Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/moses-smt/mosesdecoder.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorHieu Hoang <hieuhoang@gmail.com>2014-09-04 19:31:14 +0400
committerHieu Hoang <hieuhoang@gmail.com>2014-09-04 19:31:14 +0400
commiteb75e588204cfa6f4524f45bfd0894628c23ff6f (patch)
tree657b7a71ef9e5d02ab329725b1c2deed61e97522 /scripts/share
parentc861e191b3848ba63827e2d059a8af4ec391b450 (diff)
parent1da3df93bcd7e115d9bfe78888dc71460a544cc0 (diff)
Merge pull request #72 from flammie/master
Add Finnish non-breaking prefixes
Diffstat (limited to 'scripts/share')
-rw-r--r--scripts/share/nonbreaking_prefixes/nonbreaking_prefix.fi138
1 files changed, 138 insertions, 0 deletions
diff --git a/scripts/share/nonbreaking_prefixes/nonbreaking_prefix.fi b/scripts/share/nonbreaking_prefixes/nonbreaking_prefix.fi
new file mode 100644
index 000000000..466c6a837
--- /dev/null
+++ b/scripts/share/nonbreaking_prefixes/nonbreaking_prefix.fi
@@ -0,0 +1,138 @@
+#Anything in this file, followed by a period (and an upper-case word), does NOT
+#indicate an end-of-sentence marker. Special cases are included for prefixes
+#that ONLY appear before 0-9 numbers.
+
+#This list is compiled from omorfi <http://code.google.com/p/omorfi> database
+#by Tommi A Pirinen.
+
+
+#any single upper case letter followed by a period is not a sentence ender
+A
+B
+C
+D
+E
+F
+G
+H
+I
+J
+K
+L
+M
+N
+O
+P
+Q
+R
+S
+T
+U
+V
+W
+X
+Y
+Z
+
+#List of titles. These are often followed by upper-case names, but do not indicate sentence breaks
+alik
+alil
+amir
+apul
+apul.prof
+arkkit
+ass
+assist
+dipl
+dipl.arkkit
+dipl.ekon
+dipl.ins
+dipl.kielenk
+dipl.kirjeenv
+dipl.kosm
+dipl.urk
+dos
+erikoiseläinl
+erikoishammasl
+erikoisl
+erikoist
+ev.luutn
+evp
+fil
+ft
+hallinton
+hallintot
+hammaslääket
+jatk
+jääk
+kansaned
+kapt
+kapt.luutn
+kenr
+kenr.luutn
+kenr.maj
+kers
+kirjeenv
+kom
+kom.kapt
+komm
+konst
+korpr
+luutn
+maist
+maj
+Mr
+Mrs
+Ms
+M.Sc
+neuv
+nimim
+Ph.D
+prof
+puh.joht
+pääll
+res
+san
+siht
+suom
+sähköp
+säv
+toht
+toim
+toim.apul
+toim.joht
+toim.siht
+tuom
+ups
+vänr
+vääp
+ye.ups
+ylik
+ylil
+ylim
+ylimatr
+yliop
+yliopp
+ylip
+yliv
+
+#misc - odd period-ending items that NEVER indicate breaks (p.m. does NOT fall
+#into this category - it sometimes ends a sentence)
+e.g
+ent
+esim
+huom
+i.e
+ilm
+l
+mm
+myöh
+nk
+nyk
+par
+po
+t
+v