Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/moses-smt/mosesdecoder.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLinas Vepstas <linasvepstas@gmail.com>2017-01-05 07:01:45 +0300
committerLinas Vepstas <linasvepstas@gmail.com>2017-01-05 07:01:45 +0300
commit2a5e40ed60d351f05ca58ad3be6ec0865d08373f (patch)
tree3bed6cb7a0420d835ced0810d35f87182b495792 /scripts/share/nonbreaking_prefixes
parent1da0dc249cd1412bb3dba1517a33fdeb2660fb88 (diff)
New file: Lithuanian
Diffstat (limited to 'scripts/share/nonbreaking_prefixes')
-rw-r--r--scripts/share/nonbreaking_prefixes/nonbreaking_prefix.lt110
1 files changed, 110 insertions, 0 deletions
diff --git a/scripts/share/nonbreaking_prefixes/nonbreaking_prefix.lt b/scripts/share/nonbreaking_prefixes/nonbreaking_prefix.lt
new file mode 100644
index 000000000..d7829e3c0
--- /dev/null
+++ b/scripts/share/nonbreaking_prefixes/nonbreaking_prefix.lt
@@ -0,0 +1,110 @@
+# Anything in this file, followed by a period (and an upper-case word),
+# does NOT indicate an end-of-sentence marker.
+# Special cases are included for prefixes that ONLY appear before 0-9 numbers.
+
+# Any single upper case letter followed by a period is not a sentence ender
+# (excluding I occasionally, but we leave it in)
+# usually upper case letters are initials in a name
+A
+B
+C
+D
+E
+F
+G
+H
+I
+J
+K
+L
+M
+N
+O
+P
+Q
+R
+S
+T
+U
+V
+W
+X
+Y
+Z
+
+# Abbreviations m. menesis d. diena g. gimes
+m
+d
+g
+
+# Day and month abbreviations
+# Pirmadienis Penktadienis
+Pr
+Pn
+Pirm
+Antr
+Treč
+Ketv
+Penkt
+Šešt
+Sekm
+Saus
+Vas
+Kov
+Bal
+Geg
+Birž
+Liep
+Rugpj
+Rugs
+Spal
+Lapkr
+Gruod
+
+# List of titles. These are often followed by upper-case names, but do
+# not indicate sentence breaks
+#
+# Gerbiamasis
+Gerb
+
+# XXX TODO .. Below are not quite correct, copied from latvian
+dr
+Dr
+med
+prof
+Prof
+inž
+Inž
+ist.loc
+Ist.loc
+kor.loc
+Kor.loc
+v.i
+vietn
+Vietn
+
+# misc - odd period-ending items that NEVER indicate breaks (p.m. does NOT
+# fall into this category - it sometimes ends a sentence)
+# angl angliskai
+# dab dabartine
+angl
+dab
+
+
+#Numbers only. These should only induce breaks when followed by a numeric sequence
+# add NUMERIC_ONLY after the word for this function
+#This case is mostly for the english "No." which can either be a sentence of its own, or
+#if followed by a number, a non-breaking prefix
+No #NUMERIC_ONLY#
+Nr #NUMERIC_ONLY#