Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/moses-smt/mosesdecoder.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLinas Vepstas <linasvepstas@gmail.com>2017-01-05 20:34:38 +0300
committerLinas Vepstas <linasvepstas@gmail.com>2017-01-05 20:34:38 +0300
commit203c7c63875ff8748abc68886f62d0b1a2b20b26 (patch)
treee280f2e4a4a86bda4387e66cbdeed0f95f149bd3 /scripts/share/nonbreaking_prefixes
parent144f43495e73b4e7d6224d798dad20065cd6b21f (diff)
Preliminary support for Chinese.
Diffstat (limited to 'scripts/share/nonbreaking_prefixes')
-rw-r--r--scripts/share/nonbreaking_prefixes/nonbreaking_prefix.zh53
1 files changed, 53 insertions, 0 deletions
diff --git a/scripts/share/nonbreaking_prefixes/nonbreaking_prefix.zh b/scripts/share/nonbreaking_prefixes/nonbreaking_prefix.zh
new file mode 100644
index 000000000..077710c87
--- /dev/null
+++ b/scripts/share/nonbreaking_prefixes/nonbreaking_prefix.zh
@@ -0,0 +1,53 @@
+#
+# Chinese (Mandarin, Cantonese)
+#
+# Anything in this file, followed by a period,
+# does NOT indicate an end-of-sentence marker.
+#
+# English/Euro-language given-name initials (appearing in
+# news, periodicals, etc.)
+A
+B
+C
+D
+E
+F
+G
+H
+I
+J
+K
+L
+M
+N
+O
+P
+Q
+R
+S
+T
+U
+V
+W
+X
+Y
+Z
+
+# Numbers only. These should only induce breaks when followed by
+# a numeric sequence.
+# Add NUMERIC_ONLY after the word for this function. This case is
+# mostly for the english "No." which can either be a sentence of its
+# own, or if followed by a number, a non-breaking prefix.
+No #NUMERIC_ONLY#
+Nr #NUMERIC_ONLY#