Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/moses-smt/mosesdecoder.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorUlrich Germann <ugermann@inf.ed.ac.uk>2014-08-04 20:20:33 +0400
committerUlrich Germann <ugermann@inf.ed.ac.uk>2014-08-04 20:20:33 +0400
commitef307b29c206747778ab959286b50f57ede8bace (patch)
tree16d16154af82257cd7fd7162ef7ea3338e68b080 /doc
parent2711360ce7cddc24761c207a9fbba1ec36eb3d2d (diff)
Replaced content by pointer to online documentation.
Diffstat (limited to 'doc')
-rw-r--r--doc/PhraseDictionaryBitextSampling.howto33
1 files changed, 3 insertions, 30 deletions
diff --git a/doc/PhraseDictionaryBitextSampling.howto b/doc/PhraseDictionaryBitextSampling.howto
index 143a634ed..69ab11b5b 100644
--- a/doc/PhraseDictionaryBitextSampling.howto
+++ b/doc/PhraseDictionaryBitextSampling.howto
@@ -1,31 +1,4 @@
-How to use memory-mapped suffix array phrase tables in the moses decoder
-(phrase-based decoding only)
-
-1. Compile with the bjam switch --with-mm
-
-2. You need
- - sentences aligned text files
- - the word alignment between these files in symal output format
-
-3. Build binary files
-
- Let
- ${L1} be the extension of the language that you are translating from,
- ${L2} the extension of the language that you want to translate into, and
- ${CORPUS} the name of the word-aligned training corpus
-
- % zcat ${CORPUS}.${L1}.gz | mtt-build -i -o /some/path/${CORPUS}.${L1}
- % zcat ${CORPUS}.${L2}.gz | mtt-build -i -o /some/path/${CORPUS}.${L2}
- % zcat ${CORPUS}.${L1}-${L2}.symal.gz | symal2mam /some/path/${CORPUS}.${L1}-${L2}.mam
- % mmlex-build /some/path/${CORPUS} ${L1} ${L2} -o /some/path/${CORPUS}.${L1}-${L2}.lex -c /some/path/${CORPUS}.${L1}-${L2}.coc
-
-4. Define line in moses.ini
-
- The best configuration of phrase table features is still under investigation.
- For the time being, try this:
-
- PhraseDictionaryBitextSampling name=PT0 output-factor=0 num-features=9 path=/some/path/${CORPUS} L1=${L1} L2=${L2} pfwd=g pbwd=g smooth=0 sample=1000 workers=1
-
- You can increase the number of workers for sampling (a bit faster),
- but you'll lose replicability of the translation output.
+The documentation for memory-mapped, dynamic suffix arrays has moved to
+ http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc40
+Search for PhraseDictionaryBitextSampling.