Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/marian-nmt/marian-examples.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMarcin Junczys-Dowmunt <marcinjd@microsoft.com>2018-11-26 22:11:36 +0300
committerMarcin Junczys-Dowmunt <marcinjd@microsoft.com>2018-11-26 22:11:36 +0300
commit7f8d6b435a8e45c7c22c9a321b16a9391eba6b82 (patch)
treeddf564a6cec5d16447e0550dc80e3076938fb123
parent6ab33f71542e48d0c47628281a3ff6776dacd1f0 (diff)
add comment on lack of processing for translation
-rw-r--r--training-basics-sentencepiece/README.md5
1 files changed, 4 insertions, 1 deletions
diff --git a/training-basics-sentencepiece/README.md b/training-basics-sentencepiece/README.md
index 5841db7..1e445c1 100644
--- a/training-basics-sentencepiece/README.md
+++ b/training-basics-sentencepiece/README.md
@@ -240,7 +240,10 @@ stops improving. Depending on the number of and generation of GPUs you are using
### Translating the test and validation sets with evaluation
After training, the model with the highest translation validation score is used
-to translate the WMT2016 dev set and test set with `marian-decoder`:
+to translate the WMT2016 dev set and test set with `marian-decoder`. Note again,
+that none of the commands below required any type of pre-/post-processing. The
+decoder consumes and outputs raw text with SentencePiece doing the tokenization,
+normalization and segmentation on the fly. Similarly, sacreBLEU expects raw text.
```
# translate dev set