From 7f8d6b435a8e45c7c22c9a321b16a9391eba6b82 Mon Sep 17 00:00:00 2001
From: Marcin Junczys-Dowmunt <marcinjd@microsoft.com>
Date: Mon, 26 Nov 2018 11:11:36 -0800
Subject: add comment on lack of processing for translation

---
 training-basics-sentencepiece/README.md | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/training-basics-sentencepiece/README.md b/training-basics-sentencepiece/README.md
index 5841db7..1e445c1 100644
--- a/training-basics-sentencepiece/README.md
+++ b/training-basics-sentencepiece/README.md
@@ -240,7 +240,10 @@ stops improving. Depending on the number of and generation of GPUs you are using
 ### Translating the test and validation sets with evaluation
 
 After training, the model with the highest translation validation score is used
-to translate the WMT2016 dev set and test set with `marian-decoder`:
+to translate the WMT2016 dev set and test set with `marian-decoder`. Note again,
+that none of the commands below required any type of pre-/post-processing. The 
+decoder consumes and outputs raw text with SentencePiece doing the tokenization, 
+normalization and segmentation on the fly. Similarly, sacreBLEU expects raw text.
 
 ```
 # translate dev set
-- 
cgit v1.2.3