Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/marian-nmt/sentencepiece.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTaku Kudo <taku910@users.noreply.github.com>2020-10-17 06:28:36 +0300
committerGitHub <noreply@github.com>2020-10-17 06:28:36 +0300
commit496f22507529d6c4e2935a5967fd4fb4e53ebd47 (patch)
tree75ed571eae158d56238d02f7ba5e9595f7271508
parent0b5bd12205364c064a93c77d425ac7e6f8e41df3 (diff)
parentd796421cbaa4b8f57f8005bb3c1a1ad4173d68d8 (diff)
Merge pull request #556 from equivalence1/fix_readme_sil_symbol
Fix space symbol in code snippet: _ -> ▁
-rw-r--r--README.md2
1 files changed, 1 insertions, 1 deletions
diff --git a/README.md b/README.md
index 5543d79..d1873e7 100644
--- a/README.md
+++ b/README.md
@@ -84,7 +84,7 @@ Then, this text is segmented into small pieces, for example:
Since the whitespace is preserved in the segmented text, we can detokenize the text without any ambiguities.
```
- detokenized = ''.join(pieces).replace('_', ' ')
+ detokenized = ''.join(pieces).replace('▁', ' ')
```
This feature makes it possible to perform detokenization without relying on language-specific resources.