diff options
author | John Pope <jp@bellgeorge.com> | 2018-07-02 21:44:32 +0300 |
---|---|---|
committer | GitHub <noreply@github.com> | 2018-07-02 21:44:32 +0300 |
commit | 11bbbf27819ad32f209b306bcada4960fcb56b44 (patch) | |
tree | 401dcd8a041512277c1b563ac9d78fbbf702292a /README.md | |
parent | ee4ca7fd2e9850660a3b82cae7a9abe33bd29855 (diff) |
typo
Diffstat (limited to 'README.md')
-rw-r--r-- | README.md | 2 |
1 files changed, 1 insertions, 1 deletions
@@ -43,7 +43,7 @@ Note that BPE algorithm used in WordPiece is slightly different from the origina ## Overview ### What is SentencePiece? -SentencePiece is a re-impelemtation of **sub-word units**, an effective way to alleviate the open vocabulary +SentencePiece is a re-implementation of **sub-word units**, an effective way to alleviate the open vocabulary problems in neural machine translation. SentencePiece supports two segmentation algorithms, **byte-pair-encoding (BPE)** [[Sennrich et al.](http://www.aclweb.org/anthology/P16-1162)] and **unigram language model** [[Kudo.](https://arxiv.org/abs/1804.10959)]. Here are the high level differences from other implementations. #### The number of unique tokens is predetermined |