Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/marian-nmt/sentencepiece.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJohn Pope <jp@bellgeorge.com>2018-07-02 21:44:32 +0300
committerGitHub <noreply@github.com>2018-07-02 21:44:32 +0300
commit11bbbf27819ad32f209b306bcada4960fcb56b44 (patch)
tree401dcd8a041512277c1b563ac9d78fbbf702292a /README.md
parentee4ca7fd2e9850660a3b82cae7a9abe33bd29855 (diff)
typo
Diffstat (limited to 'README.md')
-rw-r--r--README.md2
1 files changed, 1 insertions, 1 deletions
diff --git a/README.md b/README.md
index cdc18f0..b8964eb 100644
--- a/README.md
+++ b/README.md
@@ -43,7 +43,7 @@ Note that BPE algorithm used in WordPiece is slightly different from the origina
## Overview
### What is SentencePiece?
-SentencePiece is a re-impelemtation of **sub-word units**, an effective way to alleviate the open vocabulary
+SentencePiece is a re-implementation of **sub-word units**, an effective way to alleviate the open vocabulary
problems in neural machine translation. SentencePiece supports two segmentation algorithms, **byte-pair-encoding (BPE)** [[Sennrich et al.](http://www.aclweb.org/anthology/P16-1162)] and **unigram language model** [[Kudo.](https://arxiv.org/abs/1804.10959)]. Here are the high level differences from other implementations.
#### The number of unique tokens is predetermined