1 files changed, 3 insertions, 3 deletions
diff --git a/training-basics-sentencepiece/README.md b/training-basics-sentencepiece/README.md
index 4facab0..b15d996 100644
--- a/training-basics-sentencepiece/README.md
+++ b/training-basics-sentencepiece/README.md
@@ -267,7 +267,7 @@ BLEU+case.mixed+lang.ro-en+numrefs.1+smooth.exp+test.wmt16+tok.13a+version.1.2.1
 ## Is Normalization Actually Required?
 
 We also quickly tested if the normalization of Romanian characters is actually neccessary and if there are other methods
-of dealing with the noise. SentencePiece supports a method called subword-regularization ((Kudo 2018)[]) that samples different
+of dealing with the noise. SentencePiece supports a method called subword-regularization ([Kudo 2018](https://arxiv.org/abs/1804.10959)) that samples different
 subword splits at training time; ideally resulting in a more robust translation at inference time.
 
 Here's the table:
@@ -279,8 +279,8 @@ Here's the table:
 | raw+sampling |      |      |
 
 We see that keeping the noise untouched (raw) results indeed in the worst of the three system, normalization (normalized) is best,
-closely followed by sampled subwords splits (raw+sampling). This is an interesting result: although normalization is generally better
-it is not trivial to discover the problem in the first place. Creating a normalization table is another added difficulty and on top of
+closely followed by sampled subwords splits (raw+sampling). This is an interesting result: although normalization is generally better,
+it is not trivial to discover the problem in the first place. Creating a normalization table is another added difficulty - and on top of
 that normalization breaks reversibility. Subword sampling seems to be a viable alternative when dealing with character-level noise with
 no added complexity compared to raw text. It does however take longer to converge, being a regularization method.