diff options
author | Hieu Hoang <hieuhoang@gmail.com> | 2019-10-19 05:54:46 +0300 |
---|---|---|
committer | GitHub <noreply@github.com> | 2019-10-19 05:54:46 +0300 |
commit | 286188b82a1fc80145e7e7213337c2f8577adbbf (patch) | |
tree | 35eafd850639ad032670410f987555aba4825a1a | |
parent | 555829a771cd897bb807f495a95737953a7ca9a3 (diff) | |
parent | 5d3331b922d4443b86a74960c7ebb7fea4ce7d50 (diff) |
Merge pull request #214 from JetRunner/patch-1
Fix the incorrect processing considering fullwidth number character
-rwxr-xr-x | scripts/tokenizer/replace-unicode-punctuation.perl | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/scripts/tokenizer/replace-unicode-punctuation.perl b/scripts/tokenizer/replace-unicode-punctuation.perl index b0bc811fe..faed2cd9d 100755 --- a/scripts/tokenizer/replace-unicode-punctuation.perl +++ b/scripts/tokenizer/replace-unicode-punctuation.perl @@ -29,7 +29,7 @@ while(<STDIN>) { s/!/\!/g; s/(/\(/g; s/;/;/g; - s/1/"/g; + s/1/1/g; s/」/"/g; s/「/"/g; s/0/0/g; |