Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/moses-smt/mosesdecoder.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2021-03-13Add tokenisation support for the Tetun languageRaphael Merx
2020-07-31adding rules for Catalan Cristina España i Bonet
2020-02-20Revert "line buffering for tokeniser and truecaser"Kenneth Heafield
2020-02-17line buffering for tokeniser and truecaserWilliam Waites
2019-12-17ModernizedHjalmarrSv
2019-11-25Single quotes should be escaped as single quotes.alvations-patch-normalizationalvations
2019-10-14Update replace-unicode-punctuation.perlKevin Canwen Xu
2018-11-12Merge branch 'master' of github.com:moses-smt/mosesdecoderHieu Hoang
2018-11-12removing python port. Sacremoses is newerHieu Hoang
2018-11-10Add option "-b" (unbuffer output) to tokenizer scriptsLoïc Vial
2018-11-09rename directory to work with python importHieu Hoang
2018-11-09python wrapper worksHieu Hoang
2018-11-07start borging Luis Gomes codeHieu Hoang
2018-11-07tokenizer.perl: split final dots unconditionallyOzan Caglayan
2018-09-06Handle glottal stops in SomalianBarry Haddow
2018-04-10Contributing MosesTokenizer from NLTK to Mosesalvations-python-tokenizeralvations
2018-02-14add fi/sv-specific colon handling in tokenizer.perlScherrer Yves
2018-01-19Korean words has spaces =)patch-detokenizer-koalvations
2016-12-23Merge pull request #168 from tofula/masterHieu Hoang
2016-07-31Separate comma after a number end sentenceAntoine Dusséaux
2016-04-19add script for acquis cleaningHieu Hoang
2016-02-23.' at end of sentence is missedJim Regan
2015-10-12Named group added for the safer 'protected patterns' recognition regexp.Tomáš Fulajtár
2015-09-23ga (mostly) behaves more like fr/itJim Regan
2015-09-23ga (mostly) behaves more like fr/itJim Regan
2015-05-29Add license notices to scripts.Jeroen Vermeulen
2015-05-17Fix a lot of lint, mostly trailing whitespace.Jeroen Vermeulen
2015-05-16Fix more Python lint.Jeroen Vermeulen
2015-04-26Merge branch 'master' of https://github.com/moses-smt/mosesdecoderalvations
2015-04-19add pre tokenization cleaning script. In case training has bad, overlying lon...Hieu Hoang
2015-04-13add use warnings to all perl scriptsHieu Hoang
2015-04-07examplesFlammie Pirinen
2015-04-07full set of cases and capsFlammie Pirinen
2015-04-07add fi to list to silence warningsFlammie Pirinen
2015-04-07also lowercase if case failFlammie Pirinen
2015-04-07fix detokenising : in abbrev. case suffix caseFlammie Pirinen
2015-04-02consistently use 'env perl' command for environments where the 1st perl in PA...Hieu Hoang
2015-04-02Conditional import of Thread package for perl installations that don't suppor...Hieu Hoang
2015-03-20added Gacha Filter from WMT14alvations
2015-02-13Remove debugBarry Haddow
2015-01-30"just put it in. I'll verify it if i can be bovvered" --Hieu /usr/bin/envKenneth Heafield
2015-01-30Revert "env perl shebang"Matthias Huck
2015-01-28env perl shebangKenneth Heafield
2015-01-16don't normalise quotes if tokenizing like Penn /Phil WilliamsHieu Hoang
2015-01-16move normalisation of quotes into normalize-punctuation.perl /Tom HoarHieu Hoang
2014-11-21makemteval and small change to tokenizer. /Tom Hoar and Tomas FulajtarHieu Hoang
2014-10-10Penn Tree Bank compliant versions of preprocessingPhilipp Koehn
2014-08-04changes to protecting specified patterns (with example patterns)Philipp Koehn
2014-08-01grrrr...Philipp Koehn
2014-08-01added flush switch (-b) to normalize-punctuation.perlPhilipp Koehn