diff options
author | Martin Junczys-Dowmunt <Marcin.JunczysDowmunt@microsoft.com> | 2021-03-26 19:17:12 +0300 |
---|---|---|
committer | Martin Junczys-Dowmunt <Marcin.JunczysDowmunt@microsoft.com> | 2021-03-26 19:17:12 +0300 |
commit | 7d1f941242928c976640a20f37e1bd9ac10011e8 (patch) | |
tree | a8f895b2d26bc1d947fe8a5fcb215d88a747dd6f /CHANGELOG.md | |
parent | 08bb158974597e92c3b5b0e20d938697bf6146b8 (diff) |
Merged PR 18309: Cleaner suppression of unwanted output words
This PR adds cleaner suppression of unwanted output words. We identified a situation where SPM with byte-fallback can generate random bytes with output-sampling.
That is particularly harmful when that random bytes happens to be a newline symbol. Here we suppress newline in output unless explicitly wanted.
Diffstat (limited to 'CHANGELOG.md')
-rw-r--r-- | CHANGELOG.md | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/CHANGELOG.md b/CHANGELOG.md index d300bb69..56ede4e5 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0. ## [Unreleased] ### Added +- Better suppression of unwanted output symbols, specifically "\n" from SentencePiece with byte-fallback. Can be deactivated with --allow-special - Display decoder time statistics with marian-decoder --stat-freq 10 ... - Support for MS-internal binary shortlist - Local/global sharding with MPI training via `--sharding local` |