Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/marian-nmt/marian.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMartin Junczys-Dowmunt <Marcin.JunczysDowmunt@microsoft.com>2021-03-26 19:17:12 +0300
committerMartin Junczys-Dowmunt <Marcin.JunczysDowmunt@microsoft.com>2021-03-26 19:17:12 +0300
commit7d1f941242928c976640a20f37e1bd9ac10011e8 (patch)
treea8f895b2d26bc1d947fe8a5fcb215d88a747dd6f /CHANGELOG.md
parent08bb158974597e92c3b5b0e20d938697bf6146b8 (diff)
Merged PR 18309: Cleaner suppression of unwanted output words
This PR adds cleaner suppression of unwanted output words. We identified a situation where SPM with byte-fallback can generate random bytes with output-sampling. That is particularly harmful when that random bytes happens to be a newline symbol. Here we suppress newline in output unless explicitly wanted.
Diffstat (limited to 'CHANGELOG.md')
-rw-r--r--CHANGELOG.md1
1 files changed, 1 insertions, 0 deletions
diff --git a/CHANGELOG.md b/CHANGELOG.md
index d300bb69..56ede4e5 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
## [Unreleased]
### Added
+- Better suppression of unwanted output symbols, specifically "\n" from SentencePiece with byte-fallback. Can be deactivated with --allow-special
- Display decoder time statistics with marian-decoder --stat-freq 10 ...
- Support for MS-internal binary shortlist
- Local/global sharding with MPI training via `--sharding local`