diff options
author | Guillaume Klein <guillaume.klein@systrangroup.com> | 2020-11-23 14:14:16 +0300 |
---|---|---|
committer | Guillaume Klein <guillaume.klein@systrangroup.com> | 2020-11-23 14:14:16 +0300 |
commit | 129047ea7975f73747c6448211db24c293e44da0 (patch) | |
tree | 0e887b5245d95982dddc9f4873216d034ff08a5e | |
parent | 7c54f53242da9b79e2d61452e9f057a320a60ad1 (diff) |
Bump version to 1.16.1v1.16.1
-rw-r--r-- | CHANGELOG.md | 9 | ||||
-rw-r--r-- | python/setup.py | 2 |
2 files changed, 10 insertions, 1 deletions
diff --git a/CHANGELOG.md b/CHANGELOG.md index 867c7318..fbfffec7 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,15 @@ ### Fixes and improvements +## [v1.16.1](https://github.com/OpenNMT/CTranslate2/releases/tag/v1.16.1) (2020-11-23) + +### Fixes and improvements + +* Fuse dequantization and bias addition on GPU for improved INT8 performance +* Improve performance of masked softmax on GPU +* Fix error when building the CentOS 7 GPU Docker image +* The previous version listed "Pad size of INT8 matrices to a multiple of 16 when the GPU has INT8 Tensor Cores". However, the padding was not applied due to a bug and fixing it degraded the performance, so this behavior is not implemented for now. + ## [v1.16.0](https://github.com/OpenNMT/CTranslate2/releases/tag/v1.16.0) (2020-11-18) ### Changes diff --git a/python/setup.py b/python/setup.py index 4df18aba..918ce75e 100644 --- a/python/setup.py +++ b/python/setup.py @@ -35,7 +35,7 @@ ctranslate2_module = Extension( setup( name="ctranslate2", - version="1.16.0", + version="1.16.1", license="MIT", description="Fast inference engine for OpenNMT models", long_description=_get_long_description(), |