# AmuNMT [![Join the chat at https://gitter.im/amunmt/amunmt](https://badges.gitter.im/amunmt/amunmt.svg)](https://gitter.im/amunmt/amunmt?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) [![CUDABuild Status](http://vali.inf.ed.ac.uk/jenkins/buildStatus/icon?job=amunmt_compilation_cuda)](http://vali.inf.ed.ac.uk/jenkins/job/amunmt_compilation_cuda/) [![CPU Build Status](http://vali.inf.ed.ac.uk/jenkins/buildStatus/icon?job=amunmt_compilation_cpu)](http://vali.inf.ed.ac.uk/jenkins/job/amunmt_compilation_cpu/) A C++ inference engine for Neural Machine Translation (NMT) models trained with Theano-based scripts from Nematus (https://github.com/rsennrich/nematus) or DL4MT (https://github.com/nyu-dl/dl4mt-tutorial) If you use this, please cite: Marcin Junczys-Dowmunt, Tomasz Dwojak, Hieu Hoang (2016). Is Neural Machine Translation Ready for Deployment? A Case Study on 30 Translation Directions (https://arxiv.org/abs/1610.01108) ## Recommended for GPU version: Tested on Ubuntu 14.04 LTS * CMake 3.5.1 (due to CUDA related bugs in earlier versions) * GCC/G++ 4.9 * Boost 1.54 * CUDA 7.5 Tested on Ubuntu 16.04 LTS * CMake 3.5.1 (due to CUDA related bugs in earlier versions) * GCC/G++ 5.4 * Boost 1.61 * CUDA 8.0 Also compiles the CPU version. ## Recommended for CPU version: The CPU-only version will automatically be compiled if CUDA cannot be detected by CMAKE. Tested on different machines and distributions: * CMake 3.5.1 * The CPU version should be a lot more forgiving concerning GCC/G++ or Boost versions. ## Compilation The project is a standard Cmake out-of-source build: mkdir build cd build cmake .. make -j If you want to compile only CPU version on a machine with CUDA, add `-DCUDA=OFF` flag: cmake -DCUDA=OFF .. ## Vocabulary files Vocabulary files (and all other config files) in AmuNMT are by default YAML files. AmuNMT also reads gzipped yml.gz files. * Vocabulary files from models trained with Nematus can be used directly as JSON is a proper subset of YAML. * Vocabularies for models trained with DL4MT (\*.pkl extension) need to be converted to JSON/YAML with either of the two scripts below: ``` python scripts/pkl2json.py vocab.en.pkl > vocab.json python scripts/pkl2yaml.py vocab.en.pkl > vocab.yml ``` ## Running AmuNMT ./bin/amun -c config.yml <<< "This is a test ." ## Configuration files An example configuration: # Paths are relative to config file location relative-paths: yes # performance settings beam-size: 12 devices: [0] normalize: yes gpu-threads: 1 # scorer configuration scorers: F0: path: model.en-de.npz type: Nematus # scorer weights weights: F0: 1.0 # vocabularies source-vocab: vocab.en.yml.gz target-vocab: vocab.de.yml.gz ## BPE Support AmuNMT has integrated support for [BPE encoding](https://github.com/rsennrich/subword-nmt). There are two option `bpe` and `debpe`. The `bpe` option receives a path to a file with BPE codes (here `bpe.codes`). To turn on desegmentation on the ouput, set `debpe` to `true`, e.g. bpe: bpe.codes debpe: true ## Python Bindings Python bindings allow to run AmuNMT decoder in python scripts. The compilation of the bindings requires `python-dev` package. To compile the bindings run: ``` make python ``` The Python bindings consist of 2 function: `init` and `translate`: ```python import libamunmt libamunmt.init('-c config.yml') print libamunmt.translate(['this is a little test .']) ``` The `init` function init the decoder and the syntax is the same as in command line. The `translate` function takes a list of sentences to translate. For real-world example, see the `scripts/amunmt_erver.py` script, which uses python bindings to run REST server. ## Using GPU/CPU threads AmuNMT can use GPUs, CPUs, or both, to distribute translation of different sentences. **However, it is unlikely that CPUs used together with GPUs yield any performance improvement. It is probably better to only use the GPU if one or more are available.** cpu-threads: 8 gpu-threads: 2 devices: [0, 1] The setting above uses 8 CPU threads and 4 GPU threads (2 GPUs x 2 threads). The `gpu-threads` and `devices` options are only available when AmuNMT has been compiled with CUDA support. Multiple GPU threads can be used to increase GPU saturation, but will likely not result in a large performance boost. By default, `gpu-threads` is set to `1` and `cpu-threads` to `0` if CUDA is available. Otherwise `cpu-threads` is set to `1`. To disable the GPU set `gpu-threads` to `0`. Setting both `gpu-threads` and `cpu-threads` to `0` will result in an exception. ## Example usage * [Data and systems for our winning system in the WMT 2016 Shared Task on Automatic Post-Editing](https://github.com/emjotde/amunmt/wiki/AmuNMT-for-Automatic-Post-Editing)