diff options
author | Marcin Junczys-Dowmunt <junczys@amu.edu.pl> | 2018-03-14 19:46:19 +0300 |
---|---|---|
committer | Marcin Junczys-Dowmunt <junczys@amu.edu.pl> | 2018-03-14 19:46:19 +0300 |
commit | 980a70bf0a94d8000508e4ec129ffe9d225fb9e6 (patch) | |
tree | d5a2fe03170e6223ebe6f7938f9a80284b4d4c8a | |
parent | 56afd8ddd6c61b78b8b98d56d04d14c9d7625f16 (diff) |
fixed formatting
-rw-r--r-- | wmt2017-transformer/README.md | 14 |
1 files changed, 7 insertions, 7 deletions
diff --git a/wmt2017-transformer/README.md b/wmt2017-transformer/README.md index ae7c478..77db692 100644 --- a/wmt2017-transformer/README.md +++ b/wmt2017-transformer/README.md @@ -64,21 +64,21 @@ $MARIAN/build/marian \ Running the complete script from start to end shoud results in numbers similar to the following numbers, improving on Edinburgh's system submission by 1.2 BLEU: -System | test2014 | test2015 | test2016(valid) | test2017 | -|------|----------|----------|-----------------|----------| -|Edinburgh WMT17| -- | -- | 36.20 |28.30| -|This example | 29.08 | 31.04 | 36.80 | 29.50| +|System | test2014 | test2015 | test2016 (valid) | test2017 | +|-----------------|----------|----------|------------------|----------| +|Edinburgh WMT17 | -- | -- | 36.20 | 28.30 | +|This example | 29.08 | 31.04 | 36.80 | 29.50 | Training all components for more than 8 epochs is likely to improve results further. -So could increasing model dimensions (e.g. with `--dim-emb 1024` or `--transformer-dim-ffn 4096``), but that will +So could increasing model dimensions (e.g. with `--dim-emb 1024` or `--transformer-dim-ffn 4096`), but that will require careful hyperparamter tuning, especially dropout regularization, for instance adding: -`` +``` --dim-emb 1024 \ --transformer-dim-ffn 4096 \ --transformer-dropout-attention 0.1 \ --transformer-dropout-ffn 0.1 \ -`` +``` This would be more similar to Google's Transformer-Big architecture. This will also likely require to reduce workspace to around 8500 or 8000 on multiple 12GB GPUs. |