1.18.13
[1.18.13]
Fixed
- Fixed two bugs with training resumption:
- removed overly strict assertion in the data iterator for model states before the first checkpoint.
- removed deletion of Tensorboard log directory.
Added
- Added support for config files. Command line parameters have precedence over the values read from the config file.
Minimal working example:
python -m sockeye.train --config config.yaml
with contents ofconfig.yaml
as follows:source: source.txt target: target.txt output: out validation_source: valid.source.txt validation_target: valid.target.txt
Changed
The full set of arguments is serialized to out/args.yaml
at the beginning of training (before json was used).
[1.18.12]
Changed
- All source side sequences now get appended an additional end-of-sentence (EOS) symbol. This change is backwards
compatible meaning that inference with older models will still work without the EOS symbol.
[1.18.11]
Changed
- Default training parameters have been changed to reflect the setup used in our arXiv paper. Specifically, the default
is now to train a 6 layer Transformer model with word based batching. The only difference to the paper is that weight
tying is still turned off by default, as there may be use cases in which tying the source and target vocabularies is
not appropriate. Turn it on using--weight-tying --weight-tying-type=src_trg_softmax
. Additionally, BLEU scores from
a checkpoint decoder are now monitored by default.