Skip to content

1.18.13

Compare
Choose a tag to compare
@fhieber fhieber released this 19 May 16:35
· 518 commits to main since this release
3f8cb0b

[1.18.13]

Fixed

  • Fixed two bugs with training resumption:
    1. removed overly strict assertion in the data iterator for model states before the first checkpoint.
    2. removed deletion of Tensorboard log directory.

Added

  • Added support for config files. Command line parameters have precedence over the values read from the config file.
    Minimal working example:
    python -m sockeye.train --config config.yaml with contents of config.yaml as follows:
    source: source.txt
    target: target.txt
    output: out
    validation_source: valid.source.txt
    validation_target: valid.target.txt

Changed

The full set of arguments is serialized to out/args.yaml at the beginning of training (before json was used).

[1.18.12]

Changed

  • All source side sequences now get appended an additional end-of-sentence (EOS) symbol. This change is backwards
    compatible meaning that inference with older models will still work without the EOS symbol.

[1.18.11]

Changed

  • Default training parameters have been changed to reflect the setup used in our arXiv paper. Specifically, the default
    is now to train a 6 layer Transformer model with word based batching. The only difference to the paper is that weight
    tying is still turned off by default, as there may be use cases in which tying the source and target vocabularies is
    not appropriate. Turn it on using --weight-tying --weight-tying-type=src_trg_softmax. Additionally, BLEU scores from
    a checkpoint decoder are now monitored by default.