Skip to content

2.3.2

Compare
Choose a tag to compare
@fhieber fhieber released this 18 Nov 13:41
26c02b1

[2.3.2]

Fixed

  • Fixed edge case that unintentionally skips softmax for sampling if beam size is 1.

[2.3.1]

Fixed

  • Optimizing for BLEU/CHRF with horovod required the secondary workers to also create checkpoint decoders.

[2.3.0]

Added

  • Added support for target factors.
    If provided with additional target-side tokens/features (token-parallel to the regular target-side) at training time,
    the model can now learn to predict these in a multi-task setting. You can provide target factor data similar to source
    factors: --target-factors <factor_file1> [<factor_fileN>]. During training, Sockeye optimizes one loss per factor
    in a multi-task setting. The weight of the losses can be controlled by --target-factors-weight.
    At inference, target factors are decoded greedily, they do not participate in beam search.
    The predicted factor at each time step is the argmax over its separate output
    layer distribution. To receive the target factor predictions at inference time, use
    --output-type translation_with_factors.

Changed

  • load_model(s) now returns a list of target vocabs.
  • Default source factor combination changed to sum (was concat before).
  • SockeyeModel class has three new properties: num_target_factors, target_factor_configs,
    and factor_output_layers.