Skip to content

Commit

Permalink
Improve sampling (fixes #905) (#911)
Browse files Browse the repository at this point in the history
  • Loading branch information
bricksdont authored Nov 10, 2020
1 parent 6e9dba6 commit 500e6f8
Show file tree
Hide file tree
Showing 4 changed files with 10 additions and 4 deletions.
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,11 @@ Note that Sockeye has checks in place to not translate with an old model that wa

Each version section may have have subsections for: _Added_, _Changed_, _Removed_, _Deprecated_, and _Fixed_.

## [2.3.2]
### Fixed

- Fixed edge case that unintentionally skips softmax for sampling if beam size is 1.

## [2.3.1]
### Fixed

Expand Down
5 changes: 3 additions & 2 deletions docs/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -189,9 +189,10 @@ Options:

Instead of filling the beam with the best items at each step of the decoder, Sockeye can sample from the target distributions of each hypothesis using `--sample [N]`.
If the optional parameter `N` is specified, the sampling will be limited to the top `N` vocabulary items.
The default, `N = 0`, which means to sample from the full distribution over all target vocabulary items.
If `--sample` is used without an integer, the default `N = 0` applies. `N = 0` means to sample from the full distribution over all target vocabulary items.
Limiting `N` to a value that is much smaller than the target vocabulary size (say, 5%) can lead to much more sensible samples.
Likewise, you can use `--softmax-temperature T` to make the target distributions more peaked (`T < 1.0`) or smoother (`T > 1.0`).

You can use this with `--nbest-size` to output multiple samples for each input.
However, note that since each beam item is sampled independently, there is no guarantee that sampled items will be unique.
You can use `--softmax-temperature T` to make the target distributions more peaked (`T < 1.0`) or smoother (`T > 1.0`).
Also note that the samples in an nbest list will be sorted according to model scores.
2 changes: 1 addition & 1 deletion sockeye/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.

__version__ = '2.3.1'
__version__ = '2.3.2'
2 changes: 1 addition & 1 deletion sockeye/beam_search.py
Original file line number Diff line number Diff line change
Expand Up @@ -795,7 +795,7 @@ def get_beam_search(models: List[SockeyeModel],

inference = None # type: Optional[_Inference]
if len(models) == 1:
skip_softmax = beam_size == 1 and not output_scores and not sample
skip_softmax = beam_size == 1 and not output_scores and sample is None
if skip_softmax:
logger.info("Enabled skipping softmax for a single model and greedy decoding.")
inference = _SingleModelInference(model=models[0],
Expand Down

0 comments on commit 500e6f8

Please sign in to comment.