You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add decoding option suppress_sequences to prevent specific sequences of tokens from being generated
Add decoding option end_token to stop the decoding on a different token than the model EOS token
Allow returning multiple random hypotheses from greedy search + random sampling when setting num_hypotheses > 1
Fixes and improvements
Improve support for batch generation with the Whisper model:
Improve performance of batch generation with a context (we only require the prompts to have the same length, which is easily done by adapting the number of previous text tokens)
Support batch mode for option return_no_speech_prob
Support cases where some prompts in the batch have the token <|notimestamps|> but not others
Enable the Conv1D layer in more Python wheels:
macOS x64 (using oneDNN)
macOS ARM64 (using a custom implementation)
Linux AArch64 (using a custom implementation)
Update the OpenNMT-py converter to support the latest checkpoint structure
Generalize the TransformerSpec constructor to accept arbitrary encoder and decoder specifications
Remove the global compilation flag -ffast-math which introduces unwanted side effects and enable it only for the layer norm CPU kernel where it is actually useful
Fix CMake error on Windows when setting -DOPENMP_RUNTIME=COMP