Skip to content

Commit 7a4623b

Browse files
committed
Bump version to 1.23.0
1 parent ee51033 commit 7a4623b

File tree

2 files changed

+19
-1
lines changed

2 files changed

+19
-1
lines changed

CHANGELOG.md

+18
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,24 @@ The project follows [semantic versioning 2.0.0](https://semver.org/). The API co
2323

2424
### Fixes and improvements
2525

26+
## [v1.23.0](https://github.com/OpenNMT/Tokenizer/releases/tag/v1.23.0) (2020-12-30)
27+
28+
### Changes
29+
30+
* Drop Python 2 support
31+
32+
### New features
33+
34+
* Publish Python wheels for macOS
35+
36+
### Fixes and improvements
37+
38+
* Improve performance in all tokenization modes (up to 2x faster)
39+
* Fix missing space escaping within protected sequences in "none" and "space" tokenization modes
40+
* Fix a regression introduced in 1.20 where `segment_alphabet_*` options behave differently on characters that appear in multiple Unicode scripts (e.g. some Japanese characters can belong to both Hiragana and Katakana scripts and should not trigger a segmentation)
41+
* Fix a regression introduced in 1.21 where a joiner is incorrectly placed when using `preserve_segmented_tokens` and the word is segmented by both a `segment_*` option and BPE
42+
* Fix incorrect tokenization when using `support_prior_joiners` and some joiners are within protected sequences
43+
2644
## [v1.22.2](https://github.com/OpenNMT/Tokenizer/releases/tag/v1.22.2) (2020-11-12)
2745

2846
### Fixes and improvements

bindings/python/setup.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ def _maybe_add_library_root(lib_name, header_only=False):
4545

4646
setup(
4747
name="pyonmttok",
48-
version="1.22.2",
48+
version="1.23.0",
4949
license="MIT",
5050
description="OpenNMT tokenization library",
5151
long_description=_get_long_description(),

0 commit comments

Comments
 (0)