Skip to content

uralicNLP.string_processing

Mika Hämäläinen edited this page Jul 9, 2020 · 7 revisions

The uralicNLP.string_processing module has the following methods:

char_split

Splits words into characters better than Python's own "".split() method. This tries to maintain diacritics with the character they belong to instead of separating them. Take a look at the following example:

Clone this wiki locally