Skip to content

Russian Language processing tools

Mikhail Rozhkov edited this page Feb 28, 2017 · 2 revisions

##Syntatic models

Parsey's Cousins models

Language | No. tokens | POS | fPOS | Morph | UAS | LAS -------- | :--: | :--: | :--: | :--: | :--: | :--: | :--: Russian-SynTagRus | 107737 | 98.27% | - | 94.91% | 91.68% | 87.44% Russian | 9573 | 95.27% | 95.02% | 87.75% | 81.75% | 77.71%

These models are trained on Universal Dependencies datasets v1.3. The following table shows their accuracy on Universal Dependencies test sets for different types of annotations. Source: https://github.com/mnrozhkov/models/blob/master/syntaxnet/universal.md

##NER

###Rule-based NER Natasha Source: https://github.com/bureaucratic-labs/natasha

###Томтита Парсер (Яндекс) Source: https://tech.yandex.ru/tomita/

Clone this wiki locally