Skip to content
ugurcanarikan edited this page May 30, 2019 · 37 revisions

Models with optimized parameters

Embedding Epochs Hidden size Learning rate Mini batch size rnn layers f1-score precision recall accuracy
GloVe 150 32 0.15 16 2 0.7685 0.7706 0.7665 0.6241
fastText 150 256 0.1 16 2 0.8042 0.8351 0.7755 0.6725
GloVe and fastText 150 256 0.1 16 2 0.8334 0.8632 0.8056 0.7144
GloVe and fastText 150 256 0.1 16 4 0.8319 0.8616 0.8041 0.7121
BERT

Models with different learning rates

Fixed Parameters:

Epochs Hidden size Mini batch size rnn layers
150 256 16 2

Models

Embedding Learning r. f1-score precision recall accuracy zor_cumleler accuracy
GloVe(50) 0.05 0.8031 0.8217 0.7853 0.6709
GloVe(50) 0.1 0.7973 0.8379 0.7605 0.6630
GloVe(50) 0.15 0.7885 0.8276 0.7530 0.6509
GloVe(50) 0.2 0.7866 0.8306 0.7470 0.6482
GloVe(300) 0.2 0.8422 0.8709 0.8153 0.7274 0.55
fastText 0.05 0.8178 0.8361 0.8003 0.6918
fastText 0.1 0.8042 0.8351 0.7755 0.6725
fastText 0.15 0.8115 0.8448 0.7808 0.6829
fastText 0.2 0.8323 0.8705 0.7973 0.7128 0.64
word2vec 0.2 0.8339 0.8767 0.7950 0.7151 0.71
word2vec and fastText 0.2 0.8364 0.8759 0.8003 0.7188 0.69
GloVe(50) and fastText 0.05 0.8266 0.8471 0.8071 0.7045
GloVe(50) and fastText 0.1 0.8334 0.8632 0.8056 0.7144
GloVe(50) and fastText 0.15 0.8355 0.8625 0.8101 0.7174
GloVe(50) and fastText 0.2 0.8409 0.8769 0.8078 0.7256
GloVe(300) and fastText 0.05 0.8264 0.8542 0.8003 0.7041
GloVe(300) and fastText 0.1 0.8374 0.8648 0.8116 0.7202
GloVe(300) and fastText 0.15 0.8595 0.9116 0.8131 0.7537
GloVe(300) and fastText 0.2 0.8605 0.9055 0.8198 0.7552 0.66
GloVe(300) and word2vec 0.2 0.8563 0.8979 0.8183 0.7486 0.67
GloVe(300) and fastText and word2vec 0.1 0.8548 0.8983 0.8153 0.7464
GloVe(300) and fastText and word2vec 0.15 0.8597 0.9074 0.8168 0.7540
GloVe(300) and fastText and word2vec 0.2 0.8667 0.9156 0.8228 0.7648 0.65

Zor Cumleler Accuracy

Embedding Learning r. zor_cumleler accuracy
GloVe(300) 0.2 0.55
fastText 0.2 0.64
word2vec 0.2 0.71
GloVe(300) and fastText 0.2 0.66
GloVe(300) and word2vec 0.2 0.67
fastText and word2vec 0.2 0.69
GloVe(300) fastText and word2vec 0.2 0.65
Google Docs 0.34
Microsoft Office 0.29
LibreOffice 0.0
ITU NLP Pipeline 0.0

Comparison Methods

Those are the methods that don't use semantic analysis but follows the same syntactic rule.

Model 1: Always disjoint 'de/da', if the word ends with 'de' or 'da' mark it as having error

Model 2: Always joint 'de/da', if the word is 'de' or 'da' mark it as having error

Model True positive True negative False positive False negative f1-score precision recall
Model 1 3521 6828 29692 10194 0.15 0.1060 0.2567
Model 2 10194 29692 6828 3521 0.6633 0.5989 0.7432