-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First steps towards a CG-based UD parser; point to the lexicon-proofreading-effort in the docs; some corrections in puupankki #16
Open
IlnarSelimcan
wants to merge
151
commits into
apertium:main
Choose a base branch
from
taruen:master
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
151 commits
Select commit
Hold shift + click to select a range
bb8e40f
add kaz-tagger mode to the Racket interface
IlnarSelimcan 8b855f7
add <num><ord> reading to digits; select that reading if a proper nou…
IlnarSelimcan 1b8267a
test kaz-morph's output too
IlnarSelimcan 5b06b86
add some shitty dependency rule (for parsing the first sentence of kk…
IlnarSelimcan d2269e8
few more mapping and attachement rules (sents 1-3 of ud treebank cove…
IlnarSelimcan bd5a51a
remove ерекше<adv> since it doesn't make sense to add someting as bot…
IlnarSelimcan c4f90dd
add few more dependency parsing rules
IlnarSelimcan 67b53b7
don't output in the kaz-tagger-deterministic (Racket) mode
IlnarSelimcan d24c55c
remove entries marked as Use/MT from the LR transducer (morphological…
IlnarSelimcan be5f9f8
add few more disambiguation rules
IlnarSelimcan 13efd51
minor update in the docs: pipeline for getting UD output
IlnarSelimcan cf2679d
minor
IlnarSelimcan bf27ca6
comment out kaz-morph tests for now
IlnarSelimcan e493c86
update documentation a bit
IlnarSelimcan 638c796
Merge remote-tracking branch 'upstream/master'
IlnarSelimcan b0e3d6e
revert the change in which I was deleting lines marked as Use/MT from…
IlnarSelimcan b3d5b30
minor fix in the docs
IlnarSelimcan 4abb038
get rid of kaz-morph tests as they are somewhat volatile atm
IlnarSelimcan f60e4fe
add commands for converting apertium-kaz's output into CoNLLU format
IlnarSelimcan 1e8dd15
untabify the conllu example output in the docs
IlnarSelimcan 0b4e427
untabify
IlnarSelimcan ea50d10
add two more disambiguation rules
IlnarSelimcan 43c346a
add examples and clarifications on how we evaluate CG parser's output
IlnarSelimcan 2852e54
fix a typo in the docs
IlnarSelimcan 8a23e58
set the lemma of маңызды to маңызды (its POS tag is adj)
IlnarSelimcan 9ce7f2f
set the lemma of маңызды to маңызды (its POS tag is adj)
IlnarSelimcan 50582f6
treat санал as passive form of сана
IlnarSelimcan 39ba759
add a todo note about spaceafter=no
IlnarSelimcan 42a2b47
add few more CG rules
IlnarSelimcan a2d9b0a
change lemma of 'өсті (grow)' from өст to өс
IlnarSelimcan 3d38bdc
s/ 'өс (grow)' transitive/to intransitive
IlnarSelimcan 8717126
add hyperlinks to the GPLv3 and CC-BY 4.0
IlnarSelimcan c6c8a1f
change formatting of a table in the docs in minor ways
IlnarSelimcan a3bc3bd
add a note on whole treebank eval
IlnarSelimcan f2110ea
add few more cg rules
IlnarSelimcan 326c9aa
minor
IlnarSelimcan 492e74b
minor
IlnarSelimcan 159c150
add a mixing dependency for the noun of a phrasal verb; s/amod/nummod…
IlnarSelimcan 94685ca
add a note on how to add Zhenis' disambiguator into the pipeline
IlnarSelimcan c7d4a89
add few more CG rules
IlnarSelimcan b74f3af
add a few words about the UD treebanki into the docs
IlnarSelimcan 3242f2d
fix a typo in the documentation
IlnarSelimcan 518a3c4
Merge remote-tracking branch 'upstream/master'
IlnarSelimcan 71981c9
change lemma of саналады from санал to сана in puupankki
IlnarSelimcan c877060
add ud-scripts/conllu-nospaceafter.py into the pipeline and update stats
IlnarSelimcan cac8cfa
start keeping track of open questions about categorization
IlnarSelimcan 806dfff
add few doubts
IlnarSelimcan e1a8dc8
add few doubts in the documentation
IlnarSelimcan f6c2bdb
add few more annotation doubts
IlnarSelimcan 749b284
add a note on how to parse hand-tagged texts with udpipe
IlnarSelimcan 4c45a9c
[annotation] minor fixes in puupankki
IlnarSelimcan ff2110c
[annotation] minor fixes in puupankki; add labels
IlnarSelimcan ced5b1d
[annotation] minor fixes in the UD treebank; mark sentences I've read…
IlnarSelimcan 2b43e41
[documentation] keep track of open questions/doubts/смутные сомнения
IlnarSelimcan b37cce8
[annotation] minor fixes in the UD treebank; mark sentences I've read…
IlnarSelimcan 438b92b
[annotation] minor fixes (well, я так думаю \!) in the UD treebank; m…
IlnarSelimcan 346c598
[docs]
IlnarSelimcan fb38e49
[annotation] minor fixes in the UD treebank; converting into udv2 etc
IlnarSelimcan b876220
[docs]
IlnarSelimcan 8d8a1d7
[docs] minor
IlnarSelimcan 80854ce
[docs] minor: add a css style
IlnarSelimcan ffbef26
[documentation] build documentation on Github actions and thus stop t…
IlnarSelimcan 7bf67b2
[documentation] run ./autogen.sh beforehand so that I can run 'make d…
IlnarSelimcan 19b6694
[documentation] install Apertium core so that I can run ./autogen.sh …
IlnarSelimcan 5ab0a71
[documentation] install Apertium core so that I can run ./autogen.sh …
IlnarSelimcan 0d8d685
[documentation] minor, still fiddling with Github Actions config file
IlnarSelimcan 5f0d88c
Deploying to master from @ 0d8d6859fb58a5c9a4f0f9e7ad9d785344be9821 🚀
IlnarSelimcan 1eff62b
[documentation] abandon the idea of building the documentation on git…
IlnarSelimcan 1ac86d2
Merge branch 'master' of github.com:taruen/apertium-kaz
IlnarSelimcan 1b9de16
[annotation] minor fixes to make ud-tools/validation.py happy
IlnarSelimcan 37f6c75
[documentation] re-organise; add a note on how to train a udpipe model
IlnarSelimcan d5a6372
[ud annotation] minor fixes to make the treebank ud version 2 conform…
IlnarSelimcan 7095ccc
[ud annotation] minor fixes to make the treebank ud version 2 conform…
IlnarSelimcan 7079f89
Merge remote-tracking branch 'upstream/master'
IlnarSelimcan 6eaa074
[ud annotation] minor fixes to make the treebank ud version 2 conform…
IlnarSelimcan 8d567fd
[ud annotation] minor fixes to make the treebank ud version 2 conformant
IlnarSelimcan e0e3ac5
validate with ud-tools/validate.py
IlnarSelimcan 28b5154
[ud annotation] minor fixes
IlnarSelimcan 098c7e8
[ud-annotation] [docs] validate with ud-tools/validate.py
IlnarSelimcan 7df24e6
Merge remote-tracking branch 'upstream/master'
IlnarSelimcan d0e26a7
[ud annotation] minor fixes based on validate.py's output and also to…
IlnarSelimcan ab5d141
[docs] minor tweak so that it appears as a package
IlnarSelimcan 77349a7
Revert "[ud annotation] minor fixes based on validate.py's output and…
IlnarSelimcan 0c758e6
rewind to d0e26a77318aaf93dd93c165c1216f71796d2c41
IlnarSelimcan 0570b4a
correcciones; <s>afegitons</s>
IlnarSelimcan be100e9
<s>docs</s> scribblings
IlnarSelimcan d0d8f8c
[ud annotation] correcciones
IlnarSelimcan 8b7021a
[UD annotacion] add checked_IFS labels to sentences which IFS has che…
IlnarSelimcan 4593da1
[UD annotacion] validacion partially
IlnarSelimcan 40935a7
[ud annotation] correcciones
IlnarSelimcan 303889b
[ud annotation] correcciones (I'd like to think so)
IlnarSelimcan 2c8c9fc
cleanup the consequences of GA misfortune
IlnarSelimcan e586de9
cleanup the consequences of GA misfortune
IlnarSelimcan 912086e
[docs] one more doubt about ud
IlnarSelimcan 87c7d67
[docs] clarification to a doubt
IlnarSelimcan 201a601
[docs] minor note
IlnarSelimcan a324938
[docs] minor fixes regarding ud annotation
IlnarSelimcan 6673112
[docs] minor note
IlnarSelimcan b74a216
[ud] merge puupankki.kaz.conllu file from the UD_Kazakh-KTB-v2.7 branch
IlnarSelimcan 40b1ee2
rename Лейнг_Рональд_Дэвис.txt to Лейнг_Рональд_Дэвис.tagged.txt sinc…
IlnarSelimcan c8ee0b2
point to pr17 instead pr16 for the revised puupankki-treebank
IlnarSelimcan c7d88e7
update puupankki.conllu to the version in the UD_v2.7 branch
IlnarSelimcan 188586a
s/lt-proc/hfst-proc because of the following issue
IlnarSelimcan 9903cbe
[docs] minor updates & clarifications
IlnarSelimcan d781a05
[docs] minor additions
IlnarSelimcan f09c3c3
Merge branch 'issue#20' of github.com:taruen/apertium-kaz
IlnarSelimcan a7aba36
switch back to lt-proc in kaz-morph and kaz-tagger, and now also in k…
IlnarSelimcan 99108bc
[docs] update parsing eval
IlnarSelimcan cf63639
[docs] add stats for the case when using the extended lexicon
IlnarSelimcan 8b2f6d5
Merge remote-tracking branch 'upstream/master'
IlnarSelimcan 26abf80
[data] put +e on a separate line as a subreading, as it shold've been…
IlnarSelimcan 9aad41d
[data] s/+е/"е"/g
IlnarSelimcan 47a9d86
s/мен post/мен cnjcoo/g here too (i.e. so that it matches puupankki),…
IlnarSelimcan 5773c33
eval scripts
IlnarSelimcan 8fe0578
revert
IlnarSelimcan 879f3cf
[ud] add two more lines for converting the хрглораца adv attr into ADJ
IlnarSelimcan f88c232
[ud] make the lemmas of қазіргі consistent (although I'm not too happ…
IlnarSelimcan 7586867
[ud] minor fix
IlnarSelimcan 3f9bea4
[ud] жүзеге асыр -> жүзе N Case=Dat obl. Can be changed easily into s…
IlnarSelimcan e2e1820
fix my own typo жүзеге s/N/n/ all over the place
IlnarSelimcan 8b5ccd2
[ud] in атап өтті, make атап the root and өтті aux of it, consistently
IlnarSelimcan 6b08c41
[ud] s/VerbForm=Cov/VerbForm=Conv/g
IlnarSelimcan dcc0e9e
[ud] make барлық consistently DET QNT instead of arbitrarily DET.QNT …
IlnarSelimcan f9bfda7
[ud puupankki] minor fixes
IlnarSelimcan 2bf5d7b
[docs] complete installation and usage notes
IlnarSelimcan 2dc4b95
[docs] when compiling for taruen.com (with DOCSFOR=TARUEN make docs),…
IlnarSelimcan c9f4e2e
[ud puupankki] жаттығу n -> v ger
IlnarSelimcan 19469e7
[ud puupankki] s/жаттық v.get/жаттығу n/; орталықтың езгісіне s/Case=…
IlnarSelimcan 051cd83
[ud puupankki] 2-жартысында s/adj/n/
IlnarSelimcan a67b942
[ud puupankki] s/қат v.coop/қатыс v not coop/ = to participate
IlnarSelimcan 3fb8b59
[ud] convert сондықтан cnjadv not as SCONJ but as ADV, because in puu…
IlnarSelimcan 933a21e
add some work-in-progress cg3 files for ~retokenizaion (with ud in mi…
IlnarSelimcan 816f565
[docs] fix a typo
IlnarSelimcan e47149d
[ud puupankki] fix a type in feature name
IlnarSelimcan 47f538a
[ud puupankki] fix https://github.com/apertium/apertium-kaz/issues/25
IlnarSelimcan ad8e63f
[ud puupankki] s/жеңістерге жетіңізn<num>/жеңістерге жетіңіз<v><imp>
IlnarSelimcan b92b526
[ud puupankki] бүгінгі is ADJ adv attr; оңтүстік-шығысы is a NOUN n a…
IlnarSelimcan 965aa41
[ud puupankki] миллион s/NOUN/NUM/g'
IlnarSelimcan 11ccc05
[ud puupankki] handle 4.3 мың as ->compound, just like any other comp…
IlnarSelimcan 1326f10
[ud puupankki] тығыз топтасқан. тығыз is adj.advl here
IlnarSelimcan 53f8d1c
[ud puupankki] тарихымызға -> келуге
IlnarSelimcan 3cfbded
[ud puupankki] s/М. NOUN abbr/М. PROPN np Case=Nom/ etc
IlnarSelimcan 16a803e
[ud] convert adj advl as ADV; s/VerbForm=Cov/VerbForm=Conv
IlnarSelimcan 891aced
[ud puupankki] minor fixes
IlnarSelimcan c45a9ce
[ud puupankki] minor fixes
IlnarSelimcan 27464fe
[ud puupankki] minor fixes
IlnarSelimcan 7bcfff7
[ud puupankki] minor fix: tag -GAн vadjes as acl:relcl consistently, …
IlnarSelimcan f2b0f4a
remove the empty file puupankki.conllx
IlnarSelimcan a19bb8a
label X айы as nmod:poss consistently, and not compound in one place …
IlnarSelimcan b0c4e0e
Neues aus der Werkstatt
IlnarSelimcan ebc1ad7
Merge remote-tracking branch 'upstream/master'
IlnarSelimcan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2020 in the phrase "Еуровидение 2020" receives
<num><ord>
in the UD treebank, hence this change.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, @ftyers, what do you think about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks ok.