ARK-style twitter tags and direct universal tags #5

brendano · 2016-10-16T01:31:44Z

make it work for twitter. dont bother with wrapper the ark tagger, but work with calling as get_phrases(pos=..., tokens=...)

just take the bare one-character tags (Gimpel et al 2011) so no needs for the Coarse* conversion layer the old openfst/foma/pyfst version had. and while we're at it why not use the all-caps Petrov tags directly too. hopefully there are no tag system naming conflicts with all this?

backburner: see what the nltk tagset conversion systems are now (@nschneid submitted something a while back)

The text was updated successfully, but these errors were encountered:

brendano · 2016-10-17T15:01:32Z

more TODO: coarsen_POS_tags.R needs to be updated also. current Coarse* inputs wont do anything since these codepaths never normalize to Coarse*. only the old pre-openfst prepreprocessor did that which we've ditched.

more TODO: write bilingual tests for POS coarsening

brendano · 2018-02-04T03:26:56Z

need to look into: NLTK has some tagset conversion methods now http://www.nltk.org/_modules/nltk/tag/mapping.html

AbeHandler mentioned this issue Nov 29, 2017

Unexpected output on pre-coarsened POS tags #13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARK-style twitter tags and direct universal tags #5

ARK-style twitter tags and direct universal tags #5

brendano commented Oct 16, 2016 •

edited

Loading

brendano commented Oct 17, 2016

brendano commented Feb 4, 2018

ARK-style twitter tags and direct universal tags #5

ARK-style twitter tags and direct universal tags #5

Comments

brendano commented Oct 16, 2016 • edited Loading

brendano commented Oct 17, 2016

brendano commented Feb 4, 2018

brendano commented Oct 16, 2016 •

edited

Loading