You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
make it work for twitter. dont bother with wrapper the ark tagger, but work with calling as get_phrases(pos=..., tokens=...)
just take the bare one-character tags (Gimpel et al 2011) so no needs for the Coarse* conversion layer the old openfst/foma/pyfst version had. and while we're at it why not use the all-caps Petrov tags directly too. hopefully there are no tag system naming conflicts with all this?
backburner: see what the nltk tagset conversion systems are now (@nschneid submitted something a while back)
The text was updated successfully, but these errors were encountered:
more TODO: coarsen_POS_tags.R needs to be updated also. current Coarse* inputs wont do anything since these codepaths never normalize to Coarse*. only the old pre-openfst prepreprocessor did that which we've ditched.
more TODO: write bilingual tests for POS coarsening
make it work for twitter. dont bother with wrapper the ark tagger, but work with calling as
get_phrases(pos=..., tokens=...)
just take the bare one-character tags (Gimpel et al 2011) so no needs for the Coarse* conversion layer the old openfst/foma/pyfst version had. and while we're at it why not use the all-caps Petrov tags directly too. hopefully there are no tag system naming conflicts with all this?
backburner: see what the nltk tagset conversion systems are now (@nschneid submitted something a while back)
The text was updated successfully, but these errors were encountered: