How to handle "e.g.", "i.e.", "viz.", etc. #13795
Unanswered
fi11222
asked this question in
Help: Other Questions
Replies: 1 comment
-
The tag X is used to denote words that don’t clearly fit into other standard part-of-speech categories, often referred to as the "other" or "unknown" category. This tag is assigned when the model cannot confidently classify the word into a more specific category like noun, verb, adverb, etc. For |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The standard English models (I currently use en_core_web_trf) do not seem to be able to properly identify these common abbreviations.
Currently, they are categorized as "foreign words" (pos_ = X, tag_ = FW) and the final "." is yanked off from "viz." and treated as a separate token.
Is there something I can do to avoid that?
I did not find anything about this issue either here or elsewhere on the Web
Beta Was this translation helpful? Give feedback.
All reactions