-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make HighlightedTextClassifier work with <b>
tags
#61
Labels
contributions-welcome
Intended for completion by you, the contributor
feature:elements
Parsing all the other elements correctly
Comments
Elijas
added
the
contributions-welcome
Intended for completion by you, the contributor
label
Dec 22, 2023
I would like to work on this issue. |
Elijas
added
the
status:in-progress
Work underway. Reach out if you're interested in helping!
label
Dec 22, 2023
Sorry, I was too busy to notify you that I will no longer be able to work on this issue due to my obligations. |
Elijas
removed
the
status:in-progress
Work underway. Reach out if you're interested in helping!
label
Feb 14, 2024
No worries, thanks for letting us know! |
I'd like to work on this |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
contributions-welcome
Intended for completion by you, the contributor
feature:elements
Parsing all the other elements correctly
Discussed in https://github.com/orgs/alphanome-ai/discussions/56
Originally posted by Elijas November 24, 2023
Example document
https://www.sec.gov/Archives/edgar/data/1675149/000119312518236766/d828236d10q.htm
Goal
The "G. Accumulated Other Comprehensive Loss" should be recognized as HighlightedTextElement (and therefore, TitleElement).
Most likely, you will have to get a percentage of text that is covered inside the
<b>
tag, by reusing the parts implemented in the HighlightedTextElement. This will help you avoid situations wheretext text text <b>bold</b> text text
is recognized as higlightedThe text was updated successfully, but these errors were encountered: