Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eyecite Fails to Parse Complex Citations Correctly #185

Open
flooie opened this issue Oct 8, 2024 · 1 comment
Open

Eyecite Fails to Parse Complex Citations Correctly #185

flooie opened this issue Oct 8, 2024 · 1 comment
Assignees

Comments

@flooie
Copy link
Contributor

flooie commented Oct 8, 2024

Recently, @anseljh highlighted missing citations in CourtListener (CL), and while investigating, I encountered some challenging parsing issues—likely edge cases.

For example, consider the following from Jasmine v. Superior Court:

This is a pure question of law, which we address without deference to the trial court’s ruling. (See In re K.F. (2009) 173 Cal.App.4th 655, 661 [92 Cal.Rptr.3d 784]; Yield Dynamics, Inc. v. TEA Systems Corp. (2007) 154 Cal.App.4th 547, 558 [66 Cal.Rptr.3d 1].)”

Problem:

Eyecite struggles to correctly parse this structure. There are two cases here, each with two citations (one parallel for each):

In re K.F. Citations:

1.	In re K.F. (2009) 173 Cal.App.4th 655, 661
2.	[92 Cal.Rptr.3d 784]

Yield Dynamics Citations:

1.	Yield Dynamics, Inc. v. TEA Systems Corp. (2007) 154 Cal.App.4th 547, 558
2.	[66 Cal.Rptr.3d 1]

Results from get_citations:

When parsing the string, Eyecite produces the following four citations:

1.	FullCaseCitation('173 Cal.App.4th 655', ...)
2.	FullCaseCitation('92 Cal.Rptr.3d 784', ...)
3.	FullCaseCitation('154 Cal.App.4th 547', ...)
4.	FullCaseCitation('66 Cal.Rptr.3d 1', ...)

Now each of these is correct as to the citation- but it fails down when it tries to include the date. This is where the wrench lands with our citation annotator.

  1. Date Issues:
    • The date for In re K.F. is incorrectly assigned the year of Yield Dynamics (2007). This happens because Eyecite is not separating the citations appropriately.

  2. Plaintiff/Defendant Parsing:

    • In cases like In re K.F., where there are no explicit plaintiff or defendant, Eyecite struggles with parsing parties correctly. For example, “K.F.” is being treated as a defendant.
    • Similarly, in Yield Dynamics, “Inc.” is assigned as the plaintiff, when it’s part of the full title of the case.

  3. Extra Data Repairing:
    • There’s an issue with the “extra” field being populated with the following citation information.

I think we need to add

  1. More Sophisticated Citation Boundary Detection or atleast use semicolons more effectively.
  2. Update better party parsing to handle cases without a plaintiff or defendant ex . In Re. KF
    
  3. Add a new pattern for this (maybe common) pattern of TITLE (YEAR) CITATION - we see in California.
@flooie flooie moved this to General Backlog in Case Law Sprint Nov 19, 2024
@flooie flooie moved this from General Backlog to Backlog Dec 16 - Dec 27th in Case Law Sprint Dec 16, 2024
@flooie flooie moved this from Backlog Dec 16 - Dec 27th to To Do in Case Law Sprint Dec 17, 2024
@flooie flooie moved this from To Do to Buffer Zone in Case Law Sprint Jan 13, 2025
@flooie
Copy link
Contributor Author

flooie commented Jan 13, 2025

This issue needs to be

  1. analyzed as it may be multiple sub issues. During whatever sprint this is added to it should be simply analyzed.

@flooie flooie moved this from Buffer Zone to Backlog Jan 13 to Jan 24 in Case Law Sprint Jan 13, 2025
@flooie flooie self-assigned this Jan 14, 2025
@flooie flooie moved this from Backlog Jan 13 to Jan 24 to To Do in Case Law Sprint Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: To Do
Status: No status
Development

No branches or pull requests

1 participant