Fix parsing priority issue with NULL/TRUE/FALSE. Add support for IN Clause. #66

davidmezzetti · 2025-02-14T17:11:16Z

The recent update of the lark parser to the latest version introduced an issue with properly parsing NULL/TRUE/FALSE. It's matching CNAME first.

lark has the ability to set a priority. This change makes sure the rule for NULL/TRUE/FALSE (case insensitive) is matched before CNAME.

This change also removes deprecated Python builds as I believe this would have been caught by the unit tests otherwise.

Fixes #64. Fixes #65.

…thon builds.

davidmezzetti · 2025-02-14T17:34:01Z

Ended up putting both the changes into a single PR.

The IN clause is related to something I'm working on with txtai. I'm trying to sync the SQL capability and graph querying so that vector search is now a feature with txtai's graph search.

For example:

MATCH P=(A)-[]->(B)
WHERE SIMILAR(A, "vector query")
RETURN P
LIMIT 500

txtai pre-processes this query and will replace similar with an IN clause after doing a vector search lookup. Then the following query is passed to GrandCypher.

MATCH P=(A)-[]->(B)
WHERE A IN [1,2,3...500]
RETURN P
LIMIT 500

I compared doing this vs a list of EQUAL/ORs and the IN clause is significantly faster.

For example, in one test 500 equal/or combos took 1m 56s whereas the same query with an IN clause took 1.4s.

Perhaps in the future, there is a way to build on this as mentioned in #58 in terms of registering custom functions to handle arbitrary logic.

j6k4m8 · 2025-02-15T18:55:30Z

Ooh yes I'd love to formalize this with something like #58; @ntjess did some awesome work and I'm still trying to figure out how to upstream it with minimal security question marks. (sorry @ntjess, I didn't forget about you!!!!)

j6k4m8 · 2025-02-15T18:56:02Z

.github/workflows/python-package.yml

@@ -15,7 +15,7 @@ jobs:
    runs-on: ubuntu-latest
    strategy:
      matrix:
-        python-version: [3.7, 3.8, 3.9, '3.10', '3.11']
+        python-version: [3.9, '3.10', '3.11']


Thoughts on including 3.12 and 3.13 now that we're editing this anyway? I agree that EOL versions can go away!

I'll add those in and see what happens 😄

grandcypher/__init__.py

j6k4m8 · 2025-02-15T18:58:08Z

grandcypher/test_queries.py

+        host.add_edge(2, 3)
+
+        qry = """
+        MATCH (A)


Is this well-mapped to a cypher implementation? i.e., can you check if a vertex (A) is IN a list? I thought it'd have to be something like A.id IN [...] but I may be misremembering?

I can change the test to do that but I don't believe it would change the code.

Looking at the spec: https://s3.amazonaws.com/artifacts.opencypher.org/openCypher9.pdf

I see this:

MATCH (a) WHERE a.name IN ['Peter', 'Tobias'] RETURN a.name, a.age

MATCH (n) WHERE id(n) IN [0, 3, 5] RETURN n

The latter seems to be the same idea as the unit test within the confines of GrandCypher.

yeah maybe smartest for us to make id a no-op so that the syntax is consistent? I've been (very very gently) trying to keep the queries 1:1 with a "real" neo4j/cypher database so you could ostensibly copy-paste between the two, or just change the execution location to run on a graphdb... but in practice I'll defer to what you think is right here if you think there's an obvious win off-spec?

I just added a new function section to the lark spec and implemented id(). It can be built upon to add additional scalar functions as defined in the spec.

davidmezzetti · 2025-02-17T12:46:08Z

~~For now, I'll probably just go with the OR solution until this is ready.~~

EDIT: I was able to come up with a workaround that gives similar performance. I added an attribute to each node that matches the IN clause and just made that a simple equals check. The IN clause is cleaner but I'm good for now.

If there is further action on this PR or anything else you'd like to see, please let me know.

j6k4m8

Sorry for the delay. This is great, merging shortly and will ping when it's up on pypi :)

j6k4m8 · 2025-02-17T17:02:15Z

Available in grand-cypher>=0.13.0!

davidmezzetti · 2025-02-17T17:56:06Z

Wow, faster than I expected, thank you!

davidmezzetti added 2 commits February 14, 2025 12:07

Fix parsing priority issue with NULL/TRUE/FALSE. Remove deprecated Py…

0489a35

…thon builds.

Add support for IN clause

92c3396

davidmezzetti changed the title ~~Fix parsing priority issue with NULL/TRUE/FALSE~~ Fix parsing priority issue with NULL/TRUE/FALSE. Add support for IN Clause. Feb 14, 2025

This was referenced Feb 14, 2025

Failed tests with new lark library #64

Closed

Add support for openCypher IN clause #65

Closed

j6k4m8 reviewed Feb 15, 2025

View reviewed changes

davidmezzetti added 3 commits February 15, 2025 14:16

Add additional Python versions to build workflow

3bcd1d5

Add id() function

0bafaf2

Rename function in spec to scalar_function

011b005

davidmezzetti mentioned this pull request Feb 17, 2025

Add similar query clause to graph queries neuml/txtai#875

Closed

j6k4m8 approved these changes Feb 17, 2025

View reviewed changes

j6k4m8 merged commit ef00478 into aplbrain:master Feb 17, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix parsing priority issue with NULL/TRUE/FALSE. Add support for IN Clause. #66

Fix parsing priority issue with NULL/TRUE/FALSE. Add support for IN Clause. #66

davidmezzetti commented Feb 14, 2025 •

edited

Loading

davidmezzetti commented Feb 14, 2025

j6k4m8 commented Feb 15, 2025

j6k4m8 Feb 15, 2025

davidmezzetti Feb 15, 2025

davidmezzetti Feb 15, 2025

j6k4m8 Feb 15, 2025

davidmezzetti Feb 15, 2025

j6k4m8 Feb 15, 2025

davidmezzetti Feb 15, 2025

davidmezzetti commented Feb 17, 2025 •

edited

Loading

j6k4m8 left a comment

j6k4m8 commented Feb 17, 2025

davidmezzetti commented Feb 17, 2025

Fix parsing priority issue with NULL/TRUE/FALSE. Add support for IN Clause. #66

Fix parsing priority issue with NULL/TRUE/FALSE. Add support for IN Clause. #66

Conversation

davidmezzetti commented Feb 14, 2025 • edited Loading

davidmezzetti commented Feb 14, 2025

j6k4m8 commented Feb 15, 2025

j6k4m8 Feb 15, 2025

Choose a reason for hiding this comment

davidmezzetti Feb 15, 2025

Choose a reason for hiding this comment

davidmezzetti Feb 15, 2025

Choose a reason for hiding this comment

j6k4m8 Feb 15, 2025

Choose a reason for hiding this comment

davidmezzetti Feb 15, 2025

Choose a reason for hiding this comment

j6k4m8 Feb 15, 2025

Choose a reason for hiding this comment

davidmezzetti Feb 15, 2025

Choose a reason for hiding this comment

davidmezzetti commented Feb 17, 2025 • edited Loading

j6k4m8 left a comment

Choose a reason for hiding this comment

j6k4m8 commented Feb 17, 2025

davidmezzetti commented Feb 17, 2025

davidmezzetti commented Feb 14, 2025 •

edited

Loading

davidmezzetti commented Feb 17, 2025 •

edited

Loading