Skip to content

Commit

Permalink
feat: implement Approximate Nearest Neighbor support for DDL (CREATE …
Browse files Browse the repository at this point in the history
…TABLE, CREATE VECTOR INDEX) (#124)

* fix(testing+linting): add nox lint+format directives

This change introduces new nox directives:
* blacken: `nox -s blacken`
* format: `nox -s format` to apply formatting to files
* lint: `nox -s lint` to flag linting issues
* unit: to run unit tests locally

which are the basis to enable scalable development
and continuous testing as I prepare to bring in
Approximate Nearest Neighors (ANN) functionality into
this package.

Also while here, fixed a typo in the README.rst file
that didn't have the correct import path.

* feat: add Approximate Nearest Neighbor support to distance strategies

This change adds ANN distance strategies for GoogleSQL semantics.
While here started unit tests to effectively test out components
without having to have a running Cloud Spanner instance.

Updates #94

* Ensure vector fits within limits in sample

* Update ANN query names + test expectations

* Pass in strategy inferred from initialization

* Hook up get_documents_from_query_results

* Link up __search_by_ANN to similarity_search_by_vector

* Incorporate pre_filter and post_filter plus update tests

* Review addressing

* Simplified checking if using ANN

* Reduce the amount of changes

* More reductions

* More reductions to ease code review

* Fit with get_rows_by_similarity_search_ann

* Updates from nox

* Fix PostGreSQL
  • Loading branch information
odeke-em authored Feb 4, 2025
1 parent b10dc28 commit 5a25f91
Show file tree
Hide file tree
Showing 9 changed files with 767 additions and 55 deletions.
6 changes: 6 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -253,3 +253,9 @@ Disclaimer

This is not an officially supported Google product.


Limitations
-----------

* Approximate Nearest Neighbors (ANN) strategies are only supported for the GoogleSQL dialect
* ANN's `ALTER VECTOR INDEX` is not yet supported by [Google Cloud Spanner](https://cloud.google.com/spanner/docs/find-approximate-nearest-neighbors#limitations)
2 changes: 1 addition & 1 deletion noxfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@
import os
import pathlib
import shutil
from pathlib import Path

import nox

Expand All @@ -33,6 +32,7 @@
"docfx",
"docs",
"format",
"integration",
"lint",
"unit",
]
Expand Down
4 changes: 2 additions & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
google-cloud-spanner==3.49.1
langchain-core==0.3.9
google-cloud-spanner==3.51.0
langchain-core==0.3.15
langchain-community==0.3.1
pydantic==2.9.1
1 change: 1 addition & 0 deletions src/langchain_google_spanner/graph_qa.py
Original file line number Diff line number Diff line change
Expand Up @@ -288,6 +288,7 @@ def _call(
inputs: Dict[str, Any],
run_manager: Optional[CallbackManagerForChainRun] = None,
) -> Dict[str, str]:

intermediate_steps: List = []

"""Generate gql statement, uses it to look up in db and answer question."""
Expand Down
2 changes: 1 addition & 1 deletion src/langchain_google_spanner/graph_retriever.py
Original file line number Diff line number Diff line change
Expand Up @@ -225,7 +225,7 @@ def __clean_element(self, element: dict[str, Any], embedding_column: str) -> Non
del element["properties"][embedding_column]

def __get_distance_function(
self, distance_strategy=DistanceStrategy.EUCLIDEIAN
self, distance_strategy=DistanceStrategy.EUCLIDEAN
) -> str:
"""Gets the vector distance function."""
if distance_strategy == DistanceStrategy.COSINE:
Expand Down
Loading

0 comments on commit 5a25f91

Please sign in to comment.