-
Notifications
You must be signed in to change notification settings - Fork 915
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Based on the official pinecone java library. Later expects that indices are created externally via Ops. - Map Document metadata to and from Pinecone's internal Struct. Later converts the metadata into pinecone json format. - Add integration tests and README.
- Loading branch information
1 parent
001ee99
commit 81a5eaf
Showing
5 changed files
with
810 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,113 @@ | ||
# Pinecone VectorStore | ||
|
||
This readme will walk you through setting up the Pinecone VectorStore to store document embeddings and perform similarity searches. | ||
|
||
## What is Pinecone? | ||
|
||
[Pinecone](https://www.pinecone.io/) is a popular cloud-based vector database, which allows you to store and search vectors efficiently. | ||
|
||
## Prerequisites | ||
|
||
1. Pinecone Account: Before you start, ensure you sign up for a [Pinecone account](https://app.pinecone.io/). | ||
2. Pinecone Project: Once registered, create a new project, an index, and generate an API key. You'll need these details for configuration. | ||
3. OpenAI Account: Create an account at [OpenAI Signup](https://platform.openai.com/signup) and generate the token at [API Keys](https://platform.openai.com/account/api-keys) | ||
|
||
## Configuraiton | ||
|
||
To set up PineconeVectorStore, gather the following details from your Pinecone account: | ||
|
||
* Pinecond API Key | ||
* Pinecone Environment | ||
* Pinecone Project ID | ||
* Pinecone Index Name | ||
* Pinecone Namespace | ||
|
||
> **Note** | ||
> This information is available to you in the Pinecone UI portal. | ||
|
||
When setting up embeddings, select a vector dimension of 1526. This matches the dimensionality of OpenAI's model "text-embedding-ada-002", which we'll be using for this guide. | ||
|
||
Additionally, you'll need to provide your OpenAI API Key. Set it as an environment variable like so: | ||
|
||
```bash | ||
export SPRING_AI_OPENAI_API_KEY='Your_OpenAI_API_Key' | ||
``` | ||
|
||
## Dependencies | ||
|
||
Add these dependencies to your project: | ||
|
||
1. OpenAI: Required for calculating embeddings. | ||
|
||
```xml | ||
<dependency> | ||
<groupId>org.springframework.experimental.ai</groupId> | ||
<artifactId>spring-ai-openai-spring-boot-starter</artifactId> | ||
<version>0.7.0-SNAPSHOT</version> | ||
</dependency> | ||
``` | ||
|
||
2. Pinecone | ||
|
||
```xml | ||
<dependency> | ||
<groupId>org.springframework.experimental.ai</groupId> | ||
<artifactId>spring-ai-pinecone</artifactId> | ||
<version>0.7.0-SNAPSHOT</version> | ||
</dependency> | ||
``` | ||
|
||
## Sample Code | ||
|
||
To configure Pinecone in your application, you can use the following setup: | ||
|
||
```java | ||
@Bean | ||
public PineconeVectorStoreConfig pineconeVectorStoreConfig() { | ||
|
||
return PineconeVectorStoreConfig.builder() | ||
.withApiKey(System.getenv( <PINECONE_API_KEY> )) | ||
.withEnvironment(gcp-starter) | ||
.withProjectId(89309e6) | ||
.withIndexName(spring-ai-test-index) | ||
.withNamespace("") // Leave it empty as for free tier as later doesn't support namespaces. | ||
.build(); | ||
} | ||
``` | ||
|
||
Integrate with OpenAI's embeddings by adding the Spring Boot OpenAI starter to your project. | ||
This provides you with an implementation of the Embeddings client: | ||
|
||
```java | ||
@Bean | ||
public VectorStore vectorStore(PineconeVectorStoreConfig config, EmbeddingClient embeddingClient) { | ||
return new PineconeVectorStore(config, embeddingClient); | ||
} | ||
``` | ||
|
||
In your main code, create some documents | ||
|
||
```java | ||
List<Document> documents = List.of( | ||
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", | ||
Collections.singletonMap("meta1", "meta1")), | ||
new Document("Hello World Hello World Hello World Hello World Hello World Hello World Hello World"), | ||
new Document( | ||
"Great Depression Great Depression Great Depression Great Depression Great Depression Great Depression", | ||
Collections.singletonMap("meta2", "meta2"))); | ||
``` | ||
|
||
Add the documents to your vector store: | ||
|
||
```java | ||
vectorStore.add(List.of(document)); | ||
``` | ||
|
||
And finally, retrieve documents similar to a query: | ||
|
||
```java | ||
List<Document> results = vectorStore.similaritySearch("Spring", 5); | ||
``` | ||
|
||
If all goes well, you should retrieve the document containing the text "Spring AI rocks!!". |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<project xmlns="http://maven.apache.org/POM/4.0.0" | ||
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd"> | ||
<modelVersion>4.0.0</modelVersion> | ||
<parent> | ||
<groupId>org.springframework.experimental.ai</groupId> | ||
<artifactId>spring-ai</artifactId> | ||
<version>0.7.0-SNAPSHOT</version> | ||
<relativePath>../../pom.xml</relativePath> | ||
</parent> | ||
<artifactId>spring-ai-pinecone</artifactId> | ||
<packaging>jar</packaging> | ||
<name>spring-ai-pinecone</name> | ||
<description>spring-ai-pinecone</description> | ||
<url>https://github.com/spring-projects-experimental/spring-ai</url> | ||
|
||
<scm> | ||
<url>https://github.com/spring-projects-experimental/spring-ai</url> | ||
<connection>git://github.com/spring-projects-experimental/spring-ai.git</connection> | ||
<developerConnection>[email protected]:spring-projects-experimental/spring-ai.git</developerConnection> | ||
</scm> | ||
|
||
<properties> | ||
<maven.compiler.target>17</maven.compiler.target> | ||
<maven.compiler.source>17</maven.compiler.source> | ||
</properties> | ||
|
||
<dependencies> | ||
<dependency> | ||
<groupId>org.springframework.experimental.ai</groupId> | ||
<artifactId>spring-ai-core</artifactId> | ||
<version>${project.parent.version}</version> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>io.pinecone</groupId> | ||
<artifactId>pinecone-client</artifactId> | ||
<version>${pinecone.version}</version> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>com.google.protobuf</groupId> | ||
<artifactId>protobuf-java-util</artifactId> | ||
<version>${protobuf-java-util.version}</version> | ||
</dependency> | ||
|
||
<!-- TESTING --> | ||
<dependency> | ||
<groupId>org.springframework.experimental.ai</groupId> | ||
<artifactId>spring-ai-openai-spring-boot-starter</artifactId> | ||
<version>${parent.version}</version> | ||
<scope>test</scope> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.springframework.boot</groupId> | ||
<artifactId>spring-boot-starter-test</artifactId> | ||
<scope>test</scope> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.awaitility</groupId> | ||
<artifactId>awaitility</artifactId> | ||
<version>3.0.0</version> | ||
<scope>test</scope> | ||
</dependency> | ||
|
||
</dependencies> | ||
|
||
</project> |
Oops, something went wrong.