Skip to content

Commit 81a5eaf

Browse files
tzolovmarkpollack
authored andcommitted
Add Pinecone VectorStore
- Based on the official pinecone java library. Later expects that indices are created externally via Ops. - Map Document metadata to and from Pinecone's internal Struct. Later converts the metadata into pinecone json format. - Add integration tests and README.
1 parent 001ee99 commit 81a5eaf

File tree

5 files changed

+810
-0
lines changed

5 files changed

+810
-0
lines changed

pom.xml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,8 @@
2828
<module>document-readers/pdf-reader</module>
2929
<module>document-readers/tika-reader</module>
3030
<module>embedding-clients/transformers-embedding</module>
31+
<module>vector-stores/spring-ai-pinecone</module>
32+
3133
</modules>
3234

3335
<organization>
@@ -82,6 +84,8 @@
8284
<pgvector.version>0.1.3</pgvector.version>
8385
<postgresql.version>42.6.0</postgresql.version>
8486
<milvus.version>2.3.0</milvus.version>
87+
<pinecone.version>0.6.0</pinecone.version>
88+
<protobuf-java-util.version>3.24.4</protobuf-java-util.version>
8589

8690
<!-- testing dependecies -->
8791
<testcontainers.version>1.19.0</testcontainers.version>
Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
# Pinecone VectorStore
2+
3+
This readme will walk you through setting up the Pinecone VectorStore to store document embeddings and perform similarity searches.
4+
5+
## What is Pinecone?
6+
7+
[Pinecone](https://www.pinecone.io/) is a popular cloud-based vector database, which allows you to store and search vectors efficiently.
8+
9+
## Prerequisites
10+
11+
1. Pinecone Account: Before you start, ensure you sign up for a [Pinecone account](https://app.pinecone.io/).
12+
2. Pinecone Project: Once registered, create a new project, an index, and generate an API key. You'll need these details for configuration.
13+
3. OpenAI Account: Create an account at [OpenAI Signup](https://platform.openai.com/signup) and generate the token at [API Keys](https://platform.openai.com/account/api-keys)
14+
15+
## Configuraiton
16+
17+
To set up PineconeVectorStore, gather the following details from your Pinecone account:
18+
19+
* Pinecond API Key
20+
* Pinecone Environment
21+
* Pinecone Project ID
22+
* Pinecone Index Name
23+
* Pinecone Namespace
24+
25+
> **Note**
26+
> This information is available to you in the Pinecone UI portal.
27+
28+
29+
When setting up embeddings, select a vector dimension of 1526. This matches the dimensionality of OpenAI's model "text-embedding-ada-002", which we'll be using for this guide.
30+
31+
Additionally, you'll need to provide your OpenAI API Key. Set it as an environment variable like so:
32+
33+
```bash
34+
export SPRING_AI_OPENAI_API_KEY='Your_OpenAI_API_Key'
35+
```
36+
37+
## Dependencies
38+
39+
Add these dependencies to your project:
40+
41+
1. OpenAI: Required for calculating embeddings.
42+
43+
```xml
44+
<dependency>
45+
<groupId>org.springframework.experimental.ai</groupId>
46+
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
47+
<version>0.7.0-SNAPSHOT</version>
48+
</dependency>
49+
```
50+
51+
2. Pinecone
52+
53+
```xml
54+
<dependency>
55+
<groupId>org.springframework.experimental.ai</groupId>
56+
<artifactId>spring-ai-pinecone</artifactId>
57+
<version>0.7.0-SNAPSHOT</version>
58+
</dependency>
59+
```
60+
61+
## Sample Code
62+
63+
To configure Pinecone in your application, you can use the following setup:
64+
65+
```java
66+
@Bean
67+
public PineconeVectorStoreConfig pineconeVectorStoreConfig() {
68+
69+
return PineconeVectorStoreConfig.builder()
70+
.withApiKey(System.getenv( <PINECONE_API_KEY> ))
71+
.withEnvironment(gcp-starter)
72+
.withProjectId(89309e6)
73+
.withIndexName(spring-ai-test-index)
74+
.withNamespace("") // Leave it empty as for free tier as later doesn't support namespaces.
75+
.build();
76+
}
77+
```
78+
79+
Integrate with OpenAI's embeddings by adding the Spring Boot OpenAI starter to your project.
80+
This provides you with an implementation of the Embeddings client:
81+
82+
```java
83+
@Bean
84+
public VectorStore vectorStore(PineconeVectorStoreConfig config, EmbeddingClient embeddingClient) {
85+
return new PineconeVectorStore(config, embeddingClient);
86+
}
87+
```
88+
89+
In your main code, create some documents
90+
91+
```java
92+
List<Document> documents = List.of(
93+
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!",
94+
Collections.singletonMap("meta1", "meta1")),
95+
new Document("Hello World Hello World Hello World Hello World Hello World Hello World Hello World"),
96+
new Document(
97+
"Great Depression Great Depression Great Depression Great Depression Great Depression Great Depression",
98+
Collections.singletonMap("meta2", "meta2")));
99+
```
100+
101+
Add the documents to your vector store:
102+
103+
```java
104+
vectorStore.add(List.of(document));
105+
```
106+
107+
And finally, retrieve documents similar to a query:
108+
109+
```java
110+
List<Document> results = vectorStore.similaritySearch("Spring", 5);
111+
```
112+
113+
If all goes well, you should retrieve the document containing the text "Spring AI rocks!!".
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<project xmlns="http://maven.apache.org/POM/4.0.0"
3+
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
4+
<modelVersion>4.0.0</modelVersion>
5+
<parent>
6+
<groupId>org.springframework.experimental.ai</groupId>
7+
<artifactId>spring-ai</artifactId>
8+
<version>0.7.0-SNAPSHOT</version>
9+
<relativePath>../../pom.xml</relativePath>
10+
</parent>
11+
<artifactId>spring-ai-pinecone</artifactId>
12+
<packaging>jar</packaging>
13+
<name>spring-ai-pinecone</name>
14+
<description>spring-ai-pinecone</description>
15+
<url>https://github.com/spring-projects-experimental/spring-ai</url>
16+
17+
<scm>
18+
<url>https://github.com/spring-projects-experimental/spring-ai</url>
19+
<connection>git://github.com/spring-projects-experimental/spring-ai.git</connection>
20+
<developerConnection>[email protected]:spring-projects-experimental/spring-ai.git</developerConnection>
21+
</scm>
22+
23+
<properties>
24+
<maven.compiler.target>17</maven.compiler.target>
25+
<maven.compiler.source>17</maven.compiler.source>
26+
</properties>
27+
28+
<dependencies>
29+
<dependency>
30+
<groupId>org.springframework.experimental.ai</groupId>
31+
<artifactId>spring-ai-core</artifactId>
32+
<version>${project.parent.version}</version>
33+
</dependency>
34+
35+
<dependency>
36+
<groupId>io.pinecone</groupId>
37+
<artifactId>pinecone-client</artifactId>
38+
<version>${pinecone.version}</version>
39+
</dependency>
40+
41+
<dependency>
42+
<groupId>com.google.protobuf</groupId>
43+
<artifactId>protobuf-java-util</artifactId>
44+
<version>${protobuf-java-util.version}</version>
45+
</dependency>
46+
47+
<!-- TESTING -->
48+
<dependency>
49+
<groupId>org.springframework.experimental.ai</groupId>
50+
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
51+
<version>${parent.version}</version>
52+
<scope>test</scope>
53+
</dependency>
54+
55+
<dependency>
56+
<groupId>org.springframework.boot</groupId>
57+
<artifactId>spring-boot-starter-test</artifactId>
58+
<scope>test</scope>
59+
</dependency>
60+
61+
<dependency>
62+
<groupId>org.awaitility</groupId>
63+
<artifactId>awaitility</artifactId>
64+
<version>3.0.0</version>
65+
<scope>test</scope>
66+
</dependency>
67+
68+
</dependencies>
69+
70+
</project>

0 commit comments

Comments
 (0)