A project to show good CLI practices with a fully fledged RAG system.
pip install rag-cli
- CLI tooling for RAG
- Embedder (Ollama)
- Vector store (Qdrant)
If you don't have a running instance of Qdrant or Ollama, you can use the provided docker-compose file to start one.
docker-compose up --build -d
This will start Ollama on http://localhost:11434
and Qdrant on http://localhost:6333
.
This project uses a dev container, which is the easiest way to set up a consistent development environment. Dev containers provide all the necessary tools, dependencies, and configuration, so you can focus on coding right away.
This project uses a dev container for a consistent development environment. To get started:
- Open the project in Visual Studio Code.
- On Windows/Linux, press
Ctrl+Shift+P
and run the commandRemote-Containers: Reopen in Container
. On Mac, pressCmd+Shift+P
and run the same command. - VS Code will build and start the dev container, providing access to the project's codebase and dependencies.
Other editors may have similar functionality but this project is optimised for Visual Studio Code.
Before running this command, make sure you have a running instance of Ollama and the nomic-embed-text:v1.5 model is available:
ollama pull nomic-embed-text:v1.5
rag-cli embed --ollama-url http://localhost:11434 --file <INPUT_FILE>
You can alternatively use stdin to pass the text:
cat <INPUT_FILE> | rag-cli embed --ollama-url http://localhost:11434
rag-cli vector-store \
--qdrant-url http://localhost:6333 \
--collection-name <COLLECTION_NAME> \
--data '{<JSON_DATA>}'
--embedding <EMBEDDING_FILE>
You can alternatively use stdin to pass embeddings:
cat <INPUT_FILE> | \
rag-cli vector-store \
--qdrant-url http://localhost:6333 \
--collection-name <COLLECTION_NAME> \
--data '{<JSON_DATA>}'
rag-cli rag \
--ollama-embedding-url http://localhost:11434 \
--ollama-chat-url http://localhost:11435 \
--qdrant-url http://localhost:6333 \
--collection-name nomic-embed-text-v1.5 \
--top-k 5 \
--min-similarity 0.5
--file <INPUT_FILE>
You can alternatively use stdin to pass the text:
cat <INPUT_FILE> | \
--ollama-embedding-url http://localhost:11434 \
--ollama-chat-url http://localhost:11435 \
--qdrant-url http://localhost:6333 \
--collection-name nomic-embed-text-v1.5 \
--top-k 5 \
--min-similarity 0.5
Here is an example of an end-to-end pipeline for storing embeddings. It takes the following steps:
- Get a random Wikipedia article
- Embed the article
- Store the embedding in Qdrant
Before running the pipeline make sure you have the following installed:
sudo apt-get update && sudo apt-get install parallel jq curl
Also make sure that the data/articles
and data/embeddings
directories exist:
mkdir -p data/articles data/embeddings
Then run the pipeline:
bash scripts/run_pipeline.sh
The script scripts/run_pipeline.sh
can be run in parallel with GNU Parallel to speed up the process.
parallel -j 5 -n0 bash scripts/run_pipeline.sh ::: {0..10}
parallel -n0 -j 10 '
curl -L -s "https://en.wikipedia.org/api/rest_v1/page/random/summary" | \
jq -r ".title, .description, .extract" | \
tee data/articles/$(cat /proc/sys/kernel/random/uuid).txt
' ::: {0..10}
parallel '
rag-cli embed --ollama-url http://localhost:11434 --file {1} 2>> output.log | \
jq ".embedding" | \
tee data/embeddings/$(basename {1} .txt) 1> /dev/null
' ::: $(find data/articles/*.txt)
parallel rag-cli vector-store --qdrant-url http://localhost:6333 --collection-name nomic-embed-text-v1.5 2>> output.log ::: $(find data/embeddings/*)
echo "Who invented the blue LED?" | \
rag-cli rag \
--ollama-embedding-url http://localhost:11434 \
--ollama-chat-url http://localhost:11435 \
--qdrant-url http://localhost:6333 \
--collection-name nomic-embed-text-v1.5 \
--top-k 5 \
--min-similarity 0.5 \
2>> output.log
This example obviously requires that the articles similar to the query have been embedded and stored in Qdrant. You can do this with the example found in the next section.
wikipedia_data=$(curl -L -s "https://en.wikipedia.org/api/rest_v1/page/summary/Shuji_Nakamura") && \
payload_data=$(jq "{title: .title, description: .description, extract: .extract}" <(echo $wikipedia_data)) && \
text_to_embed=$(jq -r ".title, .description, .extract" <(echo $wikipedia_data)) && \
echo $text_to_embed | \
rag-cli embed --ollama-url http://localhost:11434 | \
jq -r ".embedding" | \
rag-cli vector-store \
--qdrant-url http://localhost:6333 \
--collection-name nomic-embed-text-v1.5 \
--data "$payload_data"