Ask your documents!

Retrieval Augmented Generation and Semantic-Search Tool (RAGSST)

A quick start, locally-run tool to test and use as basis for various document-related use cases:

Rag Query: Prompt an LLM that uses relevant context to answer your queries.
Semantic Retrieval: Retrieve relevant passages from documents, showing sources and relevance.
Rag Chat: Interact with an LLM that utilizes document retrieval and chat history.
LLM Chat: Chat and test a local LLM, without document context.

The interface is divided into tabs for users to select and try the feature for the desired use case. The implementation is focused on simplicity, low-level components, and modularity, in order to depict the working principles and core elements, allowing developers and Python enthusiasts to modify and build upon.

Rag systems rely on sentence embeddings and vector databases. More information on embeddings can be found in our MOOC Understanding Embeddings for Natural Language Processing.

Installation

Windows Users: Install WSL2 First

If you are using Windows, you need to install the Windows Subsystem for Linux (WSL) before proceeding with the installation steps below. WSL allows you to run a Linux environment directly on Windows, which is required for running the installation scripts and Ollama. Note that for running Ollama, you need to install WSL2.

How to Install WSL

Open PowerShell as Administrator (right-click the Start button and select "Windows PowerShell (Admin)").
Run the following command:
```
wsl --install
```
This will install WSL2 (if your computer supports WSL2) and the default Ubuntu distribution. If prompted, restart your computer.
After restart, open Ubuntu from the Start menu and follow the prompts to set up your Linux username and password.
Update your Linux packages:
```
sudo apt update && sudo apt upgrade
```

Once WSL is installed and set up, you can continue with the installation steps below from your WSL terminal.

Download or clone the repository.

Option 1: Automatic Installation

In bash, run the following installation script:

(.myvenv)$ bin/install.sh

The script might not work for MacOS, please follow the manual installation instructions.

Option 2: Manual Installation

Create and activate a virtual environment (optional).

$ uv venv .myvenv
$ source .myvenv/bin/activate

Install dependencies.

(.myvenv)$ uv sync --active

Install Ollama to run Large Language Models (LLMs) locally. (Or follow the installation instructions for your operating system: Install Ollama).

(.myvenv)$ curl -fsSL https://ollama.ai/install.sh | sh

Choose and download an LLM model [*]. For example:

(.myvenv)$ ollama pull llama3.2

Usage

Place your documents in the intended data folder (default: data/).
Activate your virtual environment.

$ source .myvenv/bin/activate

Start the tool. [†]

(.myvenv)$ python3 app.py

Open http://localhost:7860 in your web browser.

Alternative usage option: Docker Compose

Ensure you have Docker and Docker Compose installed.
Install the latest NVIDIA drivers for your GPU on your host system. Install the NVIDIA Container Toolkit. (Only necessary for utilising a GPU.)
Build and start the app and Ollama

./run.sh

Wait for both services to start, which will take several minutes on the first run.
Open http://localhost:7860 or http://127.0.0.1:7860 in your browser.

Key Settings

Retrieval Parameters

Relevance threshold: Sets the minimum similarity threshold for retrieved passages. Lower values result in more selective retrieval.
Top n results: Specifies the maximum number of relevant passages to retrieve.

Additional Input parameters for the LLMs

Top k: Ranks the output tokens in descending order of probability, selects the first k tokens to create a new distribution, and it samples the output from it. Higher values result in more diverse answers, and lower values will produce more conservative answers.
Temperature (Temp): This affects the 'randomness' of the answers by scaling the probability distribution of the output elements. Increasing the temperature will make the model answer more creatively.
Top p: Works together with Top k, but instead of selecting a fixed number of tokens, it selects enough tokens to cover the given cumulative probability. A higher value will produce more varied text, and a lower value will lead to more focused and conservative answers.

FAQ

Check out the Frequently Asked Questions (FAQ) and please let us know if you encounter any problems.

[*] Performance consideration: On notebooks/PCs with dedicated GPUs, models such as llama3.1, mistral or gemma2 should be able to run smoothly and rapidly. On a standard notebook, or if you encounter any memory of performance issues, prioritize smaller models such as llama3.2 or qwen2.5:3b.

Development

Before committing, format the code using Black:

$ black -t py311 -S -l 99 .

Linters:

Pylance
flake8 (args: --max-line-length=100 --extend-ignore=E401,E501,E741)

For more detailed logging, set the LOG_LEVEL environment variable:

$ export LOG_LEVEL='DEBUG'

Author

License

GPLv3

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
bin		bin
data/sample_docs		data/sample_docs
exports		exports
images		images
learning-materials		learning-materials
log		log
ragsst		ragsst
.dockerignore		.dockerignore
.gitattribute		.gitattribute
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
FAQ.md		FAQ.md
LICENSE		LICENSE
README.md		README.md
app.py		app.py
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
run.sh		run.sh
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Ask your documents!

Retrieval Augmented Generation and Semantic-Search Tool (RAGSST)

Installation

Windows Users: Install WSL2 First

How to Install WSL

Option 1: Automatic Installation

Option 2: Manual Installation

Usage

Alternative usage option: Docker Compose

Key Settings

Retrieval Parameters

Additional Input parameters for the LLMs

FAQ

Development

Author

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

aihpi/ragsst

Folders and files

Latest commit

History

Repository files navigation

Ask your documents!

Retrieval Augmented Generation and Semantic-Search Tool (RAGSST)

Installation

Windows Users: Install WSL2 First

How to Install WSL

Option 1: Automatic Installation

Option 2: Manual Installation

Usage

Alternative usage option: Docker Compose

Key Settings

Retrieval Parameters

Additional Input parameters for the LLMs

FAQ

Development

Author

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages