Compare the answering capabilities of different LLMs - for example LlaMa, GPT4o, Mistral, Claude, Cohere, others - against user provided document(s) and questions.
Explore the docs »
Report Bug
·
Request Feature
Table of Contents
Compare the answering capabilities of different LLMs - for example LlaMa, ChatGPT, Cohere, Falcon - against user provided document(s) and questions.
Specify the different models, embedding tools and vector databases in configuration files.
Maintain reproducable experiments reflecting combinations of these configurations.
The instructions assume a Python environment with Poetry installed. Development of the tool is done in Python 3.11. While Poetry is not actually needed for the tool to function, the examples assume Poetry is installed.
The tool uses 3rd party hosted inference APIs. API keys need to be specified as environment variables.
The services used:
The API keys can be specied in a .env file. Use the provided .env.example file as an example (enter your own API keys and rename it to '.env').
At present, all services used in the example configuration have free tiers available.
sqlite3 v3.35 or higher is required. This is oftentimes installed as part of a Linux install.
Navigate to the directory that contains the pyproject.toml file, then execute the
poetry install
command.
For the examples the project comes with a public financial document for a Canadian Bank (CIBC) as source pdf file.
In order to run the first example, ensure to specify your HuggingFace API key.
Use the command
poetry run quke
to ask the default questions, using the default embedding and the default LLM.
The answers provided by the LLM - in addition to various other logging messages - are saved in the ./output/ or ./multirun directories in separate date and time subdirectories, including in a file called chat_session.md
.
The defaults are specified in the config.yaml file (in the ./quke/conf/ directory).
Ensure to specify your Cohere API key before running.
As per the configuration files, the default LLM is Falcon and the default embedding uses HuggingFace embedding.
To specify a different LLM - Cohere in this example - run the following:
poetry run quke embedding=huggingface llm=cohere question=eps
Ensure to specify your OpenAI API key before running.
The LLMs, embeddings, questions and other configurations can be captured in experiment config files. The command
poetry run quke +experiment=openai
uses an experiment file openai.yaml (see folder ./config/experiments) which specifies the LLM, embedding and questions to be used. It is equivalent to running:
poetry run quke embedding=openai llm=gpt4o question=eps
Multiple experiments can be run at once as follows:
Ensure to specify your Replicate API key before running.
poetry run quke --multirun +experiment=openai,llama2
LLMs, embeddings, questions, experiments and other items are specified in a set of configuration files. These are saved in the ./config directory.
The Hydra package is used for configuration management. Their website explains more about the configuration system in general.
Four different models are specified (ChatGPT, LlaMa2, Falcon, and Cohere); using 4 different APIs (OpenAI, HuggingFace, Cohere and Replicate).
Additional LLMs (or embeddings, questions) can be set up by adding new configuration files.
The documents to be searched are stored in the ./docs/pdf directory. At present only pdf documents are considered.
Note to set vectorstore_write_mode
to append
or overwrite
in the embedding configuration file (or delete the folder with the existing vector database, in the ./idata folder).
The free tiers for the third party services generally come with fairly strict limitations. They differ between services; and may differ over time.
To try out the tool with your own documents it is best to start with a single small source document, no more than two questions and only one combination of LLM/embedding.
Error messages due to limitations of the APIs are not always clearly indicated as such.
The tool uses third party APIs (OpenAI, HuggingFace, Cohere, Replicate). These process your source documents and your questions, to the extent that you provide these. They track your usage of their APIs. They may do other things; I do not know.
The tool uses the LangChain Python package to interact with the third party services. I do not know if the package 'leaks' any of the data in any way.
In general I do not know to what extent any of the data is encrypted during transmission.
The tool shares no information with me.
Distributed under the MIT License. See LICENSE.txt
for more information.
Project Link: https://github.com/ejoosterop/quke