Harnessing the Vortex: Building a Document-Based Q&A System Using OpenAI and Python

Leveraging the Power of Large Language Models and the Langchain Framework for an Innovative Approach to Document Querying

This project aims to implement a document-based question-answering system using the power of OpenAI's GPT-3.5 Turbo model, Python, and the Langchain Framework. It processes PDF documents, breaking them into ingestible chunks, and then stores these chunks into a Chroma DB vector database for querying. It complements a Medium article called Howto Build a Document-Based Q&A System Using OpenAI and Python.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

To install the project, you need to have Python installed on your machine.

Installing

The project uses Poetry for managing dependencies. After cloning the repository, navigate to the project directory and install dependencies with the following commands:

poetry install
poetry shell

Running the Application

Before you can run ingesting or querying you have to make sure that a .env file exists. This file should have a single line that read OPENAI_API_KEY=yourkey

Ingesting Documents

To ingest documents, place your PDF files in the 'docs' folder make sure that you are in the app folder and run the following command:

cd app
python ingest.py

Querying Documents

To query the ingested documents, make sure that you are in the app folder, run the following command and follow the interactive prompts:

cd app
python query.py

Running the Streamlit App

To visualize and interact with the system via the Streamlit app, run the following command:

streamlit run streamlit_app.py

Authors

Patrick Kalkman

License

This project is licensed under the MIT license - see the LICENSE.md file for details

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
app		app
docs		docs
tests		tests
.dockerignore		.dockerignore
.flake8		.flake8
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
vortex.png		vortex.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Harnessing the Vortex: Building a Document-Based Q&A System Using OpenAI and Python

Leveraging the Power of Large Language Models and the Langchain Framework for an Innovative Approach to Document Querying

Getting Started

Prerequisites

Installing

Running the Application

Ingesting Documents

Querying Documents

Running the Streamlit App

Authors

License

Acknowledgments

About

Releases

Packages

Languages

License

PatrickKalkman/python-docuvortex

Folders and files

Latest commit

History

Repository files navigation

Harnessing the Vortex: Building a Document-Based Q&A System Using OpenAI and Python

Leveraging the Power of Large Language Models and the Langchain Framework for an Innovative Approach to Document Querying

Getting Started

Prerequisites

Installing

Running the Application

Ingesting Documents

Querying Documents

Running the Streamlit App

Authors

License

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages