This is a FastAPI and Ray backend for QueryLake. Run the following commands to set up a conda environment. You must have CUDA installed to run models. We recommend the Lambda Stack
conda create --name QueryLake python=3.10
conda activate QueryLake
After this, install pytorch using the conda installation instructions on this webpage. Then continue with the following command.
pip install -r requirements.txt
One of the dependencies installed is exllamav2
, however this occassionaly raises issues to the build. To safely install it, you should build from source by cloning it doing the following:
git clone https://github.com/turboderp/exllamav2
cd exllamav2
pip install -r requirements.txt
pip install .
cd ../
rm -rf exllamav2
We currently support tesseract for OCR. This requires apt installing tesseract like so:
sudo apt install tesseract-ocr
sudo apt install libtesseract-dev
The database is a ParadeDB container. To initialize it, you must have docker and docker-compose installed (use these instructions). Once these are installed, you can run the following to start or completely reset the database:
./restart_database.sh
To set up your models, run the setup.py
CLI like so and follow the instructions:
python setup.py
I recommend using the presets for now, as custom model additions are under development.
I recommend starting a head node for ray clusters first. This initiates the ray dashboard, and may make it easier to connect serve deployments in the future. you can do so as follows:
ray start --head --port=6379 --dashboard-host 0.0.0.0
To start the server, run
serve run server:deployment
Server settings are generated in config.json
.
The file can be modified to your preferred settings.