This is a project for supplementing Llama's text generation with information from additional sources. It works by storing intermediate vectors (called context vectors) from the transformer stack in a searchable database. When a query is given to Llama, we find the most relevant document from the database and inject it into the Llama transformer stack to help provide extra context for the query, improving the quality of the answer. This project was graciously funded by SingularityNet's Deep Funding initiative.
The following are the various components of this project:
-
- Generate lower-fidelity versions of a context vector for fast searching. Analogous to Mipmaps in 3D rendering (https://en.wikipedia.org/wiki/Mipmap).
-
-
Efficiently store context vectors, queryable by article and section names. Uses
indexed_binary_db
under the hood. -
- A binary database that can consists of an index file and a data file. The index file stores the span of each entry
(start, end)
in the data file, and some metadata. The index is supposed to be small, so that it can be quickly loaded into memory to search for an entry based on their metadata and find their span in the data file. This span is then used to load the actual entry from the data file.
- A binary database that can consists of an index file and a data file. The index file stores the span of each entry
-
-
- Llama2 modified to allow extraction of the context vectors.
-
- Collection of scripts to generate data for use in Rigel. See
scripts/Readme.md
for details.
- Collection of scripts to generate data for use in Rigel. See
-
- Read files generated by https://github.com/mlabs-haskell/wikipedia_parser/
- Run
just tests
to run the module level tests. - Run
just cv_hier_db_test_e2e
to test thecv_hier_storage
module by generating one and running some sanity checks on it.
When we deliver this project to SingularityNet, we will provide our generated database of raw context vectors, hierarchical compression model, and database of compressed context vectors. If you do not have these items, go to the next section, then return here. Once you have these items, you may continue on with these instructions.
-
Download Llama
- Follow the instructions in
modified_llama/README.md
to download the Llama model.
- Follow the instructions in
-
Open the
rigel.ipynb
notebook either in VS Code or Jupyter notebook. Set theprompt
variable to whatever text you want completed. Run the notebook. Within a few seconds, you should see Llama finishing your prompt!
The steps in these instructions assume you are working without a database of context vectors, a hierarchical compression model, or the hierarchically compressed database of context vectors.
-
Parsing Wikipedia
- Download the (Wikipedia parser)[https://github.com/mlabs-haskell/wikipedia_parser] tool to this project's parent directory (i.e., this folder's parent directory should have both
rigel/
andwikipedia_parser/
in it). Go into that project's directory and follow the instructions there to gather text from Wikipedia so that it can be parsed by Rigel.
- Download the (Wikipedia parser)[https://github.com/mlabs-haskell/wikipedia_parser] tool to this project's parent directory (i.e., this folder's parent directory should have both
-
Create a list of all articles parsed by the Wikipedia parser
- The
output/subgraph
folder in the Wikipedia parser project will now contain a series of text files, each containing a list of articles. Concatenate them and store the result inall.txt
in this folder.
- The
-
Collecting context vectors
- We will now collect context vectors created from the parsed Wikipedia articles. In this folder, run
just generate_context_vectors
. This will create a database of raw context vectors
- We will now collect context vectors created from the parsed Wikipedia articles. In this folder, run
-
Generating TFIDF vectors
- Rigel uses TFIDF vectors generated from the raw Wikipedia text to train the hierarchical compression model. To generate these vectors, run
just generate_tfidf
.
- Rigel uses TFIDF vectors generated from the raw Wikipedia text to train the hierarchical compression model. To generate these vectors, run
-
Train the model
- Run
just train_compressor
to train the model. This will take some time.
- Run
Simply run just cv_hier_db_gen
to create the database. You are now ready to run Rigel!