To get started with LLMLingua-2 experiments, simply install it using pip:
pip install llmlingua
To collect your own data using GPT-4, install the following packages:
pip install openai==0.28
pip install spacy
python -m spacy download en_core_web_sm
To train your own compressor on the collected data, install:
pip install scikit-learn
pip install tensorboard
We release our collected GPT-4 compression result at HF after review. We also provide the whole data collection pipeline at collect_data.sh to help you construct your custom compression dataset.
To train a compressor on the collected data, simply run train.sh
We provide a script compress.sh to compress the original context on several benchmarks. After compression, run evaluate.sh to evalate on down-stream task using the compressed prompt.