LLM Context Enhancement Experiment

This project implements an experimental framework for evaluating how providing relevant high quality context affects the quality of language model responses. It uses a vector database to retrieve similar question-answer pairs and compares model outputs with and without this additional context.

Overview

The system:

Loads reference QA pairs from specified datasets
Stores them in a vector database for similarity search
For each experimental question:
- Retrieves similar QA pairs as context
- Generates responses both with and without context
- Evaluates response quality using a reward model
- Stores results for analysis

Features

Parallel processing for efficient vector database population
Support for multiple LLM architectures
Configurable embedding and reward models
SQLite results storage with comprehensive metrics
GPU acceleration support
Batched processing for memory efficiency

Requirements

Python 3.8+
PyTorch
Transformers
vLLM
LangChain
ChromaDB
SQLAlchemy
Datasets (HuggingFace)
tqdm

Installation

Clone the repository
Install dependencies:

pip install -r requirements.txt

Configuration

Edit config.py to customize:

Dataset sources
Model selections
Database paths
Experiment parameters

Default configuration:

reference_datasets = [
    ("mlabonne/orca-agentinstruct-1M-v1-cleaned", "default"),
]
experiment_dataset = "HuggingFaceTB/smoltalk"
embedding_model = "BAAI/bge-small-en-v1.5"
llm_model = "Qwen/Qwen2.5-7B-Instruct"
reward_model = "internlm/internlm2-7b-reward"

Usage

Quick Start

Run the complete experiment:

bash run_experiment.sh

This will:

Populate the vector database using parallel processing
Execute the main experiment
Store results in SQLite database

Manual Execution

Populate vector database:

python parallel_insertion.py --use_gpu

Run experiment:

python main.py

Additional Options

Vector database population:

# CPU-only mode
python parallel_insertion.py --num_workers 4

# Specify GPU count
python parallel_insertion.py --use_gpu --num_workers 2

Project Structure

config.py: Configuration parameters
data_loader.py: Dataset loading utilities
database.py: Vector and SQL database management
experiment.py: Core experimental logic
model_manager.py: Model loading and inference
parallel_insertion.py: Parallel vector database population
main.py: Experiment entry point
run_experiment.sh: Convenience script

Key Components

DataLoader

Handles loading and preprocessing of reference and experimental datasets.

DatabaseManager

Manages two database systems:

ChromaDB for vector similarity search
SQLite for experimental results storage

ModelManager

Handles:

Model loading/unloading
Response generation
Response quality evaluation

OptimizedExperiment

Orchestrates the experimental process:

Vector database setup
Batch processing of questions
Context-based response generation
Quality evaluation
Results storage

Results Storage

Results are stored in SQLite with the following schema:

question: Original question
context_score: Similarity score of retrieved context
context_qa: Retrieved similar QA pair
with_context_answer: Model response with context
without_context_answer: Model response without context
with_context_score: Quality score with context
without_context_score: Quality score without context
with_context_better: Boolean indicating if context improved response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

LLM Context Enhancement Experiment

Overview

Features

Requirements

Installation

Configuration

Usage

Quick Start

Manual Execution

Additional Options

Project Structure

Key Components

DataLoader

DatabaseManager

ModelManager

OptimizedExperiment

Results Storage

Files

README.md

Latest commit

History

README.md

File metadata and controls

LLM Context Enhancement Experiment

Overview

Features

Requirements

Installation

Configuration

Usage

Quick Start

Manual Execution

Additional Options

Project Structure

Key Components

DataLoader

DatabaseManager

ModelManager

OptimizedExperiment

Results Storage