Skip to content

felipejfc/codeqa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CodeQA

CodeQA is a command-line tool that helps you semantically search and analyze your codebase. It uses advanced language models to understand code context and generate meaningful responses about your code.

Features

  • 🔍 Semantic code search using embeddings
  • 📝 Interactive query mode
  • 🌲 Project structure awareness
  • 📋 LLM-ready prompt generation
  • 🎯 Smart file selection for context
  • 🚀 GPU/MPS acceleration support

Dependencies

The project requires Python 3.11+ and the following main dependencies:

chromadb>=0.4.22
torch>=2.0.0
transformers>=4.30.0
pathspec>=0.11.2
numpy
tqdm>=4.65.0
pyperclip

Installation

  1. Clone the repository:
git clone https://github.com/yourusername/codeqa.git
cd codeqa
  1. Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt

Usage

Basic usage:

python main.py --dir /path/to/your/codebase

Command Line Options

  • --dir: (Required) Path to the codebase directory to analyze
  • --db-path: Path to store embeddings database (default: "code_embeddings")
  • --n: Number of top results to return (default: 5)
  • --debug: Enable debug logging
  • --show-content: Show file contents in results
  • --code-extensions: List of file extensions to include (e.g., ".py .js .ts")
  • --full-path: Show full file paths in results
  • --copy-prompt: Enable prompt generation and copying to clipboard

Interactive Mode

After starting the tool, you can:

  1. Enter your queries about the codebase
  2. View matched files with relevance scores
  3. When using --copy-prompt:
    • Select specific files to include in the context
    • Get token count for the generated prompt
    • Have the prompt automatically copied to clipboard

Examples

Search for file handling code:

python main.py --dir ./myproject
> how does the file reading work

Generate an LLM-ready prompt:

python main.py --dir ./myproject --copy-prompt
> explain the authentication system

Analyze specific file types:

python main.py --dir ./myproject --code-extensions .py .ts .js

Project Structure

codeqa/
├── main.py              # Main entry point
├── utils/
│   └── file_utils.py    # File handling utilities
├── embeddings/
│   └── embedder.py      # Code embedding generation
├── tokenization/
│   └── chunker.py       # Code chunking logic
└── models/
    └── code_chunk.py    # Data models

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

semantic search for code snippets

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages