Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancements for FAQ Chatbot for Vitess #1364

Merged
merged 4 commits into from
Feb 7, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions programs/summerofcode/2025.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,3 +62,30 @@ You can find the project ideas from previous year [here](./2024.md).
- Valentin Delaye (@jonesbusy, [email protected]) - primary
- Feynman Zhou (@FeynmanZhou, [email protected])
- Upstream Issues: https://github.com/oras-project/oras-java/issues

#### Vitess

#### Enhancements for FAQ Chatbot for Vitess

Vitess is a distributed database system built on MySQL. Developers often need to search through documentation, Slack
discussions, and GitHub issues to find answers. We are starting a project to implement an AI-powered FAQ chatbot using
**Retrieval-Augmented Generation**, integrating **vector search** with an **LLM** (such as OpenAI, DeepSeek,
GPT-4, Mistral, Llama 3). The chatbot will be available via a **CLI and Slack bot** for developer support.

In the next phase, which will be implemented in this Summer Of Code (SOC) project, we will be adding more features like:
* Content filtering for chatbot safety and response validation
* Fine-tuning the model for improved accuracy
* Pipelines for refreshing data from new/updated docs
* Caching previous replies to reduce LLM costs
* Rate-limiting
* Better benchmarking for iterative improvements
* User feedback integration to improve relevancy


- Expected Outcome: Improved chatbot that provides accurate Vitess-related answers via CLI and Slack, using indexed documentation and discussions for retrieval.
- Recommended Skills: golang, python, LLM APIs, vector databases
- Expected project size: large (~350 hour projects)
- Mentor(s):
- Rohit Nayak (@rohit-nayak-ps, [email protected])
- Manan Gupta (@GuptaManan100, [email protected])
- Upstream Issue: https://github.com/vitessio/vitess/issues/17690