GitHub - avrtt/pochemuchka: All-in-one library for automatic prompt engineering, testing & load balancing for your AI models in production and R&D

ver. 1.1.12 • Documentation (WIP): 🇺🇸 EN | 🇷🇺 RU

Commands • Style • Conventions

This is an all-in-one library built as part of my other SaaS project. It provides various techniques for managing, optimizing and testing prompts for LLMs in both production and research environments. With the client's permission, this demo illustrates a system designed to dynamically integrate data, monitor performance metrics such as latency and cost, and efficiently balance loads among various AI models.

The system can help to simplify the development and testing of prompt-based interactions with LLMs. By combining real-time monitoring, dynamic caching and integration across multiple models, it offers tools for understanding the capabilities of AI-driven solutions. You can refine your prompt design or automatically adapt learning systems to evolving contexts.

Tip

Check out some simple usage examples in examples/getting_started.ipynb

Some features:

Dynamic prompt crafting
Adapt and update prompts on the fly, ensuring you avoid issues like budget overflows by integrating live data.
Multi-model compatibility
Easily switch between various LLM providers, distributing workload intelligently based on configurable weights.
Real-time performance insights
Gain immediate visibility into metrics such as latency, token usage and overall cost.
CI/CD testing
Automatically generate and execute tests during prompt calls by comparing responses with an ideal output provided by a human expert.
Efficient prompt caching
Leverage a caching system with a short TTL (Time-To-Live) of five minutes to ensure that prompt content is always current while minimizing redundant data fetches.
Asynchronous interaction logging
Log detailed interaction data in the background so that your application's performance remains unaffected.
User feedback integration
Enhance prompt quality continuously by incorporating explicit feedback and ideal answers for previous responses.

Architecture

The demo implements a smart caching mechanism with some lifespan for each prompt. This includes automatic refresh (every prompt call checks for an updated version from the server, ensuring that the cached version is always fresh), local backup (in case the central service is unavailable, the system reverts to a locally stored version of the prompt) and version synchronization (to maintain consistent versions across both local and remote environments).

The system supports two distinct methods for creating tests to ensure the quality of prompt outputs: inline and explicit. The first one includes test data with an ideal response during the LLM call, which automatically triggers test creation. The second invokes a test creation method for a given prompt directly, to compare the LLM's response against a predefined ideal answer.

Logs interact asynchronously, so logging happens in the background without impacting response times. You can automatically capture details like response latency, token count and associated costs, store complete snapshots of prompts, context and responses for analysis.

Feedback is integral to continuous improvement. You can attach ideal answers to previous responses, prompting the system to generate new tests and refine prompt formulations.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
api		api
behavior		behavior
calls		calls
chat		chat
configs		configs
constraints		constraints
data		data
db		db
deprecated		deprecated
documentation		documentation
evaluators		evaluators
examples		examples
exceptions		exceptions
extras		extras
frontend		frontend
logs		logs
models		models
prompts		prompts
responses		responses
tests		tests
utils		utils
workers		workers
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
attempt_to_call.py		attempt_to_call.py
azure_models.py		azure_models.py
base_prompt.py		base_prompt.py
behavior.py		behavior.py
chat.py		chat.py
claude_constants.py		claude_constants.py
claude_model.py		claude_model.py
claude_responses.py		claude_responses.py
conftest.py		conftest.py
constants_general.py		constants_general.py
evaluate_prompt_quality.py		evaluate_prompt_quality.py
exceptions_general.py		exceptions_general.py
extras_prompt.py		extras_prompt.py
models.py		models.py
models_utils.py		models_utils.py
openai_exceptions.py		openai_exceptions.py
openai_models.py		openai_models.py
openai_responses.py		openai_responses.py
openai_utils.py		openai_utils.py
pochemuchka.py		pochemuchka.py
prompt_general.py		prompt_general.py
pyproject.toml		pyproject.toml
response_parser.py		response_parser.py
responses_general.py		responses_general.py
save_worker.py		save_worker.py
service_general.py		service_general.py
service_utils.py		service_utils.py
settings.py		settings.py
test_chat.py		test_chat.py
test_ci_cd.py		test_ci_cd.py
test_create_test.py		test_create_test.py
test_integrational.py		test_integrational.py
test_model.py		test_model.py
test_pricing.py		test_pricing.py
test_prompt.py		test_prompt.py
test_response_parsers.py		test_response_parsers.py
test_stream.py		test_stream.py
test_utils.py		test_utils.py
user_prompt.py		user_prompt.py
utils_general.py		utils_general.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Architecture

License

About

Languages

License

avrtt/pochemuchka

Folders and files

Latest commit

History

Repository files navigation

Architecture

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages