Testing and evaluating prompts for real world use is challenging.
Completitions are not determinstic and quality is unpredictable with all potential variables, especially with more complex prompts.
Prompt A:
Assistant: You are Tyrion Lannister, known for your sharp wit and frequent sarcasm in Game of Thrones. Answer all questions with a maximum of
${words_limit}
wordsUser: Tell me, am I
${what]
?
Prompt B:
Assistant: You are Marvin, the paranoid android from Hitchhiker's Guide to the Galaxy. Answer all questions with a maximum of
${words_limit}
words.User: Tell me, am I
${what]
?
- Answers from which prompt do users like more?
- Should I use
gpt-4
,gpt-3-turbo
, or is a free open-source model enough? - What's the ideal temperature setting?
- Do all user-submitted
${what}
values play nicely?
PromptOn helps you to:
- Evaluate your prompt template tweaks in-house
- A/B test versions with end users
- Continuosly monitor feedback
- And it doubles as your all-in-one log and monitor for your model calls
It's a REST API microservice designed to mix it into your existing ecosystem in a non intrusive way.
The API is still in alpha, may change without notice. However, the schema is largely stable and it will soon enter production when proper versioning will be introduced.
The easiest way to try is via the Prompton API documentation UI on our hosted staging environment.
There is no public signup currently but drop an email for early access: [email protected]
pip install prompton
npm install prompton
To install local dev env: Local setup
-
User Auth
Prompton users are those who handle prompts and the services that call inferences on behalf of the final users.
User auth is via JWT tokens. Once you have a user/password just click on authorize on Prompton API documentation UI.
From code you can use
/token
endpoint.OrgAdmins can add more users to their org via the
/users
endpoint.NB: All users in your org can create and change prompts and call inferences using your org API keys but only OrgAdmins can add new users or change the org settings.
-
OpenAI API key
Once you authenticated you need to assign an
openai_api_key
to your org:-
GET your
org_id
viaorgs/me
-
Set your
openai_api_key
using/orgs
PATCH:{ "access_keys": {"openai_api_key": "<your OpenAI API key>" }}
TIP: if you just want to play around then set
openai_api_key
to any string and mock responses -
-
Prompt
Prompts are your use-cases, "headers" for your different prompt template variants. Create one via
/prompts
-
Prompt version
Prompt version is the actual
template
with themode_config
andtemplate
to be used when the provider is called (inference
)Example prompt version with basic params:
{ "status": "Live", "provider": "OpenAI", "name": "Test template", "prompt_id": "<your prompt_id>", "template": [ { "role": "system", "content": "You are a sarcastic assistant. Answer all questions with maximum ${words_limit} words" }, {"role": "user", "content": "Tell me, am I ${what}"} ], "model_config": {"model": "gpt-3.5-turbo", "temperature": 1, "max_tokens": 500} }
NB: Only
Draft
status promptVersions can be updated (apart from changing the status to any non-Draft). It's to make sure inferences are comparable. If you need to change a prompt version then create a new one. Copy a prompt version is on the todo but until then it can to be handled by an UI. -
Inference - aka calling the model: POST
/inferences
{ "prompt_version_id": "<your prompt version id>", // or prompt_id and it will pick one of the Live prompt versions "end_user_id": "your_end_user_id_for_linking", // optional, passed on to provider "template_args": {"words_limit": "10", "what": "crazy"} }
It will:
- Populate the template with the passed values
- Log the request
- Send the request to provider
- Log response and send it back to client.
It also handles errors, timeouts and updates the inference accordingly. It will still process response if client disconnects before it arrives.
You can use a few easter eggs to test without a valid api key:
"end_user_id": "mock_me_softly"
"end_user_id": "timeout_me_softly"
"end_user_id": "fail_me_softly"
Successfull response
NB: full raw request data also accessible via
/inference
GET{ "id": "648230b9cdd2a95d2190ed23", "response": { "completed_at": "2023-06-08T19:49:15.293824", "completition_duration_seconds": 1.6043, "is_client_connected_at_finish": true, "isError": false, "first_message": { "role": "assistant", "content": "Not in any legally diagnosable manner.", "name": null, }, "token_usage": { "prompt_tokens": 34, "completion_tokens": 9, "total_tokens": 43, }, "raw_response": { "id": "chatcmpl-7PFv8dtMhnin1o8bxSnlRs6IUwLSS", "object": "chat.completion", "created": 1686253754, "model": "gpt-3.5-turbo-0301", "usage": {"prompt_tokens": 34, "completion_tokens": 9, "total_tokens": 43}, "choices": [ { "message": { "role": "assistant", "content": "Not in any legally diagnosable manner.", "name": null, }, "finish_reason": "stop", "index": 0, } ], }, }, }
-
Query inferences: see
/inferences
GET endpointCurrently only filter by
prompt_id
andprompt_version_id
supported. Let us know what else you need. -
Response feedback and flagging
Stay tuned: Support for end user feedback (following an inference) and optional external feedback (i.e.expert feedback/flagging etc.) is coming.
-
Checkout repo:
gh repo clone prompton/prompton cd prompton/server # workdir needs to be server folder for these instructions
-
Install
just
- Mac:
brew install just
- Just install on other platforms
- Mac:
-
Install packages with:
cd server just install
-
Create your local
.env
See: .env.example -
Launch MongDB:
just devdb-up
The launched container provides:
-
MongoDB instance on
localhost:27017
with the username/password set in your.env
It stores data in
.mongo-data
folder. DB is initialised with scripts in mongo-init-docker-dev folder at first run. -
Mongo Express on http://localhost:8081
Stop DB container:
just devdb-down
Purge DB and re-initialise:
just devdb-init-purge
If you want to connect to other instance instead of the dev container then configure your
.env
-
-
Run the server
just run
Running server in container if you need to test the container or you want to deploy it yourself:
docker build -t prompton-api-server:dev . docker run --env-file .env -it -p 8080:8080 prompton-api-server:dev
just test
just test-quick # if you want to cut a few secs by skipping slower tests (password hasing etc.)
-
Set your
.env
based on .env.example -
Optional: if you only have admin role db user create one for the api with only readWrite acces. Manually or use this script:
poetry run python -m scripts.db_init_add_api_user
-
Create initial org and SuperAdmin user
poetry run python -m scripts.db_init_indexes_and_first_user
This project is licensed under the GNU Affero General Public License v3.0 license - see the LICENSE file for details.