Dialogs are collected and self-annotated through the chat interface and logged in 3 formats:
- json [example] - Primary log format. Logs all events and variables. Used for training and evaluation.
- txt [example] - Convenient, human-readable format.
- py [example] - Python executable to replay a dialog's events.
Dialog datasets are generated from the chat interface's json dialogs above.
cd dataset/
python make_dataset.py \
--num_test_files <num_dialogs_to_allocate_to_test_set> \
--num_dev_files <num_dialogs_to_allocate_to_dev_set> \
<input_dir_with_json_dialogs> \
<output_dir> \
<output_format: {click_level,action_level,json}>
The command above can generate datasets in 3 formats: click-level, action-level, and json.
click-level [example] - Each action performed by the agent is annotated with PREDICT.
# Note that there are 4 *separate* prediction annotations to train/evaluate on.
PREDICT: find_place
PREDICT: Urth Cafe
PREDICT: 33.9816425
PREDICT: -118.4409761
action-level [example] - Agent actions and parameters are grouped together.
# Only a single prediction annotation to train on.
PREDICT: [ACTION] find_place [PARAM] Urth Cafe [PARAM] 33.9816425 [PARAM] -118.4409761
json [example] - Leave the json dialogs used as input unchanged, but split them into train/dev/test directories.
We collected 497 dialogs for specifying trip destination. In each of these dialogs, the user has a destination in their mind which they convey to the agent and the agent has to confirm/clarify the destination until both the agent and the user come to an agreement about where they are driving to. After the completion of the dialog, a map is shown to the user to ascertain whether it was the intended destination or not.
The dialogs are split into train, development and test splits. As mentioned previously, there are 3 different formats of the dataset. The links to these formats of the dataset are below
The train/dev/test split of the dataset is as follows:
Train | Dev | Test |
---|---|---|
404 | 46 | 47 |
This dialog dataset can be used to train automated agents and evaluate the various automated agents developed. For more details please refer to training and evaluating a GPT agent.