Improving Tool Acquisition in Large Language Models

Supporting material for the thesis "Thinking Forwards, Backwards, and in Code: Improving Tool Acquisition in Large Language Models"

Authors: Huey Sun

Repository Structure

Results: The ToolSandbox evaluations results can be found here. Each result has a summary at the end, which was used to create the tables in the thesis.
Data: The data used to finetune the models can be found here. As Mistral's lack of chat formatting means that incremental masking is impossible (the output is not perfectly autoregressive at each conversation turn), each assistant message has been individually preformatted.
Scripts: The scripts used to convert json conversations to Mistral's expected formatting can be found here.

You can find my fork of the torchtune library here, where I add support for tool chat and Mistral formatting in training.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
results		results
scripts		scripts
README.md		README.md
figure.png		figure.png
thesis.pdf		thesis.pdf