Uses GPT-4 to automatically generate tests based on pytest
.
A cli is exposed at ibl_github_bot/__main__.py
.
$ python -m ibl_github_bot --help
Usage: python -m ibl_github_bot [OPTIONS]
Options:
--repo TEXT Repository to clone. Must be of the format
username/reponame. eg. ibleducation/ibl-ai-github-
bot
--branch TEXT Branch to clone repository from.
-f, --file TEXT Target file in repository to test. Defaults to all
files. You can pass multiple files with -f file1 -f
file2
--cleanup Delete cloned repository after test generation.
--github-token TEXT Github token used to authenticate and clone
repository. Token must have write access to the
repository.
--github-username TEXT Username associated with the github token
--help Show this message and exit.
For example:
$ python -m ibl_github_bot --repo ibleducation/ibl-ai-bot-app --branch slack --cleanup -f ibl_ai_bot/views.py
Important
You may export your GitHub token as an environment variable or place it in a .env
file in the current working directory.
Name the environment variable GH_TOKEN
A new branch and related pull request will be created on the repository specified containing the generated tests.
Warning
Do not blindly merge the pull requests created. Always check out the pull request and run the tests.
Place the following variables in an .env
file in the current working directory, or exported as system environment variables:
- GH_TOKEN: This stores a valid GitHub token to be used by the bot to pull repositories, push commits and create pull requests.
- GH_USERNAME: The appropriate username associated with the GitHub token.
- OPENAI_API_KEY: A valid OpenAI key with GPT4 access.
Below is a sample .env
file:
OPENAI_API_KEY=sk....
GH_USERNAME=username
GH_TOKEN=gh-............
The bot is capable of loading configurations from the specified repository to alter its behaviour.
These configurations include specifying the programming language, frameworks used, testing library as well as module dependencies. Configurations are specified in ibl_test_config.yaml
Below is a sample configuration file:
exclude: # project wide excludes
- "tests"
test_library: pytest
frameworks:
- Django
- Djangorestframework
modules: # configurations for specific modules/directories.
directory1:
depends_on:
- directory2
- directory3
exclude:
- "*.txt"
- "tests.py"
directory2:
depends_on:
- directory3
- directory4
exclude:
- "templates"
The exclude
entry lists files or directories to ignore for a module (or globally if it is a top level configuration).
Setting module dependencies appropriately can largely reduce LLM costs and context size leading to better performance. However, wrong dependency relationships can be detrimental.
When no configuration file is provided in the repository, the following configuration file is used instead:
exclude:
- .git
- __pycache__
- tests
- migrations
- requirements
test_library: pytest
frameworks:
- django
- djangorestframework
language: python
- Should the tests depend on some lesser known projects (eg. some private apps in separate repositories) it is best to manually write sample tests from which the LLM can learn how to generate specific fixtures and how those external dependencies are used.
- This tool works best for mono repos, where the LLM can understand the entire project scope at once.
- Provide an
ibl_test_config.yaml
file at the root directory of the repository for optimal performance. Ensure that all entries provided are correct. Also make sure that all dependency relationship specified are exhaustive. - For very small projects, dependency relationships can be ignored in favor of loading the entire project as context.
- In situations where the project depends on a lesser known package that may be essential to testing the project, some tests may end up being incorrect.
- Merging the code generated by the LLM may not be the best possible approach. It is essential to check out the test branch created, run the tests and issue fixes where necessary.