Today, the GenAI coding assistants are good enough to produce excellent proposals for even complex problems. But except for the simplest of cases, they rarely get it right the first time. Hence, the modal GenAI chat when coding is a refinement call in which the developers send the stack trace or logs to direct attention to specific issues. We can, however, automate and improve it.
When programmers evaluate a solution, they rely on 1. outputs of static analysis, 2. stack trace, 3. structured logs, and 4. outputs from CI/CD to pin point which tests are failing for what reason. We build a tool that automatically appends these outputs for a Python program and lets the GenAI iterate till there are no errors or until the maximum number of iterations is reached.
https://loogyai.streamlit.app/
cd loogy
streamlit run loogy/src/loogy/app.py
python src/loogy/process_dataset.py # Process the dataset
The app allows you to:
- Select a model provider (Ollama or OpenAI)
- Choose a specific model
- Enter a development task
- Start the development process
- Clear outputs
loogy/
├── outputs/ # Generated outputs
├── src/
│ └── loogy/
│ ├── app.py # Streamlit application
│ ├── crew.py # CrewAI implementation
│ ├── config/
│ │ ├── agents.yaml # Agent definitions
│ │ └── tasks.yaml # Task definitions
│ └── outputs/ # Module-specific outputs (if any)
└── README.md # This file
Data and scripts for evaluating which system --- naive or one that appends logs and stacktrace---produces the correct code more quickly.
- The Developer agent creates code based on the given topic
- The Tester agent writes unit tests for the code
- The Executor agent runs the tests and reports results
- The Exit agent checks if tests pass and decides whether to continue
- If tests fail, the system iterates with logs from previous runs
Atul Dhingra and Gaurav Sood