Replies: 3 comments
This comment was marked as off-topic.
This comment was marked as off-topic.
-
The approach suggested focuses on the ongoing flow of an app, leveraging git diff for real-time test generation. However, for git diff to be meaningful, the agent needs initial context or pre-existing test coverage. Without this, the agent wouldn't know what the changes are affecting or what's expected behavior versus a genuine issue. For example, if we assigned this task to a human QA, they would first need to understand what the app does, establish a baseline of test cases (with an instance of the app before the merge/commit), and then analyze what the diff changes. For AI to generate relevant tests, it needs a baseline—a snapshot of the app’s expected behavior. This means: Pre-existing tests or context Even without a fixed test suite, the system needs a way to understand:
This could come from:
Understanding changes in context
Generating targeted test cases Once the AI understands the context of the change, it can dynamically:
To summarize, a first-time setup flow is important for helping the agent succeed in ongoing flows. This step doesn't have to be manual (how it is currently); it can be automated. I've outlined this in a scoping doc last week, summarizing:
This way, every subsequent git diff is relative to something—not just an isolated patch of code. |
Beta Was this translation helpful? Give feedback.
-
A first MVP towards this is having tools that can detect the framework used (e.g. NextJS), analyze the code, write a test plan, write tests, and then execute the tests. High-level steps (actual command may be different)
|
Beta Was this translation helpful? Give feedback.
-
From some conversations last week, I was interested in exploring more the idea of not having QA at all as part of the code base / build step. Instead, only have it as maintaining a high-quality production experience–business logic will still be tested by unit tests.
This is how a QA agent could work for production deployments.
main
For feature deployments, bugs are left as comments on the PR instead.
Shortest generator prompt:
Shortest test runner prompt:
The resulting issues would be recorded with a single tool: note issue, which has a text note and screenshots of the UX (ideally, we would support something like instant reply somehow).
Beta Was this translation helpful? Give feedback.
All reactions