-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collect all test results and handle cancelled tests properly #58
base: main
Are you sure you want to change the base?
Collect all test results and handle cancelled tests properly #58
Conversation
Given that the unit tests failed (in the Github Actions build), I think this needs another look. |
@ammolitor Can you tell what failed? |
9c0aa5f
to
435b390
Compare
@ammolitor CI passed. It would great if you can take another quick look. Thanks! |
Now I realized the scaler needs to be aware of this change. Converting this as a draft for now. |
…ents attempting to take over a dead agent's run
# TODO(qhoang) let's try this but there must be a better way | ||
# When an agent is cancelled, it has already incremented the __started__ counter | ||
# but will never get to increment the __ended__ counter | ||
_decrement(tr, ensemble_id, "started") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be idempotent, see https://apple.github.io/foundationdb/developer-guide.html#transactions-with-unknown-results
This PR addresses 3 issues:
started
can go beyondmax_runs
. In this scenario, the agent should wait for all tests to complete, rather than stop atmax_runs
and ignore the still running jobs