-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage endpoint for deleting multiple runs in one query #25167
Conversation
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. Join @jamiedemaria and the rest of your teammates on Graphite |
b5ac4d7
to
2b17776
Compare
87f8c68
to
3e9cd0b
Compare
2b17776
to
00b56ce
Compare
3e9cd0b
to
16a4546
Compare
00b56ce
to
f893974
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How many runs at once do you imagine being deleted here?
The existing instance method that deletes a single run both deletes the run (which should generally be very fast) and deletes all the events for that run (which can potentially be quite slow / time out): https://github.com/dagster-io/dagster/blob/master/python_modules/dagster/dagster/_core/instance/__init__.py#L1861-L1868
I am worried that this problem will be exacerbated if we are deleting multiple runs at once in a single web request
hmm it could be quite a lot of runs. the idea is to enable deleting backfills. I could scratch this approach and do some kind of batching delete in the |
something like
I'm not sure what factors to consider when determining if that's a better approach though |
Is delete_backfill being called in a graphql endpoint? I think that could still run into the same problem if so - ultimately we are subject to a 60 second timeout for an individual graphql request. The way that we solved this for bulk terminating runs was to have the frontend manage the 'bulk' part and issue individual termination requests, each of which is fast Another much more involved solution would be to invest in some kind of async bulk action system where the request triggers work, that work happens asynchronously elsewhere (e.g. in a daemon or some other backend system) and the frontend can then poll for status updates until that work is finished. |
yeah |
For this specific case i could imagine it moving the backfill into a DELETING state and then the backfill daemon picks it up and performs the actual deletions that need to happen before doing the actual deletion? |
that's an interesting idea |
not a full impl, but it was easier to make the pr than write it out - is this generally what you were thinking? would deleting the runs one at a time (or in batches?) also run into constraints? |
Summary & Motivation
Allows for deleting multiple runs in a single query. This supports deleting backfills in the stacked PR since we'll want to delete all associated runs as well
associated internal pr https://github.com/dagster-io/internal/pull/12089
How I Tested These Changes
new unit tests