Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage endpoint for deleting multiple runs in one query #25167

Closed

Conversation

jamiedemaria
Copy link
Contributor

@jamiedemaria jamiedemaria commented Oct 9, 2024

Summary & Motivation

Allows for deleting multiple runs in a single query. This supports deleting backfills in the stacked PR since we'll want to delete all associated runs as well

associated internal pr https://github.com/dagster-io/internal/pull/12089

How I Tested These Changes

new unit tests

Copy link
Contributor Author

jamiedemaria commented Oct 9, 2024

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @jamiedemaria and the rest of your teammates on Graphite Graphite

@jamiedemaria jamiedemaria changed the title wip storage endpoint for deleting multiple runs in one query Oct 9, 2024
@jamiedemaria jamiedemaria force-pushed the jamie/delete-mmultiple-runs branch from b5ac4d7 to 2b17776 Compare October 10, 2024 15:05
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-retries branch from 87f8c68 to 3e9cd0b Compare October 10, 2024 16:44
@jamiedemaria jamiedemaria force-pushed the jamie/delete-mmultiple-runs branch from 2b17776 to 00b56ce Compare October 10, 2024 16:45
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-retries branch from 3e9cd0b to 16a4546 Compare October 11, 2024 13:39
@jamiedemaria jamiedemaria force-pushed the jamie/delete-mmultiple-runs branch from 00b56ce to f893974 Compare October 11, 2024 13:39
@jamiedemaria jamiedemaria marked this pull request as ready for review October 11, 2024 13:40
Copy link
Member

@gibsondan gibsondan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many runs at once do you imagine being deleted here?

The existing instance method that deletes a single run both deletes the run (which should generally be very fast) and deletes all the events for that run (which can potentially be quite slow / time out): https://github.com/dagster-io/dagster/blob/master/python_modules/dagster/dagster/_core/instance/__init__.py#L1861-L1868

I am worried that this problem will be exacerbated if we are deleting multiple runs at once in a single web request

@jamiedemaria
Copy link
Contributor Author

hmm it could be quite a lot of runs. the idea is to enable deleting backfills. I could scratch this approach and do some kind of batching delete in the delete_backfills method, rather than adding a method to do all the run deletion at once.

@jamiedemaria
Copy link
Contributor Author

something like

def delete_backfill(self, backfill_id):
   run_ids = ...
  iterate through run ids in batches of N and to delete runs 
  delete backfill 

I'm not sure what factors to consider when determining if that's a better approach though

@gibsondan
Copy link
Member

Is delete_backfill being called in a graphql endpoint? I think that could still run into the same problem if so - ultimately we are subject to a 60 second timeout for an individual graphql request. The way that we solved this for bulk terminating runs was to have the frontend manage the 'bulk' part and issue individual termination requests, each of which is fast

Another much more involved solution would be to invest in some kind of async bulk action system where the request triggers work, that work happens asynchronously elsewhere (e.g. in a daemon or some other backend system) and the frontend can then poll for status updates until that work is finished.

@jamiedemaria
Copy link
Contributor Author

yeah delete_backfill would be called by gql. I wanted to be able to support backfills with the multi-select action menu in the runs page, which would require ability to delete backfills. But it seems like adding that is much more complicated than i originally thought. I think it makes sense to punt on it and reconsider if we get lots of requests for the feature

@gibsondan
Copy link
Member

For this specific case i could imagine it moving the backfill into a DELETING state and then the backfill daemon picks it up and performs the actual deletions that need to happen before doing the actual deletion?

@jamiedemaria
Copy link
Contributor Author

that's an interesting idea

@jamiedemaria
Copy link
Contributor Author

not a full impl, but it was easier to make the pr than write it out - is this generally what you were thinking? would deleting the runs one at a time (or in batches?) also run into constraints?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants