Skip to content

How do I clean up older data from the database? #12047

Answered by alangenfeld
alangenfeld asked this question in Q&A
Discussion options

You must be logged in to vote

One option is to use the python APIs against the DagsterInstance to query for older runs and delete them. This is a destructive operation that will remove the events, tags, and run record from the database. This will remove dagster s understanding that this run ever occurred, which can be particularly impactful to partitioned jobs and assets.

Perform this operation with great care.

An example script would look something like this.

import datetime

from dagster import DagsterInstance, RunsFilter

instance = DagsterInstance.get()

# define the time threshold for what is old enough, this example uses 1 week
week_ago = datetime.datetime.now() - datetime.timedelta(days=7)

old_run_records = in…

Replies: 6 comments 13 replies

Comment options

You must be logged in to vote
6 replies
@alangenfeld
Comment options

alangenfeld Sep 1, 2023
Maintainer Author

@Lukashab
Comment options

@prha
Comment options

prha Dec 20, 2023
Maintainer

@NiallRees
Comment options

@SiviP-Glossai
Comment options

Answer selected by alangenfeld
Comment options

You must be logged in to vote
1 reply
@alangenfeld
Comment options

alangenfeld Sep 1, 2023
Maintainer Author

Comment options

You must be logged in to vote
3 replies
@prha
Comment options

prha Dec 20, 2023
Maintainer

@AndreaGiardini
Comment options

@prha
Comment options

prha Apr 9, 2024
Maintainer

Comment options

You must be logged in to vote
1 reply
@cleboo
Comment options

Comment options

You must be logged in to vote
1 reply
@stefanadelbert
Comment options

Comment options

You must be logged in to vote
1 reply
@stefanadelbert
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
area: storage Related to persistent storage