Replies: 1 comment 2 replies
-
I was searching for the exact same thing and could not find anything. Simple things like renaming an asset while keeping the materialization events intact would be really cool. It feels like any serious data pipeline is going to evolve over time and dagster needs to be able to evolve with it. Ideally I should be able to write migrations which dagster would run first before doing anything else. Thinking about it though we have everything we need already to be able to do that before something more official is implemented. You could create a new job called migrations_job. In your deployment script, before actually starting dagster, you would run the command line Another option could be to use dynamic partition in an asset called migrations. You would still run a job right before deployment, but then the job would only detect which migration needs to run, then create dynamic partitions in the migrations asset for each new migration to run and then trigger materializations in the migrations asset for those partitions. This way you could keep track of migrations as part of a dagster asset, keep logs etc. |
Beta Was this translation helpful? Give feedback.
-
I was wondering if anyone has attempted writing data migrations in dagster, similar to db migrations in Ruby on Rails or Django? The idea being that I have some data warehouse tables where I need to modify the columns or the table structure in non-trivial ways. I would only need to run this code once, but it might take hours to complete and certain new dagster processes shouldn't run until the migration is complete. I want to keep it in code so I have a history of it, but after it completes successfully, it should never be run again.
Beta Was this translation helpful? Give feedback.
All reactions