Time Based Partitioning (From Start Date + Custom Interval e.g. every 3 days) #22782
Unanswered
datapay-ai
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I saw this thread from a while ago that seems to have the same problem as me ( #19647 ) but it doesn't look like it ever got solved. I trie using TimeWindowPartitioning to have a start date of '2017-01-01' and and a cron schedule of every 3 days, but rather than going up in 3 days continuously from '2017-01-01' instead it just reset the counter at the start of each new month (so an interval between 1 partition and the next, at the end of the month' could have 4 or 2 days between them rather than the 3 I specified.
I managed to get it to 'work' with static partitioning, but then when I try to hold shift and click to select multiple partitions for backfilling, it won't let me, so I have to click one by one which isn't tenable (+ it has no time awareness as well as it's an abuse of static partitioning for this specific use case).
Trying to use the scheduling to do this doesn't work for my use case because I get dumps of the data on a monthly basis, and want to process all missing partitions there and then. I also don't want to do that because I want this to be for a single job which currently already has unpartitioned, daily partitioned and monthly partitioned dbt assets in it, and I would like to keep it that way say I have one smooth monthly job that runs all of these at the end of the month.
Is there any good way to deal with this? It seems my only option atm is to do daily partitioning and just let it run for days that I know will produce no data because of how I wrote the downstream DBT code, but this causes a lot of dead runs and wasted time.
Beta Was this translation helpful? Give feedback.
All reactions