-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dagster-dbt] Make default backfill_policy=None always for @dbt_asset #22280
base: master
Are you sure you want to change the base?
[dagster-dbt] Make default backfill_policy=None always for @dbt_asset #22280
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
47115c5
to
2a1e03c
Compare
2a1e03c
to
73a7421
Compare
|
||
assert my_dbt_assets.backfill_policy == expected_backfill_policy | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole test is no longer needed as there is no special backfill policy logic in @dbt_assets
anymore.
|
||
asset_job = define_asset_job("asset_job", [foo, my_dbt_assets]) | ||
|
||
assert Definitions(assets=[foo, my_dbt_assets], jobs=[asset_job]).get_job_def("asset_job") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirmed this test was failing before the changes.
This is a breaking change but I think the original behavior is so counterintuitive that we should consider it a bug, unless I am missing something obvious. Want @rexledesma to weigh in ofc since he has more context. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding @sryza here for visibility.
The non-standard default for backfill_policy
in @dbt_assets
was added in #17822.
The context was that the initial introduction of backfill_policy
was providing a broken experience for folks. Before, they could do a single-run backfill through the UI. Afterwards, they had to manually specify BackfillPolicy.single_run()
to get that behavior.
This change would remove this bandaid for users.
Part of the problem was, after the introduction of backfill_policy
, the UI did not clearly indicate that the user now had to take action in code to accomplish a single-run backfill. Do we have that guidance in the UI now?
Furthermore, when defining a mismatch of backfill_policy
's from separate assets combined in a job, do we have an appropriate error message for the user to unify the backfill_policy
of these definitions?
That makes sense, but is there something special about dbt and
The UI has some text notifying the user that backfill policies will be used, but I don't think it provides specific guidance like: "if you want a single-run backfill, set
|
The current message describes the definition error, but does not prescribe an explicit remedy. Like which definitions should I modify to ensure that the backfill policies are the same? I would love to have an error message that mentions the names of the underlying node definitions violating the constraint and the corresponding backfill policies. |
@schrockn Since @rexledesma laid out the reasoning for the current behavior, can you weigh in as to whether to proceed with this PR? Also I've improved the error message for multiple backfill policies in an asset job here: #22433 |
@rexledesma I'm not familiar enough with dagster-dbt to know this without digging-- will partitioned assets created with Asking because there are users using partitioned dbt assets (see here) who seem to be not using the single-run-backfill functionality, because they've been backfilling using jobs. |
@smackesey They don't need a custom IO manager. Instead, they'll need to appropriately shuffle in the partition information to the dbt invocation in the body of their execution function for things to work properly. https://docs.dagster.io/integrations/dbt/reference#building-incremental-models-using-partitions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we need this anymore -- did we end up fixing this behavior in the core framework?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q mgmt
Summary & Motivation
@dbt_assets
currently sets a default backfill policy ofBackfillPolicy.single_run()
if there is aTimeWindowPartitionsDefinition
. This is non-standard behavior since all other assets have a default backfill policy ofNone
.With the recent fix of job backfills to respect backfill policies, this has caused problems for some users since all assets in a job need to have the same backfill policy, so users who have a mix of dbt and regular assets in the same job without setting a backfill policy are now seeing failures.
This PR removes the non-standard default, which is a breaking change (but it should be noted backfill policies are experimental).
How I Tested These Changes
Add test to make sure a job with
@dbt_asset
and@asset
can be created.