-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Checks on AssetSelection #16610
Checks on AssetSelection #16610
Conversation
Current dependencies on/for this PR: This comment was auto-generated by Graphite. |
79e0bc8
to
42ad214
Compare
Strongly feel that we should includes checks by default. Not including them will be very surprising behavior. |
|
||
|
||
def test_job_with_asset_and_all_its_checks(): | ||
job_def = define_asset_job("job1", selection=AssetSelection.assets(asset1) & AssetSelection.asset_checks_for_assets(asset1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using set operators is super confusing since this doesn't act like a set '&'. This really is an OR/UNION isn't it. Why don't we just do that instead of being cute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meaning that if this really following set semantics this wolud be a null job
job_def = define_asset_job( | ||
"job1", selection=AssetSelection.all_assets() - AssetSelection.all_asset_checks(asset1_check1) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as above I'm skeptical of using set operations here as it claims a level for formalism that just isn't there. This is a builder API, not a proper algebra.
Strongly recommend AssetSelection.all_assets().without_checks()
and let the set operations to the sets.
Meaning if someone really wants to use set operations then can create sets of AssetKeys and go nuts.
42ad214
to
db9ad33
Compare
@schrockn these test cases were wrong, they're fixed now. Under the hood |
How do you feel about Agreed these set operations are maybe too fancy, they leave a lot of ways to do things. But they're already public so I'm not sure we can avoid supporting them if we're building in to AssetSelection |
Yeah I guess we already do this so we can continue to. However in our successor selection API I do not think we should claim to support a formal set algebra. |
1ccc7b2
to
f0dc53c
Compare
Deploy preview for dagit-storybook ready! ✅ Preview Built with commit f0dc53c. |
Deploy preview for dagit-core-storybook ready! ✅ Preview Built with commit f0dc53c. |
Deploy preview for dagster-docs ready! Preview available at https://dagster-docs-oe8kailyf-elementl.vercel.app Direct link to changed pages: |
f0dc53c
to
b00675a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I think we move forward here. I've formally registered my concern with relying on set operators, but that decision predates this PR, so we must press on. I look forward to reassessing selection.
job_def = define_asset_job( | ||
"job1", selection=AssetSelection.assets(asset1) - AssetSelection.all_asset_checks() | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit concerned about the discoverability of this one. I'm not sure how we are going to expect users to discover this. We may have to add additional helper methods for common use cases like this.
0f22746
to
603a84d
Compare
## Summary & Motivation Build on #16610 and #16219 to enable asset check subselection. This change was painful and a doozy. To wrangle this complexity in the future we need to: - Encompass a succinct and interpretable version of `XXXSelection` to seamlessly enable the asset selection and asset check selection use case - Standardize our usage of `None`, `[]`, `{}`, when indicating that a selection is defaulting to select all the objects (assets and asset checks). - Rather than using `None`, an explicit sentinel value should be used (e.g. `XXXSelection.ALL`, so that the value indicates its usage. Likewise, `[]` and `{}` should be replaced with `XXXSelection.EMPTY`). ## How I Tested These Changes - When materializing a subsetted asset, there are four cases: - Materialize without a selection - Materialize with a selection of assets and checks - Materialize with only a selection of assets - Materialize with only a selection of checks - Remove any usages of `AssetSelection` that break the new subsetting invariants (e.g. #16638) - Existing pytest - Stack on dbt implementation --------- Co-authored-by: Johann Miller <[email protected]>
Implementation of #16185, with some modifications
resolve
method for assets, andresolve_checks
for checksThis means that most of the existing callsites that only care about selected assets (e.g. https://github.com/dagster-io/dagster/blob/johann/09-18-checks_on_AssetSelection/python_modules/dagster/dagster/_core/definitions/asset_graph.py#L118, https://github.com/dagster-io/dagster/blob/johann/09-18-checks_on_AssetSelection/python_modules/dagster/dagster/_core/definitions/data_time.py#L164) can keep doing what they're doing, while the callsites that care can get the checks.
We'll expand the number of callsites to
resolve_checks
in the future- e.g. for sensors that return run requests with lists of asset keys.Question:
Should we default to including or excluding checks? E.g. in
AssetSelection.all()
andAssetSelection.assets(...)