Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checks on AssetSelection #16610

Merged
merged 6 commits into from
Sep 20, 2023
Merged

Conversation

johannkm
Copy link
Contributor

@johannkm johannkm commented Sep 19, 2023

Implementation of #16185, with some modifications

  • Use AssetSelection, rather than introducing AssetCheckSelection
  • externally use Tuple[AssetKey, str] rather than making AssetCheckHandle public
  • internally use the existing resolve method for assets, and resolve_checks for checks

This means that most of the existing callsites that only care about selected assets (e.g. https://github.com/dagster-io/dagster/blob/johann/09-18-checks_on_AssetSelection/python_modules/dagster/dagster/_core/definitions/asset_graph.py#L118, https://github.com/dagster-io/dagster/blob/johann/09-18-checks_on_AssetSelection/python_modules/dagster/dagster/_core/definitions/data_time.py#L164) can keep doing what they're doing, while the callsites that care can get the checks.

We'll expand the number of callsites to resolve_checks in the future- e.g. for sensors that return run requests with lists of asset keys.

Question:
Should we default to including or excluding checks? E.g. in AssetSelection.all() and AssetSelection.assets(...)

  • we've gone with including

@johannkm
Copy link
Contributor Author

johannkm commented Sep 19, 2023

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

@johannkm johannkm force-pushed the johann/09-18-checks_on_AssetSelection branch 2 times, most recently from 79e0bc8 to 42ad214 Compare September 19, 2023 02:47
@johannkm johannkm changed the title checks on AssetSelection [rfc] checks on AssetSelection Sep 19, 2023
@schrockn
Copy link
Member

Strongly feel that we should includes checks by default. Not including them will be very surprising behavior.



def test_job_with_asset_and_all_its_checks():
job_def = define_asset_job("job1", selection=AssetSelection.assets(asset1) & AssetSelection.asset_checks_for_assets(asset1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using set operators is super confusing since this doesn't act like a set '&'. This really is an OR/UNION isn't it. Why don't we just do that instead of being cute.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meaning that if this really following set semantics this wolud be a null job

job_def = define_asset_job(
"job1", selection=AssetSelection.all_assets() - AssetSelection.all_asset_checks(asset1_check1)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above I'm skeptical of using set operations here as it claims a level for formalism that just isn't there. This is a builder API, not a proper algebra.

Strongly recommend AssetSelection.all_assets().without_checks() and let the set operations to the sets.

Meaning if someone really wants to use set operations then can create sets of AssetKeys and go nuts.

@johannkm johannkm force-pushed the johann/09-18-checks_on_AssetSelection branch from 42ad214 to db9ad33 Compare September 19, 2023 18:52
@johannkm
Copy link
Contributor Author

@schrockn these test cases were wrong, they're fixed now. Under the hood selection1 & selection2 does a set operation on the asset keys, plus now the asset check handles.

@johannkm
Copy link
Contributor Author

How do you feel about AssetSelection.all() - AssetSelection.all_asset_checks instead of AssetSelection.all_assets().without_checks()?

Agreed these set operations are maybe too fancy, they leave a lot of ways to do things. But they're already public so I'm not sure we can avoid supporting them if we're building in to AssetSelection

@schrockn
Copy link
Member

Yeah I guess we already do this so we can continue to. However in our successor selection API I do not think we should claim to support a formal set algebra.

@johannkm johannkm changed the title [rfc] checks on AssetSelection Checks on AssetSelection Sep 19, 2023
@johannkm johannkm force-pushed the johann/09-18-checks_on_AssetSelection branch from 1ccc7b2 to f0dc53c Compare September 19, 2023 20:54
@github-actions
Copy link

Deploy preview for dagit-storybook ready!

✅ Preview
https://dagit-storybook-fh9ivon3e-elementl.vercel.app
https://johann-09-18-checks-on-AssetSelection.components-storybook.dagster-docs.io

Built with commit f0dc53c.
This pull request is being automatically deployed with vercel-action

@github-actions
Copy link

Deploy preview for dagit-core-storybook ready!

✅ Preview
https://dagit-core-storybook-19tuogqoq-elementl.vercel.app
https://johann-09-18-checks-on-AssetSelection.core-storybook.dagster-docs.io

Built with commit f0dc53c.
This pull request is being automatically deployed with vercel-action

@github-actions
Copy link

github-actions bot commented Sep 19, 2023

Deploy preview for dagster-docs ready!

Preview available at https://dagster-docs-oe8kailyf-elementl.vercel.app
https://johann-09-18-checks-on-AssetSelection.dagster.dagster-docs.io

Direct link to changed pages:

@johannkm johannkm force-pushed the johann/09-18-checks_on_AssetSelection branch from f0dc53c to b00675a Compare September 20, 2023 01:36
Copy link
Member

@schrockn schrockn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I think we move forward here. I've formally registered my concern with relying on set operators, but that decision predates this PR, so we must press on. I look forward to reassessing selection.

Comment on lines +110 to +112
job_def = define_asset_job(
"job1", selection=AssetSelection.assets(asset1) - AssetSelection.all_asset_checks()
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit concerned about the discoverability of this one. I'm not sure how we are going to expect users to discover this. We may have to add additional helper methods for common use cases like this.

@johannkm johannkm force-pushed the johann/09-18-checks_on_AssetSelection branch from 0f22746 to 603a84d Compare September 20, 2023 02:55
@johannkm johannkm merged commit 190fb99 into master Sep 20, 2023
1 check failed
@johannkm johannkm deleted the johann/09-18-checks_on_AssetSelection branch September 20, 2023 02:55
johannkm added a commit that referenced this pull request Sep 27, 2023
## Summary & Motivation
Build on #16610 and
#16219 to enable asset check
subselection. This change was painful and a doozy.

To wrangle this complexity in the future we need to:
- Encompass a succinct and interpretable version of `XXXSelection` to
seamlessly enable the asset selection and asset check selection use case
- Standardize our usage of `None`, `[]`, `{}`, when indicating that a
selection is defaulting to select all the objects (assets and asset
checks).
- Rather than using `None`, an explicit sentinel value should be used
(e.g. `XXXSelection.ALL`, so that the value indicates its usage.
Likewise, `[]` and `{}` should be replaced with `XXXSelection.EMPTY`).

## How I Tested These Changes
- When materializing a subsetted asset, there are four cases:
  - Materialize without a selection
  - Materialize with a selection of assets and checks
  - Materialize with only a selection of assets
  - Materialize with only a selection of checks
- Remove any usages of `AssetSelection` that break the new subsetting
invariants (e.g. #16638)
- Existing pytest
- Stack on dbt implementation

---------

Co-authored-by: Johann Miller <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants