Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] tests to specify APIs for constructing jobs with check selections #16185

Closed
wants to merge 1 commit into from

Conversation

sryza
Copy link
Contributor

@sryza sryza commented Aug 29, 2023

Summary & Motivation

This is a set of unit tests that aim to specify behavior for how you construct jobs where you want to exercise fine-grained control over checks that execute in them.

Examples of fine-grained control:

  • Materialize assets without executing their checks
  • Execute checks without materializing the assets they target

Some things to point out here:

  • You can imagine all instances here of AssetSelection replaced with Selection. But I think we should keep AssetSelection working at least for backcompat.
  • This sticks with define_asset_job and UnresolvedAssetJobDefinition rather than introducing a new API. The idea is that a job that executes only asset checks is still an "asset job", because it operates on assets.

How I Tested These Changes

These changes are tests.

@sryza sryza requested review from johannkm and schrockn August 29, 2023 19:46
@schrockn
Copy link
Member

You can imagine all instances here of AssetSelection replaced with Selection. But I think we should keep AssetSelection working at least for backcompat.

I agree we need to keep AssetSelection working for backcompat, but I think all these examples should be written in terms of Selection and all of our public examples should be in terms of Selection. Effectively AssetSelection is de facto deprecated.

@schrockn
Copy link
Member

This sticks with define_asset_job and UnresolvedAssetJobDefinition rather than introducing a new API. The idea is that a job that executes only asset checks is still an "asset job", because it operates on assets.

100% agree with this call 👍🏻

@schrockn
Copy link
Member

I think this is ready for wider consumption. I've heard mixed feedback about our entire selection regime, so folks might have an opinions on changing this API without doing other changes. You only get so many shots as introducing a class with an central-sounding name like Selection, so I think this is worth writing up an socializing as a discussion.

@sryza
Copy link
Contributor Author

sryza commented Aug 30, 2023

I think all these examples should be written in terms of Selection

@schrockn I pushed an update that uses Selection in all the examples.

all of our public examples should be in terms of Selection. Effectively AssetSelection is de facto deprecated.

I think we should get here eventually, but I would lean against targeting this change for launch week / 1.5 release. To do so, we'd need to either (a) mark Selection unexperimental or (b) make a bunch of our public examples that currently depend on a stable API depend on an experimental API.

I think (a) is risky and we don't have the bandwidth to mitigate that risk. I think (b) is pretty undesirable.

@sryza
Copy link
Contributor Author

sryza commented Aug 30, 2023

You only get so many shots as introducing a class with an central-sounding name like Selection, so I think this is worth writing up an socializing as a discussion.

Good point - created a discussion: https://github.com/dagster-io/internal/discussions/6621

johannkm added a commit that referenced this pull request Sep 20, 2023
Implementation of #16185, with
some modifications

- Use AssetSelection, rather than introducing AssetCheckSelection
- externally use Tuple[AssetKey, str] rather than making
AssetCheckHandle public
- internally use the existing `resolve` method for assets, and
`resolve_checks` for checks

This means that most of the existing callsites that only care about
selected assets (e.g.
https://github.com/dagster-io/dagster/blob/johann/09-18-checks_on_AssetSelection/python_modules/dagster/dagster/_core/definitions/asset_graph.py#L118,
https://github.com/dagster-io/dagster/blob/johann/09-18-checks_on_AssetSelection/python_modules/dagster/dagster/_core/definitions/data_time.py#L164)
can keep doing what they're doing, while the callsites that care can get
the checks.

We'll expand the number of callsites to `resolve_checks` in the future-
e.g. for sensors that return run requests with lists of asset keys.

Question:
Should we default to including or excluding checks? E.g. in
`AssetSelection.all()` and `AssetSelection.assets(...)`
- we've gone with including
@johannkm
Copy link
Contributor

implemented in #16610

@johannkm johannkm closed this Sep 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants