-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
@observable_source_asset adapter to generate and SensorDefinition #16716
@observable_source_asset adapter to generate and SensorDefinition #16716
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
# calling these helpers could be in the Definitions object itself | ||
defs = Definitions( | ||
assets=[create_assets_def_from_source_asset(an_asset)], | ||
sensors=[sensor_def_from_observable_source_asset(an_asset)], | ||
) | ||
|
||
instance = DagsterInstance.ephemeral() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sryza relevant test case here
b03b357
to
ee9c797
Compare
f07cd20
to
111599d
Compare
ee9c797
to
b7d2e1b
Compare
111599d
to
5d5ce37
Compare
b7d2e1b
to
e66d383
Compare
1b7fa84
to
ae483e1
Compare
e66d383
to
d8a6069
Compare
int(observable_source_asset.auto_observe_interval_minutes * 60) | ||
if observable_source_asset.auto_observe_interval_minutes is not None | ||
else 5 | ||
* 60 # I could not find the default value (undocumented) so guessing it is 5 minutes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@OwenKephart I was trying to find the default value for auto_observe_interval_minutes
. I couldn't find it in the codebase. I'll update the docblock for SourceAsset
once I know what it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no default value for this (other than `None) -- if someone wants an observable source asset to be automatically observed they need to pass in an explicit number.
5 minutes seems like a fine default here, although I think I'm missing some context for the overall PR. Tying this sensor's frequency to the value set on auto_observe_interval_minutes
would require that we also turn off any auto-observing functionality in the asset daemon to avoid two daemons trying to do the same job.
The machinery for auto-observing source assets is very different from the machinery for auto-materializing regular assets (for one, there's no system of rules for when source assets should be observed, it's just based on elapsed time), so in general I think it's reasonable to model this as just traditional sensor if we don't think there will be any need for more complex logic for when to kick these off (I'd buy that).
Some bits of trepidation:
- we'd probably want to rename this property to something that's less similar to
auto_materialize
. as proposed, this is less declarative and "automatic" than the existing solution (as you need to add code in two places in order for your assets to actually be observed automatically). as you mention, this could be fixed by adding things to the Definitions machinery. - for users with tons of observable source assets, it might be offputting to end up with an equal number of sensors. this might be something that clever UI could handle (potentially grouping together all these auto-generated sensors).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for users with tons of observable source assets, it might be offputting to end up with an equal number of sensors. this might be something that clever UI could handle (potentially grouping together all these auto-generated sensors).
Absolutely. We'll want a version of this which can take multiple observable source assets. In fact @OwenKephart I think you should request changes here and then I should do exactly that.
we'd probably want to rename this property to something that's less similar to auto_materialize. as proposed, this is less declarative and "automatic" than the existing solution (as you need to add code in two places in order for your assets to actually be observed automatically). as you mention, this could be fixed by adding things to the Definitions machinery.
We talked about this live, but the goal here is to provide a backwards compatibility layer. Agree that any new APIs should avoid the language.
ae483e1
to
a93f78b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should request changes here
phabricator plan changes
, i miss you so
Miss phab every day. |
d8a6069
to
3211da4
Compare
3211da4
to
c429cc0
Compare
feedabck Extract out observe_fn_to_op_compute_fn to it can be reused in other contexts cp feedback experimental cp cp cp cleanup make sensor handle multiple assets
dc9eb75
to
fde2c96
Compare
Summary & Motivation
This PR adds the ability to generate a sensor from a list
observable_source_asset
declarations.The goal here is to demonstrate and have a story for executing observations within sensors when we consolidate
@observable_source_asset
and into the mainlineAssetsDefinition
ontology by having the compute function able to generate an observation, not just a materialization.So now without having to opt-in to AMPs, you can easily generate a sensor to refresh source assets at a regular cadence without having the write a sensor.
observable_source_asset
will become a deprecated API in this world, almost certainly.In a test case we have the following code:
If we so chose, we could push down calling those functions against the source asset to within the Definitions
__init__
method machinery to make this completely automatic. Whether or not that is a good idea is a different discussion.How I Tested These Changes
BK