-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[module-loaders] Delete extra_source_assets param from assets_from_modules #26494
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i see -- sadly this is a public API so we can't actually delete this arg, right?
we could at least make it a hidden param
@OwenKephart i don't think this is a public api actually. Searched for reference at top level and couldn't find. Was added pre 1.0 |
@dpeng817 huh I swear I tried |
4f6accf
to
2219a42
Compare
Deploy preview for dagster-docs ready! Preview available at https://dagster-docs-fmnqhegjp-elementl.vercel.app Direct link to changed pages: |
2219a42
to
2853ef1
Compare
8e93c2e
to
0291fcc
Compare
1b50ebe
to
5ddc747
Compare
## Summary & Motivation This code path was highly crufty - code smell here is "iterating through big list and doing a ton of things at once". Solution is to create a lookup table of cacheable properties and imperatively call properties as you need them (stole this pattern from @schrockn when we were working on airlift stuff) I think the code is a lot clearer now. I also got rid of an error message we were devoting a lot of lines to that felt very outdated; nobody is using with_resources, and repository_def is misleading. I think we also end up catching a few hidden error states; for example, we weren't checking assetsdefs against sourceassets with the same key, sourceasset collision checking felt more like a side effect. ## How I Tested These Changes Existing tests
## Summary & Motivation We can further simplify the load_asset_from_x code path by making the asset checks loader return AssetsDefinitions, and changing the assertions to just make sure that the returned assets only contain checks. ## How I Tested These Changes Altered existing tests
#26529) ## Summary & Motivation This was one of the most confusing parts of the original code path; that asset check key changing was happening _implicitly_ within with_attributes when you changed output_asset_keys. Instead, provide a top-level, explicit way to remap asset check keys. I think the old behavior was highly mysterious. I found myself questioning what the heck was happening. This is much more; "what it says on the tin." ## How I Tested These Changes Existing tests.
## Summary & Motivation Consolidate module loader code and tests into a single directory for easier reuse between them.
## Summary & Motivation The object list utils were previously defining behavior mostly in terms of assets. Genericize the utilities to instead refer to "DagsterObjects" where appropriate; in preparation for handling other types of dagster definitions. ## How I Tested These Changes Existing tests
0405939
to
ca38954
Compare
…sensors, jobs, schedules (#26545) ## Summary & Motivation Actually genericize `ModuleScopedDagsterObjects` to handle taking in all types of Dagster objects - sensors, schedules, and jobs. I explicitly left out resources and loggers, because I don't think it makes sense to scoop those up at module load - since they need to be associated with a key. ## How I Tested These Changes Added a new test file `test_module_loaders` which scaffolds out a module fake for a given test spec, and ensures that either an error is thrown or the correct number of attributes are accessible on the underlying list.
e9365bb
to
f160acc
Compare
## Summary & Motivation This PR is a prototype of a top-level API to load dagster definitions from across a module into a single returned `Definitions` object. It is not intended for landing yet. Just want to align on direction. ### Why add this API? For any user who is bought into a module structure making use of `load_assets_from_x`, we are currently making their life unnecessarily more difficult by not bringing in other objects that are scoped at module-load time. Those users agree - we've received requests for an API to do that numerous times. In absence of a compelling replacement for our current project structure, I think the existence of this API is a good stopgap to improve module ergonomics. ### Why are resources, loggers, and executor provided as args? It seems like the most straightforward way to support these objects without some sort of additional magic. Since we force you to provide a key in addition to the class itself, there's not currently a module-scoped pattern that matches how these objects are defined on a `Definitions` argument. So it seems reasonable to accept these as parameters. The other approach would be to allow users to configure resources as variables and use the key as a variable. But I figure the more conservative approach here leaves us room to also do that in the future (can imagine some sort of combination step). ```python dbt = DbtCliResource(...) # equivalent of {"dbt": DbtCliResource(...)} ``` For executor, the fact that we only accept one per module is kind of unique. I don't think a user would define an executor in the same way that they define a sensor, schedule, or job, for example, where the definition is inlined to the module. I think it makes more sense for the user to need to provide this explicitly. ### Why does this return a Definitions object I don't think there is any reasonable alternative. The existing `load_assets_from_x` is nice because that whole list can be provided directly to a `Definitions` object - this is not the case for if we provided a flat list of heterogeneous types of Dagster objects - this could not be provided directly to a `Definitions` object. Unless we added some sort of `from_list` constructor or something. But I think this is more straightforward. It also gives us the opportunity to potentially paper over some stuff; if we so choose - we can, for example, automatically coerce source assets and AssetSpec objects into resolved AssetsDefinitions. ### Should this thing take in other `Definitions` objects? Right now, I think no. While `Definitions.merge` exists, it's obscured and not documented, I think most users think of a single `Definitions` object as being synonymous with a code location. ### What's the intended design pattern for using this? I think the intended use case for this would be to provide one call at the top level of the module of a given code location; and automatically load in all of your dagster defs. ``` defs = load_definitions_from_module(current_module, resources=..., loggers=...) ``` ## How I Tested These Changes I added a new test which operates on all test specs and calls this fxn, and also one for handling the resource and logger cases.
f160acc
to
cf8c840
Compare
This PR was used to accumulate the upstack.
Summary & Motivation
How I Tested These Changes
Changelog