Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[module-loaders] [rfc] Include specs in asset module loaders #26524

Closed
wants to merge 5 commits into from

Conversation

dpeng817
Copy link
Contributor

@dpeng817 dpeng817 commented Dec 16, 2024

Summary & Motivation

This PR is the tablestakes approach for adding spec support to our asset module loaders. It enables asset spec loading in the code path, and gates it behind a bool so as to not change behavior for any existing users.

One intricacy here is that the loading the following module will error:

import dagster as dg

spec = dg.AssetSpec("my_asset")

@dg.multi_asset(specs=[spec])
def uses_spec():
    ...

At this point in time, I think this is fine. It will just basically push users towards a pattern like this:

import dagster as dg

def specs():
 return [dg.AssetSpec("my_asset")]

@dg.multi_asset(specs=specs())
def uses_spec():
    ...

How I tested this

Added additional testing for spec loading case, ensured existing test behavior did not change.

Changelog

Our asset loading functions can now load AssetSpec objects when called with the include_specs parameter. This includes load_assets_from_package_name, load_assets_from_modules, load_assets_from_modules, and load_assets_from_current_module.

@dpeng817 dpeng817 changed the title Include specs in asset module loaders [rfc] Include specs in asset module loaders Dec 16, 2024
Copy link
Contributor Author

dpeng817 commented Dec 16, 2024

Copy link
Contributor

@yuhan yuhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

open to ideas - i wonder which option is better:

  1. this PR - add an additional param include_specs to existing APIs
  2. or brand new load_asset_specs_from_x

hmmm 🤔 actually as im typing out this, i think this PR does feel cleaner and less of a cognitive load to remembering both load_asset_ and load_asset_specs.

@yuhan
Copy link
Contributor

yuhan commented Dec 16, 2024

OK I think this is a good step forward and it's not too risky or irreversible.

Could you update the relevant docstrings to communicate this addition in a non-confusing way? (btw the current docstring is pretty busted 😓 https://docs-preview.dagster.io/api/python-api/assets#code-locations)

Copy link
Contributor Author

Yea I think that ideally we don't introduce the cross product of these APIs - make the existing ones work sensibly, and then eventually introduce a centralized defs loader

@dpeng817 dpeng817 force-pushed the dpeng817/use_new_loader_in_checks branch from aa8efe8 to 0309dba Compare December 18, 2024 03:35
@dpeng817 dpeng817 force-pushed the dpeng817/include_specs branch from 203c228 to 1fff846 Compare December 18, 2024 03:35
@dpeng817 dpeng817 changed the title [rfc] Include specs in asset module loaders [module-loaders] [rfc] Include specs in asset module loaders Dec 18, 2024
@dpeng817 dpeng817 requested a review from yuhan December 18, 2024 16:58
Copy link
Contributor

@OwenKephart OwenKephart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

@dpeng817 dpeng817 force-pushed the dpeng817/use_new_loader_in_checks branch from 0309dba to 5be4d04 Compare December 18, 2024 18:36
@dpeng817 dpeng817 force-pushed the dpeng817/include_specs branch from 1fff846 to 2219a42 Compare December 18, 2024 18:36
## Summary & Motivation
This code path was highly crufty - code smell here is "iterating through
big list and doing a ton of things at once". Solution is to create a
lookup table of cacheable properties and imperatively call properties as
you need them (stole this pattern from @schrockn when we were working on
airlift stuff) I think the code is a lot clearer now.

I also got rid of an error message we were devoting a lot of lines to
that felt very outdated; nobody is using with_resources, and
repository_def is misleading.

I think we also end up catching a few hidden error states; for example,
we weren't checking assetsdefs against sourceassets with the same key,
sourceasset collision checking felt more like a side effect.

## How I Tested These Changes
Existing tests
@dpeng817 dpeng817 force-pushed the dpeng817/use_new_loader_in_checks branch from 5be4d04 to 292e31e Compare December 18, 2024 18:41
@dpeng817 dpeng817 force-pushed the dpeng817/include_specs branch from 2219a42 to 2b63520 Compare December 18, 2024 18:41
Base automatically changed from dpeng817/use_new_loader_in_checks to dpeng817/delete_extra_source_assets December 18, 2024 19:06
@dpeng817 dpeng817 changed the base branch from dpeng817/delete_extra_source_assets to master December 18, 2024 19:09
@dpeng817
Copy link
Contributor Author

Closing because all the commits from this PR are now contained within #26494 - ideally this would be merged, but gt fold failed to resolve things properly for me, and now this PR is in an unresolveable state.

@dpeng817 dpeng817 closed this Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants