Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle "unexecutable" assets in job building and cross process representation #16637

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

alangenfeld
Copy link
Member

@alangenfeld alangenfeld commented Sep 20, 2023

building forward from
#16575
#16617

This propagates understanding of "unexecutable" assets in to the job building and the inter process representation.

  • unexecutable assets defs are no longer put in to jobs
  • unexceutable assets are transformed in to SourceAssets when we build jobs, allowing them to replace SourceAsset at the definitions level

How I Tested These Changes

  • pytest python_modules/dagster/dagster_tests/definitions_tests/test_observable_assets.py
  • loading
table_a = AssetSpec("table_A")
table_b = AssetSpec("table_B", deps=[table_a])
table_c = AssetSpec("table_C", deps=[table_a])
table_d = AssetSpec("table_D", deps=[table_b, table_c])

those_assets = create_unexecutable_observable_assets_def(
    specs=[table_a, table_b, table_c, table_d]
)

defs = Definitions(assets=[those_assets])

in via dagster-webserver -f and using instance.report_runless_asset_event to update materializations for associated assets . Verify asset graph renders as the nodes and their most recent materializations

@alangenfeld
Copy link
Member Author

alangenfeld commented Sep 20, 2023

@alangenfeld alangenfeld force-pushed the al/09-19-_exploration_unexecutable_assets_not_backed_by_jobs branch from f25fe58 to 47435a4 Compare September 20, 2023 02:03
@alangenfeld
Copy link
Member Author

tue night progress:

  • jammed unexecutable assetsdefs in to source_assets_by_key
  • got a sign of life of things rendering in the webserver with this approach, using dagster-webserver -f this file:
table_a = AssetSpec("table_A")
table_b = AssetSpec("table_B", deps=[table_a])
table_c = AssetSpec("table_C", deps=[table_a])
table_d = AssetSpec("table_D", deps=[table_b, table_c])

those_assets = create_unexecutable_observable_assets_def(
    specs=[table_a, table_b, table_c, table_d]
)

defs = Definitions(assets=[those_assets])

@alangenfeld alangenfeld force-pushed the al/09-19-_exploration_unexecutable_assets_not_backed_by_jobs branch 2 times, most recently from ce171b1 to 8ef4522 Compare September 20, 2023 16:42
@github-actions
Copy link

github-actions bot commented Sep 20, 2023

Deploy preview for dagit-core-storybook ready!

✅ Preview
https://dagit-core-storybook-nm72izvj9-elementl.vercel.app
https://al-09-19--exploration-unexecutable-assets-not-backed-by-jobs.core-storybook.dagster-docs.io

Built with commit 71b697f.
This pull request is being automatically deployed with vercel-action

@alangenfeld alangenfeld force-pushed the al/09-19-_exploration_unexecutable_assets_not_backed_by_jobs branch from 8ef4522 to 28340ae Compare September 21, 2023 18:52
@github-actions
Copy link

github-actions bot commented Sep 21, 2023

Deploy preview for dagit-storybook ready!

✅ Preview
https://dagit-storybook-njmw2ahdd-elementl.vercel.app
https://al-09-19--exploration-unexecutable-assets-not-backed-by-jobs.components-storybook.dagster-docs.io

Built with commit 71b697f.
This pull request is being automatically deployed with vercel-action

@alangenfeld alangenfeld force-pushed the al/09-19-_exploration_unexecutable_assets_not_backed_by_jobs branch from 28340ae to 954a961 Compare September 21, 2023 19:10
@alangenfeld alangenfeld changed the title [exploration] unexecutable assets not backed by jobs handle "unexecutable" assets in job building and cross process representation Sep 21, 2023
@alangenfeld alangenfeld requested a review from sryza September 21, 2023 19:12
Copy link
Member

@schrockn schrockn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly makes sense. I think this introduces a bunch of nomenclature issues that we can address for now with super verbose but clear names. See comments inline.

len(self._external_asset_node.job_names) >= 1,
"Asset must be part of at least one job",
len(self._external_asset_node.job_names) >= 1
or not self._external_asset_node.is_executable,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So IMO we should not have helper bools here and instead test against execution type directly. My argument here is that this is on its face confusing as a source assets is also not executable. A code reader with less context might find this very confusing.

I think or self._external_asset_node.execution_type == AssetExecutionType::UNEXECUTABLE is more explicit and clear than not self._external_asset_node.is_executable in this way

@@ -581,7 +582,7 @@ def resolve_assetObservations(
]

def resolve_configField(self, _graphene_info: ResolveInfo) -> Optional[GrapheneConfigTypeField]:
if self.is_source_asset():
if self.is_source_or_unexecutable_asset():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar spirit I think is_source_or_defined_in_unexecutable_assets_def is more clear is_source_or_unexecutable_asset. I think in weird dual state/transitional states in code bases being super clear is worth it and the expense of seemingly super verbose internal code paths.

Comment on lines 888 to 897
def is_unexecutable_source(self):
"""Returns True if this definition represents unexecutable source assets.
Assumption: either all or none contained assets are unexecutable source assets.
"""
for key in self.keys:
if not self.is_unexecutable_source_asset(key):
return False
return True

def is_unexecutable_source_asset(self, asset_key: AssetKey) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am now confused as the precise difference between is_unexecutable_source and is_unexecutable_source_asset

**{SYSTEM_METADATA_KEY_ASSET_EXECUTION_TYPE: AssetExecutionType.UNEXECUTABLE.value},
**{
SYSTEM_METADATA_KEY_ASSET_EXECUTION_TYPE: (
AssetExecutionType.UNEXECUTABLE_SOURCE.value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm worth talking about what this represents. Given this repurposing the original notion of AssetVarietal or AssetType might make more sense,

@@ -193,7 +193,7 @@ def build_caching_repository_data_from_list(

if assets_defs or source_assets or asset_checks_defs:
for job_def in get_base_asset_jobs(
assets=assets_defs,
assets_defs=assets_defs,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@@ -1341,6 +1343,11 @@ def external_repository_data_from_def(
asset_graph = external_asset_graph_from_defs(
jobs,
source_assets_by_key=repository_def.source_assets_by_key,
unexecutable_assets={
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ unexecutable_assets/unexecutable_assets_defs/

I think using "assets_defs" very important in code paths that co-exist with SourceAsset instances

@alangenfeld alangenfeld changed the base branch from master to al/09-25-add_defensive_check_for_multiple_ExternalAssetNodes_for_same_key September 25, 2023 19:12
@alangenfeld alangenfeld force-pushed the al/09-19-_exploration_unexecutable_assets_not_backed_by_jobs branch from b1222d0 to aa652da Compare September 25, 2023 19:12
Comment on lines 1691 to 1707
# build nodes for remaining asset keys that were referred in other assets but not defined
asset_keys_without_definitions = (
all_upstream_asset_keys.difference(node_defs_by_asset_key.keys())
.difference(source_assets_by_key.keys())
.difference(unexecutable_keys)
)
for asset_key in asset_keys_without_definitions:
asset_nodes.append(
ExternalAssetNode(
asset_key=asset_key,
dependencies=list(deps[asset_key].values()),
depended_by=list(dep_by[asset_key].values()),
job_names=[],
group_name=group_name_by_asset_key.get(asset_key),
code_version=code_version_by_asset_key.get(asset_key),
)
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part of the building (moved from above) dates back to the addition of ExternalAssetNode https://github.com/dagster-io/dagster/pull/4742/files

not sure what conditions it fires with the source asset resolution happening above

@alangenfeld alangenfeld force-pushed the al/09-19-_exploration_unexecutable_assets_not_backed_by_jobs branch from aa652da to 372453a Compare September 25, 2023 19:26
Base automatically changed from al/09-25-add_defensive_check_for_multiple_ExternalAssetNodes_for_same_key to master September 25, 2023 21:49
@alangenfeld alangenfeld force-pushed the al/09-19-_exploration_unexecutable_assets_not_backed_by_jobs branch 3 times, most recently from cf7b1b5 to 73878a1 Compare September 25, 2023 22:13
@alangenfeld alangenfeld force-pushed the al/09-19-_exploration_unexecutable_assets_not_backed_by_jobs branch from 73878a1 to 71b697f Compare October 13, 2023 18:54
@github-actions
Copy link

Deploy preview for dagster-university ready!

✅ Preview
https://dagster-university-ozk1zjw13-elementl.vercel.app
https://al-09-19--exploration-unexecutable-assets-not-backed-by-jobs.dagster-university.dagster-docs.io

Built with commit 71b697f.
This pull request is being automatically deployed with vercel-action

@alangenfeld alangenfeld force-pushed the al/09-19-_exploration_unexecutable_assets_not_backed_by_jobs branch from 71b697f to 6e332f0 Compare October 13, 2023 19:02
@alangenfeld alangenfeld force-pushed the al/09-19-_exploration_unexecutable_assets_not_backed_by_jobs branch from 6e332f0 to 1edbf01 Compare October 13, 2023 19:12
@alangenfeld alangenfeld requested a review from smackesey October 16, 2023 20:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants