Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

partition methods on AssetExecutionContext #16625

Closed

Conversation

jamiedemaria
Copy link
Contributor

@jamiedemaria jamiedemaria commented Sep 19, 2023

Summary & Motivation

Adds partition_key_range_for_asset_key to AssetExecutionContext. Also does some minor reorganizing of partition methods on StepExecutionContext to support passing AssetKeys rather than input names

To Do:

  • update tests to use new methods, instead of deprecated ones
  • determine if we need to add a partition_keys_for_asset_key method to support materializing multiple static partitions

How I Tested These Changes

"""TODO - implement in stacked pr."""
pass
def partition_key_range_for_asset_key(
self, asset_key: CoercibleToAssetKey, is_dependency: bool = False
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still not super happy with the is_dependency solution to the self dependent asset problem. here's my comment from the first PR exploring this:


the is_dependency bool param here is to handle the self-dependent partition case. Basically when you have a self-dependent partition, if you said context.partition_key_for_asset_key("my_asset") it's unclear if you want the partition key of my_asset as a dependency, or as is being currently materialized. I'm handling this (for now at least) by allowing users to specify is_dependency to indicate they want it loaded as a dependency.

In all other cases, the function should be able to figure out the correct partition key to load given the current asset

I'm not set on this solution, so if you have any concerns or other ideas let me know

def partition_key_range_for_asset_key(self, asset_key: AssetKey) -> PartitionKeyRange:
"""TODO - implement in stacked pr."""
pass
def partition_key_range_for_asset_key(
Copy link
Contributor Author

@jamiedemaria jamiedemaria Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also like to have a discussion of what partition methods should be on the AssetExecutionContext. Currently we have:

  • partition_key property
  • partition_key_range property
  • partition_time_window property
  • partition_key_range_for_asset_key method

This leaves out the ability to:

  • Directly get a TimeWindow for a partition of an upstream asset, or any asset in a multi asset, bc there's no partition_time_window_for_asset_key method. The user can construct this info from the key range and the partitions def, so we may just need to document this process. It is pretty convoluted though, so we may be better off adding a key_range -> time_window helper method to the partition definitions
  • Directly get a list of all partition keys being materialized for a static partitioned asset. The user can construct this info by using key range and the partitions def, so we may just need to document this process
  • Get the PartitionsDefinition for an asset from the context. This seems pretty important to add, especially if we want the users to use the PartitionsDefinition to get time windows/key lists/etc

I just want to make sure this is the direction we want to go. We can always add more methods in at a later date, but if we decide we need them now, i'll go ahead and add them in

@jamiedemaria jamiedemaria changed the base branch from jamie/context-pyright to jamie/subclass-asset-context September 20, 2023 16:43
@jamiedemaria jamiedemaria force-pushed the jamie/partition-method branch 2 times, most recently from c92fea1 to e665260 Compare September 20, 2023 17:08
@jamiedemaria jamiedemaria force-pushed the jamie/subclass-asset-context branch from 9d2c497 to 92457f9 Compare September 20, 2023 18:23
@jamiedemaria jamiedemaria force-pushed the jamie/partition-method branch 2 times, most recently from 9355c3e to 34377b4 Compare September 20, 2023 21:25
@jamiedemaria
Copy link
Contributor Author

Discussing and aligning on partition method APIs here https://github.com/dagster-io/internal/discussions/6704. then will implement in this PR

@jamiedemaria jamiedemaria force-pushed the jamie/subclass-asset-context branch from efa6880 to 0788e4e Compare September 20, 2023 22:06
@jamiedemaria jamiedemaria force-pushed the jamie/subclass-asset-context branch from 27152b9 to d01d5d2 Compare September 22, 2023 14:27
@jamiedemaria jamiedemaria force-pushed the jamie/subclass-asset-context branch 2 times, most recently from 1448509 to 5ba005b Compare September 26, 2023 14:06
Copy link
Member

@alangenfeld alangenfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whats the overlap with this and #16480?

back to your queue

@jamiedemaria
Copy link
Contributor Author

closing for now. I think determining the partition method APIs will be best done in a discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants