Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[single-implicit-asset-job] methods on CodeLocation to get job partitions, partition tags, and partition config, with asset selection #23491

Merged
merged 5 commits into from
Aug 20, 2024

Conversation

sryza
Copy link
Contributor

@sryza sryza commented Aug 7, 2024

Summary & Motivation

This builds on top of #23527 to do two related things:

  • On the host process side, refactors CodeLocation methods and call sites to use job names and selected asset keys instead of partition set names.
  • On the user code process side, updates the implementations of grpc APIs that fetch partition information to pass selected asset keys to the JobDefinition.

How I Tested These Changes

@sryza sryza changed the title grpc methods on CodeLocation for get job partitions, partition tags, and partition config Aug 7, 2024
@sryza sryza changed the title methods on CodeLocation for get job partitions, partition tags, and partition config methods on CodeLocation to get job partitions, partition tags, and partition config Aug 7, 2024
@sryza sryza changed the title methods on CodeLocation to get job partitions, partition tags, and partition config methods on CodeLocation to get job partitions, partition tags, and partition config, with asset selection Aug 7, 2024
@sryza sryza force-pushed the partition-set-to-job/grpc branch 3 times, most recently from f2cc723 to 0fdc2d3 Compare August 7, 2024 23:08
@sryza sryza force-pushed the partition-set-to-job/grpc branch 4 times, most recently from 41e2efe to 9e9936a Compare August 8, 2024 21:11
@sryza sryza changed the base branch from master to partition-set-to-job/grpc-params August 8, 2024 21:14
@sryza sryza force-pushed the partition-set-to-job/grpc-params branch from 43f584a to 34a855a Compare August 8, 2024 21:15
@sryza sryza force-pushed the partition-set-to-job/grpc branch 3 times, most recently from 93579be to 8500be5 Compare August 8, 2024 23:03
Copy link
Member

@gibsondan gibsondan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sandy can you say more about what we are doing here to ensure that selected_asset_keys is only used/needed when there are newer versions of user code? how does that requirement get threaded through?

@sryza sryza force-pushed the partition-set-to-job/grpc branch from 8500be5 to 28f571b Compare August 9, 2024 03:01
@sryza
Copy link
Contributor Author

sryza commented Aug 9, 2024

sandy can you say more about what we are doing here to ensure that selected_asset_keys is only used/needed when there are newer versions of user code? how does that requirement get threaded through?

Why it's only needed when there are newer versions of user code:

1.8.1 will be the first Dagster version where it's possible for a job to have multiple PartitionsDefinitions "inside" it. Prior to this, every job either had a single partitions_def or had no partitioning at all.

In 1.8.1, it will be possible for an asset job to target assets with different PartitionsDefinitions. Even though those jobs don't have a single PartitionsDefinition, it's sensible to ask "what partitions can I launch a run of this job with if I'm only selecting asset1?" This is what the new selected_asset_keys parameter allows us to do.

selected_asset_keys is not needed and can be safely ignored prior to 1.8.1 because the answer to "what partitions can I launch a run of this job with?" doesn't depend on what assets are selected, because the assets in the job will never have different PartitionsDefinitions.

Why it's only used when there are newer versions of user code

This might reveal a naive misunderstanding of how things work, but I thought the answer to this was basically "only versions of user code on or 1.8.1 after will know about this selected_asset_keys, so only these versions can/will use it".

@gibsondan
Copy link
Member

"selected_asset_keys is not needed and can be safely ignored prior to 1.8.1 " answers my question - this scheme works as long as the grpc client isn't relying on the selected_asset_keys to do filtering, even on older versions. (For example, if this were like the asset_selection argument which causes certain runs to subset the assets that are being materialized - it would be bad if that argument was sometimes dropped on the floor by the server and ignored, because then you would launch a run for every asset instead of just the assets being selected - but it sounds like this argument is not like that)

@sryza sryza force-pushed the partition-set-to-job/grpc-params branch from 34a855a to 5ed7b66 Compare August 9, 2024 14:45
@sryza sryza changed the title methods on CodeLocation to get job partitions, partition tags, and partition config, with asset selection [single-implicit-asset-job] methods on CodeLocation to get job partitions, partition tags, and partition config, with asset selection Aug 9, 2024
@sryza sryza force-pushed the partition-set-to-job/grpc branch from 28f571b to fc88adb Compare August 9, 2024 18:01
Copy link
Member

@gibsondan gibsondan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not actually seeing anything set the new selected_asset_keys field other than derived it from the legacy partition set - is there still a change coming that has the frontend vary that field in some way that it wasn't before?

Comment on lines +111 to +113
job_name: str,
partition_name: str,
selected_asset_keys: Optional[AbstractSet[AssetKey]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any other validation that you cloud do here to validate that its a valid job name / selected asset keys combo? Like that if selected_asset_keys is None, that the job name has some property that only partition set jobs have?

Comment on lines -512 to -514
partitions_def.validate_partition_key(
partition_name, dynamic_partitions_store=instance
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was removing this part intentional / related to this PR?

Copy link
Contributor Author

@sryza sryza Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was intentional, though I should have flagged it and could be easily nudged to break it out into its own PR.

If you look at _instance_from_ref_for_dynamic_partitions, there's this weirdness right now where we only construct an instance ref if the PartitionsDefinition is a DynamicPartitionsDefinition, because otherwise we want the user to not need a dagster.yaml in their code server.

HOWEVER, part of what's happening in this PR is that we abstract away the PartitionsDefinition from this code path (this allows a future where someone executes a run for "2024-04-23" for asset1 and asset2, but those assets have different daily PartitionsDefinitions with different start dates). So now we don't have the PartitionsDefinition to look at and see if it's dynamic.

We could add some sort of JobDefinition.partitions_def_is_dynamic(selected_asset_keys) method to retain the existing behavior. HOWEVER HOWEVER, the reason I deleted this code is that I question whether this validation is really important. It results in an extra round trip to the DB. And this weirdness with the instance ref. And the backend code paths that invoke it already know the partition exists.

],
)
):
def __new__(
cls,
repository_origin: RemoteRepositoryOrigin,
job_name: str,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break when users upgrade their dagit/daemon versions without also upgrading their user code - we should make sure that we are prepared for user reports of that and maybe ensure that our docs are clear about exactly what our back-compat guarantees are in OSS - in the past we have been kind of wishy washy about it and I think we even have tests in our backcompat suite that exercise new versions of OSS with old versions of dagit

Maybe it's something we could detect in dagit and warn about so that the resulting error is less cryptic? Code servers tell us what dagster versions they are using

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break when users upgrade their dagit/daemon versions without also upgrading their user code

Do you mean it will break when users upgrade their user code without upgrading their dagit/daemon versions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sorry, I had that reversed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe ensure that our docs are clear about exactly what our back-compat guarantees are in OSS

Here's a PR that adds a section on this: #23734.

Maybe it's something we could detect in dagit and warn about so that the resulting error is less cryptic?

Thoughts on where we would add this? The gRPC call that (re)loads the definitions? Do you think we should add this before merging this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think it needs to block the PR - if there’s a known exception it will emit we could catch it and add “this error may be due to version skew” in the error message.

@sryza sryza force-pushed the partition-set-to-job/grpc-params branch from 5ed7b66 to af98201 Compare August 19, 2024 20:23
@sryza sryza force-pushed the partition-set-to-job/grpc branch from fc88adb to 1f24881 Compare August 19, 2024 20:38
sryza added a commit that referenced this pull request Aug 19, 2024
…to grpc partition api params (#23527)

## Summary & Motivation

"Partition set"s are an eliminated concept that lives on like a ghost
inside our host processes and gRPC APIs. This PR is the bottom of a
stack that makes a dent at removing it.

When a job has a `PartitionsDefinition`, we also automatically generate
a `PartitionSetSnap` whose name is f"{job_name}_partition_set". This is
a relic from a past era when jobs could have multiple partition sets.
This partition set name is then referenced from host processes when they
need to get partition-related information for the job:
- The set of partitions
- The tags corresponding to a particular partition
- The config corresponding to a particular partition

This is problematic for two reasons:

- The redundant and vestigial "partition set" concept is threaded
through a bunch of our host process code
- When an asset job targets assets with different PartitionsDefinitions,
there is no partition set for it. However, we still need to be able to
get the partitions, tags, and config for subsets of assets, e.g. to
populate the Launchpad. This functionality in the Launchpad has been
broken since the single-implicit-asset-job PR was merged.

### Backwards compatibility

After this change (and some of the changes stacked on top), there will
be two kinds of code locations:
1. Code locations that don't know about these new params and never have
jobs that include assets with different `PartitionsDefinition`s.
- These code locations will continue to use the `partition_set_name`
param, which continues to be supplied by host processes, to find the
associated job and fetch the associated data.
2. Code locations that do know about these new params and may have jobs
that include assets with different `PartitionsDefinition`s.
   -  These code locations will make use of the new params.


## How I Tested These Changes

Up-stack: #23491
Base automatically changed from partition-set-to-job/grpc-params to master August 19, 2024 20:51
branch-name: partition-set-to-job/grpc
@sryza
Copy link
Contributor Author

sryza commented Aug 20, 2024

i'm not actually seeing anything set the new selected_asset_keys field other than derived it from the legacy partition set - is there still a change coming that has the frontend vary that field in some way that it wasn't before?

Oops yes, I should have linked here: #23494. And then @bengotow has a branch for the frontend changes.

@sryza sryza merged commit 18993a4 into master Aug 20, 2024
1 check failed
@sryza sryza deleted the partition-set-to-job/grpc branch August 20, 2024 18:05
sryza added a commit that referenced this pull request Aug 20, 2024
…` to `GrapheneJob` (#23494)

## Summary & Motivation

This enables the frontend to query for partition information with
respect to an asset selection within a job. This enables a world where
jobs can contain assets with different `PartitionsDefinition`s.

It's used by @bengotow 's `partition-set-to-job/graphql-ui` branch.

Sits atop #23491.

## How I Tested These Changes
jmsanders pushed a commit that referenced this pull request Aug 21, 2024
…itions, partition tags, and partition config, with asset selection (#23491)

## Summary & Motivation

This builds on top of #23527
to do two related things:
- On the host process side, refactors `CodeLocation` methods and call
sites to use job names and selected asset keys instead of partition set
names.
- On the user code process side, updates the implementations of grpc
APIs that fetch partition information to pass selected asset keys to the
`JobDefinition`.

## How I Tested These Changes

(cherry picked from commit 18993a4)
jmsanders pushed a commit that referenced this pull request Aug 21, 2024
…` to `GrapheneJob` (#23494)

## Summary & Motivation

This enables the frontend to query for partition information with
respect to an asset selection within a job. This enables a world where
jobs can contain assets with different `PartitionsDefinition`s.

It's used by @bengotow 's `partition-set-to-job/graphql-ui` branch.

Sits atop #23491.

## How I Tested These Changes

(cherry picked from commit 3bd53c0)
jmsanders added a commit that referenced this pull request Aug 21, 2024
…job partitions, partition tags, and partition config, with asset selection (#23491)"

This reverts commit de78d95.
jmsanders pushed a commit that referenced this pull request Aug 22, 2024
…itions, partition tags, and partition config, with asset selection (#23491)

## Summary & Motivation

This builds on top of #23527
to do two related things:
- On the host process side, refactors `CodeLocation` methods and call
sites to use job names and selected asset keys instead of partition set
names.
- On the user code process side, updates the implementations of grpc
APIs that fetch partition information to pass selected asset keys to the
`JobDefinition`.

## How I Tested These Changes

(cherry picked from commit 18993a4)
jmsanders pushed a commit that referenced this pull request Aug 22, 2024
…` to `GrapheneJob` (#23494)

## Summary & Motivation

This enables the frontend to query for partition information with
respect to an asset selection within a job. This enables a world where
jobs can contain assets with different `PartitionsDefinition`s.

It's used by @bengotow 's `partition-set-to-job/graphql-ui` branch.

Sits atop #23491.

## How I Tested These Changes

(cherry picked from commit 3bd53c0)
sryza added a commit that referenced this pull request Aug 22, 2024
…3807)

## Summary & Motivation

While working on #23491 and
#23494, I wrote a set of tests
that validate that operations like `get_tags_for_partition` over GraphQL
and gRPC, when an asset job includes assets with different
`PartitionsDefinition`s and an asset selection is provided. However, I
had to take them out due to the way the stack was sequenced.

This PR adds them back in.

## How I Tested These Changes

## Changelog

NOCHANGELOG
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants