Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ui] Upgrade yaml library, improve config merge behavior #20055

Merged
merged 2 commits into from
Feb 26, 2024

Conversation

bengotow
Copy link
Collaborator

Summary & Motivation

This PR aims to address #19668 (but doesn't entirely fix it):

Context: In the launchpad, there are several scenarios where we parse the current YAML, modify it as a JS object tree, and serialize it back to YAML. This is fundamentally undesirable because some YAML values (eg: 5.0) experience type coercion (eg: to 5) because JS does not have a Float type, etc. Fully fixing this would require changing the YAML merge logic to use YAML ASTs, but that's complex and we express config defaults as JS objects, so it wouldn't entirely resolve the "5.0 to 5" issue.

This PR fixes two of the more straightforward bugs in the "parse, merge, stringify" workflow:

  • The deepmerge library changed in v3+ to concatenate arrays by default rather than combining them (so [a,b,c] + [c,d,e] = [a,b,c,c,d,e]). This changes it to the expected ([a,b,c,d,e]), and means that merging two identical arrays does nothing to them.

  • The yaml stringify method tries to be "lean" with quotes, but can un-quote some values that other YAML parsers parse as other types. The example called out in Incorrect parsing of string-type sexagesimal values from launchpad #19668 was "5:30" becoming 5:30, which is a Sexagesimal base-60 number in python. (This has been called out as awful behavior before: https://news.ycombinator.com/item?id=26679229). I changed the serialize call to explicitly maintain quotes on all string values, which looks a bit ugly if you're a YAML purist but will avoid these cases.

How I Tested These Changes

I broke out a yamlMerge helper and wrote tests for the cases above, verified they failed and then fixed them

Copy link

Deploy preview for dagit-storybook ready!

✅ Preview
https://dagit-storybook-dgonpjs4a-elementl.vercel.app
https://bengotow-2024-02-19668.components-storybook.dagster-docs.io

Built with commit 05ddc84.
This pull request is being automatically deployed with vercel-action

Copy link

Deploy preview for dagit-core-storybook ready!

✅ Preview
https://dagit-core-storybook-1ufttr8pk-elementl.vercel.app
https://bengotow-2024-02-19668.core-storybook.dagster-docs.io

Built with commit 05ddc84.
This pull request is being automatically deployed with vercel-action

const mergedYaml = yaml.stringify(updatedRunConfigData);

onSaveSession({runConfigYaml: mergedYaml});
onSaveSession({runConfigYaml: mergeYaml(rootDefaultYaml, currentSession.runConfigYaml)});
Copy link
Contributor

@salazarm salazarm Feb 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the set state action instead: onSaveSession((currentSession) => ({runConfigYaml: mergeYaml(rootDefaultYaml, currentSession.runConfigYaml)}));

This helps avoid bugs where multiple things try to write to the session in the same tick and end up overwriting each other meaning the last write wins.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm it looks like this should work (onSaveSession is a useSetStateUpdateCallback function reference, and that mimics SetStateAction), but we don't do it anywhere in the LaunchpadSession currently. Going to do do this in a small follow-up PR and change all the callsites at once

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edit: it looks like we're in a weird spot with this because onSaveSession takes a partial state (our previous solution to ensure it didn't get stale values), which means that the type on the current state is also the partial state, not the full state 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just set the full state and any fields we want to get rid of we set as undefined/null ? I do agree this should just take the full state though, we can update callsites to pass in the full state and they can preserve the old behavior by doing onSaveSession({...currentSession, {rootConfigYaml: ...}})

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we can -- we just need to read/write the full state using that spread operator everywhere. It actually looks like the function-setter won't work until we do that because the "current value" is just the last partial that was set currently. Was trying to see if I could change that, but just setting the entire state each time would be better.

Going to do this in a follow-up tonight, it's more lines of code than I expected!

Copy link
Contributor

@salazarm salazarm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for digging into this!

@bengotow bengotow merged commit 0759886 into master Feb 26, 2024
3 checks passed
@bengotow bengotow deleted the bengotow-2024-02/19668 branch February 26, 2024 22:57
cmpadden pushed a commit that referenced this pull request Feb 28, 2024
## Summary & Motivation

This PR aims to address #19668 (but doesn't entirely fix it):

Context: In the launchpad, there are several scenarios where we parse
the current YAML, modify it as a JS object tree, and serialize it back
to YAML. This is fundamentally undesirable because some YAML values (eg:
5.0) experience type coercion (eg: to 5) because JS does not have a
Float type, etc. Fully fixing this would require changing the YAML merge
logic to use YAML ASTs, but that's complex and we express config
defaults as JS objects, so it wouldn't entirely resolve the "5.0 to 5"
issue.

This PR fixes two of the more straightforward bugs in the "parse, merge,
stringify" workflow:

- The deepmerge library changed in v3+ to concatenate arrays by default
rather than combining them (so [a,b,c] + [c,d,e] = [a,b,c,c,d,e]). This
changes it to the expected ([a,b,c,d,e]), and means that merging two
identical arrays does nothing to them.

- The yaml stringify method tries to be "lean" with quotes, but can
un-quote some values that other YAML parsers parse as other types. The
example called out in #19668 was "5:30" becoming 5:30, which is a
Sexagesimal base-60 number in python. (This has been called out as awful
behavior before: https://news.ycombinator.com/item?id=26679229). I
changed the serialize call to explicitly maintain quotes on all string
values, which looks a bit ugly if you're a YAML purist but will avoid
these cases.

## How I Tested These Changes

I broke out a yamlMerge helper and wrote tests for the cases above,
verified they failed and then fixed them

---------

Co-authored-by: bengotow <[email protected]>
PedramNavid pushed a commit that referenced this pull request Mar 28, 2024
## Summary & Motivation

This PR aims to address #19668 (but doesn't entirely fix it):

Context: In the launchpad, there are several scenarios where we parse
the current YAML, modify it as a JS object tree, and serialize it back
to YAML. This is fundamentally undesirable because some YAML values (eg:
5.0) experience type coercion (eg: to 5) because JS does not have a
Float type, etc. Fully fixing this would require changing the YAML merge
logic to use YAML ASTs, but that's complex and we express config
defaults as JS objects, so it wouldn't entirely resolve the "5.0 to 5"
issue.

This PR fixes two of the more straightforward bugs in the "parse, merge,
stringify" workflow:

- The deepmerge library changed in v3+ to concatenate arrays by default
rather than combining them (so [a,b,c] + [c,d,e] = [a,b,c,c,d,e]). This
changes it to the expected ([a,b,c,d,e]), and means that merging two
identical arrays does nothing to them.

- The yaml stringify method tries to be "lean" with quotes, but can
un-quote some values that other YAML parsers parse as other types. The
example called out in #19668 was "5:30" becoming 5:30, which is a
Sexagesimal base-60 number in python. (This has been called out as awful
behavior before: https://news.ycombinator.com/item?id=26679229). I
changed the serialize call to explicitly maintain quotes on all string
values, which looks a bit ugly if you're a YAML purist but will avoid
these cases.

## How I Tested These Changes

I broke out a yamlMerge helper and wrote tests for the cases above,
verified they failed and then fixed them

---------

Co-authored-by: bengotow <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants