Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[break] Render enums as structs instead of string aliases #162

Merged
merged 6 commits into from
Nov 5, 2020

Conversation

bmoylan
Copy link
Contributor

@bmoylan bmoylan commented Nov 3, 2020

Before this PR

In #139 we removed the UNKNOWN variant in favor of round-tripping unknown variants. This was difficult to adopt for some consumers because they depend on the "UNKNOWN" constant in wire API. When using a string alias type, there was no way to both store the original value and the fact that it is unknown.

After this PR

==COMMIT_MSG==
Render enums as structs instead of string aliases
==COMMIT_MSG==

Possible downsides?

This is a breaking change (identified by the compiler) that consumers will need to handle.


This change is Reviewable

@bmoylan bmoylan requested a review from asanderson15 November 3, 2020 22:14
@changelog-app
Copy link

changelog-app bot commented Nov 3, 2020

Generate changelog in changelog/@unreleased

Type

  • Feature
  • Improvement
  • Fix
  • Break
  • Deprecation
  • Manual task
  • Migration

Description

Render enums as structs instead of string aliases

Check the box to generate changelog(s)

  • Generate changelog entry

Copy link
Contributor

@asanderson15 asanderson15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good

}

func NewErrorCode(value ErrorCodeValue) ErrorCode {
return ErrorCode{val: value}
}

// IsUnknown returns false for all known variants of ErrorCode and true otherwise.
func (e ErrorCode) IsUnknown() bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we're making a break here, does it makes sense to make this method private? I'm fine keeping it, but it seems a little redundant after reinstating EnumUnknown values. Just a thought.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think if v.IsUnknown() { is cleaner than if v.Value() == api.EnumUnknown so would prefer to keep it

return e.val
}

func (e ErrorCode) String() string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intended to mirror the behavior of the java impl, right? (returning the "corrupted" string value rather than unknown)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep exactly

Copy link
Contributor

@nmiyake nmiyake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in sync with the general approach, but wanted to dig in on 2 things:

  • Whether we need to define and export the "Unknown" value per-enum type
  • Consideration for ensuring that name collisions do not occur due to newly generated code

I think we should make efforts to ensure that none of the newly generated code can create collisions -- although I think it may also be worth revisiting the old decision for enum value names to ensure that collisions can't occur there as well (if we're going to make a breaking change for enums that requires client code change, might as well consolidate all the breaks in that area if possible), I don't feel as strongly about that.

Reviewed 18 of 18 files at r1.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @asanderson15 and @bmoylan)


conjure/conjure_test.go, line 703 at r1 (raw file):

}

type DaysValue string

This method of appending "Value" to auto-generate the type is a bit dangerous/prone to collisions -- a valid Conjure definition could define both "Days" and "DaysValue" as enums, in which case the generated code won't compile. "Value" is also a common enough suffix that I don't think it's crazy that it might exist.

Only surefire way to avoid this is to have the generated type contain a value that can't exist in the naming (like Days_Value, although this would obviously not be great from a Go style perspective).


conjure/conjure_test.go, line 706 at r1 (raw file):

const (
	DaysFriday   DaysValue = "FRIDAY"

This issue existed before as well, but if we're doing an enum-related break anyway, then wonder if it's worth making the enum value names collision-proof... Although the instances in which we encounter this will likely be rare, it's always concerning to know that there can exist valid Conjure for which conjure-go can create code that cannot compile.


conjure/conjure_test.go, line 708 at r1 (raw file):

	DaysFriday   DaysValue = "FRIDAY"
	DaysSaturday DaysValue = "SATURDAY"
	DaysUnknown  DaysValue = "UNKNOWN"

What's the consideration for exporting this value? Since the IsUnkown() function is exported and provided, wondering if it might be cleaner to not declare this "UNKNOWN" value for every enum (and the generated code that returns DaysUnknown could just return DaysValue("UNKNOWN") instead).


conjure/conjure_test.go, line 716 at r1 (raw file):

}

func NewDays(value DaysValue) Days {

Similar concern to the above in terms of namespace collision -- this declaration will clash with anything else that's declared in this package as "New{EnumName}". This collision is probably less likely than "{EnumName}Value", but may still be worth considering making collision-proof like Days_Values.

Copy link
Contributor

@asanderson15 asanderson15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @asanderson15, @bmoylan, and @nmiyake)


conjure/conjure_test.go, line 708 at r1 (raw file):

Previously, nmiyake (Nick Miyake) wrote…

What's the consideration for exporting this value? Since the IsUnkown() function is exported and provided, wondering if it might be cleaner to not declare this "UNKNOWN" value for every enum (and the generated code that returns DaysUnknown could just return DaysValue("UNKNOWN") instead).

I'd argue for keeping this for a few reasons. First, including an enum value for "UNKNOWN" more closely mirrors the java implementation, which includes UNKNOWN in the native enum type. Since go doesn't have an equivalent native enum, including it as a defined constant is as close as we can get.

Second, and relatedly, "UNKNOWN" is used not only as a fallback, but as an explicit value returned by a number of existing APIs, consumed both by Java and Go service implementations. While I don't think this is a great practice, I also think the ship has sailed here unfortunately. If we remove this constant, existing Go API implementations will now need to return DaysValue("UNKNOWN") rather than simply continuing to return DaysUnknown, which feels pretty ugly.

Finally, we frequently switch on enum value, and it's much more convenient to have a defined value for UNKNOWN when writing one, especially when UNKNOWN is a defined case and not simply a fallback. IsUnknown() wouldn't help there.

Copy link
Contributor

@asanderson15 asanderson15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @asanderson15 and @nmiyake)


conjure/conjure_test.go, line 703 at r1 (raw file):

Previously, nmiyake (Nick Miyake) wrote…

This method of appending "Value" to auto-generate the type is a bit dangerous/prone to collisions -- a valid Conjure definition could define both "Days" and "DaysValue" as enums, in which case the generated code won't compile. "Value" is also a common enough suffix that I don't think it's crazy that it might exist.

Only surefire way to avoid this is to have the generated type contain a value that can't exist in the naming (like Days_Value, although this would obviously not be great from a Go style perspective).

Agree this is a valid concern, especially since "Value" is pretty common. I'm open to using the underscore, though I agree it's pretty unfortunately in terms of go style. That said, since use of the Days_Value type by consumers should be rare (generally I'd expect the enum struct rather than the enum value to be passed around), this is probably worth doing.


conjure/conjure_test.go, line 706 at r1 (raw file):

Previously, nmiyake (Nick Miyake) wrote…

This issue existed before as well, but if we're doing an enum-related break anyway, then wonder if it's worth making the enum value names collision-proof... Although the instances in which we encounter this will likely be rare, it's always concerning to know that there can exist valid Conjure for which conjure-go can create code that cannot compile.

Agree with this concern as well. It's going to break existing code in many more places because these constants get used all over, but also trivial enough to fix. I guess if there's ever a time to eat the break, it's probably now.


conjure/conjure_test.go, line 716 at r1 (raw file):

Previously, nmiyake (Nick Miyake) wrote…

Similar concern to the above in terms of namespace collision -- this declaration will clash with anything else that's declared in this package as "New{EnumName}". This collision is probably less likely than "{EnumName}Value", but may still be worth considering making collision-proof like Days_Values.

Agree on this as well.

@nmiyake
Copy link
Contributor

nmiyake commented Nov 5, 2020


conjure/conjure_test.go, line 708 at r1 (raw file):

Previously, asanderson15 (Adam Anderson) wrote…

I'd argue for keeping this for a few reasons. First, including an enum value for "UNKNOWN" more closely mirrors the java implementation, which includes UNKNOWN in the native enum type. Since go doesn't have an equivalent native enum, including it as a defined constant is as close as we can get.

Second, and relatedly, "UNKNOWN" is used not only as a fallback, but as an explicit value returned by a number of existing APIs, consumed both by Java and Go service implementations. While I don't think this is a great practice, I also think the ship has sailed here unfortunately. If we remove this constant, existing Go API implementations will now need to return DaysValue("UNKNOWN") rather than simply continuing to return DaysUnknown, which feels pretty ugly.

Finally, we frequently switch on enum value, and it's much more convenient to have a defined value for UNKNOWN when writing one, especially when UNKNOWN is a defined case and not simply a fallback. IsUnknown() wouldn't help there.

Thanks for the context -- the points that you raised make sense, and have convinced me that it is worth generating these UNKNOWN values.

Copy link
Contributor

@nmiyake nmiyake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the current implementation isn't quite sufficient to avoid collisions because of some namespace leak.

Let's formalize a bit:

We can have {EnumName} and {EnumName}.{EnumValues} are user-specifiable. {EnumName} must be defined as camel-case in Conjure definition, while {EnumValues} must be defined as [A-Z][A-Z0-9]*(_[A-Z0-9]+)*.

For a given {EnumName}, this PR generates the top-level identifiers:

  • {EnumName}
  • {EnumName}_Value
  • {EnumName}_Values
  • {EnumName}_{EnumValues} (if the enum value has an underscore, it is removed and the next letter is capitalized)

The Conjure spec/IR ensures that {EnumName} doesn't clash with anything else, and I believe that all Conjure-spec-generated names at this level are CamelCase only (no underscores).

With the current design, the generated code will break if a given {EnumName} has a value named "VALUE" or "VALUES", since the generated {EnumName}_{EnumValues} will clash with the generated functions.

I also found another case that previously fails -- right now, we translate values to Go identifiers by removing underscores and capitalizing the next letter -- so "FOO_BAR" becomes "FooBar". However, this can break for numbers: "FOO_3_3" and "FOO_33" both become "Foo33".

If we're messying up identifiers with underscores anyway, I wonder if we should just bite the bullet and preserve the input exactly -- that is, use "SCREAMING_SNAKE_CASE". If we did this, then we would be guaranteed that {EnumName}_Value} and {EnumNames_Values} don't conflict (since actual values would be "VALUE" and "VALUES"), and would solve the number collision issue as well.

Reviewed 15 of 15 files at r2.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @asanderson15)

@@ -44,6 +44,12 @@ types:
chan: map<string, string>
Enum:
values:
- VALUE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added VALUE and VALUES to the test objects to prove these are now handled without collisions.

Copy link
Contributor

@nmiyake nmiyake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Although it's a bit of an eyesore, I do think that this is ultimately the right way to go just so we can ensure correctness.

Had one question to make sure that we weren't missing something, and a suggestion for adding a test case.

Reviewed 24 of 24 files at r3.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @asanderson15 and @bmoylan)


conjure/errorwriter.go, line 410 at r3 (raw file):

	switch errorCode.Value() {
	case spec.ErrorCode_PERMISSION_DENIED:
		varName = "PermissionDenied"

What is this code block used for again? Want to make sure that the varName still being CamelCase is fine.


integration_test/testgenerated/objects/objects.yml, line 47 at r3 (raw file):

Previously, asanderson15 (Adam Anderson) wrote…

I added VALUE and VALUES to the test objects to prove these are now handled without collisions.

Great!

Can you also add "VALUE_1_1" and "VALUE_11" to also catch the case that would previously cause collisions?

Copy link
Contributor

@asanderson15 asanderson15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 23 of 26 files reviewed, 3 unresolved discussions (waiting on @asanderson15 and @nmiyake)


conjure/errorwriter.go, line 410 at r3 (raw file):

Previously, nmiyake (Nick Miyake) wrote…

What is this code block used for again? Want to make sure that the varName still being CamelCase is fine.

This is used to get the http status code for an error using conjure-go-runtime errors pkg: https://github.com/palantir/conjure-go-runtime/blob/develop/conjure-go-contract/errors/error_code.go#L38-L61. So this is still correct.


integration_test/testgenerated/objects/objects.yml, line 47 at r3 (raw file):

Previously, nmiyake (Nick Miyake) wrote…

Great!

Can you also add "VALUE_1_1" and "VALUE_11" to also catch the case that would previously cause collisions?

Done.

Copy link
Contributor

@nmiyake nmiyake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r4.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @asanderson15)

@bulldozer-bot bulldozer-bot bot merged commit 66166bb into develop Nov 5, 2020
@bulldozer-bot bulldozer-bot bot deleted the bm/enum-struct branch November 5, 2020 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants