-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Fix: handle column name collisions when combining UNION logical inputs & nested Column expressions in maybe_fix_physical_column_name #16064
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6bf7e8e
to
1e3e9cb
Compare
expr.and_then(|e| { | ||
e.transform_down(|node| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes Columns
can be inside other type of expressions (so they are not on the "top level") , for example:
BinaryExpr {
left: IsNotNull(
Column(
Column {
relation: Some(
Bare {
table: "left",
},
),
name: "people_column",
},
),
),
op: Or,
right: IsNotNull(
Column(
Column {
relation: Some(
Bare {
table: "left",
},
),
name: "people_column:1",
},
),
),
},
if so the current fix won't apply, this change handles those cases by using transform_down
but logic remains the same
7bdaa6a
to
99d1177
Compare
99d1177
to
f756008
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 nice! LGTM
@berkaysynnada since you reviewed #15580 that added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me, thank you @LiaCastaneda
thank you also @gabotechs, I merge this after the checks |
thanks for the reviews @gabotechs & @berkaysynnada ! |
…s & nested Column expressions in maybe_fix_physical_column_name (apache#16064) * Fix union schema name coercion * Address renaming for columns that are not in the top level as well * Add unit test * Format * Use insta tests properly * Address review - comment + minor simplification change --------- Co-authored-by: Berkay Şahin <[email protected]> (cherry picked from commit e5f596b)
Which issue does this PR close?
Rationale for this change
This PR handles column names not matching expression names for 2 cases:
UnionExec
(see issue for wider explanation)What changes are included in this PR?
A workaround fix in
union_schema
by keeping the field names of the first input + a integration test with a reproducer.Are these changes tested?
yes, for the union cercion case, a test was added in
substrait_consumer
tests. For columns inside other types of expressions I added a unit test.Are there any user-facing changes?
no