ENH #61033: Add coalesce_keys option to DataFrame.join for preserving join keys #61678
+337
−55
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add coalesce_keys option to DataFrame.join for preserving join keys
This adds a coalesce_keys keyword to DataFrame.join to allow preservation
of both join key columns (id and id_right), instead of automatically
coalescing them into a single column.
This is especially useful in full outer joins, where retaining information
about unmatched keys from both sides is important.
Example:
df1.join(df2, on=id, coalesce_keys=False)
This will result in both id and id_right columns being preserved, rather
than merged into a single id.
Includes:
Modifications to join internals (core/reshape/merge.py)
A dedicated test file (test_merge_coalesce.py) covering:
All code checks passed.