Skip to content

Fix GH-61477: Prevent spurious sort warning in concat with unorderable MultiIndex #61490

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Neer-Pathak
Copy link

Fix GH-61477: Stop Spurious Warning When concat(..., sort=False) on Mixed-Type MultiIndex

Overview

When you do something like:

pd.concat([df1, df2], axis=1, sort=False)

and your two DataFrames have MultiIndex columns that mix tuples and integers, pandas used to try to sort those labels under the hood. Since Python cannot compare tuple < int, you’d see:

RuntimeWarning: '<' not supported between instances of 'int' and 'tuple'; sort order is undefined for incomparable objects with multilevel columns

This warning is confusing, and worse, you explicitly asked not to sort (sort=False), so pandas should never even try.

What Changed

  1. Short-circuit Index.union when sort=False
    Before: Even with sort=False, pandas would call its normal union logic, which might attempt to compare labels.

Now: If you pass sort=False, we simply concatenate the two index arrays with:

np.concatenate([self._values, other._values])

and wrap that in a new Index. No comparisons, no warnings, and your original order is preserved.

  1. Guard sorting in MultiIndex._union
    Before: pandas would call result.sort_values() when sort wasn’t False, and if labels were unorderable it would warn you.

Now: We only call sort_values() when sort is truthy (True), and we wrap it in a try/except TypeError that silently falls back to the existing order on failure. No warning is emitted.

  1. New Regression Test
    A pytest test reproduces the original bug scenario, concatenating two small DataFrames with mixed-type MultiIndex columns and sort=False. The test asserts:

No RuntimeWarning is raised

Column order is exactly “first DataFrame’s columns, then second DataFrame’s columns”

Respects sort=False: If a user explicitly disables sorting, pandas won’t try.

Silences spurious warnings: No more confusing messages about comparing tuples to ints.

Keeps existing behavior for sort=True: You still get a sort or a real error if the labels truly can’t be ordered.

For testing we can try

import numpy as np, pandas as pd

left = pd.DataFrame(
    np.random.rand(5, 2),
    columns=pd.MultiIndex.from_tuples([("A", 1), ("B", (2, 3))])
)
right = pd.DataFrame(
    np.random.rand(5, 1),
    columns=pd.MultiIndex.from_tuples([("C", 4)])
)

# No warning, order preserved:
out = pd.concat([left, right], axis=1, sort=False)
print(out.columns)  # [("A", 1), ("B", (2, 3)), ("C", 4)]

# Sorting still works if requested:
sorted_out = pd.concat([left, right], axis=1, sort=True)
print(sorted_out.columns)  # sorted order or TypeError if impossible

…nstances of 'int' and 'tuple', sort order is undefined for incomparable objects with multilevel columns pandas-dev#61477

Fix: concat on mixed-type MultiIndex columns with sort=False should not warn
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant