api: move polars.from_dataframe
to polars.interchange.from_dataframe
, or even consider removing?
#20065
Labels
A-api
Area: changes to the public API
A-interchange
Area: Python dataframe interchange protocol
enhancement
New feature or an improvement of an existing feature
Milestone
polars.from_dataframe
uses the dataframe interchange protocol to convert between dataframes. However, users aren't necessarily are of this, nor are they aware of the limitations of the interchange protocol. Proof: at EuroScipy this year I saw someone usepl.from_dataframe
to convert from pandas to Polars because they thought that was just the recommended way of doing itThe reason this matters is that the interchange protocol is tied down to pandas/polars implementations. And pandas implementation had some severe (critical?) bugs before 2.2, which users wouldn't necessarily be aware of. A new Polars user could well use
pl.from_dataframe
with a pandas dataframe, get nonsense data back, and blame Polars even though the bug was on the pandas sideSo, my first inclination would be to move
polars.from_dataframe
topolars.interchange.from_dataframe
But second, we should probably have a conversation about whether we should keep it at all? Polars already supports the PyCapsule Interface now for both import and export, so anyone wishing to agnostically convert between dataframe can already use that, and with that it also opens the doors to agnostically accessing the underlying data from say C or Rust
In terms of ecosystem:
What I feel bad about is that Stijn was the only person to have read the interchange protocol spec carefully enough to have come up with a correct and useful implementation, and it would be a pity to see that effort go to waste. Do we take ownership of the interchange protocol and drive it forwards, or just let it sink and encourage the PyCapsule Interface for the same use cases?
TL;DR:
The text was updated successfully, but these errors were encountered: