Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider Narwhals to enable multiple dataframe types for free #96

Open
AdrianDAlessandro opened this issue Oct 1, 2024 · 2 comments
Open
Labels
enhancement New feature or request
Milestone

Comments

@AdrianDAlessandro
Copy link
Contributor

Currently pycsvy manually implements different read and write functionality for a few different libraries (pandas, polars, numpy). There already exists a tool that unifies the API between some of these libraries, narwhals.

An easy first option here is to use write_csv for the write functions.

While there currently isn't a read_csv function in narwhals, there is an open issue discussing whether to implement one.

Another feature of narwhals that can be useful is it includes some tooling for checking array types and checking for successful imports that is currently done manually https://narwhals-dev.github.io/narwhals/api-reference/dependencies/#narwhals.dependencies.is_numpy_array

@MarcoGorelli
Copy link

MarcoGorelli commented Oct 2, 2024

Hey @AdrianDAlessandro , thanks for your interest here

Before determining whether to use Narwhals, do you have an API in mind? I see you currently have:

  • read_to_dataframe (uses pandas.read_csv)
  • read_to_polars (uses polars.read_csv)

Would your idea be something like that users could do something like

read_to_dataframe("important_data.csv", backend='cudf')

? backend would default to 'pandas' (to preserve backwards-compatibility for you), and users can specify something else if they like

@AdrianDAlessandro
Copy link
Contributor Author

Hi @MarcoGorelli

To be honest, I hadn't thought much further than what I've already written in here. But the kind of API you've suggested is along the lines of what I was imagining.

There's possibly a broader question about what is the best API for this tool in general, but for the sake of keeping the current behaviour consistent, your suggestion is what I was initially thinking.

@dalonsoa dalonsoa added the enhancement New feature or request label Dec 20, 2024
@dalonsoa dalonsoa added this to the v1.1.0 milestone Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants