Skip to content

[WIP] Generate_data with pyarrow dtypes #320

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dougbrn
Copy link
Collaborator

@dougbrn dougbrn commented Jul 31, 2025

Fixes #252.

Copy link

Before [fe17e19] <v0.4.7> After [5c0d2c1] Ratio Benchmark (Parameter)
881M 984M 1.12 benchmarks.ReadFewColumnsS3.peakmem_run
10.2±0.04ms 11.4±0.1ms 1.11 benchmarks.NestedFrameQuery.time_run
66.0±0.7ms 67.1±2ms 1.02 benchmarks.CountNestedBy.time_run
10.8±0.2ms 11.0±0.5ms 1.02 benchmarks.NestedFrameAddNested.time_run
2.71±0.01s 2.75±0.02s 1.01 benchmarks.ReadFewColumnsHTTPS.time_run
249M 250M 1 benchmarks.AssignSingleDfToNestedSeries.peakmem_run
101M 101M 1 benchmarks.NestedFrameAddNested.peakmem_run
1.20±0.01ms 1.20±0ms 1 benchmarks.NestedFrameReduce.time_run
270M 270M 1 benchmarks.ReassignHalfOfNestedSeries.peakmem_run
42.4±0.4ms 42.3±0.3ms 1 benchmarks.ReassignHalfOfNestedSeries.time_run

Click here to view all benchmarks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change generate_data to use pd.ArrowDtype
1 participant