Skip to content

Poor performance for large allocations in native-lib mode #4357

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks
nia-e opened this issue May 29, 2025 · 4 comments
Open
3 tasks

Poor performance for large allocations in native-lib mode #4357

nia-e opened this issue May 29, 2025 · 4 comments
Labels
A-native Area: calling native functions via FFI C-bug Category: This is a bug. I-slow Impact: Makes Miri even slower than it already is

Comments

@nia-e
Copy link
Contributor

nia-e commented May 29, 2025

As of #4343, the benchmark results for huge_allocs are significantly worse (~20x) if the isolated allocator is enabled i.e. in native-lib mode. This could be alleviated by using mmap internally instead, which did not have this issue, but will possibly require overallocating if the requested alignment is greater than the system pagesize.

Todos / open questions:

  • Why is calling alloc::alloc() so much slower when asking for page-aligned memory?
  • If it's fixable, bug the relevant people / open a PR to fix this
  • If it's not fixable, consider switching over to mmaping its memory instead
@RalfJung
Copy link
Member

@bjorn3 @lqd @nnethercote do you have any idea why page-aligned multiple-of-page-size allocations are slowing down jemalloc so much?

@RalfJung RalfJung added C-bug Category: This is a bug. I-slow Impact: Makes Miri even slower than it already is A-native Area: calling native functions via FFI labels May 29, 2025
@nnethercote
Copy link
Contributor

@bjorn3 @lqd @nnethercote do you have any idea why page-aligned multiple-of-page-size allocations are slowing down jemalloc so much?

Nope, but I will try summoning @glandium, who has forgotten more about jemalloc than I will ever know, in case he feels like answering a random question... (Hi Mike!)

@glandium
Copy link

I'm not sure what might be going on here, especially regarding the scale of the mentioned difference. I'm also not that familiar with very recent versions of jemalloc. I would advise looking at profiles (and maybe also look at the difference on different platforms)

If I was to venture a guess, it could be the kernel zeroing fresh pages in the process of those allocations.

(Hey Nick!)

@nia-e
Copy link
Contributor Author

nia-e commented May 31, 2025

I doubt it's the kernel. mmapping fresh pages (which are definitely zeroed) was near-exactly tied with jemallocing 16-byte-aligned memory, when both were requested as zeroed. The perf hit only appeared when the size was left unchanged but the alignment on jemalloc was upped to being the system pagesize, even hardcoding 4096 in the align field caused the same perf hit. It also seemed to vary a lot between runs; I saw ~8.5x slowdown on some and almost 30x on others, but both mmap and low-alignment jemalloc had very consistent times (+/- 5% or so on any given run)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-native Area: calling native functions via FFI C-bug Category: This is a bug. I-slow Impact: Makes Miri even slower than it already is
Projects
None yet
Development

No branches or pull requests

4 participants