-
Notifications
You must be signed in to change notification settings - Fork 384
Isolate MiriMachine memory from Miri's #4343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@rustbot ready |
6cbc283
to
b53ed38
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I left some first comments, but this is not a full review. I'd rather not reverse-engineer the invariants of MachineAlloc
myself, so I'll wait for you to document them, which will make review a lot easier.
Furthermore, all pub fn
in discrete_alloc
should have proper doc comments, not just a safety comment. Please also add some basic unit tests -- we don't use them much in Miri, but this is one of the cases where they would make sense.
On Zulip you mentioned some benchmarks. Can you put benchmark results for the variant that you ended up going for here into the PR?
@rustbot author |
Reminder, once the PR becomes ready for a review, use |
I'll post benchmarks in a bit! I realised there might be some speed gains to be made with very simple changes, so I'll just experiment a little first. Thanks for the comments ^^ |
Baseline is set to having the allocator fully disabled. It's only marginally slower in most cases, though it struggles with large allocations it seems. I wonder how much work it would be to improve that, but if we go down the "only machines using this will touch it" path I hope it's not too bad? I got a slight (~4%) improvement from calling
|
Yes. A 32x slowdown with big allocations is hefty.^^ Shouldn't those just forward to |
It does mostly do that, which is what's confusing me... I'll try to fix it, I assume I just missed something really obvious. |
I checked; seems like the
I might be able to squeeze a bit more perf out by actually making the functions generic instead of just passing in a function pointer but shrug, unsure if it's necessary |
Ah yes, that is exactly why we added that particular benchmark. :) |
What kind of unit tests do you think belong here? I assumed functionality is covered by the usual tests, but I'll happily add in some stuff if you think it's relevant |
Similar to |
Openen the PR on the main repo, |
Expecting the build to fail for now since it's adapted to the changes from the PR (but also Miri seems to be having trouble on the current upstream master commit, so I guess it's pending that being fixed too) |
Tests added :D let me know if there's anything more to do |
This comment has been minimized.
This comment has been minimized.
c4ad33f
to
b832def
Compare
32672ff
to
1539771
Compare
My takeaway so far is that I need to get better at double-checking myself after a refactor. I hope I addressed everything brought up, though |
interpret/allocation: Fixup type for `alloc_bytes` This can be `FnOnce`, which helps us avoid an extra clone in rust-lang/miri#4343 r? RalfJung
This changed quite a bit since the last benchmark run -- could you re-run the benchmark to see whether that |
Rollup merge of #141682 - nia-e:fixup-alloc, r=RalfJung interpret/allocation: Fixup type for `alloc_bytes` This can be `FnOnce`, which helps us avoid an extra clone in rust-lang/miri#4343 r? RalfJung
Ok! So there's a small problem, namely that the jemalloc implementation rust uses by default is really slow with page-aligned allocations it seems; the reason perf was good before is that I forgot to actually enforce page-alignment in the However, if we do the huge allocs by directly calling
And this is what I got calling
|
What alignment did you set before?
It's definitely legal, and it's also possible to guarantee. You "just" have to allocate more pages than needed and then round up the pointer you got to the needed alignment. jemalloc likely does that if you ask for a high alignment. It's non-trivial to implement this correctly though, in particular regarding deallocation, so I'd rather not have this in the codebase. Given that the regression only affects native-libs mode, I am also fine with just taking the hit for now and having an issue to track the problem. We can then ping some people there that might know more about this; I am fairly clueless when it comes to allocators. |
When refactoring I accidentally just left in whatever align was passed in to us (oops). But if you prefer it, I can just leave it as it was before and take the hit, or try to get the |
I'm pretty sure the offending allocations in big-allocs are the big ones, so that wouldn't help. I'll be a lot more busy soon than I was recently and thus have less review capacity, so I'd rather land this PR today or tomorrow (in particular the glue code outside the new allocator impl). So I would propose we stick to jemalloc for now, you file an issue for the perf problem, and then if you want to make a follow-up PR that uses |
6b43f8f
to
9028dfa
Compare
Hopefully this addresses everything then? I also removed the extra clone in |
Gah I messed up, one sec |
There. Is this fine? |
Please squash, I'll take a look later :) |
Update src/alloc/isolated_alloc.rs Co-authored-by: Ralf Jung <[email protected]> allow multiple seeds use bitsets fix xcompile listened to reason and made my life so much easier fmt Update src/machine.rs Co-authored-by: Ralf Jung <[email protected]> fixups avoid some clones Update src/alloc/isolated_alloc.rs Co-authored-by: Ralf Jung <[email protected]> Update src/alloc/isolated_alloc.rs Co-authored-by: Ralf Jung <[email protected]> address review Update src/alloc/isolated_alloc.rs Co-authored-by: Ralf Jung <[email protected]> fixup comment Update src/alloc/isolated_alloc.rs Co-authored-by: Ralf Jung <[email protected]> Update src/alloc/isolated_alloc.rs Co-authored-by: Ralf Jung <[email protected]> address review pt 2 nit rem fn Update src/alloc/isolated_alloc.rs Co-authored-by: Ralf Jung <[email protected]> Update src/alloc/isolated_alloc.rs Co-authored-by: Ralf Jung <[email protected]> address review unneeded unsafe
I have done some lite refactoring, could you take a look if it makes sense to you? |
Everything checks out, ty! Should I squash it also? |
Thanks! No need to :) |
Based on discussion surrounding #4326, this merges in the (very simple) discrete allocator that the MiriMachine will have. See the design document linked there there for considerations, but in brief: we could pull in an off the shelf allocator for this, but performance isn't a massive worry and doing it this way might make it easier to enable support for doing multi-seeded runs in the future (without a lot more extraneous plumbing, at least)