-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Ensure ptr::read
gets all the same LLVM load
metadata that dereferencing does
#109035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
b2c717f
`MaybeUninit::assume_init_read` should have `noundef` load metadata
scottmcm 0b96fee
Add a codegen test to confirm this fixes 106369
scottmcm 1f70bb8
Add a codegen test to confirm this fixes 73258
scottmcm 87696fd
Add a better approach comment in `ptr::read` to justify the intrinsic
scottmcm e7c6ad8
Improved implementation and comments after code review feedback
scottmcm dfc3377
Split the mem-replace codegen test
scottmcm File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1135,27 +1135,58 @@ pub const unsafe fn replace<T>(dst: *mut T, mut src: T) -> T { | |
#[rustc_const_unstable(feature = "const_ptr_read", issue = "80377")] | ||
#[cfg_attr(miri, track_caller)] // even without panics, this helps for Miri backtraces | ||
pub const unsafe fn read<T>(src: *const T) -> T { | ||
// We are calling the intrinsics directly to avoid function calls in the generated code | ||
// as `intrinsics::copy_nonoverlapping` is a wrapper function. | ||
extern "rust-intrinsic" { | ||
#[rustc_const_stable(feature = "const_intrinsic_copy", since = "1.63.0")] | ||
fn copy_nonoverlapping<T>(src: *const T, dst: *mut T, count: usize); | ||
} | ||
// It would be semantically correct to implement this via `copy_nonoverlapping` | ||
// and `MaybeUninit`, as was done before PR #109035. Calling `assume_init` | ||
// provides enough information to know that this is a typed operation. | ||
|
||
let mut tmp = MaybeUninit::<T>::uninit(); | ||
// SAFETY: the caller must guarantee that `src` is valid for reads. | ||
// `src` cannot overlap `tmp` because `tmp` was just allocated on | ||
// the stack as a separate allocated object. | ||
// However, as of March 2023 the compiler was not capable of taking advantage | ||
// of that information. Thus the implementation here switched to an intrinsic, | ||
// which lowers to `_0 = *src` in MIR, to address a few issues: | ||
// | ||
// Also, since we just wrote a valid value into `tmp`, it is guaranteed | ||
// to be properly initialized. | ||
// - Using `MaybeUninit::assume_init` after a `copy_nonoverlapping` was not | ||
// turning the untyped copy into a typed load. As such, the generated | ||
// `load` in LLVM didn't get various metadata, such as `!range` (#73258), | ||
// `!nonnull`, and `!noundef`, resulting in poorer optimization. | ||
scottmcm marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// - Going through the extra local resulted in multiple extra copies, even | ||
// in optimized MIR. (Ignoring StorageLive/Dead, the intrinsic is one | ||
// MIR statement, while the previous implementation was eight.) LLVM | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To show my work for these numbers, With this PR: bb0: {
StorageLive(_2); // scope 0 at C:\src\rust\tests\codegen\aaaaaa-mir-dump-demo.rs:9:5: 9:23
_0 = (*_1); // scope 2 at C:\src\rust\library\core\src\ptr\mod.rs:1167:13: 1167:50
StorageDead(_2); // scope 0 at C:\src\rust\tests\codegen\aaaaaa-mir-dump-demo.rs:9:5: 9:23
return; // scope 0 at C:\src\rust\tests\codegen\aaaaaa-mir-dump-demo.rs:10:2: 10:2
} Previously: bb0: {
StorageLive(_6); // scope 0 at C:\src\rust\tests\codegen\aaaaaa-mir-dump-demo.rs:12:5: 12:27
StorageLive(_2); // scope 2 at C:\src\rust\library\core\src\ptr\mod.rs:1195:17: 1195:24
StorageLive(_7); // scope 5 at C:\src\rust\library\core\src\mem\maybe_uninit.rs:314:31: 314:33
_2 = MaybeUninit::<T> { uninit: move _7 }; // scope 5 at C:\src\rust\library\core\src\mem\maybe_uninit.rs:314:9: 314:35
StorageDead(_7); // scope 5 at C:\src\rust\library\core\src\mem\maybe_uninit.rs:314:34: 314:35
StorageLive(_3); // scope 3 at C:\src\rust\library\core\src\ptr\mod.rs:1196:38: 1196:54
StorageLive(_4); // scope 3 at C:\src\rust\library\core\src\ptr\mod.rs:1196:38: 1196:54
_4 = &mut _2; // scope 3 at C:\src\rust\library\core\src\ptr\mod.rs:1196:38: 1196:54
StorageLive(_8); // scope 3 at C:\src\rust\library\core\src\ptr\mod.rs:1196:42: 1196:54
_8 = &raw mut (*_4); // scope 6 at C:\src\rust\library\core\src\mem\maybe_uninit.rs:569:9: 569:13
_3 = _8 as *mut T (PtrToPtr); // scope 6 at C:\src\rust\library\core\src\mem\maybe_uninit.rs:569:9: 569:33
StorageDead(_8); // scope 3 at C:\src\rust\library\core\src\ptr\mod.rs:1196:42: 1196:54
StorageDead(_4); // scope 3 at C:\src\rust\library\core\src\ptr\mod.rs:1196:53: 1196:54
copy_nonoverlapping(dst = move _3, src = _1, count = const 1_usize); // scope 3 at C:\src\rust\library\core\src\ptr\mod.rs:1196:13: 1196:58
StorageDead(_3); // scope 3 at C:\src\rust\library\core\src\ptr\mod.rs:1196:57: 1196:58
StorageLive(_5); // scope 3 at C:\src\rust\library\core\src\ptr\mod.rs:1197:13: 1197:16
_5 = move _2; // scope 3 at C:\src\rust\library\core\src\ptr\mod.rs:1197:13: 1197:16
StorageLive(_9); // scope 8 at C:\src\rust\library\core\src\mem\maybe_uninit.rs:627:38: 627:48
_9 = move (_5.1: std::mem::ManuallyDrop<T>); // scope 8 at C:\src\rust\library\core\src\mem\maybe_uninit.rs:627:38: 627:48
_0 = move (_9.0: T); // scope 9 at C:\src\rust\library\core\src\mem\manually_drop.rs:89:9: 89:19
StorageDead(_9); // scope 8 at C:\src\rust\library\core\src\mem\maybe_uninit.rs:627:48: 627:49
StorageDead(_5); // scope 3 at C:\src\rust\library\core\src\ptr\mod.rs:1197:29: 1197:30
StorageDead(_2); // scope 2 at C:\src\rust\library\core\src\ptr\mod.rs:1198:9: 1198:10
StorageDead(_6); // scope 0 at C:\src\rust\tests\codegen\aaaaaa-mir-dump-demo.rs:12:5: 12:27
return; // scope 0 at C:\src\rust\tests\codegen\aaaaaa-mir-dump-demo.rs:13:2: 13:2
} That's of course not a semantic problem, as both work fine, but I think it makes a nice demonstration of how this ends up being practically useful, especially in conjunction with MIR inlining. |
||
// could sometimes optimize them away, but because `read` is at the core | ||
// of so many things, not having them in the first place improves what we | ||
// hand off to the backend. For example, `mem::replace::<Big>` previously | ||
// emitted 4 `alloca` and 6 `memcpy`s, but is now 1 `alloc` and 3 `memcpy`s. | ||
// - In general, this approach keeps us from getting any more bugs (like | ||
// #106369) that boil down to "`read(p)` is worse than `*p`", as this | ||
// makes them look identical to the backend (or other MIR consumers). | ||
// | ||
// Future enhancements to MIR optimizations might well allow this to return | ||
// to the previous implementation, rather than using an intrinsic. | ||
workingjubilee marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
// SAFETY: the caller must guarantee that `src` is valid for reads. | ||
unsafe { | ||
assert_unsafe_precondition!( | ||
"ptr::read requires that the pointer argument is aligned and non-null", | ||
[T](src: *const T) => is_aligned_and_not_null(src) | ||
); | ||
copy_nonoverlapping(src, tmp.as_mut_ptr(), 1); | ||
tmp.assume_init() | ||
|
||
#[cfg(bootstrap)] | ||
{ | ||
// We are calling the intrinsics directly to avoid function calls in the | ||
// generated code as `intrinsics::copy_nonoverlapping` is a wrapper function. | ||
extern "rust-intrinsic" { | ||
#[rustc_const_stable(feature = "const_intrinsic_copy", since = "1.63.0")] | ||
fn copy_nonoverlapping<T>(src: *const T, dst: *mut T, count: usize); | ||
} | ||
|
||
// `src` cannot overlap `tmp` because `tmp` was just allocated on | ||
// the stack as a separate allocated object. | ||
let mut tmp = MaybeUninit::<T>::uninit(); | ||
copy_nonoverlapping(src, tmp.as_mut_ptr(), 1); | ||
tmp.assume_init() | ||
} | ||
#[cfg(not(bootstrap))] | ||
{ | ||
crate::intrinsics::read_via_copy(src) | ||
} | ||
scottmcm marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
} | ||
|
||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
// compile-flags: -O | ||
// ignore-debug (the extra assertions get in the way) | ||
|
||
#![crate_type = "lib"] | ||
|
||
// From <https://github.com/rust-lang/rust/issues/106369#issuecomment-1369095304> | ||
|
||
// CHECK-LABEL: @issue_106369( | ||
#[no_mangle] | ||
pub unsafe fn issue_106369(ptr: *const &i32) -> bool { | ||
// CHECK-NOT: icmp | ||
// CHECK: ret i1 true | ||
// CHECK-NOT: icmp | ||
Some(std::ptr::read(ptr)).is_some() | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
// compile-flags: -O | ||
// ignore-debug (the extra assertions get in the way) | ||
|
||
#![crate_type = "lib"] | ||
|
||
// Adapted from <https://github.com/rust-lang/rust/issues/73258#issue-637346014> | ||
|
||
#[derive(Clone, Copy)] | ||
#[repr(u8)] | ||
pub enum Foo { | ||
A, B, C, D, | ||
} | ||
|
||
// CHECK-LABEL: @issue_73258( | ||
#[no_mangle] | ||
pub unsafe fn issue_73258(ptr: *const Foo) -> Foo { | ||
// CHECK-NOT: icmp | ||
// CHECK-NOT: call | ||
// CHECK-NOT: br | ||
// CHECK-NOT: select | ||
|
||
// CHECK: %[[R:.+]] = load i8 | ||
// CHECK-SAME: !range ! | ||
|
||
// CHECK-NOT: icmp | ||
// CHECK-NOT: call | ||
// CHECK-NOT: br | ||
// CHECK-NOT: select | ||
|
||
// CHECK: ret i8 %[[R]] | ||
|
||
// CHECK-NOT: icmp | ||
// CHECK-NOT: call | ||
// CHECK-NOT: br | ||
// CHECK-NOT: select | ||
let k: Option<Foo> = Some(ptr.read()); | ||
return k.unwrap(); | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
// This test ensures that `mem::replace::<T>` only ever calls `@llvm.memcpy` | ||
// with `size_of::<T>()` as the size, and never goes through any wrapper that | ||
// may e.g. multiply `size_of::<T>()` with a variable "count" (which is only | ||
// known to be `1` after inlining). | ||
|
||
// compile-flags: -C no-prepopulate-passes -Zinline-mir=no | ||
// ignore-debug: the debug assertions get in the way | ||
|
||
#![crate_type = "lib"] | ||
|
||
#[repr(C, align(8))] | ||
pub struct Big([u64; 7]); | ||
pub fn replace_big(dst: &mut Big, src: Big) -> Big { | ||
// Before the `read_via_copy` intrinsic, this emitted six `memcpy`s. | ||
std::mem::replace(dst, src) | ||
} | ||
|
||
// NOTE(eddyb) the `CHECK-NOT`s ensure that the only calls of `@llvm.memcpy` in | ||
// the entire output, are the direct calls we want, from `ptr::replace`. | ||
|
||
// CHECK-NOT: call void @llvm.memcpy | ||
|
||
// For a large type, we expect exactly three `memcpy`s | ||
// CHECK-LABEL: define internal void @{{.+}}mem{{.+}}replace{{.+}}sret(%Big) | ||
// CHECK-NOT: alloca | ||
// CHECK: alloca %Big | ||
// CHECK-NOT: alloca | ||
// CHECK-NOT: call void @llvm.memcpy | ||
// CHECK: call void @llvm.memcpy.{{.+}}({{i8\*|ptr}} align 8 %{{.*}}, {{i8\*|ptr}} align 8 %{{.*}}, i{{.*}} 56, i1 false) | ||
// CHECK-NOT: call void @llvm.memcpy | ||
// CHECK: call void @llvm.memcpy.{{.+}}({{i8\*|ptr}} align 8 %{{.*}}, {{i8\*|ptr}} align 8 %{{.*}}, i{{.*}} 56, i1 false) | ||
// CHECK-NOT: call void @llvm.memcpy | ||
// CHECK: call void @llvm.memcpy.{{.+}}({{i8\*|ptr}} align 8 %{{.*}}, {{i8\*|ptr}} align 8 %{{.*}}, i{{.*}} 56, i1 false) | ||
// CHECK-NOT: call void @llvm.memcpy | ||
|
||
// CHECK-NOT: call void @llvm.memcpy |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
// compile-flags: -O -Z merge-functions=disabled | ||
// no-system-llvm | ||
// ignore-debug (the extra assertions get in the way) | ||
|
||
#![crate_type = "lib"] | ||
|
||
// Ensure that various forms of reading pointers correctly annotate the `load`s | ||
// with `!noundef` and `!range` metadata to enable extra optimization. | ||
|
||
use std::mem::MaybeUninit; | ||
|
||
// CHECK-LABEL: define noundef i8 @copy_byte( | ||
#[no_mangle] | ||
pub unsafe fn copy_byte(p: *const u8) -> u8 { | ||
// CHECK-NOT: load | ||
// CHECK: load i8, ptr %p, align 1 | ||
// CHECK-SAME: !noundef ! | ||
// CHECK-NOT: load | ||
*p | ||
} | ||
|
||
// CHECK-LABEL: define noundef i8 @read_byte( | ||
#[no_mangle] | ||
pub unsafe fn read_byte(p: *const u8) -> u8 { | ||
// CHECK-NOT: load | ||
// CHECK: load i8, ptr %p, align 1 | ||
// CHECK-SAME: !noundef ! | ||
// CHECK-NOT: load | ||
p.read() | ||
} | ||
|
||
// CHECK-LABEL: define i8 @read_byte_maybe_uninit( | ||
#[no_mangle] | ||
pub unsafe fn read_byte_maybe_uninit(p: *const MaybeUninit<u8>) -> MaybeUninit<u8> { | ||
// CHECK-NOT: load | ||
// CHECK: load i8, ptr %p, align 1 | ||
// CHECK-NOT: noundef | ||
// CHECK-NOT: load | ||
p.read() | ||
} | ||
|
||
// CHECK-LABEL: define noundef i8 @read_byte_assume_init( | ||
#[no_mangle] | ||
pub unsafe fn read_byte_assume_init(p: &MaybeUninit<u8>) -> u8 { | ||
// CHECK-NOT: load | ||
// CHECK: load i8, ptr %p, align 1 | ||
// CHECK-SAME: !noundef ! | ||
// CHECK-NOT: load | ||
p.assume_init_read() | ||
} | ||
|
||
// CHECK-LABEL: define noundef i32 @copy_char( | ||
#[no_mangle] | ||
pub unsafe fn copy_char(p: *const char) -> char { | ||
// CHECK-NOT: load | ||
// CHECK: load i32, ptr %p | ||
// CHECK-SAME: !range ![[RANGE:[0-9]+]] | ||
// CHECK-SAME: !noundef ! | ||
// CHECK-NOT: load | ||
*p | ||
} | ||
|
||
// CHECK-LABEL: define noundef i32 @read_char( | ||
#[no_mangle] | ||
pub unsafe fn read_char(p: *const char) -> char { | ||
// CHECK-NOT: load | ||
// CHECK: load i32, ptr %p | ||
// CHECK-SAME: !range ![[RANGE]] | ||
// CHECK-SAME: !noundef ! | ||
// CHECK-NOT: load | ||
p.read() | ||
} | ||
|
||
// CHECK-LABEL: define i32 @read_char_maybe_uninit( | ||
#[no_mangle] | ||
pub unsafe fn read_char_maybe_uninit(p: *const MaybeUninit<char>) -> MaybeUninit<char> { | ||
// CHECK-NOT: load | ||
// CHECK: load i32, ptr %p | ||
// CHECK-NOT: range | ||
// CHECK-NOT: noundef | ||
// CHECK-NOT: load | ||
p.read() | ||
} | ||
|
||
// CHECK-LABEL: define noundef i32 @read_char_assume_init( | ||
#[no_mangle] | ||
pub unsafe fn read_char_assume_init(p: &MaybeUninit<char>) -> char { | ||
// CHECK-NOT: load | ||
// CHECK: load i32, ptr %p | ||
// CHECK-SAME: !range ![[RANGE]] | ||
// CHECK-SAME: !noundef ! | ||
// CHECK-NOT: load | ||
p.assume_init_read() | ||
} | ||
|
||
// CHECK: ![[RANGE]] = !{i32 0, i32 1114112} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.