Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Experimental] Split out an UnsafeCoerceUnsized trait from CoerceUnsized #88239

Closed
wants to merge 2 commits into from

Conversation

crlf0710
Copy link
Member

@crlf0710 crlf0710 commented Aug 22, 2021

This is taking a more aggressive but more neutral approach than #88010.

Currently this doesn't fully work, and it's just a prototype meant for discussion.

Motivation

In current rust, this code compiles:

#![feature(unsize)]

use std::marker::Unsize;

fn unsize<A, B>(v: *mut A) -> *mut B
where
    B: ?Sized,
    A: Unsize<B>,
{
    v
}

However, there's no guarentee at all that v points to a valid A typed value. What's more, when v is a fat pointer, it's pointee metadata might be any scalar value, if the pointer is created with something like std::ptr::from_raw_parts.

On the other hand, the unsizing coercion itself is based on the proof of A: Unsize<B>, basically saying that this is a type A that can be “unsized” to a dynamically-sized type B. It's reasonable to base this on A value is valid.

In recently implemented trait upcasting coercion feature(see #65991) we especially need this assumption to hold, because during the actual conversioin, we'll read the vtable metadata to retrieve the new metadata after the conversion. This is impossible if the input is not a valid pointer for A type (we need the metadata be valid, including when the pointer itself is a null pointer).

So i'm experimentally make this change, by marking raw pointer unsizing operations unsafe. It's up to the user to guarentee the pointer validity before performing the unsizing conversion. The code fragment above will need an unsafe block to pass compilation.

Changes in this PR

  • Splitted the CoerceUnsized trait into trait UnsafeCoerceUnsized{} and unsafe trait CoerceUnsized: UnsafeCoerceUnsized{}, the former for typechecking and the later for user-facing usages and unsafety checking.
  • Migrated all typechecking usages to use UnsafeCoerceUnsized
  • Removed implementation of CoerceUnsized for raw pointers, leaving only UnsafeCoerceUnsized implementation for them.
  • During unsafety checking, check whether CoerceUnsized obligation is fulfilled, if not, raise unsafety violation, so if it's not in an unsafe block environment, it will create a compilation error.

Status

Remaining issues if this approach is taken:

  • Trait name might need to be changed: Currently UncheckedCoerceUnsized is suggested in review.
  • Maybe a future-incompat-lint is needed for migrating all the existing code that performs unsizing on raw pointers.
  • There is a problem in unsafety checking code that for code in this pattern:
{
    other_statements;
    unsafe { ... }
}

the coercion cast of the tail expression should be considered "inside" the unsafe block, but it doesn't, and report an unsafe block needed error. (This is blocking CI).

  • Unnecessary unsizing operations:
trait Foo {}

fn foo(v: * mut dyn Foo) -> * mut dyn Foo { v }

is currently an unsizing cast at MIR level: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=e99e4181ea4edcfbb74e56352b899572

  • Rename CoerceUnsized to UnsafeCoerceUnsized in the ir code that chalk is using.
  • Tests and error code documentation update.

I think this will need a broader discussion first.

r? @RalfJung

@crlf0710 crlf0710 added A-DSTs Area: Dynamically-sized types (DSTs) F-trait_upcasting `#![feature(trait_upcasting)]` F-coerce_unsized The `CoerceUnsized` trait labels Aug 22, 2021
@rust-highfive
Copy link
Collaborator

Some changes occured to rustc_codegen_cranelift

cc @bjorn3

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 22, 2021
@bjorn3
Copy link
Member

bjorn3 commented Aug 22, 2021

cc @ojeda I believe rust-for-linux implements CoerceUnsized somewhere inside the kernel crate. Or maybe it was just Receiver that was implemented?

@rust-log-analyzer
Copy link
Collaborator

The job x86_64-gnu-llvm-10 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
   Compiling std_detect v0.1.5 (/checkout/library/stdarch/crates/std_detect)
   Compiling miniz_oxide v0.4.0
   Compiling object v0.26.1
   Compiling addr2line v0.16.0
error[E0133]: performing unsizing coercion is unsafe and requires unsafe block
    |
495 |             Box::into_raw(Box::new(contents))
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ performing unsizing coercion
    |
    |
    = note: unsizing coercion performed on a type that did not implement `CoerceUnsized` will cause undefined behavior if data is invalid

error[E0133]: performing unsizing coercion is unsafe and requires unsafe block
    |
    |
508 |             unsafe { Box::into_raw(Box::new(self.0)) }
    |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ performing unsizing coercion
    |
    = note: unsizing coercion performed on a type that did not implement `CoerceUnsized` will cause undefined behavior if data is invalid

error[E0133]: performing unsizing coercion is unsafe and requires unsafe block
    |
    |
508 |             unsafe { Box::into_raw(Box::new(self.0)) }
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ performing unsizing coercion
    |
    = note: unsizing coercion performed on a type that did not implement `CoerceUnsized` will cause undefined behavior if data is invalid

error[E0133]: performing unsizing coercion is unsafe and requires unsafe block
    |
    |
570 |             unsafe { Box::into_raw(data) }
    |                      ^^^^^^^^^^^^^^^^^^^ performing unsizing coercion
    |
    = note: unsizing coercion performed on a type that did not implement `CoerceUnsized` will cause undefined behavior if data is invalid

error[E0133]: performing unsizing coercion is unsafe and requires unsafe block
    |
    |
570 |             unsafe { Box::into_raw(data) }
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ performing unsizing coercion
    |
    = note: unsizing coercion performed on a type that did not implement `CoerceUnsized` will cause undefined behavior if data is invalid

error[E0133]: performing unsizing coercion is unsafe and requires unsafe block
    |
    |
657 |             unsafe { Box::into_raw(mem::replace(&mut self.0, Box::new(()))) }
    |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ performing unsizing coercion
    |
    = note: unsizing coercion performed on a type that did not implement `CoerceUnsized` will cause undefined behavior if data is invalid

error[E0133]: performing unsizing coercion is unsafe and requires unsafe block
    |
    |
657 |             unsafe { Box::into_raw(mem::replace(&mut self.0, Box::new(()))) }
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ performing unsizing coercion
    |
    = note: unsizing coercion performed on a type that did not implement `CoerceUnsized` will cause undefined behavior if data is invalid
For more information about this error, try `rustc --explain E0133`.
error: could not compile `std` due to 7 previous errors
Build completed unsuccessfully in 0:05:09

@crlf0710 crlf0710 added S-experimental Status: Ongoing experiment that does not require reviewing and won't be merged in its current state. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 22, 2021
/// See the [DST coercion RFC][dst-coerce] and [the nomicon entry on coercion][nomicon-coerce]
/// for more details.
///
/// When this trait is implemented in addition to [`UnsafeCoerceUnsized`](unsafe-coerce-unsized),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand the reasoning behind it, I think it's confusing to haveCoerceUnsized be an unsafe trait, and UnsafeCoerceUnsized be a safe trait.

We've used 'unchecked' in the names of unsafe functions (e.g. get_unchecked) - I think it might be clearer to rename UnsafeCoerceUnsized to UncheckedCoerceUnsized.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thanks!

@Aaron1011
Copy link
Member

It seems to me that CoerceUnsized/UnsafeCoerceUnsized has two related purposes:

  1. Indicate to the compiler that an 'unsizing' is allowed to be performed. User code cannot perform any of these 'unsizings' without language support (e.g. converting &[u8; N] to &[u8]), since there are no stable guarantees about the format of fat pointers.
  2. Make it convenient to perform the above 'unsizing' operations. In safe code, this is very useful (e.g. passing a reference to an array to a function expecting a slice, or a Box<T> to a function expecting a Box<dyn Trait>).

However, I'm not sure that combining these two purposes is a good idea in unsafe code:

  1. Writing unsafe { local_variable } looks very odd - at first glance, it seems like the read itself from local_variable is unsafe, which isn't true.
  2. Without explicit type annotations (e.g. let val: &dyn Foo = unsafe { some_trait_object }), the actual coercion performed might accidentally change if another part of the function is modified. In some ways, this is similar to using std::mem::transmute without any explicit type arguments.
  3. When dealing with raw pointers and unsafe code, I think there's much less value to having syntactic sugar. Similar to how we don't have something like 'unsafe Deref coercions' for raw pointers (e.g. allowing raw_pointer.field_from_deref inside an unsafe block), I think it would be better for any casting/coercions to be explicit if they may directly trigger undefined behavior.

Instead, I think it would be better to require an explicit opt-in at the call site when using UnsafeCoerceUnsized. One way would be to use the as keyword - it can already be used for stable unsizing coercions (e.g. Box::new(val) as Box<dyn Trait>). 'Fully safe' coercions defined via implementations of CoerceUnsized could still happen 'automatically', since they cannot directly cause undefined behavior.

@ojeda
Copy link
Contributor

ojeda commented Aug 23, 2021

cc @ojeda I believe rust-for-linux implements CoerceUnsized somewhere inside the kernel crate. Or maybe it was just Receiver that was implemented?

Yeah, we implement it in our refcounted pointer:

// This is to allow coercion from `Ref<T>` to `Ref<U>` if `T` can be converted to the
// dynamically-sized type (DST) `U`.
impl<T: ?Sized + Unsize<U>, U: ?Sized> core::ops::CoerceUnsized<Ref<U>> for Ref<T> {}

@bors
Copy link
Contributor

bors commented Aug 25, 2021

☔ The latest upstream changes (presumably #87875) made this pull request unmergeable. Please resolve the merge conflicts.

@bors bors added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Aug 25, 2021
@RalfJung
Copy link
Member

I'm sorry I will not be able to review this PR. I am not familiar with most of the code it touches, and I am moving which keeps me rather busy currently.

@RalfJung
Copy link
Member

RalfJung commented Aug 25, 2021

I think this will need a broader discussion first.

Cc @rust-lang/lang for this. I think this might need to go through some t-lang process (MCP or so); this sounds like something where a bit of design work is needed.

There's also a broader question here: what do the validity and safety invariants of *const dyn A say about the metadata ptr? Some possible choices:

  • nothing, i.e., it can be anything that's valid/safe for *const ()
  • validity: it must be non-null; safety: it must be actually a valid vtable for A

If we go with the latter choice, all these casts could actually be safe I think. "Non-null" is currently definitely part of the validity invariant, based on the metadata that rustc generates for LLVM (but do we really want this?). The size_of_val_raw functions are unsafe to ensure future compatibility with the first choice, but here there are also possible considerations around user-defined unsized types, e.g. a *const MyCStr where the size is determined by strlen. I am not sure what the ptr metadata APIs do (Cc @SimonSapin).

But my thinking is that we should make a choice here for the validity and safety of raw ptr metadata, and only adjust the unsizing trait family if that is still needed after we made a choice.

@crlf0710
Copy link
Member Author

@RalfJung indeed! I discussed this a bit with niko today on Zulip. we decide to create a separate repo for this initiative: https://github.com/nikomatsakis/dyn-upcasting-coercion-initiative to collect the design questions and discussions.

There's a dedicated page on this: https://github.com/nikomatsakis/dyn-upcasting-coercion-initiative/blob/master/design-questions/upcast-safety.md and we're trying to collect thoughts here.

We also want to write this up into something like a lang-team design meeting proposal after some preparation.

@crlf0710
Copy link
Member Author

crlf0710 commented Aug 27, 2021

This experiment has fulfilled its goals, and this is likely not the preferred approach. But the feedbacks are valuable. Closing this PR for now, and will reopen if needed later.

@crlf0710 crlf0710 closed this Aug 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-DSTs Area: Dynamically-sized types (DSTs) F-coerce_unsized The `CoerceUnsized` trait F-trait_upcasting `#![feature(trait_upcasting)]` S-experimental Status: Ongoing experiment that does not require reviewing and won't be merged in its current state. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants