Skip to content

Make TypeId const comparable #142789

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

oli-obk
Copy link
Contributor

@oli-obk oli-obk commented Jun 20, 2025

This should unblock stabilizing const TypeId::of and allow us to progress into any possible future we want to take TypeId to.

To achieve that TypeId now contains 16 / size_of<usize>() pointers which each are actually just size_of<usize>() bytes of the stable hash. At compile-time the first of these pointers cannot be dereferenced or otherwise inspected (at present doing so might ICE the compiler). Preventing inspection of this data allows us to refactor TypeId to any other scheme in the future without breaking anyone who was tempted to transmute TypeId to obtain the hash at compile-time.

cc @eddyb for their previous work on #95845 (which we still can do in the future if we want to get rid of the hash as the final thing that declares two TypeIds as equal).

r? @RalfJung

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jun 20, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jun 20, 2025

Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter
gets adapted for the changes, if necessary.

cc @rust-lang/miri, @RalfJung, @oli-obk, @lcnr

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred to the CTFE / Miri interpreter

cc @rust-lang/miri, @RalfJung, @oli-obk, @lcnr

Some changes occurred to the CTFE / Miri interpreter

cc @rust-lang/miri

Some changes occurred to the CTFE machinery

cc @RalfJung, @oli-obk, @lcnr

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

@rust-log-analyzer

This comment has been minimized.

@oli-obk oli-obk force-pushed the constable-type-id branch from 3cddd21 to 1fd7b66 Compare June 21, 2025 10:20
@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Member

It will be a while until I have the capacity to review a PR of this scale.

Meanwhile, could you say a bit more about the architecture of the change? It seems you want for the "new kind of allocation" approach, but it's not clear from the PR description how exactly that shows up in TypeId.

Also, I am definitely not comfortable landing this by myself, I can only review the const-eval parts. Changing the representation of TypeId has ramifications well beyond that that I do not feel qualified to evaluate -- I think an MCP would be justified.

@rustbot
Copy link
Collaborator

rustbot commented Jun 21, 2025

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

@oli-obk
Copy link
Contributor Author

oli-obk commented Jun 21, 2025

Well, I got private feedback yesterday that instead of encoding a 16 byte value as an 8 byte pointer to the 16 byte value and an 8 byte hash, I should just do the thing where we split up type id internally into pointer sized chunks and codegen will make a hash out of it again.

TLDR: no changes to runtime type id anymore in the latest revision of this PR. Only compile-time type id is now a bit funny

@oli-obk
Copy link
Contributor Author

oli-obk commented Jun 21, 2025

It will be a while until I have the capacity to review a PR of this scale.

I'm splitting unrelated parts out, so the high level feedback is already useful and I'll look for libs and codegen ppl to review the appropriate parts

@rust-log-analyzer

This comment has been minimized.

jdonszelmann added a commit to jdonszelmann/rust that referenced this pull request Jun 23, 2025
Make `PartialEq` a `const_trait`

r? `@fee1-dead` or `@compiler-errors`

something generally useful but also required for rust-lang#142789
jdonszelmann added a commit to jdonszelmann/rust that referenced this pull request Jun 23, 2025
Make `PartialEq` a `const_trait`

r? ``@fee1-dead`` or ``@compiler-errors``

something generally useful but also required for rust-lang#142789
rust-timer added a commit that referenced this pull request Jun 23, 2025
Rollup merge of #142822 - oli-obk:const-partial-eq, r=fee1-dead

Make `PartialEq` a `const_trait`

r? ``@fee1-dead`` or ``@compiler-errors``

something generally useful but also required for #142789
@bors
Copy link
Collaborator

bors commented Jun 23, 2025

☔ The latest upstream changes (presumably #142906) made this pull request unmergeable. Please resolve the merge conflicts.

@oli-obk oli-obk force-pushed the constable-type-id branch 2 times, most recently from b8a7a10 to 1c47a64 Compare June 24, 2025 09:25
@rust-log-analyzer

This comment has been minimized.

@oli-obk oli-obk force-pushed the constable-type-id branch from 1c47a64 to bcb4aa2 Compare June 24, 2025 13:02
@rust-log-analyzer

This comment has been minimized.

@oli-obk oli-obk force-pushed the constable-type-id branch from afa74de to 6fea6c3 Compare June 27, 2025 13:58
@rustbot
Copy link
Collaborator

rustbot commented Jun 27, 2025

The Miri subtree was changed

cc @rust-lang/miri

@rust-log-analyzer

This comment has been minimized.

@oli-obk oli-obk force-pushed the constable-type-id branch from 6fea6c3 to 2e8cece Compare June 27, 2025 14:24
@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Collaborator

bors commented Jun 27, 2025

☔ The latest upstream changes (presumably #143091) made this pull request unmergeable. Please resolve the merge conflicts.

@RalfJung
Copy link
Member

@rustbot author

@rustbot rustbot removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 28, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jun 28, 2025

Reminder, once the PR becomes ready for a review, use @rustbot ready.

@rustbot rustbot added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jun 28, 2025
@oli-obk oli-obk force-pushed the constable-type-id branch from 2e8cece to aa4c156 Compare June 30, 2025 13:19
@rust-log-analyzer

This comment has been minimized.

@oli-obk oli-obk force-pushed the constable-type-id branch from aa4c156 to 8a42250 Compare June 30, 2025 15:34
@oli-obk
Copy link
Contributor Author

oli-obk commented Jun 30, 2025

Having provenance on some bytes still means we'd appropriately error out if someone just transmutes the entire thing to u128 in const-eval, or something like that. We'd only be truly exposing the "tail", i.e., everything after the first ptr-sized chunk.

I implemented this in 8a42250

It made things a lot cleaner and simpler

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 30, 2025
@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Collaborator

bors commented Jul 3, 2025

☔ The latest upstream changes (presumably #143350) made this pull request unmergeable. Please resolve the merge conflicts.

@oli-obk oli-obk force-pushed the constable-type-id branch from 82aa2a9 to 871e4eb Compare July 3, 2025 08:22
Copy link
Member

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay yes I think I can live with this approach. :)

Comment on lines +35 to +37
let size = Size::from_bytes(16);
let align = tcx.data_layout.pointer_align;
let mut alloc = Allocation::new(size, *align, AllocInit::Uninit, ());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance we can assert that this matches the size and align of the real TypeId?

@@ -29,6 +31,37 @@ pub(crate) fn alloc_type_name<'tcx>(tcx: TyCtxt<'tcx>, ty: Ty<'tcx>) -> ConstAll
tcx.mk_const_alloc(alloc)
}

pub(crate) fn alloc_type_id<'tcx>(tcx: TyCtxt<'tcx>, ty: Ty<'tcx>) -> AllocId {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub(crate) fn alloc_type_id<'tcx>(tcx: TyCtxt<'tcx>, ty: Ty<'tcx>) -> AllocId {
/// This returns the `AllocId` of a place where a [`TypeId`](https://doc.rust-lang.org/nightly/std/any/struct.TypeId.html) for the given `ty` is stored.
pub(crate) fn alloc_type_id<'tcx>(tcx: TyCtxt<'tcx>, ty: Ty<'tcx>) -> AllocId {

I think ideally this would return an OpTy rather than an AllocId, makes it conceptually much more clear -- we don't even want a place here.

Or, if we want to optimize for performance, this function should take as argument an MPlaceTy for where to store the TypeId.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a local commit for making it write to a dest directly. Which is a change I want to do for all the intrinsics that are currently generating a ConstValue and writing that from within the interpreter

I can land that first on master if you prefer. Or just move the commit for typeid into this PR


// Give the first pointer-size bytes provenance that knows about the type id

let alloc_id = tcx.reserve_and_set_type_id_alloc(ty);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's two AllocId in this function, this one and the one being returned. Please give it a more clearly distinguished name.

}
}
}
// These are controlled by rustc and not available for CTFE
GlobalAlloc::Type { .. } => skip_recursive_check = true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this need for Type but not for functions and vtables?

GlobalAlloc::Type { .. } => {
// Drop the provenance, the offset contains the bytes of the hash
let llval = self.const_usize(offset.bytes());
return unsafe { llvm::LLVMConstIntToPtr(llval, llty) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have to emit an int-to-ptr cast here? I think we should avoid that if at all possible...

@@ -461,6 +461,28 @@ pub trait EvalContextExt<'tcx>: crate::MiriInterpCxExt<'tcx> {
// Make these a NOP, so we get the better Miri-native error messages.
}

"type_id_eq" => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what this should do is actually look up the two first pointers, get the corresponding Ty, and compare those. That also ensures that these are indeed (looking like) TypeId.

We can then also compare the raw bytes to catch programs doing nasty things, but comparing the actual true types is the main operation IMO. I'd rather not compare provenance as that's abstract data that cannot exist at runtime (as opposed to the Type allocations which we actually could manifest).

Please add a Miri test that transmutes two u128 to TypeId and compares them; that should be UB.

Comment on lines 8 to 23
const fn type_id_of_val<T: 'static>(_: &T) -> u128 {
std::intrinsics::type_id::<T>()
let name = std::intrinsics::type_name::<T>();
let len = name.len() as u64;
let len = u64::to_be_bytes(len);
let mut ret = [0; 16];
let mut i = 0;
while i < 8 {
ret[i] = len[i];
i += 1;
}
while i < 16 {
ret[i] = name.as_bytes()[i - 8];
i += 1;
}
u128::from_be_bytes(ret)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this doing...? This definitely cannot stand without a comment.^^


const _: () = {
let id = TypeId::of::<u8>();
let id: u8 = unsafe { (&id as *const TypeId).cast::<u8>().read() };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let id: u8 = unsafe { (&id as *const TypeId).cast::<u8>().read() };
let id: u8 = unsafe { (&raw const id).cast::<u8>().read() };

@@ -403,6 +403,22 @@ impl<'tcx> interpret::Machine<'tcx> for CompileTimeMachine<'tcx> {
let cmp = ecx.guaranteed_cmp(a, b)?;
ecx.write_scalar(Scalar::from_u8(cmp), dest)?;
}
sym::type_id_eq => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think const-eval and Miri can use the same implementation.


#[inline]
fn rt(a: &TypeId, b: &TypeId) -> bool {
a.data == b.data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means Miri never calls the intrinsic, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants