-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cpu/percpu: fix reentrant mutable references #372
Conversation
@roy-hopkins requested your review for the task and gdbstub changes. |
At the moment, a mutable reference may be acquired into the GHCB from any point in code via the `current_ghcb()` function. This function returns a `GHCBRef`, which implements `DerefMut<Target=GHCB>`. Since mutable references must be exclusive, this can cause undefined behavior, as a function may acquire a mutable reference to the GHCB while another one is held in a function up the call stack. Aliased mutable references are always undefined behavior, so we must give out shared references from GHCBRef. In order to be able to use the GHCB, we must mutate it though. Thus, introduce interior mutability in the GHCB through the use of the Cell type. This informs the compiler that the type may be mutated even through `&self`, and allows for sound nested references to the GHCB. This still does not add a safeguard against a function down the call stack altering the GHCB in an unexpected way for the first caller. While this is not undefined behavior, it might cause unexpected errors. This will be fixed in a future change. Signed-off-by: Carlos López <[email protected]>
The contents of this page are atomic and thus we never require a mutable reference to it. As such, store a const pointer instead. Signed-off-by: Carlos López <[email protected]>
The PerCpuUnsafe struct is reentrant, meaning multiple references can be acquired to it from several callsites in a single call stack. Since mutable references must always be unique, use only shared references and add interior mutability to the structure via Cell and RefCell. We use Cell for trivial Copy types, and RefCell for more complex types that we need to give references to. This allows us to remove dynamic borrow checking on the PerCpu struct via RefCell as a whole, as we never give out mutable references, making every reference automatically safe. Signed-off-by: Carlos López <[email protected]>
Check that the size of the per-cpu structures fit within a page at compile instead of emitting a runtime panic. Signed-off-by: Carlos López <[email protected]>
Now that the PerCpu and PerCpuUnsafe structures have interior mutability, it's always safe to acquire a reference to them, since they are always CPU-local and never modified outside the current CPU. Thus, we can remove PerCpuUnsafe and inline everything into a single PerCpu structure. This also removes a lot of unnecessary unsafe code, simplifying the codebase. Signed-off-by: Carlos López <[email protected]>
Now that the PerCpu is has interior mutability, and that the GHCB does as well, it is safe to hold references to them, as they can never mutably aliased. Thus, there is no need to store a const pointer and the unsafe code around it. This obsoletes the GHCBRef type, as there is no longer a need for a wrapper. Signed-off-by: Carlos López <[email protected]>
The PerCpu structure can only be accessed from the current CPU, so there is no need for locking. Moreover, if the pagetable is acquired more than once this will cause a deadlock. Replace the spinlock instead with a RefCell, and return a RefMut from `PerCpu::get_pgtable()`. This way, if we attempt to acquire more than one coexisting reference to the per-cpu pagetable more than once, we will panic immediately. Signed-off-by: Carlos López <[email protected]>
The PerCpu structure can only be accessed from the current CPU, so there is no need for locking. Moreover, if the runqueue is acquired with write access more than once this will cause a deadlock. Replace the RWLock with a RefCell. This way, if we attempt to acquire the runqueue with write access more than once we will panic immediately. Signed-off-by: Carlos López <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code changes look good to me. Testing on KVM was also successful.
svsm_vmsa: Cell<Option<VmsaRef>>, | ||
reset_ip: Cell<u64>, | ||
/// PerCpu Virtual Memory Range | ||
vm_range: VMR, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be really helpful to have comments about why some of these fields are not protected with a Cell
. While I suspect it has to do with which structures support interior mutability and can always be accessed through a shared reference, this fact is not immediately clear from reading the code here.
#[derive(Debug)] | ||
pub struct PerCpuUnsafe { | ||
pub struct PerCpu { | ||
shared: PerCpuShared, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is PerCpuShared
still necessary if the entire structure is shared-only? I pushed this initially so that cross-CPU access (via shared references) was always possible even if the owning CPU held a mutable reference, but now that this is no longer possible, it seems like we no longer need a separate structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I think about it more, I think we will conclude that PerCpuShared
is Sync
while PerCpu
is !Sync
. This may be enough of a critical distinction for how these fields are seprated. If so, we should at least comment this fact if not enforce it through trait declarations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly, PerCpuShared
needs to be Sync
, while PerCpu
doesn't (the fact that it doesn't is what allows us to use Cell
or RefCell
). This is also why I removed locks from the PerCpu
, as references never leave the current CPU.
I will add some documentation to the structure.
apic_id: u32, | ||
pgtbl: RefCell<PageTableRef>, | ||
tss: Cell<X86Tss>, | ||
svsm_vmsa: Cell<Option<VmsaRef>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this actually need a Cell
? I believe this is only ever written to while the CPU is being constructed, at which point a mutable reference should be safe (because it cannot possibly be referenced by anyone other than the constructing code). Once the CPU is constructed, and becomes visible in multiple placesk, this should just be constant and should not require interior mutability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also working on this on a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does need to have interior mutability though because VMSA set up needs to happen after calling PerCpu::setup()
(which sets up several PerCpu
fields that are used later), which means that the VMSA needs to be None
in PerCpu::new()
and later initialized via calls from the SMP startup code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can see, the only place the svsm_vmsa
field ever needs to be mutable is after PerCpu::new()
and before it is added to the global PERCPU_AREAS
. If you follow the construction phase comment I wrote, this would eliminate the need for a Cell
because the field would only require mutation during the construction phase - while the construction code holds a mutable reference - and would become constant once it is published into the global.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PerCpu::alloc()
(callsPerCpu::new()
)PerCpu::setup()
(sets up page table)PerCpu::allocate_svsm_vmsa()
(allocates the VMSA, mutatesself.svsm_vmsa
to set it).PerCpu::prepare_svsm_vmsa()
(modifiesself.svsm_vmsa
with the CR3 value from page table set in step 2).
Even if you move step 3 into step 1 you still need to modify svsm_vmsa
afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also need call vmsa.enable()
later which requires mutable access. This is now hidden by VmsaRef
but the type needs to disappear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If vmsa.enable()
requires mutable access, then we want RefCell
and not Cell
because we don't want to have to take and replace the VMSA every time we want to modify it.
I thought VmsaRef
was only used for guest VMSAs, not the SVSM VMSA. And that exists because the guest VMSA is accessible across CPUs and therefore must be Sync
-safe. I want to change this once we can rely on IPI delivery (where cross-CPU modification will be sent as IPI work instead of cross-thread memory references).
For the four-step flow you have above, what I am trying to describe involves PerCpu::alloc()
returning a mutable reference, which permits mutability through the remaining steps. This eliminates the need for interior mutability because the code holds a mutable reference - and is the only code that could be holding any reference because the structure is not yet globally visible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If
vmsa.enable()
requires mutable access, then we wantRefCell
and notCell
because we don't want to have to take and replace the VMSA every time we want to modify it.
Yes, that's what my upcoming PR will use.
I thought
VmsaRef
was only used for guest VMSAs, not the SVSM VMSA. And that exists because the guest VMSA is accessible across CPUs and therefore must beSync
-safe. I want to change this once we can rely on IPI delivery (where cross-CPU modification will be sent as IPI work instead of cross-thread memory references).For the four-step flow you have above, what I am trying to describe involves
PerCpu::alloc()
returning a mutable reference, which permits mutability through the remaining steps. This eliminates the need for interior mutability because the code holds a mutable reference - and is the only code that could be holding any reference because the structure is not yet globally visible.
I'll look into it but it seems to be at first glance that it will make code a bit less readable and add some unsafe in places where we say "we know we are not mutably aliasing the PerCpu because its under construction".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look into it but it seems to be at first glance that it will make code a bit less readable and add some unsafe in places where we say "we know we are not mutably aliasing the PerCpu because its under construction".
I was hoping the compiler would do that for us by ensuring that no alias could be created while construction was underway. Key to this idea is moving the PERCPU_AREAS
insertion out of the allocation and into some subsequent step, so that PerCpu::alloc()
can provably guarantee that no alias could exist because no reference was ever handed out elsewhere.
@@ -16,27 +15,22 @@ use crate::utils::immut_after_init::immut_after_init_set_multithreaded; | |||
|
|||
fn start_cpu(platform: &dyn SvsmPlatform, apic_id: u32, vtom: u64) { | |||
let start_rip: u64 = (start_ap as *const u8) as u64; | |||
let mut percpu = PerCpu::alloc(apic_id).expect("Failed to allocate AP per-cpu data"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced this is the right approach. During construction, when the CPU is not visible to other threads, it seems perfectly reasonable for this reference to be mutable to facilitate construction. The existing code makes this unnecessarily messy by having PerCpu::alloc()
take responsibility for inserting the newly constructed CPU into the PERCPU_AREAS
table, which I think is not appropriate. It would be much better to define an explicit method like PerCpu::insert(cpu: PerCpu)
which would insert the CPU into the global table, and which takes an actual PerCpu
(not a reference) to force a move to invalidate any reference that the caller has prior to insertion. This pattern maintains reference safety without introducing interior mutability on fields that really should be constant following construction. That said, I'm not certain this will work because the PerCpu
allocation requires specific memory allocation attributes and therefore the move may not be compatible, but I still think we could some up with something (like a PerCpuConstructor
or something that holds the actual reference) that would permit mutability during construction and only during construction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how this would look like or what it would solve. Sure, it's a bit strange to allocate a page and then get a pointer and cast it, but that's precisely what I intend to solve with the PageBox
type in my other PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is less about pointer casting and more about the concept of atomic construction. In general, it is preferable to have a design where an object is fully constructed before it can be visible to other entities. I think we would prefer a pattern like this:
let mut cpu = PerCpu::new(); // object exists
cpu.setup_some_field(); // requires mutable reference
cpu.setup_some_other_field(); // also requires mutable reference
PERCPU_AREAS.push(cpu); // makes object visible, e.g. Send, and drops local reference due to move - after this, no mutable reference exists
This pattern guarantees that the object is not visible (sent) until all of the fields that require construction have been fully populated. And if we follow this pattern, it means that none of the fields requiring mutability during construction would require interior mutability, because they would all be constant after construction is complete.
This pattern would be easy to achieve on PerCpu
if PerCpu
didn't have the unusual allocation pattern that it uses today, because the compiler would take care of all of the data movement as required to fulfill the move and send patterns that are required. But because PerCpu
is allocated by PerCpu::new()
at a specific virtual address, the subsequent move that causes the initial mutable reference to drop and be replaced by an entry in the global table is impossible.
My suggestion was to implement some other object which wraps the reference to the CPU allocation. It would simply hold a mutable reference to the CPU, and it would implement Deref
and DerefMut
so that it could be used like a PerCpu
during the construction phase, and then when the final move happens, the only thing the compiler is moving is the wrapper (not the actual CPU allocation). This would permit us to keep the unusual allocation pattern that's present today in PerCpu::new()
, would permit the CPU to be mutable during the construction phase, and would permit a clean hand-off when the CPU is made visible in the global array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to go through your comment more carefully, but the thing about moving is that we have code like this:
fn vmsa_tr_segment(&self) -> VMSASegment {
VMSASegment {
...
base: ptr::addr_of!(self.tss) as u64,
}
}
That code is called during VMSA initialization. If we move the PerCpu
after initialization, then the used address is invalid, as it points to the stack. So implementing your change in principle is possible but it would require further changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we move the
PerCpu
after initialization, then the used address is invalid, as it points to the stack. So implementing your change in principle is possible but it would require further changes.
That's precisely the "unusual allocation" pattern that I'm talking about, which is why I'm saying that moving a PerCpu
isn't going to work - we have to wrap the reference in some other object which is safely movable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact we should make use of the Pin
type to guarantee that the code I pasted above does not cause problems in the future. Either that or add a PhantomPinned
marker to the PerCpu
;
There are several per-cpu structures (the
PerCpu
andPerCpuUnsafe
structures themselves, as well as theGHCB
and the#HV
doorbell page) to which we can obtain mutable coexistant references, e.g.:This is undefined behavior, as the compiler is allowed to assume that one and only one mutable reference exists to the same memory location.
The way to solve this is to only give shared references (i.e.
&T
as opposed to&mut T
) to these structures. However, we need to mutate these structures for SVSM operation. This can be solved by using the types in thecore::cell
module:Cell<T>
allows setting aT
by moving values in and out of theCell
through a shared reference to theCell
(&Cell<T>
). We can't get a&T
, but we're allowed to mutate it in a single step (e.g.cell.set(value)
).RefCell<T>
gives out shared and mutable references to the innerT
, but checks at runtime that mutable references do not coexist with shared references.Thus, update the
PerCpu
andPerCpuUnsafe
types (merged through this PR into a singlePerCpu
struct) to use these types. This removes many unsafe code blocks, as there is no longer need to use raw pointers.This fixes some points in #359.
Remaining work
Nested updates
This PR does not solve the following case:
This is completely safe (does not produce undefined behavior), but can produce unexpected results. A follow-up PR will fix this.
VMSA
I think the changes for the VMSA (e.g.
PerCpu.svsm_vmsa
) require more intrusive work. I'll open a follow-up PR to fix it as well, although it does not seem as problematic at the moment.For the record, the approach used in this PR is how the Rust standard library solves the use of thread local data (which suffers from the same problem as our
PerCpu
type). This is the only way to safely have multiple intra-thread references to a structure that may be mutated.