-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate all floating-point status register functions across architectures #1479
Comments
In particular, does anyone know if ARM/aarch64 has intrinsics that read or write something like a "floating point status register", something that alters the behavior of floating-point instructions or can detect whether a floating-point instruction ran that caused a particular side-effect? If so, we should deprecate them as well. (And same if there's anything else in the ARM intrinsics that does more than just work on some values locally in some registers. Anything that affects or is affected by previous or later operations is suspicious.) Let's see if ping groups work here... @rustbot ping arm |
Ah, bummer. Let's do it by hand then. |
Sorry, we didn't see this! Arm architectures support this behaviour using special register (e.g. FPCR/FPSR on AArch64). However, neither ACLE (C) nor Rust's implementation expose these directly. For example, in C, implementations that want to configure rounding are supposed to do so using I think the comments in #1454 about assembly blocks apply to Arm too; people may need to change FP behaviours in I've noticed one notable issue: the existing transactional memory extension intrinsics (#855) rely on start/commit bracketing around other accesses. I think such accesses may even be implicit (e.g. stack accesses), so this could cause problems if the transaction is cancelled. This probably needs a closer look; I suspect this might need to be wrapped in a single assembly block, and in which case we should deprecate the TME intrinsics.
That's a rather more general concern. If I understand correctly, the main problem here is with intrinsics that change the implicit behaviour of other instructions that Rust might use itself. Is that accurate? I'm not aware of anything (other than FPCR) than does that today. However, we will have intrinsics that affect the behaviour of other intrinsics. Examples:
We also have intrinsics that need to remain ordered with respect to others, such as barriers. |
Ah! Good catch. If these intrinsics can "un-do" other memory accesses then indeed they can only be used from inline asm. This ofc applies to transactional memory intrinsics across all targets. I opened an issue for that: #1521.
Yes, that sounds like a good characterization of "FP control bits"-style problems. There's also "intrinsics that make observable the implicit behavior of other instructions Rust might use itself", which are "FP status bits"-style problems. Those are less problematic since they cannot cause UB, but they still produce unreliable results so we need to watch out for them.
I know ~nothing about how SVE works on the asm level, so I can't say unfortunately. Is there a good explanation for people without a hardware/ISA background -- something focusing on the abstract high-level behavior in a programming language, and relating the asm-level behavior to that?
Ordering constraints should be fine, as long as LLVM respects the ordering as well. However, this might still be a symptom of an actual issue. For instance, if you use Rust atomics, I am not sure how much we guarantee about how exactly they get turned into assembly -- and in particular, we are allowed to optimize them away entirely in some conditions. So mixing Rust atomics with direct use of target-specific hardware intrinsics is probably a bad idea. As another example of an ordering-related problem, "non-temporal / streaming stores" on x86 are causing major headaches. |
I think that's the crux of my point: we might have intrinsics that affect the behaviour of other intrinsics (or assembly), but can't affect the behaviour of normal Rust code. However, I now wonder if that ever actually happens reliably.
Do we need to go as far as to say "do not modify any global state"? Probably not, because most architectures have global condition flags, and setting them is often unavoidable. Do we need to go as far as describing exactly what can and can't be modified for each architecture? Maybe, but it might be a big, difficult task. At least for Arm, the ABI describes whether or not these modal registers are caller- or callee-saved. For AArch64 Linux, it'd be in AAPCS64. I'd like to think that there's a suitable middle-ground, or a statement of policy that we can make, but I haven't thought of one yet.
The ACLE (C/C++) documentation might help, though it's hard to completely get away from the ISA-level details. There's a section specifically about FFR; actually I think I need to read through that myself, to understand that FFRT mechanism and how (or if) it can help us in Rust. |
Does the ABI guarantee a specific value for those control registers? If not, inline asm can't assume a specific value either. For things like flags inline asm can't assume a specific input value either and can produce any output value unless you explicitly declare that you don't touch flags. |
Strictly, no, and (if I understand correctly) that's Ok for C because it doesn't precisely define FP behaviours, at least not like Rust does. However, Rust does require specific FPCR settings, and it would make sense that it would want to extend that requirement to the |
C requires a fixed fpenv unless you use the pragma to respect the fpenv, right? |
AFAIK we allow the asm block to change the condition flags arbitrarily, and for intrinsics LLVM needs to just know which of them clobber these flags. But yes ideally this would be documented in the
Intrinsics (and
Yeah, good point -- which state is "Rust-observable" can change with future versions of Rust. Without an explicit statement that certain state is definitely not affecting Rust code, there's always a risk that modifying some state can cause issues in the future. |
We did this for x86 in #1454 and #1471 (thanks @eduardosm for catching that!), and for RISC-V in #1478. Chances are other architectures will have similar functions; they should all be treated the same way.
The text was updated successfully, but these errors were encountered: