-
Notifications
You must be signed in to change notification settings - Fork 308
Upgrade to new asm
syntax
#306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is going to be stabilized soon: rust-lang/rust#72016 (comment) I'd like to check whether the strategy in this playground would allow for a cleaner impl (eg: getting rid of the funky macro for constructing the Copy of playground code for posterity#![feature(asm)]
mod intrinsics {
#[inline(always)]
pub unsafe fn adcx(a: &mut u64, b: u64) {
asm! {
"adcx {0}, {1}",
inout(reg) *a,
in(reg) b
};
}
#[inline(always)]
pub unsafe fn adox(a: &mut u64, b: u64) {
asm! {
"adox {0}, {1}",
inout(reg) *a,
in(reg) b
};
}
#[inline(always)]
pub unsafe fn clear_flags() {
asm! {
"xor {0}, {0}",
out(reg) _
}
}
// doesn't inline properly
// pub use std::arch::x86_64::_mulx_u64 as mulx;
#[inline(always)]
pub unsafe fn mulx(a: u64, b: u64, lo: &mut u64, hi: &mut u64) {
asm!(
"mulx {2}, {1}, {0}",
in(reg) b,
lateout(reg) *lo,
lateout(reg) *hi,
in("rdx") a
);
}
}
/// doesn't do actual work, just demonstration
pub fn carry_chain(a: [u64; 2], b: [u64; 2]) -> u64 {
use intrinsics::*;
unsafe {
// clear flags CF and OF
intrinsics::clear_flags();
let mut hi = 0;
let mut lo = 0;
mulx(a[0], b[0], &mut lo, &mut hi);
adox(&mut lo, hi);
adcx(&mut hi, b[0]);
mulx(a[1], b[1], &mut lo, &mut hi);
adox(&mut lo, hi);
adcx(&mut hi, b[1]);
lo
}
}
#[test]
fn test_adcx() {
let mut a = 5;
let b = 3;
unsafe {
intrinsics::adcx(&mut a, b);
}
assert_eq!(a, 8);
}
#[test]
fn test_mulx() {
let a = 5;
let b = 3;
let mut hi = 0;
let mut lo = 0;
unsafe {
intrinsics::mulx(a, b, &mut lo, &mut hi);
}
assert_eq!(lo, 5 * 3);
assert_eq!(hi, 0);
} |
There is a decision to make:
Ideas? We can still have the assembly in nightly. |
Based on the CI, there might be a hard-to-find problem:
|
1f6eeca
to
a80a82a
Compare
One thing to check is whether the code can compile in non-x86-64 machines (aka arm). In the build.rs, I use |
For the previous comment---I think check_std does that job. |
ff/build.rs
Outdated
let rust_version = rustc_version::version().unwrap(); | ||
let is_asm_stable = rust_version >= Version::new(1, 59, 0); | ||
|
||
let asm_enabled = cfg!(feature = "asm"); | ||
|
||
let is_x86_64 = target_arch == "x86_64"; | ||
let mut bmi2_enabled = false; | ||
let mut adx_enabled = false; | ||
if is_x86_64 { | ||
bmi2_enabled = is_x86_feature_detected!("bmi2"); | ||
adx_enabled = is_x86_feature_detected!("adx"); | ||
} | ||
|
||
let should_use_asm = is_asm_stable && asm_enabled && bmi2_enabled && adx_enabled && is_x86_64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm should we just get rid of this, and instead change our minimum supported rust version to 1.59? And in the places that we use the assembly backend, we can directly gate it with a
#[cfg(
feature = "asm",
target_feature = "bmi2",
target_feature = "adx",
target_arch = "x86_64"
)]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is only one question: do we want to bump up the minimal supported rust version to 1.59?
That would clearly be ideal, and we can cut the build.rs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine personally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend doing that in a separate PR---the current stable is 1.58.1, and it would need 20+ days to be 1.59.0.
So, if we drop it, for the next three weeks, we may have some trouble.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add an issue and a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We write cfg!(all( use_asm, target_feature = "bmi2", target_feature = "adx", target_arch = "x86_64" ))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another complication---actually this is a mistake:
https://github.com/arkworks-rs/algebra/blob/migrate-to-asm/ff/src/biginteger/mod.rs#L242
It just detects the asm
feature flag but not other things.
Seems to be complicated.
Should we do cfg_attr!
in the lib.rs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That one is fine, as it doesn't require 1.59, nor bmi2
or adx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the logic. It appears that rust version has to be queried in build.rs. The rest can be moved out.
I use inline_asm_stable
since not all asm requires 1.59.0, as you pointed out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cfg(version(1.59))
is not yet stable, though, we can drop it when we decided that the minimal supported rust version is >=1.59.
Co-authored-by: Pratyush Mishra <[email protected]>
Co-authored-by: Pratyush Mishra <[email protected]>
Thanks for getting this over the line, @weikengchen! There's one remaining TODO that we should file an issue for: in the previous version of the backend, Unfortunately making |
Talking about which, since we have already used the ugly There is an issue---if we hardcode that as a constant, does it mean that Rust would generate different assembly codes for each different field? Or, is that already happening? |
My feeling is that |
We can't quite hardcode the constants, because we don't know what they are until instantiation time in some downstream crate. We could provide a macro that's the equivalent of https://crates.io/crates/ff_derive, which people invoke on the |
...and this approach should also allow us to hardcode |
Yeah, it would allow us to hardcode modulus also. |
With that, I think we are reaching the destination of single-field-element asm optimization. To improve further, we will need to do batching and SIMD. |
@weikengchen is this ready to merge? I can do it now if so. |
It is really to merge. I am waiting you (technically the person who created the PR did the final decision right?) |
Description
closes: #305
closes: #68
There is one change in the assembly here. Earlier, we used to declare
P::INV
as an immediate input for the assembly block, but due to limitations ofconst
in Rust, we can no longer do that, and must declare it as an input. We can probably work around this manually, but it would make things slightly uglier. I would benchmark the performance difference, but unfortunately I don't have access to an x86 machine atm. Could one of @jon-chuang or @ValarDragon benchmark this PR?Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.
Pending
section inCHANGELOG.md
Files changed
in the Github PR explorer