Skip to content

Upgrade to new asm syntax #306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 68 commits into from
Feb 5, 2022
Merged

Upgrade to new asm syntax #306

merged 68 commits into from
Feb 5, 2022

Conversation

Pratyush
Copy link
Member

@Pratyush Pratyush commented Aug 12, 2021

Description

closes: #305
closes: #68

There is one change in the assembly here. Earlier, we used to declare P::INV as an immediate input for the assembly block, but due to limitations of const in Rust, we can no longer do that, and must declare it as an input. We can probably work around this manually, but it would make things slightly uglier. I would benchmark the performance difference, but unfortunately I don't have access to an x86 machine atm. Could one of @jon-chuang or @ValarDragon benchmark this PR?


Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.

  • Targeted PR against correct branch (master)
  • Linked to Github issue with discussion and accepted design OR have an explanation in the PR that describes this work.
  • Wrote unit tests
  • Updated relevant documentation in the code
  • Added a relevant changelog entry to the Pending section in CHANGELOG.md
  • Re-reviewed Files changed in the Github PR explorer

@Pratyush Pratyush added P-medium Priority: medium T-performance Type: performance improvements T-refactor Type: cleanup/refactor labels Sep 9, 2021
@Pratyush Pratyush self-assigned this Sep 9, 2021
@Pratyush
Copy link
Member Author

Pratyush commented Nov 9, 2021

This is going to be stabilized soon: rust-lang/rust#72016 (comment)

I'd like to check whether the strategy in this playground would allow for a cleaner impl (eg: getting rid of the funky macro for constructing the asm block), but otherwise we should move quickly to update to the new syntax.

Copy of playground code for posterity
#![feature(asm)]

mod intrinsics {
    #[inline(always)]
    pub unsafe fn adcx(a: &mut u64, b: u64) {
        asm! {
            "adcx {0}, {1}",
            inout(reg) *a,
            in(reg) b
        };
    }

    #[inline(always)]
    pub unsafe fn adox(a: &mut u64, b: u64) {
        asm! {
            "adox {0}, {1}",
            inout(reg) *a,
            in(reg) b
        };
    }

    #[inline(always)]
    pub unsafe fn clear_flags() {
        asm! {
            "xor {0}, {0}",
            out(reg) _
        }
    }

    // doesn't inline properly
    // pub use std::arch::x86_64::_mulx_u64 as mulx;

    #[inline(always)]
    pub unsafe fn mulx(a: u64, b: u64, lo: &mut u64, hi: &mut u64) {
        asm!(
            "mulx {2}, {1}, {0}",
            in(reg) b,
            lateout(reg) *lo,
            lateout(reg) *hi,
            in("rdx") a
        );
    }
}

/// doesn't do actual work, just demonstration
pub fn carry_chain(a: [u64; 2], b: [u64; 2]) -> u64 {
    use intrinsics::*;
    unsafe {
        // clear flags CF and OF
        intrinsics::clear_flags();
        let mut hi = 0;
        let mut lo = 0;

        mulx(a[0], b[0], &mut lo, &mut hi);
        adox(&mut lo, hi);
        adcx(&mut hi, b[0]);

        mulx(a[1], b[1], &mut lo, &mut hi);
        adox(&mut lo, hi);
        adcx(&mut hi, b[1]);

        lo
    }
}


#[test]
fn test_adcx() {
    let mut a = 5;
    let b = 3;
    unsafe {
        intrinsics::adcx(&mut a, b);
    }
    assert_eq!(a, 8);
}


#[test]
fn test_mulx() {
    let a = 5;
    let b = 3;
    let mut hi = 0;
    let mut lo = 0;
    unsafe {
        intrinsics::mulx(a, b, &mut lo, &mut hi);
    }
    assert_eq!(lo, 5 * 3);
    assert_eq!(hi, 0);
}

@weikengchen
Copy link
Member

There is a decision to make:

  • Since 1.59.0, asm! is now stable.
  • However, asm_const is not stable, but we do use const in asm in certain places.

Ideas? We can still have the assembly in nightly.

@weikengchen
Copy link
Member

Based on the CI, there might be a hard-to-find problem:

Error: test failed, to rerun pass '-p ark-ff --lib'
Caused by:
  process didn't exit successfully: `/home/runner/work/algebra/algebra/target/debug/deps/ark_ff-a63347d5ec55c228` (signal: 11, SIGSEGV: invalid memory reference)

@weikengchen
Copy link
Member

One thing to check is whether the code can compile in non-x86-64 machines (aka arm). In the build.rs, I use is_x86_feature_detected!. I worried that it would be undefined in arm.

@weikengchen
Copy link
Member

For the previous comment---I think check_std does that job.

ff/build.rs Outdated
Comment on lines 8 to 21
let rust_version = rustc_version::version().unwrap();
let is_asm_stable = rust_version >= Version::new(1, 59, 0);

let asm_enabled = cfg!(feature = "asm");

let is_x86_64 = target_arch == "x86_64";
let mut bmi2_enabled = false;
let mut adx_enabled = false;
if is_x86_64 {
bmi2_enabled = is_x86_feature_detected!("bmi2");
adx_enabled = is_x86_feature_detected!("adx");
}

let should_use_asm = is_asm_stable && asm_enabled && bmi2_enabled && adx_enabled && is_x86_64;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm should we just get rid of this, and instead change our minimum supported rust version to 1.59? And in the places that we use the assembly backend, we can directly gate it with a

#[cfg(
    feature = "asm",
    target_feature = "bmi2",
    target_feature = "adx",
    target_arch = "x86_64"
)]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is only one question: do we want to bump up the minimal supported rust version to 1.59?

That would clearly be ideal, and we can cut the build.rs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine personally

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend doing that in a separate PR---the current stable is 1.58.1, and it would need 20+ days to be 1.59.0.

So, if we drop it, for the next three weeks, we may have some trouble.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add an issue and a separate PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We write cfg!(all( use_asm, target_feature = "bmi2", target_feature = "adx", target_arch = "x86_64" ))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another complication---actually this is a mistake:
https://github.com/arkworks-rs/algebra/blob/migrate-to-asm/ff/src/biginteger/mod.rs#L242

It just detects the asm feature flag but not other things.

Seems to be complicated.

Should we do cfg_attr! in the lib.rs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That one is fine, as it doesn't require 1.59, nor bmi2 or adx

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the logic. It appears that rust version has to be queried in build.rs. The rest can be moved out.

I use inline_asm_stable since not all asm requires 1.59.0, as you pointed out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cfg(version(1.59)) is not yet stable, though, we can drop it when we decided that the minimal supported rust version is >=1.59.

Co-authored-by: Pratyush Mishra <[email protected]>
Co-authored-by: Pratyush Mishra <[email protected]>
@Pratyush
Copy link
Member Author

Pratyush commented Feb 5, 2022

Thanks for getting this over the line, @weikengchen!

There's one remaining TODO that we should file an issue for: in the previous version of the backend, Self::INV was a constant in the generated assembly, but in the current version it's passed in a register. This decreases the number of available registers, increasing register and hence memory pressure.

Unfortunately making Self::INV into a constant requires stabilization of the asm_const and (part of) the generic_const_exprs features, so it might take a while.

@weikengchen
Copy link
Member

Talking about which, since we have already used the ugly movq_zero!, would it make sense to just hardcode the constants?

There is an issue---if we hardcode that as a constant, does it mean that Rust would generate different assembly codes for each different field? Or, is that already happening?

@weikengchen
Copy link
Member

My feeling is that asm_const is mostly a Rust side problem because it needs to be implemented in a general manner, but we are doing a very specific and simple case (movq).

@Pratyush
Copy link
Member Author

Pratyush commented Feb 5, 2022

We can't quite hardcode the constants, because we don't know what they are until instantiation time in some downstream crate. We could provide a macro that's the equivalent of https://crates.io/crates/ff_derive, which people invoke on the MontConfig in the appropriate downstream crate, and that macro could generate the assembly with full knowledge of the various involved constants.

@weikengchen
Copy link
Member

weikengchen commented Feb 5, 2022

...and this approach should also allow us to hardcode modulus, which at this moment is a memory access (while, it must be in the cache)?

@Pratyush
Copy link
Member Author

Pratyush commented Feb 5, 2022

Yeah, it would allow us to hardcode modulus also.

@weikengchen
Copy link
Member

With that, I think we are reaching the destination of single-field-element asm optimization.

To improve further, we will need to do batching and SIMD.

@Pratyush
Copy link
Member Author

Pratyush commented Feb 5, 2022

@weikengchen is this ready to merge? I can do it now if so.

@weikengchen
Copy link
Member

weikengchen commented Feb 5, 2022

It is really to merge. I am waiting you (technically the person who created the PR did the final decision right?)

@weikengchen weikengchen merged commit 26eb65f into master Feb 5, 2022
@weikengchen weikengchen deleted the migrate-to-asm branch February 5, 2022 21:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P-medium Priority: medium T-performance Type: performance improvements T-refactor Type: cleanup/refactor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate away from llvm_asm to asm Explore integration of new_asm! API
2 participants