Skip to content

sha2 crate = runtime error #207

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
brandonros opened this issue Apr 27, 2025 · 12 comments
Open

sha2 crate = runtime error #207

brandonros opened this issue Apr 27, 2025 · 12 comments

Comments

@brandonros
Copy link

https://github.com/RustCrypto/hashes/blob/master/sha2/Cargo.toml vs https://github.com/brandonros/rust-ed25519-compact

$ cargo run --release -- aa $BLOCKS_PER_GRID $THREADS_PER_BLOCK
   Compiling ed25519_vanity v0.1.0 (/home/brandon/ed25519-vanity-rs)
    Finished `release` profile [optimized] target(s) in 0.90s
     Running `target/release/ed25519_vanity aa 128 128`
Found 1 CUDA devices
Starting device 0
[0] Loading module...
[0] Starting search loop...

thread '<unnamed>' panicked at /home/brandon/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/cudarc-0.16.0/src/driver/safe/core.rs:470:36:
called `Result::unwrap()` on an `Err` value: DriverError(CUDA_ERROR_LAUNCH_FAILED, "unspecified launch failure")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

thread '<unnamed>' panicked at /home/brandon/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/cudarc-0.16.0/src/driver/safe/core.rs:246:58:
called `Result::unwrap()` on an `Err` value: DriverError(CUDA_ERROR_LAUNCH_FAILED, "unspecified launch failure")
stack backtrace:
   0:     0x55e313ca18e3 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::hdbd106d724e72c20
   1:     0x55e313cc3493 - core::fmt::write::h861eecc74abebf7a
   2:     0x55e313c9f003 - std::io::Write::write_fmt::h493b3152b071fba0
   3:     0x55e313ca1732 - std::sys::backtrace::BacktraceLock::print::h71f315c25fc266cb
   4:     0x55e313ca26ca - std::panicking::default_hook::{{closure}}::h8019dc6a2c6c0fe7
   5:     0x55e313ca253a - std::panicking::default_hook::h497f769686a88dd6
   6:     0x55e313ca2fd2 - std::panicking::rust_panic_with_hook::h98fc165e90ef379e
   7:     0x55e313ca2e6a - std::panicking::begin_panic_handler::{{closure}}::h2c1a60d0a908eaec
   8:     0x55e313ca1dd9 - std::sys::backtrace::__rust_end_short_backtrace::he8aba8f9b7ddf304
   9:     0x55e313ca2afd - rust_begin_unwind
  10:     0x55e313cc2010 - core::panicking::panic_fmt::hcbf39f8c1e585f84
  11:     0x55e313cc23a6 - core::result::unwrap_failed::haf1491c6d679786d
  12:     0x55e313c7bc98 - <cudarc::driver::safe::core::CudaEvent as core::ops::drop::Drop>::drop::heca398ef781d7d06
  13:     0x55e313c60323 - core::ptr::drop_in_place<core::option::Option<cudarc::driver::safe::core::CudaEvent>>::h0f0fd586ce97c35f
  14:     0x55e313c601f6 - core::ptr::drop_in_place<cudarc::driver::safe::core::CudaSlice<u8>>::he9b6484b4f5db6c9
  15:     0x55e313c61cae - ed25519_vanity::device_main::h99b65f8ab2f8c263
  16:     0x55e313c65f2b - std::sys::backtrace::__rust_begin_short_backtrace::he6efac01710238fd
  17:     0x55e313c656f1 - core::ops::function::FnOnce::call_once{{vtable.shim}}::h486bd94640b67ff5
  18:     0x55e313ca4d8b - std::sys::pal::unix::thread::Thread::new::thread_start::h20288ab9ea215a81
  19:     0x7fe278c381f5 - <unknown>
  20:     0x7fe278cb889c - <unknown>
  21:                0x0 - <unknown>

thread '<unnamed>' panicked at library/core/src/panicking.rs:226:5:
panic in a destructor during cleanup
thread caused non-unwinding panic. aborting.
Aborted
use sha2::Digest as _;
use ed25519_compact::ge_scalarmult_base;
use rand_core::{SeedableRng, RngCore};
use rand_xorshift::XorShiftRng;
use bs58;

// fails
fn sha512(input: &[u8]) -> [u8; 64] {
    let mut hasher = sha2::Sha512::new();
    hasher.update(input);
    hasher.finalize().into()
}

// works
fn sha512_compact(input: &[u8]) -> [u8; 64] {
    let mut hasher = ed25519_compact::sha512::Hash::new();
    hasher.update(input);
    hasher.finalize()
}
@adamcavendish
Copy link
Contributor

Hi @brandonros , sha2 has a lot of optimizations on the CPU, i.e. AVX2 etc. so these kinds of crates are not available for direct use in CUDA kernel. If we would like to directly use these kinds of crates in the CUDA kernel, we need to write an implementation in these crates and gate by a CUDA alike feature flag.

@brandonros
Copy link
Author

I would have guessed the compiler would be able to tell AVX2 was not available and not try to include them.

I believe even with this non-AVX2 implementation (soft) the issue still occurs: https://github.com/RustCrypto/hashes/blob/master/sha2/src/sha512/soft.rs

https://github.com/RustCrypto/hashes/blob/master/sha2/src/sha512.rs#L2-L4

Any suggestions on how to debug exactly what the problem is or tell the compiler those options aren't available? Are you saying host CPU features are accidentally used when compiling with the CUDA GPU compiler?

@jorge-ortega
Copy link
Collaborator

The error shown here happens at runtime, so I'm assuming that your GPU crate compiled successfully with the nvvm codegen. If so, then there shouldn't be an issue with the use of the SHA crate. You are however using cudarc, which is a different crate then the one we maintain here and is where the error originates in. While in theory, these should be identical bindings to the cuda driver api, and the ptx generated should be loadable by any program that can load and launch kernels, I've only every used the bindings provided through cust to launch kernels compiled by the nvvm backend. If this issue is in how cudarc launches the kernel, then it might be better to open an issue with them so they can help pinpoint why the kernel is failing to launch, and if it has something to do with the ptx generated from our backend. If you have the same issue launching the kernel with cust, I can look further.

@brandonros
Copy link
Author

cudarc replaced with cust: brandonros/ed25519-vanity-rs@2b04c7e

_compact functions work (sha2), non-compact do not

PTX:

//
// Generated by NVIDIA NVVM Compiler
//
// Compiler Build ID: CL-35059454
// Cuda compilation tools, release 12.6, V12.6.85
// Based on NVVM 7.0.1
//

.version 8.5
.target sm_61
.address_size 64

        // .globl       find_vanity_private_key

.visible .entry find_vanity_private_key(
        .param .u64 find_vanity_private_key_param_0,
        .param .u64 find_vanity_private_key_param_1,
        .param .u64 find_vanity_private_key_param_2,
        .param .u64 find_vanity_private_key_param_3,
        .param .u64 find_vanity_private_key_param_4,
        .param .u64 find_vanity_private_key_param_5,
        .param .u64 find_vanity_private_key_param_6
)
{



        bar.sync        0;
        bar.sync        0;
        bar.sync        0;
        trap;

}

@jorge-ortega
Copy link
Collaborator

Thanks for the extra context. I'll look further.

@jorge-ortega
Copy link
Collaborator

Thanks again for all the reports. I won't have as much availability to look into this as I thought but will asap. Or someone else can feel free to look further.

@brandonros
Copy link
Author

cudarc replaced with cust: brandonros/ed25519-vanity-rs@2b04c7e

_compact functions work (sha2), non-compact do not

PTX:

//
// Generated by NVIDIA NVVM Compiler
//
// Compiler Build ID: CL-35059454
// Cuda compilation tools, release 12.6, V12.6.85
// Based on NVVM 7.0.1
//

.version 8.5
.target sm_61
.address_size 64

        // .globl       find_vanity_private_key

.visible .entry find_vanity_private_key(
        .param .u64 find_vanity_private_key_param_0,
        .param .u64 find_vanity_private_key_param_1,
        .param .u64 find_vanity_private_key_param_2,
        .param .u64 find_vanity_private_key_param_3,
        .param .u64 find_vanity_private_key_param_4,
        .param .u64 find_vanity_private_key_param_5,
        .param .u64 find_vanity_private_key_param_6
)
{



        bar.sync        0;
        bar.sync        0;
        bar.sync        0;
        trap;

}

@LegNeato

give me your 30 second take on this please, something I can dive in and try to help look into.

I get that the crate has a bunch of different backends, trying to use AVX and SIMD where present, but I would think it knows to fallback to not do that if possible.

@LegNeato
Copy link
Contributor

I had glanced at the sha2 code, and it does indeed fall back to soft.

I'm a bit busy at the moment but will probably start poking at this in an hour 👍

@LegNeato
Copy link
Contributor

@brandonros I am not seeing this error, it seems to work? https://github.com/LegNeato/ed25519-vanity-rs/. Did I screw something up?

@brandonros
Copy link
Author

@LegNeato

https://docs.rs/sha2/latest/sha2/?search=output

  error[E0432]: unresolved import `sha2::Output`
   --> src/lib.rs:5:20
    |
  5 | use sha2::{Digest, Output, Sha512};
    |                    ^^^^^^ no `Output` in the root

Are you sure that compiles? That vast script will silently fail to compile but run the oldest one. it does a (not smart) cargo build and then ./target/ run separate. you could make it cargo run if you want

@brandonros
Copy link
Author

Try this

fn sha512_hash(input: &[u8]) -> [u8; 64] {
    use sha2::{Digest, Sha512};
    let mut hasher = Sha512::new();
    hasher.update(input);
    hasher.finalize().into()
}

@LegNeato
Copy link
Contributor

LegNeato commented May 27, 2025

Ugh, I don't understand why it isn't always using the latest code, sometimes it fails and continues to run the previous binary, making me look like an idiot 😅 . I can repro now. Going to bed, will look tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants