Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed optimization when looping over bytes of a value #133528

Open
theemathas opened this issue Nov 27, 2024 · 3 comments
Open

Missed optimization when looping over bytes of a value #133528

theemathas opened this issue Nov 27, 2024 · 3 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade, or backported fixes T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@theemathas
Copy link
Contributor

I tried this code, which contains 3 functions which check if all the bits in a u64 are all ones:

#[no_mangle]
fn ne_bytes(input: u64) -> bool {
    let bytes = input.to_ne_bytes();
    bytes.iter().all(|x| *x == !0)
}

#[no_mangle]
fn black_box_ne_bytes(input: u64) -> bool {
    let bytes = input.to_ne_bytes();
    let bytes = std::hint::black_box(bytes);
    bytes.iter().all(|x| *x == !0)
}

#[no_mangle]
fn direct(input: u64) -> bool {
    input == !0
}

I expected to see this happen: ne_bytes() should be optimized to the same thing as direct(), while black_box_ne_bytes() should be optimized slightly worse

Instead, this happened: I got the following assembly, where ne_bytes() is somehow optimized worse than black_box_ne_bytes()

ne_bytes:
        mov     rax, rdi
        not     rax
        shl     rax, 8
        sete    cl
        shr     rdi, 56
        cmp     edi, 255
        setae   al
        and     al, cl
        ret

black_box_ne_bytes:
        mov     qword ptr [rsp - 8], rdi
        lea     rax, [rsp - 8]
        cmp     qword ptr [rsp - 8], -1
        sete    al
        ret

direct:
        cmp     rdi, -1
        sete    al
        ret

Godbolt

Meta

Reproducible on godbolt with stable rustc 1.82.0 (f6e511eec 2024-10-15) and nightly rustc 1.85.0-nightly (7db7489f9 2024-11-25)

@theemathas theemathas added the C-bug Category: This is a bug. label Nov 27, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Nov 27, 2024
@workingjubilee workingjubilee added C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. and removed C-bug Category: This is a bug. labels Nov 27, 2024
@purplesyringa
Copy link

purplesyringa commented Nov 27, 2024

As far as I can see, something very similar is at least partially fixed on LLVM trunk: https://godbolt.org/z/MzqG7rf9d. There's also another similar issue: llvm/llvm-project#117853, but I'm not sure if it's relevant to this particular issue.

@purplesyringa

This comment has been minimized.

@rustbot rustbot added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Nov 27, 2024
@jieyouxu jieyouxu removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Nov 27, 2024
@clubby789
Copy link
Contributor

Looks like all 3 functions optimize to the same thing on LLVM trunk opt

@clubby789 clubby789 added the llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade, or backported fixes label Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade, or backported fixes T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants