Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the variant of rand! for an array of Float64 is slowed down by the explicit SIMD when the array is small enough #57114

Open
nsajko opened this issue Jan 21, 2025 · 2 comments
Labels
arrays [a, r, r, a, y, s] performance Must go faster randomness Random number generation and the Random stdlib

Comments

@nsajko
Copy link
Contributor

nsajko commented Jan 21, 2025

Benchmarking #57101, I notice using the implementation with explicit SIMD is slowing down rand!(::Memory{Float64}) for arrays shorter than about a thousand elements. Perhaps a cutoff would be good, to only use the SIMD-ed implementation for large-enough arrays. Will investigate more deeply later.

@nsajko nsajko added arrays [a, r, r, a, y, s] performance Must go faster randomness Random number generation and the Random stdlib labels Jan 21, 2025
@giordano
Copy link
Contributor

There are already two different implementations, one for less than 32 elements and one for 32 or more: #55997 (comment)

@bbrehm
Copy link

bbrehm commented Jan 22, 2025

To be more precise, https://github.com/JuliaLang/julia/blame/f91436eae7265a01bff17e35887cd9b8e15c8fdc/stdlib/Random/src/XoshiroSimd.jl#L169 and https://github.com/JuliaLang/julia/blame/f91436eae7265a01bff17e35887cd9b8e15c8fdc/stdlib/Random/src/XoshiroSimd.jl#L14.

The currrent simd thesholds were determined #40546 (comment). I originally set them much higher https://github.com/JuliaLang/julia/pull/34852/files# (tuned for a crappy broadwell laptop), before vanishing for some time for personal reasons (big thanks to jeff for finishing up that work I abandoned).

I think a re-tuning is in order -- ideally one would plot performance curves for all of: apple silicon, avx-2, axv-512, rasp-pi.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] performance Must go faster randomness Random number generation and the Random stdlib
Projects
None yet
Development

No branches or pull requests

3 participants