Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimisation of Runge-Kutta MultiFab allocation #4263

Open
the-florist opened this issue Dec 13, 2024 · 1 comment
Open

Optimisation of Runge-Kutta MultiFab allocation #4263

the-florist opened this issue Dec 13, 2024 · 1 comment

Comments

@the-florist
Copy link

While optimising GRTeclyn on Nvidia GPUs, we found that at the beginning of each call to the Runge-Kutta function in Amrex, the computational load was switched onto the CPU, causing a slow-down. We traced this switch to the beginning of the RK4 function (line 244 in AMReX_RungeKutta.H), where a new set of MultiFabs are created to store the RK steps each time the function is called (lines 251-255 in AMReX_RungeKutta.H). We suspect that a speed-up could be achieved if this memory is instead allocated at the creation of the amrex::RungeKutta class, and passed as an array of aliased MultiFabs to each RungeKutta sub-routine.

I have included snapshots of the Nsight timeline, displayed using the Nsight Systems viewer, which show the jump in computational load between the CPU and GPU, as well as the corresponding NVTX diagnostic noting the beginning of the RungeKutta4 function. I am happy to share the .nsys-rep file itself, but due to its size I cannot upload it to this issue, so please contact me if you would like a copy.

cpu-gpu-load nvtx-flags
@the-florist the-florist changed the title Optimisation of Runge-Kutta class Optimisation of Runge-Kutta MultiFab allocation Dec 13, 2024
@WeiqunZhang
Copy link
Member

Have you changed the default parameters of arenas such as amrex.the_arena_release_threshold? If not, MultiFab memory allocation is usually a one-time cost. After a few steps, the cost should be very small.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants