-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure when using more than 1 GPU in STRUMPACK MPI #126
Comments
The The OMP deprecation message is probably coming from the SLATE library. I believe the invalid resource handle message is because multiple mpi processes are using the same GPU, and so it is using more CUDA streams than allowed per GPU. |
This changes the |
When you run with |
Hmm, I'm not sure. STRUMPACK/src/dense/CUDAWrapper.cpp Line 330 in 115b152
this is called form the SparseSolver constructor. So perhaps that changes what you specify. But it should not use all GPUs. Maybe SLATE is doing that? You could try to set the |
Hi, Dr. Ghysels,
I have seen some issues when using multi-GPU feature of STRUMPACK to solve a sparse matrix. I built STRUMPACK successfully with support of SLATE and MAGMA.
However, it passes when I run with one GPU: "
OMP_NUM_THREADS=1 mpirun -n 1 test_structure_reuse_mpi pde900.mtx
Example: I try using 2 GPUs:
a) sometimes it passes
(Why GPU =1 here? Does it mean, it only use one GPU but two processes are run on each og gpus I request? )
b) sometimes it fails with error msg
Do you know what the reasons could be, causing these issues and how should I resolve them?
Best,
-Jing
The text was updated successfully, but these errors were encountered: