Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong nvidia driver used by docker #244

Open
bigjazzsound opened this issue Mar 26, 2025 · 3 comments
Open

Wrong nvidia driver used by docker #244

bigjazzsound opened this issue Mar 26, 2025 · 3 comments
Labels

Comments

@bigjazzsound
Copy link

Docker is using the wrong nvidia libraries:

➜ docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start co
ntainer process: error during container init: error running prestart hook #0: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: detection error: open failed: /usr/lib64/libnvidia-pkcs11-openssl3.so.570.124.04: no such file or directory: unknown

Run 'docker run --help' for more information

But my system has 570.133.07, not 570.124.04:

$ ls -l /lib64/libnvidia-*
lrwxrwxrwx. 1 root root       27 Jan  1  1970 /lib64/libnvidia-cfg.so.1 -> libnvidia-cfg.so.570.133.07
-rwxr-xr-x. 1 root root   403416 Jan  1  1970 /lib64/libnvidia-cfg.so.570.133.07
lrwxrwxrwx. 1 root root       32 Jan  1  1970 /lib64/libnvidia-container-go.so.1 -> libnvidia-container-go.so.1.17.5
-rwxr-xr-x. 1 root root  2959416 Jan  1  1970 /lib64/libnvidia-container-go.so.1.17.5
lrwxrwxrwx. 1 root root       29 Jan  1  1970 /lib64/libnvidia-container.so.1 -> libnvidia-container.so.1.17.5
-rwxr-xr-x. 1 root root   205272 Jan  1  1970 /lib64/libnvidia-container.so.1.17.5
lrwxrwxrwx. 1 root root       30 Jan  1  1970 /lib64/libnvidia-encode.so -> libnvidia-encode.so.570.133.07
lrwxrwxrwx. 1 root root       30 Jan  1  1970 /lib64/libnvidia-encode.so.1 -> libnvidia-encode.so.570.133.07
-rwxr-xr-x. 1 root root   297688 Jan  1  1970 /lib64/libnvidia-encode.so.570.133.07
lrwxrwxrwx. 1 root root       26 Jan  1  1970 /lib64/libnvidia-ml.so.1 -> libnvidia-ml.so.570.133.07
-rwxr-xr-x. 1 root root  2217912 Jan  1  1970 /lib64/libnvidia-ml.so.570.133.07
lrwxrwxrwx. 1 root root       28 Jan  1  1970 /lib64/libnvidia-nvvm.so.4 -> libnvidia-nvvm.so.570.133.07
-rwxr-xr-x. 1 root root 81978912 Jan  1  1970 /lib64/libnvidia-nvvm.so.570.133.07
lrwxrwxrwx. 1 root root       30 Jan  1  1970 /lib64/libnvidia-opencl.so.1 -> libnvidia-opencl.so.570.133.07
-rwxr-xr-x. 1 root root 65758768 Jan  1  1970 /lib64/libnvidia-opencl.so.570.133.07
lrwxrwxrwx. 1 root root       35 Jan  1  1970 /lib64/libnvidia-opticalflow.so.1 -> libnvidia-opticalflow.so.570.133.07
-rwxr-xr-x. 1 root root    67704 Jan  1  1970 /lib64/libnvidia-opticalflow.so.570.133.07
-rwxr-xr-x. 1 root root    10176 Jan  1  1970 /lib64/libnvidia-pkcs11-openssl3.so.570.133.07
lrwxrwxrwx. 1 root root       38 Jan  1  1970 /lib64/libnvidia-ptxjitcompiler.so.1 -> libnvidia-ptxjitcompiler.so.570.133.07
-rwxr-xr-x. 1 root root 38251952 Jan  1  1970 /lib64/libnvidia-ptxjitcompiler.so.570.133.07
lrwxrwxrwx. 1 root root       36 Jan  1  1970 /lib64/libnvidia-sandboxutils.so.1 -> libnvidia-sandboxutils.so.570.133.07
-rwxr-xr-x. 1 root root    78888 Jan  1  1970 /lib64/libnvidia-sandboxutils.so.570.133.07

Running nvidia-ctk runtime configure does not fix the issue.

Also tried running:

$ sudo ldconfig 
ldconfig: Can't link /lib64/libnvidia-pkcs11-openssl3.so.570.124.04 to libnvidia-pkcs11-openssl3.so.570.133.07

If you need more details please let me know. I'm not too familiar with the nvidia container runtime.

@dosubot dosubot bot added the nvidia label Mar 26, 2025
@m2Giles
Copy link
Member

m2Giles commented Mar 26, 2025

Please run rpm -q | grep nvidia and provide the output.

We had a fedora repo package creep in on other images that was resolved.

@bigjazzsound
Copy link
Author

$ rpm -qa | grep nvidia | sort
kmod-nvidia-570.133.07-1.fc41.x86_64
libnvidia-cfg-570.133.07-1.fc41.x86_64
libnvidia-container-tools-1.17.5-1.x86_64
libnvidia-container1-1.17.5-1.x86_64
libnvidia-ml-570.133.07-1.fc41.x86_64
nvidia-container-toolkit-1.17.5-1.x86_64
nvidia-container-toolkit-base-1.17.5-1.x86_64
nvidia-driver-cuda-570.133.07-1.fc41.x86_64
nvidia-driver-cuda-libs-570.133.07-1.fc41.x86_64
nvidia-gpu-firmware-20241210-1.fc41.noarch
nvidia-kmod-common-570.133.07-1.fc41.noarch
nvidia-modprobe-570.133.07-1.fc41.x86_64
nvidia-persistenced-570.133.07-1.fc41.x86_64
ublue-os-ucore-nvidia-0.3-1.fc41.noarch

@m2Giles
Copy link
Member

m2Giles commented Mar 26, 2025

You have the correct nvidia-ctk. Hmmm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants