-
Notifications
You must be signed in to change notification settings - Fork 2k
update Nvidia device driver docs to link to list of supported cards and newer versions #25531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…newer versions of CUDA-based containers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update! I left a few style suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also removed the x86_64 qualifier for Linux because the driver runs successfully (even if it can't fully fingerprint the unsupported card) on arm64 in my testing, and the underlying library and its Go bindings are arm64 compatible (NVML and nvml-go).
I'm hesitant to remove this unless we've actually tested it on arm64 and shown it works end-to-end. I don't see any issues in the driver repo that show someone using it on arm64. Do you know of anyone who's done so?
Co-authored-by: Aimee Ukasick <[email protected]>
Co-authored-by: Aimee Ukasick <[email protected]>
Co-authored-by: Aimee Ukasick <[email protected]>
@tgross I've ran it on arm64, kind of, and it works enough to identify the NVIDIA GPU, but fails at getting a device handle to check power/memory because the card in question is an iGPU which isn't supported by NVML. We can give it a spin in an AWS G5g (Graviton ARM CPU + NVIDIA T4 GPU, which is supported by NVML). |
If you could, that'd be great. It'd at least give us a reasonable smoke test before we attest to it. |
@tgross gave it a spin on an AWS G5g.2xlarge, the driver compiles like a charm, and everything I could think of works:
Output of a docker job with
Anything else you can think of that I should test? |
Awesome, that's great @sofixa. Let's ship it. I'll mark this for re-review. |
Description
NVIDIA Device Driver docs had a few older references (e.g.
nvidia/cuda:11.0-base
doesn't exist anymore), and didn't have a link to the list of compatible NVIDIA devices (e.g. Jetsons aren't compatible).Also removed the x86_64 qualifier for Linux because the driver runs successfully (even if it can't fully fingerprint the unsupported card) on arm64 in my testing, and the underlying library and its Go bindings are arm64 compatible (NVML and nvml-go).
Reviewer Checklist
backporting document.
in the majority of situations. The main exceptions are long-lived feature branches or merges where
history should be preserved.
within the public repository.