We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User reported L40 GPUs in their cluster were being shown as L4 GPUs in sky show-gpus
sky show-gpus
sky show-gpus:
Kubernetes GPUs (context: default) GPU REQUESTABLE_QTY_PER_NODE TOTAL_GPUS TOTAL_FREE_GPUS L4 1, 2, 4 4 4 Kubernetes per node GPU availability NODE_NAME GPU_NAME TOTAL_GPUS FREE_GPUS node L4 4 4
Labels:
nvidia.com/cuda.driver-version.major=535 nvidia.com/cuda.driver-version.minor=104 nvidia.com/cuda.driver-version.revision=12 nvidia.com/cuda.driver.major=535 nvidia.com/cuda.driver.minor=104 nvidia.com/cuda.driver.rev=12 nvidia.com/cuda.runtime-version.full=12.2 nvidia.com/cuda.runtime-version.major=12 nvidia.com/cuda.runtime-version.minor=2 nvidia.com/cuda.runtime.major=12 nvidia.com/cuda.runtime.minor=2 nvidia.com/gfd.timestamp=1732465495 nvidia.com/gpu-driver-upgrade-state=upgrade-done nvidia.com/gpu.compute.major=8 nvidia.com/gpu.compute.minor=9 nvidia.com/gpu.count=4 nvidia.com/gpu.deploy.container-toolkit=true nvidia.com/gpu.deploy.dcgm=true nvidia.com/gpu.deploy.dcgm-exporter=true nvidia.com/gpu.deploy.device-plugin=true nvidia.com/gpu.deploy.driver=pre-installed nvidia.com/gpu.deploy.gpu-feature-discovery=true nvidia.com/gpu.deploy.node-status-exporter=true nvidia.com/gpu.deploy.operator-validator=true nvidia.com/gpu.family=ampere nvidia.com/gpu.memory=46068 nvidia.com/gpu.mode=compute nvidia.com/gpu.present=true nvidia.com/gpu.product=NVIDIA-L40 nvidia.com/gpu.replicas=1 nvidia.com/gpu.sharing-strategy=none nvidia.com/mig.capable=false nvidia.com/mig.strategy=single nvidia.com/mps.capable=false nvidia.com/vgpu.present=false
The canonical_name in value here is likely the culprit, since the substring gets matched:
canonical_name in value
skypilot/sky/provision/kubernetes/utils.py
Lines 338 to 352 in ef2233b
The text was updated successfully, but these errors were encountered:
No branches or pull requests
User reported L40 GPUs in their cluster were being shown as L4 GPUs in
sky show-gpus
sky show-gpus
:Labels:
The
canonical_name in value
here is likely the culprit, since the substring gets matched:skypilot/sky/provision/kubernetes/utils.py
Lines 338 to 352 in ef2233b
The text was updated successfully, but these errors were encountered: