Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[k8s] Fix show-gpus availability map when nvidia drivers are not installed #4429

Merged
merged 2 commits into from
Dec 3, 2024

Conversation

romilbhardwaj
Copy link
Collaborator

When nvidia drivers are not installed, sky show-gpus --cloud kubernetes would fail with:

AssertionError: Keys of counts ([]), capacity ([]), and available (['T4']) must be same.

This is because we were adding GPUs to the available table if the label was present, even if the resource did not exist. This PR fixes it by adding a GPU to available only if both conditions are satisfied - resource exists and labels are present.

Tested:

  • GKE cluster with and without GPU drivers.

Copy link
Collaborator

@Michaelvll Michaelvll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks for the quick fix @romilbhardwaj!

@romilbhardwaj romilbhardwaj added this pull request to the merge queue Dec 3, 2024
Merged via the queue into master with commit 747382a Dec 3, 2024
19 checks passed
@romilbhardwaj romilbhardwaj deleted the k8s_fix_show_gpus_availability branch December 3, 2024 09:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants