Skip to content

support scheduler plugins #3612

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

KunWuLuan
Copy link
Contributor

@KunWuLuan KunWuLuan commented May 16, 2025

Support PodGroup of scheduler-plugins

#3611

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@KunWuLuan KunWuLuan changed the title support scheduler plugins (WIP)support scheduler plugins May 16, 2025
@KunWuLuan KunWuLuan force-pushed the feat/support-scheduler-plugins branch 6 times, most recently from 55d376d to a054ab9 Compare May 21, 2025 04:03
@KunWuLuan KunWuLuan changed the title (WIP)support scheduler plugins support scheduler plugins May 27, 2025
@KunWuLuan
Copy link
Contributor Author

@kevin85421 Hi, Please have a look when you have time. Thanks!

@kevin85421
Copy link
Member

@KunWuLuan can you resolve the conflict?

Copy link
Member

@MortalHappiness MortalHappiness left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure your PR can run successfully. I tried to run your PR but it failed to create PodGroup.

You also need to update the following places

  • # Enable customized Kubernetes scheduler integration. If enabled, Ray workloads will be scheduled
    # by the customized scheduler.
    # * "enabled" is the legacy option and will be deprecated soon.
    # * "name" is the standard option, expecting a scheduler name, supported values are
    # "default", "volcano", and "yunikorn".
    #
    # Note: "enabled" and "name" should not be set at the same time. If both are set, an error will be thrown.
    #
    # Examples:
    # 1. Use volcano (deprecated)
    # batchScheduler:
    # enabled: true
    #
    # 2. Use volcano
    # batchScheduler:
    # name: volcano
    #
    # 3. Use yunikorn
    # batchScheduler:
    # name: yunikorn
    #
    batchScheduler:
    # Deprecated. This option will be removed in the future.
    # Note, for backwards compatibility. When it sets to true, it enables volcano scheduler integration.
    enabled: false
    # Set the customized scheduler name, supported values are "volcano" or "yunikorn", do not set
    # "batchScheduler.enabled=true" at the same time as it will override this option.
    name: ""
  • if config.BatchScheduler == volcano.GetPluginName() || config.BatchScheduler == yunikorn.GetPluginName() {
  • Add a sample config named ray-cluster.kube-scheduler.yaml to ray-operator/config/samples/ folder

Follow-up:

@KunWuLuan
Copy link
Contributor Author

Hi, I have update the helm chart and make sure it works in my local environment. Please have a look when you have time.Thanks
@MortalHappiness

@KunWuLuan KunWuLuan force-pushed the feat/support-scheduler-plugins branch from 34d3f44 to a8048b0 Compare May 29, 2025 15:15
metadata:
name: test-podgroup-0
labels:
ray.io/gang-scheduling-enabled: "true"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be using ray.io/* labels for any of the scheduler integrations. The label prefix should be specific to the integration (see other examples with Volcano, Yunikorn and Kueue)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, ray.io/gang-scheduling-enabled is used in the sample of yunikorn, and ray.io/scheduler-name: volcano is used when we need to use volcano.
And these labels are defined here:

RaySchedulerName = "ray.io/scheduler-name"
RayPriorityClassName = "ray.io/priority-class-name"
RayClusterGangSchedulingEnabled = "ray.io/gang-scheduling-enabled"

We didn't create a new label for new scheduler.

if !k.isGangSchedulingEnabled(rc) {
return nil
}
replica := int32(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why start at 1? Is it for the head pod? If so can you add a comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is for the head pod.

// AddMetadataToPod adds essential labels and annotations to the Ray pods
// the scheduler needs these labels and annotations in order to do the scheduling properly
func (k *KubeScheduler) AddMetadataToPod(_ context.Context, app *rayv1.RayCluster, groupName string, pod *corev1.Pod) {
// when gang scheduling is enabled, extra annotations need to be added to all pods
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

annotations or labels?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the mistakes. Thanks.


// AddMetadataToPod adds essential labels and annotations to the Ray pods
// the scheduler needs these labels and annotations in order to do the scheduling properly
func (k *KubeScheduler) AddMetadataToPod(_ context.Context, app *rayv1.RayCluster, groupName string, pod *corev1.Pod) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/app/cluster

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain the comment? Thank you!

// when gang scheduling is enabled, extra annotations need to be added to all pods
if k.isGangSchedulingEnabled(app) {
// the group name for the head and each of the worker group should be different
pod.Labels[KubeSchedulerPodGroupLabelKey] = app.Name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should PodGroups be scheduled at the worker group level or RayCluster level? I feel like worker group level could make more sense in some cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the head is not available, the workers will be blocked, this is unacceptable for my customers. So I think the head should be checked with the workers.

@MortalHappiness
Copy link
Member

MortalHappiness commented May 30, 2025

Please also resolve conflicts. We now use single go.mod in the repo root. Thanks.

Copy link
Member

@MortalHappiness MortalHappiness left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need to add permission for creating PodGroup here

KunWuLuan added 3 commits May 30, 2025 15:43
Signed-off-by: kunwuluan <[email protected]>
update ValidateBatchSchedulerConfig()

update helm chart

Rename the function.

Signed-off-by: kunwuluan <[email protected]>
@KunWuLuan KunWuLuan force-pushed the feat/support-scheduler-plugins branch from a8048b0 to 21f4f88 Compare May 30, 2025 08:59
Comment on lines +70 to +73
# 4. Use PodGroup
# batchScheduler:
# name: kube-scheduler
#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need to update L53 to

#    "default", "volcano", "yunikorn", and "kube-scheduler".

Comment on lines +79 to +94
func TestCalculateDesiredResources(t *testing.T) {
a := assert.New(t)

cluster := createTestRayCluster(1)

totalResource := utils.CalculateDesiredResources(&cluster)

// 256m * 3 (requests, not limits)
a.Equal("768m", totalResource.Cpu().String())

// 256Mi * 3 (requests, not limits)
a.Equal("768Mi", totalResource.Memory().String())

// 2 GPUs total
a.Equal("2", totalResource.Name("nvidia.com/gpu", resource.BinarySI).String())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you testing here? It seems that you didn't test anything specific to the kube scheduler plugin.

@MortalHappiness
Copy link
Member

Please also install pre-commit locally and then fix the CI lint error. Thanks.

@kevin85421
Copy link
Member

I discussed this with @MortalHappiness offline. Since this PR still requires some work before it can be merged, I’ve removed the release-blocker label. We can consider cherry-pick this PR after branch cut.

@KunWuLuan KunWuLuan force-pushed the feat/support-scheduler-plugins branch from 325c140 to 54f786f Compare May 31, 2025 04:56
@KunWuLuan KunWuLuan force-pushed the feat/support-scheduler-plugins branch from 54f786f to 7c572f7 Compare May 31, 2025 04:57
Signed-off-by: KunWuLuan <[email protected]>
@KunWuLuan KunWuLuan force-pushed the feat/support-scheduler-plugins branch from 7c572f7 to 1e40ceb Compare May 31, 2025 06:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants