KEP-5224: Node Resource Discovery #5319

marquiz · 2025-05-19T16:48:38Z

One-line PR description: Node Resource Discovery KEP

Issue link: Node Resource Discovery #5224

Other comments:

k8s-ci-robot · 2025-05-19T16:48:45Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: marquiz
Once this PR has been reviewed and has the lgtm label, please assign jpbetz, mrunalp for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

keps/prod-readiness/OWNERS
keps/sig-node/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

marquiz · 2025-05-19T16:50:56Z

/cc @mrunalp @haircommander @SergeyKanzhelev @yujuhong @ffromani @tallclair @Karthik-K-N @kad @Tal-or @kannon92 @pbetkier

k8s-ci-robot · 2025-05-19T16:51:07Z

@marquiz: GitHub didn't allow me to request PR reviews from the following users: Tal-or.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @mrunalp @haircommander @SergeyKanzhelev @yujuhong @ffromani @tallclair @Karthik-K-N @kad @Tal-or @kannon92 @pbetkier

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

haircommander · 2025-05-20T14:58:13Z

keps/sig-node/5224-node-resource-discovery/README.md

+
+// GetDynamicRuntimeConfig is a streaming interface for receiving dynamically
+// changing runtime and node configuration
+rpc GetDynamicRuntimeConfig(DynamicRuntimeConfigRequest) returns (stream DynamicRuntimeConfigResponse) {}


hmm this doesn't really seem like a runtime config..

This name came into existence with a background thought of creating a more generic "channel" for the runtime to inform kubelet about changes (without kubelet polling). But I agree, the name is bad and the background idea probably too.

Related, @ffromani suggested having a completely separate service (e.g. DiscoveryService, in addition to RuntimeService and ImageService) for handling this.

Thoughts?

yeah or maybe ResourceService or NodeService or something?

but big +1 on separate CRI service

Check, I will change this in the next update. I don't have strong opinions on this. One detail is that with a separate service there is the possibility to connect to a 3rd-party NodeService/ResourceService agent. That may be a good thing for some special users(?)

haircommander · 2025-05-20T15:00:24Z

keps/sig-node/5224-node-resource-discovery/README.md

+
+```go
+const (
+    ResourceTopologyZoneCore       = "Core"


these currently are a subset of the cadvisor machine info fields. is this all the kubelet uses?

Yup, picked the ones that the kubelet is interested in. Of course we can add all that we can imagine (e.g. the current cpu topology levels from linux kernel, i.e. package, die, cluster, core)

keps/sig-node/5224-node-resource-discovery/README.md

haircommander · 2025-05-20T15:02:31Z

keps/sig-node/5224-node-resource-discovery/README.md

+
+The kubelet reads MacineID, BootID and SystemUUID from the attributes of the
+resource topology tree. If this information is not present the kubelet uses the
+cAdvisor MachineInfo as a fallback.


forever? or will it stop after GA? What if this info is never available? should the kubelet exit?

oops I should read the section above: set the node not ready for the last question

I think this is a good question in general, whether the MachineID, BootID and SystemUUID should come from the runtime or should the kubelet figure out these itself. I put them here to be able to ditch cAdvisor MachineInfo completely. Glad to hear opinions on this.

haircommander · 2025-05-20T15:05:26Z

keps/sig-node/5224-node-resource-discovery/README.md

+
+A rollout could fail e.g. because of a bug in the CRI runtime, the runtime
+returning data that the kubelet cannot consume. In this case the node will be
+set into NotReady state. Existing workloads should not be affected but new pods


What should the cluster admin do in this case? is NotReady the correct signal?

The cluster admin can rollback. Regarding NotReady, I'm open to guidance, suggestions. I think the node should not be ready as new stuff shouldn't be scheduled there. But what else? Events, conditions?

haircommander · 2025-05-20T15:06:08Z

keps/sig-node/5224-node-resource-discovery/README.md

+
+TBD.
+
+- [ ] Events


we probably want a metric if the kubelet falls back to cadvisor for a certain resource

You mean metric for falling back to cAdvisor? The thinking in this proposal is that it's all or nothing: all resources (native, i.e. cpu, memory, hugepages and swap) come from the cAdvisor or nothing.

haircommander · 2025-05-20T15:07:23Z

keps/sig-node/5224-node-resource-discovery/README.md

+    heartbeats, leader election, etc.)
+-->
+
+If Node Resource Hotplug ([KEP-3953][kep-3953]) is enabled in tandem, the node


this creates a dependency on this for 3953. @Karthik-K-N do you +1 this?

I can also remove this snippet. This proposal (KEP-5224) alone does not cause this. But with both features enabled at the same time, this is what will happen.

Yeah, These two KEP's compliment each other if enabled together otherwise works as expected independently.

- fix typos

HirazawaUi · 2025-05-22T14:52:57Z

keps/sig-node/5224-node-resource-discovery/README.md

+
+A rollout could fail e.g. because of a bug in the CRI runtime, the runtime
+returning data that the kubelet cannot consume. In this case the node will be
+set into NotReady state. Existing workloads should not be affected but new pods


When node becomes NotReady, pods remaining in the Running state is considered controversial behavior. We may correct this in future releases.
ref: kubernetes/kubernetes#125618

Thanks @HirazawaUi for pointing out this. I'll add this detail in the next update.

HirazawaUi · 2025-05-22T15:19:50Z

keps/sig-node/5224-node-resource-discovery/README.md

+
+### Goals
+
+- Ability for kubelet to get node resources (capacity) from the CRI runtime


Does your goals include maintaining the availability of the kubelet's /metrics/resource endpoint after migrating to CRI-based node resource discovery?

Yes, this is completely independent of the stats/metrics stuff

I think there should be a correlation here.

There is an ambiguity: in the future, will the data returned by the /stats and /metrics/resource endpoints represent the hardware resources owned by the Kubernetes Node resource object, or the actual physical hardware resources of the node?

Prior to this KEP, these were nearly equivalent. However, once a user enables the feature gate proposed in this KEP, it might allocate only a subset of the node's resources (e.g., a portion of CPU cores) to the kubelet through extensible mechanisms. This could create ambiguity in the reported metrics.

fmuyassarov · 2025-05-27T06:53:05Z

/cc

dchen1107 · 2025-05-27T18:00:50Z

keps/sig-node/5224-node-resource-discovery/README.md

+- Ability for kubelet to get node resources (capacity) from the CRI runtime
+- Retain current functionality of cpu, memory and topology managers
+- API that can support dynamic node capacity changes
+


Explicitly state that we plan to have cpu topology info is enough and compatible with Slurm-like HPC workloads.

KEP-5224: Initial version of the Node Resource Discovery KEP

f5d13f1

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 19, 2025

k8s-ci-robot requested review from dchen1107 and palnabarun May 19, 2025 16:48

k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/node Categorizes an issue or PR as relevant to SIG Node. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels May 19, 2025

k8s-ci-robot requested review from mrunalp, haircommander, SergeyKanzhelev, ffromani, tallclair, pbetkier, yujuhong, Karthik-K-N, kad and kannon92 May 19, 2025 16:50

marquiz mentioned this pull request May 6, 2025

Node Resource Discovery #5224

Open

4 tasks

haircommander reviewed May 20, 2025

View reviewed changes

keps/sig-node/5224-node-resource-discovery/README.md Outdated Show resolved Hide resolved

haircommander reviewed May 20, 2025

View reviewed changes

KEP-5224: Update

049ccd2

- fix typos

haircommander mentioned this pull request May 20, 2025

KEP-3953: Node Resource Hot Plug #3955

Open

HirazawaUi reviewed May 22, 2025

View reviewed changes

k8s-ci-robot requested a review from fmuyassarov May 27, 2025 06:53

dchen1107 reviewed May 27, 2025

View reviewed changes


		### Goals

		- Ability for kubelet to get node resources (capacity) from the CRI runtime


		TBD.

		- [ ] Events

KEP-5224: Node Resource Discovery #5319

Are you sure you want to change the base?

KEP-5224: Node Resource Discovery #5319

Uh oh!

Conversation

marquiz commented May 19, 2025

Uh oh!

k8s-ci-robot commented May 19, 2025

Uh oh!

marquiz commented May 19, 2025

Uh oh!

k8s-ci-robot commented May 19, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HirazawaUi May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fmuyassarov commented May 27, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HirazawaUi May 26, 2025 •

edited

Loading