Skip to content

[KEP-5246] Migrate to systemd's cgroup v1 CPU shares to v2 CPU weight formula #5247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

iholder101
Copy link
Contributor

@iholder101 iholder101 commented Apr 16, 2025

Signed-off-by: Itamar Holder <[email protected]>
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels Apr 16, 2025
@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 16, 2025
@iholder101 iholder101 force-pushed the kep/systemd_cpu_cgroup_conversion branch from 9285303 to 12f5084 Compare April 16, 2025 09:32
Signed-off-by: Itamar Holder <[email protected]>
@iholder101 iholder101 force-pushed the kep/systemd_cpu_cgroup_conversion branch from 12f5084 to 8020ae0 Compare April 16, 2025 09:37
@iholder101
Copy link
Contributor Author

FYI @vladikr

@pacoxu
Copy link
Member

pacoxu commented Apr 27, 2025

/cc @giuseppe
for #2254
/assign @yujuhong @dchen1107 @derekwaynecarr

@k8s-ci-robot k8s-ci-robot requested a review from giuseppe April 27, 2025 03:38
Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: giuseppe, iholder101
Once this PR has been reviewed and has the lgtm label, please assign dchen1107, soltysh for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment


- A significant amount of the work would need to land in other layers, mainly OCI runtimes and the CRI.
- We'll probably need a CRI configuration to ensure coordination between the CRI and the OCI runtimes implementations,
and to ensure it lands at the same version, as suggested
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we expand this in the design section to understand what needs to be changed and how the rollout will happen?


That being said, the formula in entirely an implementation detail that's most probably not being counted
to have certain concrete values. In any way, we should ensure that the new formula is well documented
and that the change is properly communicated to the users.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be a way for the users to configure back to the previous behavior, in the case of unexpected issues resulted from the change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this mechanism will be implemented by the OCI runtimes, the entire mechanism is a workaround for using cgroup v1 settings in a cgroup v2 environment.

If someone wants to have full control of the cgroup v2 values, then the must use the native cgroupv2 unified map to pass the correct value down the stack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mainly asked because if we simply switched the implementation, it may have unexpected impact on user workloads. Having an option to preserve the original behavior can be important

@cartermckinnon
Copy link

/cc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/node Categorizes an issue or PR as relevant to SIG Node. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants