Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for setting active processor argument #319

Merged
merged 8 commits into from
Mar 29, 2023

Conversation

andybradshaw
Copy link
Contributor

Before this PR

Most of the context exists in #313, but to briefly summarize here, there are certain scenarios where using a very low CPU request can cause the JVM to make some assumptions around how many threads should be used for various tasks. We would like to move the CPU share detection logic into the launcher and pass that information to the JVM, so we have a little bit more control in those situations.

After this PR

==COMMIT_MSG==
Add support for setting the -XX:ActiveProcessorCount argument driven by detecting cgroup cpu.shares and some heuristics around behavior when running with too few processors. This feature is enabled within a new experimental configuration block, with the intent of indicating the stability of the feature.
==COMMIT_MSG==

Possible downsides?

Some initial attempts to implement this used the containerd/cgroups library. Maybe I was doing something silly, but the API was a little bit awkward for our use case (reading cpu.shares), so I opted to parse the relevant files instead. This will probably need some changes if/when we need to add cgroup v2 support.

@andybradshaw andybradshaw self-assigned this Mar 28, 2023
@changelog-app
Copy link

changelog-app bot commented Mar 28, 2023

Generate changelog in changelog/@unreleased

Type

  • Feature
  • Improvement
  • Fix
  • Break
  • Deprecation
  • Manual task
  • Migration

Description

Add support for setting the -XX:ActiveProcessorCount argument driven by detecting cgroup cpu.shares and some heuristics around behavior when running with too few processors. This feature is enabled within a new experimental configuration block, with the intent of indicating the stability of the feature.

Check the box to generate changelog(s)

  • Generate changelog entry

@@ -0,0 +1,148 @@
// Copyright 2016 Palantir Technologies, Inc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the limit of my ability to review golang code ;-)

Suggested change
// Copyright 2016 Palantir Technologies, Inc.
// Copyright 2023 Palantir Technologies, Inc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated... looks like the default godel license configuration handles the "year", so probably shouldn't have been hardcoding 2016.

if err != nil {
return 0, errors.New("unable to convert cpu.shares value to expected type")
}
return uint(math.Max(2.0, math.Floor(float64(cpuShares/1024)))), nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly worth a link to #313 in a comment for context on the 2.0 minimum value.

Copy link
Contributor

@carterkozak carterkozak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, I really appreciate it @andybradshaw :-)

@carterkozak carterkozak requested a review from CRogers March 29, 2023 12:50
fs: os.DirFS("/"),
}

type CGroupV1ProcessorCounter struct {
Copy link
Contributor

@CRogers CRogers Mar 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should support cgroups v2 from the get go? Devs use docker-for-mac for dev-env tests, and docker-for-mac uses cgroups v2. It might be a little odd for it to behave differently on CI vs locally. The difference between the two is not that great, I had some code here which shows how cgroupsv2 works for memory at least: CRogers/gradle@3fb0f7d.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, in dev-env we restrict the active processors to 1 via a JVM arg by default, so I guess this doesn't really matter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The range of valid cpu.weight values is quite a bit lower (1-10000) and I'm not entirely sure how this gets set/mapped in some situations. For example, it looks like crun attempts to keep the same "shares" API, and does a conversion behind the scenes.

Looking through the kubernetes interactions now to see how requests end up getting translated, but want to know what we should treat as the value for "1 core"... should we use the cgroup v2 default weight (100)? should we use the translated cgroup v1 cpu shares weight (39)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference would be consistency (avoiding special case values). Cgroups are a tool applied to a broad set of problems, where share/weight values are meant (in the context of pure cgroups) to provide relative weights as opposed to mapping to physical cores.
However, our use case targets container environments more specifically, where we have the additional context that the weights are based on millicore values. If container mode is used in go-java-launcher, we should assume a k8s (or similar) environment.

return CGroupV1ProcessorCounter{fs: filesystem}
}

func (c CGroupV1ProcessorCounter) ProcessorCount() (uint, error) {
Copy link
Contributor

@CRogers CRogers Mar 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was doing this for memory in gradle, there's a case where cgroups is installed but there is no limit for the current container/environment. An example of this is Circle machine executors (I wonder if PCloud is like this too?). Under cgroupsv1 you get a really big value (2^63) which means "no limit" but you can end up thinking there's more memory available than there is. Is this a problem as well for cpu shares/active processors implementation?

Copy link
Contributor

@CRogers CRogers Mar 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue I want to avoid here is thinking we have 2^63 cpu shares, so tell java we have 2^63/1024 = 2^54 processors then witchcraft tries to make a thread pool with minimum 8 * 2^54 = 2^57 threads.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As best as I can tell, groups get 1024 as their cpu shares by default, so if someone has a very large value here, it would've had to be set manually. Maybe a safeguard here would be to detect the hosts number of processors? So something like:

max(2, min(runtime.NumCPUs(), floor(cpu.shares / 1024)))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that cpu shares values shouldn't default to an extraordinarily high value (the jvm cgroups v1 implementation only appears to special case the 1024 default). Using runtime.NumCPU as a minimum sounds very reasonable (even taking precedence over the 2-processor minimum we assume based on shares, if we have confidence that the system has a single physical core).

The NumCPU implementation doesn't so anything exciting with cgroups, although I've only reviewed the linux implementation, the docs suggest that the intent is to describe the current process where GOMAXPROCS can be configured to hint go internals: https://github.com/golang/go/blob/f57f02fcd519a86683c47a444e0e24a086fea8e0/src/runtime/os_linux.go#L97-L122

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/uber-go/automaxprocs provides some similar functionality on the golang side for setting GOMAXPROCS

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw I checked on an internal machine Circle executor and the cpu shares does appear to be 1024 and not some huge number (where as the memory limit is 2^63).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andybradshaw I'm happy merging this now, and handling cgroupsv2 in a separate PR. Whichever way you prefer.

@CRogers
Copy link
Contributor

CRogers commented Mar 29, 2023

👍

@andybradshaw andybradshaw merged commit c9a6c45 into develop Mar 29, 2023
@andybradshaw andybradshaw deleted the ab/active-processor branch March 29, 2023 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants