Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for /sys/devices/system/node and subfiles #542

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Hunter1016
Copy link

No description provided.

huteng.ht added 6 commits May 31, 2022 13:46
also implement functions called by get_cpuset_mems(),
 - cgfsng_get_cpuset_mems
 - readat_cpuset_mems

Signed-off-by: huteng.ht <[email protected]>
Signed-off-by: huteng.ht <[email protected]>
Copy link
Member

@brauner brauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I think this is fine but should we maybe introduce a new command line switch so users can choose whether or not they want this emulation? //Cc @stgraber

@Hunter1016
Copy link
Author

Hunter1016 commented Jun 7, 2022

In general, I think this is fine but should we maybe introduce a new command line switch so users can choose whether or not they want this emulation? //Cc @stgraber

For whether or not want this emulation, like /sys/devices/system/cpu emulation, users can choose not to use by just not mounting it into the container.
Do you mean a parameter like '--enable-cfs'? Different from cpu, which can be controlled by both cpuset.cpus and cpu quota/period, node only lies on cpuset, cpuset.cpus and cpuset.mems, therefore, I can't think of a similar parameter.
THX.

@stgraber
Copy link
Member

stgraber commented Jun 7, 2022

@brauner I can't think of a good reason to disable this one. If someone comes up with a good reason to turn it off, I think I'd prefer we switch to a --disable-emulation=KEYWORD,... type list so one can disable each file/section independently without us having to add a ton of new flags.

@brauner
Copy link
Member

brauner commented Jun 7, 2022

@brauner I can't think of a good reason to disable this one. If someone comes up with a good reason to turn it off, I think I'd prefer we switch to a --disable-emulation=KEYWORD,... type list so one can disable each file/section independently without us having to add a ton of new flags.

I'm just a bit concerned that we're too lax with new features here and might introduce regressions. That's the main driver for me requesting this.

@stgraber
Copy link
Member

stgraber commented Jun 7, 2022

I'm just a bit concerned that we're too lax with new features here and might introduce regressions. That's the main driver for me requesting this.

Well, this isn't going to be in stable-5.0 as it's a new feature, so I'm not too worried about that.

@Hunter1016
Copy link
Author

@brauner @stgraber Based on above discussions, I think disabling this emulation can be easily and flexibly done by just not mount it into the container. Each container has the right to choose.
Conversely, if we introduce a new flag, it becomes a global option that all containers should follow.

@Hunter1016 Hunter1016 requested a review from brauner August 26, 2022 01:37
@mihalicyn
Copy link
Member

mihalicyn commented Mar 26, 2024

Hi @Hunter1016

Really sorry for a long delay with reviewing this. Could you rebase it?

We have just discussed with Stéphane and we would like to merge it and have it in an upcoming LXCFS release.

Kind regards,
Alex

@stgraber stgraber added this to the lxcfs-6.0.0 milestone Mar 26, 2024
@stgraber
Copy link
Member

We'll go with a --enable-cpu-nodes flag for the time being and then replace that with the per-container config stuff a bit later (after 6.0 LTS).

@mihalicyn
Copy link
Member

JFYI: Right now I'm working on rebasing this on top of main branch.

@mihalicyn
Copy link
Member

replaced by #632

@mihalicyn mihalicyn closed this Mar 27, 2024
@mihalicyn mihalicyn reopened this Mar 27, 2024
@mihalicyn
Copy link
Member

This thing is tightly coupled with /sys/devices/system/cpu virtualization which was removed recently in scope of #557

If I fully understand the idea of this patchset is to hide NUMA nodes in accordance with what we have in cpuset in cgroup.

Taking into account the problem described by @tych0 in #557 I'm not sure that it's a good idea in general, because user space application may not expect this.

It can lead to weird configurations like this (manually crafted listing example, it's not from a real system):

$ ls -la /sys/devices/system/node/
total 0
drwxr-xr-x  4 root root    0 Mar 27 08:39 .
drwxr-xr-x 10 root root    0 Mar 27 08:39 ..
<...>
drwxr-xr-x  6 root root    0 Mar 27 08:39 node0
drwxr-xr-x  6 root root    0 Mar 27 08:39 node2
-r--r--r--  1 root root 4096 Mar 27 12:47 online
<..>

Dear @Hunter1016, couldn't you provide a bit more context why this was implemented? Do you have any use-case example for this?

@stgraber stgraber removed this from the lxcfs-6.0.0 milestone Mar 27, 2024
@tych0
Copy link
Member

tych0 commented Mar 27, 2024

I agree; we should only emulate what the kernel does. Anything beyond that will lead to problems.

@Hunter1016
Copy link
Author

This thing is tightly coupled with /sys/devices/system/cpu virtualization which was removed recently in scope of #557

If I fully understand the idea of this patchset is to hide NUMA nodes in accordance with what we have in cpuset in cgroup.

Taking into account the problem described by @tych0 in #557 I'm not sure that it's a good idea in general, because user space application may not expect this.

It can lead to weird configurations like this (manually crafted listing example, it's not from a real system):

$ ls -la /sys/devices/system/node/
total 0
drwxr-xr-x  4 root root    0 Mar 27 08:39 .
drwxr-xr-x 10 root root    0 Mar 27 08:39 ..
<...>
drwxr-xr-x  6 root root    0 Mar 27 08:39 node0
drwxr-xr-x  6 root root    0 Mar 27 08:39 node2
-r--r--r--  1 root root 4096 Mar 27 12:47 online
<..>

Dear @Hunter1016, couldn't you provide a bit more context why this was implemented? Do you have any use-case example for this?

@mihalicyn I'm sorry for a bit late, the most important motivation is running applications depending on libhwloc meets segmentaion fault because it reads /sys/devices/system/node/node%u/cpumap and /sys/devices/system/cpu/online to analyze hardware topology.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants