Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Allow Configuration of TreeWidth #6366

Open
isc-lee opened this issue Jul 25, 2024 · 2 comments
Open

Feature Request: Allow Configuration of TreeWidth #6366

isc-lee opened this issue Jul 25, 2024 · 2 comments

Comments

@isc-lee
Copy link

isc-lee commented Jul 25, 2024

My cursory read-through of the codebase leads to the understanding that TreeWidth is on the CustomSlurmSettings' Deny list due to clusters deployed with the ec2hostnames DNS configuration. As in that case, the TreeWidth must be set to MAX_VALUE to keep slurmctl directly communicating with each node (as opposed to utilzing fanout).

This results in the Managed/Domain DNS deployments having TreeWidth forcibly set to '30.' On the one hand, this value seems too low for small clusters that may want to avoid faults introduced by relying on dynamic node fanout (which has caused issues as recently as pcluster 3.9.0-3.9.2 due to cloud_dns bug(s) in slurm 23.11.4-23.11.7). On the other hand, this value seems too high for clusters looking to efficiently use hierarchical messaging in clusters running below 4000 nodes (resulting in more threads being spun up/resources used by slurmctld than are necessary).

Allowing users to configure TreeWidth (for Managed/Domain DNS configured clusters) grants them the ability to balance fanout based on their clusters fault tolerance/scalability needs.

(Note: Can currently set TreeWidth by suppressing the CustomSlurmSettingsValidator, but that turning that validator off comes with decent risk).

@joehellmersNOAA
Copy link

+1 on this one.

@hanwen-pcluste
Copy link
Contributor

Sorry for the late reply.

We've recorded this feature request and will discuss within the team.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants