-
Notifications
You must be signed in to change notification settings - Fork 349
Refactor the way we handle Hook Creation #1090
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
31d7b72
to
45a6c02
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the hook creation logic to use a centralized HookCreator interface in place of the direct use of nvidiaCDIHookPath.
- Centralizes hook creation by introducing the HookCreator interface and updating its implementation.
- Updates multiple discoverers, constructors, and tests across the codebase to use the new abstraction.
- Simplifies and consolidates redundant hook creation functions.
Reviewed Changes
Copilot reviewed 31 out of 31 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
pkg/nvcdi/driver-nvml.go | Replaces nvidiaCDIHookPath with hookCreator for hook creation. |
pkg/nvcdi/common-nvml.go | Updates NewGraphicsMountsDiscoverer to use hookCreator. |
internal/platform-support/tegra/*.go | Adjusts Tegra options and symlinks to leverage hookCreator. |
internal/platform-support/dgpu/*.go | Replaces direct hook path usage with hookCreator in DGPU discoverers. |
internal/modifier/*.go | Updates modifiers to instantiate hookCreator from configuration. |
internal/discover/*.go | Centralizes hook creation in discoverers and adds error-checking. |
internal/discover/hooks.go | Refactors Hook methods with pointer receivers and defines HookCreator. |
internal/discover/graphics_test.go | Updates tests to follow the new HookCreator interface. |
Comments suppressed due to low confidence (1)
internal/discover/hooks.go:39
- [nitpick] Ensure that consistent nil-checks are applied in all pointer receiver methods (Devices, Mounts, Hooks) to avoid unexpected behavior when a nil Hook is encountered.
func (h *Hook) Hooks() ([]Hook, error) { if h == nil { return nil, nil } }
81fd6ff
to
7724f11
Compare
.gitignore
Outdated
@@ -12,3 +12,6 @@ | |||
/shared-* | |||
/release-* | |||
/bin | |||
/nvidia-cdi-hook |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could we replace the nvidia-*
binaries with a /nvidia-*
rule?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove this line as well.
internal/modifier/graphics.go
Outdated
@@ -35,12 +35,12 @@ func NewGraphicsModifier(logger logger.Interface, cfg *config.Config, containerI | |||
return nil, nil | |||
} | |||
|
|||
nvidiaCDIHookPath := cfg.NVIDIACTKConfig.Path | |||
hookCreator := discover.NewHookCreator(cfg.NVIDIAContainerCLIConfig.Path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to also pass this in as an argument to the function instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think is ok to leave it as is, as the value of NVIDIACTKConfig.Path
of o.nvidiaCDIHookPath
it's taken when the pkg's are initialized from the CMD side
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about how this would look from a point of view of a unified config for disabling hooks. At present we contruct this hook creator in multiple places and we would want to ensure that we do that ONCE per code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a unified config for disabling hooks could be passed as an optional argument when creating the HookCreator
, meaning we add a new entry to the internal/config/config
to allow users defining disabled hooks even on the config file. and then we have a WithDisabledHooks
option that takes the config.DisabledHooks
what ever type we decide there, as input.
That's what I think would be a nice long-term goal for this discussion. But not in scope for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not what I mean in the context of this PR. Here, we are constructing a HookCreator
in NewFeatureGatedModifier
as well as in NewGraphicsModifier
. (We also separately construct this in the CDI and CSV modifiers and the latter definitely needs to be udpated for consistency, but that is definitely out of scope).
My suggestion is to create a HookModifier
once in the newSpecModifier
function and the pass it down as required. When we DO add config options, we could then make the change in one place and add the relevant args to the constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get it now, will work on the change
pkg/nvcdi/driver-nvml.go
Outdated
cudaCompatLibHookDiscoverer := discover.NewCUDACompatHookDiscoverer(l.logger, l.nvidiaCDIHookPath, l.driver) | ||
discoverers = append(discoverers, cudaCompatLibHookDiscoverer) | ||
} | ||
// TODO: The following should use the version directly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: if we're just refactoring the hook creation, should we not still check HookIsSupported
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, since Create
will check if the hook is disabled as first thing, I think we could deprecate, HookIsSupported
func?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm not saying we should keep the function. I'm saying that for the first refactoring step, we ONLY add a HookCreator
and then as a follow-up we pull in the ability to disable hooks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update this diff so that the only modification is the chane from l.nvidiaCDIHookPath
-> l.hookCreator
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
7724f11
to
7c530e1
Compare
internal/discover/hooks.go
Outdated
return nil | ||
} | ||
|
||
links := make([]string, 0, len(args)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the point of defining the capacity here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
internal/discover/hooks.go
Outdated
) | ||
type HookCreator interface { | ||
Create(string, ...string) *Hook | ||
DisableHook(string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like having DisableHook
as a function here. I know we discussed it, but I think constructing the proper may beforehand and not having this change state is more easy to reason about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean that the list of disabled hooks will be passed down during NewHookCreator
and not having a method to allow adding more hooks after that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. That is what I mean. We may need to move the contrstructor around in lib.go
a bit, but I think this is a better long-term solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we can have that as a follow-up, os as part of this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
but I think constructing the proper may beforehand and not having this change state is more easy to reason about.
Will be addressed in a followup PR
pkg/nvcdi/lib.go
Outdated
@@ -145,7 +150,7 @@ func New(opts ...Option) (Interface, error) { | |||
l.vendor = "management.nvidia.com" | |||
} | |||
// Management containers in general do not require CUDA Forward compatibility. | |||
l.disabledHooks[HookEnableCudaCompat] = true | |||
l.hookCreator.DisableHook(string(HookEnableCudaCompat)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We either need to keep this as is, or properly move the handling of disabled hooks to the discover
package. I would recommend handling skipping here in the first commit / PR and then adding proper disabled functionality in a follow-up commit or PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
7c530e1
to
6704cd9
Compare
6704cd9
to
96c57fd
Compare
"enable-cuda-compat", | ||
args..., | ||
) | ||
return hookCreator.Create("enable-cuda-compat", args...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for a follow-up: Create
should be modified to accept a HookName
instead of arbitrary strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but we would need to define the HookName
in a new Package, as pkg/nvcdi
calls internal/discover
, so we can not reference pkg/nvcdi
from internal/discover
without running into an import loop.
I think...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now have a better idea for this after our conversation, this will be addressed in a followup PR
internal/discover/symlinks.go
Outdated
@@ -73,8 +73,12 @@ func (d *additionalSymlinks) Hooks() ([]Hook, error) { | |||
return hooks, nil | |||
} | |||
|
|||
hook := CreateCreateSymlinkHook(d.nvidiaCDIHookPath, links).(Hook) | |||
return append(hooks, hook), nil | |||
hook, err := d.hookCreator.Create("create-symlinks", links...).Hooks() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: the return value is a slice, so we may want to rename the variable createSymlinkHooks
:
hook, err := d.hookCreator.Create("create-symlinks", links...).Hooks() | |
createSymlinkHooks, err := d.hookCreator.Create("create-symlinks", links...).Hooks() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
internal/discover/symlinks_test.go
Outdated
@@ -308,10 +308,12 @@ func TestWithWithDriverDotSoSymlinks(t *testing.T) { | |||
|
|||
for _, tc := range testCases { | |||
t.Run(tc.description, func(t *testing.T) { | |||
hookCreator := NewHookCreator("/path/to/nvidia-cdi-hook") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be defined outside of the for
loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/nvcdi/driver-nvml.go
Outdated
// TODO: The following should use the version directly. | ||
|
||
// replace l.nvidiaCDIHookPath for the l.cdiHook to use the inferface | ||
if !l.disabledHooks[HookEnableCudaCompat] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if !l.disabledHooks[HookEnableCudaCompat] { | |
if l.HookIsSupported(HookEnableCudaCompat) { |
It should be clear from the change set that this behaviour is NOT affected by this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/nvcdi/driver-nvml.go
Outdated
cudaCompatLibHookDiscoverer := discover.NewCUDACompatHookDiscoverer(l.logger, l.nvidiaCDIHookPath, l.driver) | ||
// TODO: The following should use the version directly. | ||
|
||
// replace l.nvidiaCDIHookPath for the l.cdiHook to use the inferface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// replace l.nvidiaCDIHookPath for the l.cdiHook to use the inferface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/nvcdi/lib.go
Outdated
} | ||
|
||
// New creates a new nvcdi library | ||
func New(opts ...Option) (Interface, error) { | ||
l := &nvcdilib{ | ||
disabledHooks: make(disabledHooks), | ||
disabledHooks: make(map[HookName]bool), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disabledHooks: make(map[HookName]bool), | |
disabledHooks: make(disabledHooks), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
96c57fd
to
8458290
Compare
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
8458290
to
e7c25f4
Compare
This pull request refactors the codebase to introduce a new
HookCreator
interface, replacing the previous direct use ofnvidiaCDIHookPath
for creating hooks. This change improves modularity and makes the code more extensible by centralizing hook creation logic. The changes span multiple files and functions, focusing on updating constructors, methods, and tests to use the newHookCreator
abstraction.Refactoring to Use
HookCreator
:Centralized Hook Creation:
HookCreator
interface and its implementation (CDIHook
) to encapsulate hook creation logic. This replaces the direct use ofnvidiaCDIHookPath
for creating hooks. (internal/discover/hooks.go
, internal/discover/hooks.goL28-R102)createLDCacheUpdateHook
,Create("create-symlinks")
) to useHookCreator
. (internal/discover/hooks.go
, internal/discover/hooks.goL28-R102)Function and Constructor Updates:
nvidiaCDIHookPath
parameter withHookCreator
in constructors and functions such asNewCUDACompatHookDiscoverer
,NewGraphicsMountsDiscoverer
, andNewLDCacheUpdateHook
. (internal/discover/compat_libs.go
, [1];internal/discover/graphics.go
, [2] [3]hookCreator
instead ofnvidiaCDIHookPath
. (internal/discover/graphics.go
, [1] [2]Code Simplification and Cleanup:
Removed Redundant Hook Creation Logic:
CreateNvidiaCDIHook
andCreateCreateSymlinkHook
, consolidating their behavior into theHookCreator
implementation. (internal/discover/hooks.go
, internal/discover/hooks.goL28-R102)Simplified Hook Handling:
Hooks()
methods across various discoverers to delegate hook creation toHookCreator
. (internal/discover/graphics.go
, [1];internal/discover/ldconfig.go
, [2]Tests and Validation:
HookCreator
abstraction, ensuring compatibility with the refactored code. (internal/discover/graphics_test.go
, [1];internal/discover/ldconfig_test.go
, [2]These changes enhance the maintainability and scalability of the codebase by abstracting hook creation logic, reducing redundancy, and improving testability.