From 19dfa9453166dca50a9154546aab89ee3a44494a Mon Sep 17 00:00:00 2001 From: Michal Wysokinski Date: Wed, 31 Jul 2024 17:29:51 +0200 Subject: [PATCH 01/15] Initial commit 95% done --- keps/1618-optional-gc-of-workloads/README.md | 391 +++++++++++++++++++ keps/1618-optional-gc-of-workloads/kep.yaml | 25 ++ 2 files changed, 416 insertions(+) create mode 100644 keps/1618-optional-gc-of-workloads/README.md create mode 100644 keps/1618-optional-gc-of-workloads/kep.yaml diff --git a/keps/1618-optional-gc-of-workloads/README.md b/keps/1618-optional-gc-of-workloads/README.md new file mode 100644 index 0000000000..9fd9c48e84 --- /dev/null +++ b/keps/1618-optional-gc-of-workloads/README.md @@ -0,0 +1,391 @@ +# KEP-NNNN: Your short, descriptive title + + + + + + +- [Summary](#summary) +- [Motivation](#motivation) + - [Goals](#goals) + - [Non-Goals](#non-goals) +- [Proposal](#proposal) + - [User Stories (Optional)](#user-stories-optional) + - [Story 1](#story-1) + - [Story 2](#story-2) + - [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional) + - [Risks and Mitigations](#risks-and-mitigations) +- [Design Details](#design-details) + - [Test Plan](#test-plan) + - [Prerequisite testing updates](#prerequisite-testing-updates) + - [Unit Tests](#unit-tests) + - [Integration tests](#integration-tests) + - [Graduation Criteria](#graduation-criteria) +- [Implementation History](#implementation-history) +- [Drawbacks](#drawbacks) +- [Alternatives](#alternatives) + + +## Summary + + + +This KEP proposes a mechanism of optional garbage collection of finished Workloads. + +## Motivation + + + +Currently, kueue doesn't delete its own K8s objects leaving it to K8s' GC to manage retention of +objects created by kueue such as Pods, Jobs, Workloads etc. We don't set any expiration on them so +by default they are kept forever. Based on the usage pattern including +amount of jobs created, their complexity and frequency over time those objects can accumulate which leads +to unnecessary usage of etcd storage as well as gradual increase of kueue's memory footprint. Some of these objects, +like finished Workloads don't contain any useful information which could be used for additional +purposes like debugging. That's why based on user's feedback in [#1618](https://github.com/kubernetes-sigs/kueue/issues/1618) +or [#1789](https://github.com/kubernetes-sigs/kueue/issues/1789) a mechanism of garabge collection +of finished Workloads is being proposed. + +With the mechanism in place deleting finished Workloads has several benefits: +- **deletion of superfluous information** - finished Workloads don't store any useful runtime related +information (that's being stored by Jobs), they aren't useful for debugging, possibly could be used +for auditing purposes +- **decreasing kueue memory footprint** - finished Workloads are being loaded during kueue's +initialization and kept in memory. +- **freeing up etcd's storage** - finished Workloads are stored in etcd's memory until K8s builtin +GC mechanism collects it but by default Custom Resources are being stored indefinitely. + +### Goals + + + +- support deletion of finished Workload objects +- introduction of configuration of global retention policies for kueue authored +K8s objects starting with finished Workloads +- maintain backward compatibility (the feature needs to be explicitly enabled and configured) + +### Non-Goals + + + +- support deletion/expiration of any kueue authored K8s object +- configuring retention policies (like `.spec.ttlSecondsAfterFinished`) attached per +K8s object + +## Proposal + + + +Add a new section to the configuration called `objectRetentionPolicies` +which will include a retention policy for finished Workloads under an option +called `FinishedWorkloadRetention`. The section could be extended with +options for other kueue authored objects. + + +### User Stories + + + +Based on how the retention policy is configured for finished Workloads there are +multiple user stories being proposed below. + +#### Story 0 - Retention policy for finished Workloads is misconfigured + +If the finished Workloads policy is not configured correctly (provided value is not a valid +time.Duration) kueue will crash during config parsing. + +#### Story 1 - Retention policy for finished Workloads is not configured (default) + +**This story ensures backward compatibility.** +If the object retention policy section is not configured or finished Workloads retention +is either not configured or set to null Workload objects won't be deleted by kueue. + +#### Story 2 - Retention policy for finished Workloads is configured and set to a value > 0s + +If the object retention policy is configured to a value > 0s finished Workload retention policy +will be evaluated against every completed Workload. The flow is follow: +1. During kueue's initial run of the reconciliation loop all previously finished Workloads +will be evaluated against the Workload retention policy and they'll be either deleted immediately if they +are already expired or they'll be requeued for reconciliation after their +retention period has expired. +2. During following runs of the reconciliation loop each Workload will be evaluated using +the same approach as in 1. but during the subsequent run of the reconciliation loop +after they were declared finished and based on the evaluation result against the retention policy +they'll be either requeued for later reconciliation or deleted. + +#### Story 3 - Retention policy for finished Workloads is configured and set to 0s + +When the object retention policy is set to 0s finished workloads will be deleted +during the first run of the reconciliation loop after they were declared finished. + +### Notes/Constraints/Caveats (Optional) + + + +- Initially other naming was considered for config keys namely instead of Objects +the word Resources was used but based on the feedback from @alculquicondor it was +changed because of Resources bearing too many meanings in the context of kueue +(K8s resources, physical resources, etc.). +- A different implementation was proposed initially. The deletion logic wasn't supposed +to be part of the Reconcile function but rather it was supposed to have its own +handler/watcher pair. During implementation, it was challenging to implement it this way +without substantial changes to the code. Also creating a watcher for completed Workloads +proved to be very difficult and complex. +- The deletion mechanism assumes that finished Workload objects don't have +any finalizers attached to them. After an investigation from @mimowo the [exact place in +code](https://github.com/kubernetes-sigs/kueue/blob/a128ae3354a56670ffa58695fef1eca270fe3a47/pkg/controller/jobframework/reconciler.go#L332-L340) +was found which removes the resource in use finalizer when the Workload state +transitions to finished. If that behavior ever changes or if there's an external +source of a finalizer being attached to the Workload it will be marked for deletion +by the K8s client but it won't actually be deleted until the finalizer is removed. + +### Risks and Mitigations + + + +- In clusters with a big number of existing finished Workloads (1000s, 10000s, ...) +it may take some time to delete all previously finished Workloads that qualify +for deletion. Since the reconciliation loop is effectively synchronous it may +impact reconciling new jobs. From the user's perspective it may seem like the kueue's +initialization time is really long for that initial run until all expired objects +are deleted. Unfortunately there's no good way of mitigating this +without complicating the code. A potential improvement would be to have a special +internal queue that deals with deletion outside of the reconciliation loop or +deleting expired objects in batches so that after a batch of deletions is +completed all expired objects that are left would be skipped over until the next run +of the loop but it'd increase its against the synchronous nature of the reconciliation loop. +The author proposes to include it as part of the documentation + +## Design Details + + + +### Test Plan + + + +[X] I understand the owners of the involved components may require updates to +existing tests to make this code solid enough prior to committing the changes necessary +to implement this enhancement. + +[//]: # (##### Prerequisite testing updates) + +[//]: # () +[//]: # () + +#### Unit Tests + + + + + +The following new unit tests or modifications to the existing ones were added +to test new functionality: + +- `apis/config/v1beta1/defaults_test.go` + - updated existing tests with `defaultObjectRetentionPolicies` which sets + `FinishedWorkloadRetention` to nil which ensures that the default + scenario maintains backward compatibility and no retention settings are set. + - new unit test: `set object retention policy for finished workloads` which + checks if provided can configuration overrides default values. +- `pkg/config/config_test.go` + - updated existing tests with `defaultObjectRetentionPolicies` set to nil + which tests if for existing tests the default behavior maintains backward + compatibility and results in no retention settings. + - new unit test: `objectRetentionPolicies config` which sets some example + retention policy and then checks if it can be retrieved. +- `pkg/controller/core/workload_controller_test.go` + - added following new tests: + - `shouldn't delete the workload because, object retention not configured` + - `shouldn't try to delete the workload (no event emitted) because it is already being deleted by kubernetes, object retention configured` + - `shouldn't try to delete the workload because the retention period hasn't elapsed yet, object retention configured` + - `should delete the workload because the retention period has elapsed, object retention configured` + - `should delete the workload immediately after it completed because the retention period is set to 0s` + +#### Integration tests + + + +- TBA + + +[//]: # (### Graduation Criteria) + +[//]: # () +[//]: # () + +## Implementation History + + + +- Initial PR created for the feature pre-KEP [#2686](https://github.com/kubernetes-sigs/kueue/pull/2686). + +## Drawbacks + + + +- Deleting things is tricky and GC behavior can often be surprising for users. +Even though the mechanism proposed in this KEP is really simple and needs to be +explicitly enabled, it is a global mechanism applied at the controller level. +The author made sure to introduce some observability in the controller itself +(logging and emitting a deletion event) but K8s users and developers used to +tracking K8s objects' lifecycle using per object attached policies might be +surprised with this behavior. + + +## Alternatives + + + +- Leave the current state of things as is, +- Implement attaching retention policies attached per object like (like `.spec.ttlSecondsAfterFinished`) +instead of having global retention policies configured at kueue level +but it'd complicate the implementation and add complexity to the feature itself +e.g. during reconfiguration of an existing retention policy all existing K8s objects +would need to be updated with the new policy. diff --git a/keps/1618-optional-gc-of-workloads/kep.yaml b/keps/1618-optional-gc-of-workloads/kep.yaml new file mode 100644 index 0000000000..b6bfc06120 --- /dev/null +++ b/keps/1618-optional-gc-of-workloads/kep.yaml @@ -0,0 +1,25 @@ +title: Optional garbage collection of finished Workloads +kep-number: 1618 +authors: + - "@mwysokin" +status: implementable +creation-date: 2024-07-31 +reviewers: + - "@mimowo" + - "@tenzen-y" + - "@trasc" +approvers: + - "@mimowo" + - "@tenzen-y" + +# The target maturity stage in the current dev cycle for this KEP. +stage: alpha + +# The most recent milestone for which work toward delivery of this KEP has been +# done. This can be the current (upcoming) milestone, if it is being actively +# worked on. +latest-milestone: "v0.9" + +# The milestone at which this feature was, or is targeted to be, at each stage. +milestone: + alpha: "v0.9" From 334b02070e0ec34ca06767ad1613f8c650fd68dc Mon Sep 17 00:00:00 2001 From: Michal Wysokinski Date: Thu, 1 Aug 2024 13:32:12 +0200 Subject: [PATCH 02/15] KEP ready to be reviewed --- keps/1618-optional-gc-of-workloads/README.md | 235 +++++++++++-------- 1 file changed, 142 insertions(+), 93 deletions(-) diff --git a/keps/1618-optional-gc-of-workloads/README.md b/keps/1618-optional-gc-of-workloads/README.md index 9fd9c48e84..b57e3e06e9 100644 --- a/keps/1618-optional-gc-of-workloads/README.md +++ b/keps/1618-optional-gc-of-workloads/README.md @@ -1,4 +1,4 @@ -# KEP-NNNN: Your short, descriptive title +# KEP-1618: Optional garbage collection of finished Workloads -This KEP proposes a mechanism of optional garbage collection of finished Workloads. +This KEP proposes a mechanism for garbage collection of finished Workloads in kueue. +Currently, kueue does not delete its own Kubernetes objects, leading to potential accumulation +and unnecessary resource consumption. The proposed solution involves adding a new +API section called `objectRetentionPolicies` to allow users to specify retention +policies for finished Workloads. This can be set to a specific duration or +disabled entirely to maintain backward compatibility. + +Benefits of implementing this KEP include the deletion of superfluous information, +decreased memory footprint for kueue, and freeing up etcd storage. The proposed changes +are designed to be explicitly enabled and configured by users, ensuring backward compatibility. +While the primary focus is on finished Workloads, the API section can be extended to include +other kueue-authored objects in the future. The implementation involves incorporating the +deletion logic into the reconciliation loop and includes considerations for +potential risks, such as handling a large number of existing finished Workloads. +Overall, this KEP aims to enhance kueue's resource management and provide users with +more control over object retention. ## Motivation @@ -72,24 +87,24 @@ demonstrate the interest in a KEP within the wider Kubernetes community. [experience reports]: https://github.com/golang/go/wiki/ExperienceReports --> -Currently, kueue doesn't delete its own K8s objects leaving it to K8s' GC to manage retention of -objects created by kueue such as Pods, Jobs, Workloads etc. We don't set any expiration on them so -by default they are kept forever. Based on the usage pattern including -amount of jobs created, their complexity and frequency over time those objects can accumulate which leads -to unnecessary usage of etcd storage as well as gradual increase of kueue's memory footprint. Some of these objects, -like finished Workloads don't contain any useful information which could be used for additional -purposes like debugging. That's why based on user's feedback in [#1618](https://github.com/kubernetes-sigs/kueue/issues/1618) -or [#1789](https://github.com/kubernetes-sigs/kueue/issues/1789) a mechanism of garabge collection -of finished Workloads is being proposed. - -With the mechanism in place deleting finished Workloads has several benefits: -- **deletion of superfluous information** - finished Workloads don't store any useful runtime related -information (that's being stored by Jobs), they aren't useful for debugging, possibly could be used -for auditing purposes -- **decreasing kueue memory footprint** - finished Workloads are being loaded during kueue's -initialization and kept in memory. -- **freeing up etcd's storage** - finished Workloads are stored in etcd's memory until K8s builtin -GC mechanism collects it but by default Custom Resources are being stored indefinitely. +Currently, kueue does not delete its own Kubernetes objects, leaving it to Kubernetes' +garbage collector (GC) to manage the retention of objects created by kueue, such as Pods, Jobs, and Workloads. +We do not set any expiration on these objects, so by default, they are kept indefinitely. Based on usage patterns, +including the amount of jobs created, their complexity, and frequency over time, these objects can accumulate, +leading to unnecessary use of etcd storage and a gradual increase in kueue's memory footprint. Some of these objects, +like finished Workloads, do not contain any useful information that could be used for additional purposes, such as debugging. +That's why, based on user feedback in [#1618](https://github.com/kubernetes-sigs/kueue/issues/1618) +or [#1789](https://github.com/kubernetes-sigs/kueue/issues/1789), a mechanism for garbage collection of finished Workloads +is being proposed. + +With this mechanism in place, deleting finished Workloads has several benefits: +- **Deletion of superfluous information**: Finished Workloads do not store any useful runtime-related information +(which is stored by Jobs), nor are they useful for debugging. They could possibly be useful for auditing purposes, +but this is not a primary concern. +- **Decreasing kueue memory footprint**: Finished Workloads are loaded during kueue's initialization and kept in memory, +consuming resources unnecessarily. +- **Freeing up etcd storage**: Finished Workloads are stored in etcd memory until Kubernetes' built-in GC mechanism +collects them, but by default, Custom Resources are stored indefinitely. ### Goals @@ -98,10 +113,10 @@ List the specific goals of the KEP. What is it trying to achieve? How will we know that this has succeeded? --> -- support deletion of finished Workload objects -- introduction of configuration of global retention policies for kueue authored -K8s objects starting with finished Workloads -- maintain backward compatibility (the feature needs to be explicitly enabled and configured) +- Support the deletion of finished Workload objects. +- Introduce configuration of global retention policies for kueue-authored Kubernetes objects, +starting with finished Workloads. +- Maintain backward compatibility (the feature must be explicitly enabled and configured). ### Non-Goals @@ -110,9 +125,9 @@ What is out of scope for this KEP? Listing non-goals helps to focus discussion and make progress. --> -- support deletion/expiration of any kueue authored K8s object -- configuring retention policies (like `.spec.ttlSecondsAfterFinished`) attached per -K8s object +- Support the deletion/expiration of any kueue-authored Kubernetes object other than finished Workloads. +- Configure retention policies (like `.spec.ttlSecondsAfterFinished`) that are attached to +individual Kubernetes objects. ## Proposal @@ -125,10 +140,10 @@ The "Design Details" section below is for the real nitty-gritty. --> -Add a new section to the configuration called `objectRetentionPolicies` -which will include a retention policy for finished Workloads under an option -called `FinishedWorkloadRetention`. The section could be extended with -options for other kueue authored objects. +Add a new API called `objectRetentionPolicies`, +which will enable specifying a retention policy for finished Workloads under an option +called `FinishedWorkloadRetention`. This API section could be extended in the future +with options for other kueue-authored objects. ### User Stories @@ -140,39 +155,46 @@ the system. The goal here is to make this feel real for users without getting bogged down. --> -Based on how the retention policy is configured for finished Workloads there are -multiple user stories being proposed below. +Based on how the retention policy is configured for finished Workloads, +several user stories are proposed below. #### Story 0 - Retention policy for finished Workloads is misconfigured -If the finished Workloads policy is not configured correctly (provided value is not a valid -time.Duration) kueue will crash during config parsing. +If the finished Workloads policy is not configured correctly +(the provided value is not a valid time.Duration), kueue will fail to start +during configuration parsing. #### Story 1 - Retention policy for finished Workloads is not configured (default) **This story ensures backward compatibility.** -If the object retention policy section is not configured or finished Workloads retention -is either not configured or set to null Workload objects won't be deleted by kueue. + +If the object retention policy section is not configured, +or finished Workloads retention is either not configured or set to `null`, +Workload objects will not be deleted by kueue. #### Story 2 - Retention policy for finished Workloads is configured and set to a value > 0s -If the object retention policy is configured to a value > 0s finished Workload retention policy -will be evaluated against every completed Workload. The flow is follow: -1. During kueue's initial run of the reconciliation loop all previously finished Workloads -will be evaluated against the Workload retention policy and they'll be either deleted immediately if they -are already expired or they'll be requeued for reconciliation after their -retention period has expired. -2. During following runs of the reconciliation loop each Workload will be evaluated using -the same approach as in 1. but during the subsequent run of the reconciliation loop -after they were declared finished and based on the evaluation result against the retention policy -they'll be either requeued for later reconciliation or deleted. +If the object retention policy is configured to a value greater than 0 seconds, +the finished Workload retention policy will be evaluated against every +completed Workload. The flow is as follows: + +1. During kueue's initial reconciliation loop, all previously finished Workloads +will be evaluated against the Workload retention policy. They will either be +deleted immediately if they are already expired or requeued for reconciliation +after their retention period has expired. + +2. During subsequent reconciliation loops, each Workload will be evaluated +using the same approach as in step 1. However, the evaluation will occur +during the next reconciliation loop after the Workload was declared finished. +Based on the evaluation result against the retention policy, the Workload will +either be requeued for later reconciliation or deleted. #### Story 3 - Retention policy for finished Workloads is configured and set to 0s -When the object retention policy is set to 0s finished workloads will be deleted -during the first run of the reconciliation loop after they were declared finished. +When the object retention policy is set to 0 seconds, finished Workloads will be +deleted immediately during the first reconciliation loop after they are declared finished. -### Notes/Constraints/Caveats (Optional) +### Notes/Constraints/Caveats -- Initially other naming was considered for config keys namely instead of Objects -the word Resources was used but based on the feedback from @alculquicondor it was -changed because of Resources bearing too many meanings in the context of kueue -(K8s resources, physical resources, etc.). -- A different implementation was proposed initially. The deletion logic wasn't supposed -to be part of the Reconcile function but rather it was supposed to have its own -handler/watcher pair. During implementation, it was challenging to implement it this way -without substantial changes to the code. Also creating a watcher for completed Workloads -proved to be very difficult and complex. -- The deletion mechanism assumes that finished Workload objects don't have -any finalizers attached to them. After an investigation from @mimowo the [exact place in -code](https://github.com/kubernetes-sigs/kueue/blob/a128ae3354a56670ffa58695fef1eca270fe3a47/pkg/controller/jobframework/reconciler.go#L332-L340) -was found which removes the resource in use finalizer when the Workload state -transitions to finished. If that behavior ever changes or if there's an external -source of a finalizer being attached to the Workload it will be marked for deletion -by the K8s client but it won't actually be deleted until the finalizer is removed. +- Initially, other naming was considered for config keys. Namely, instead of "Objects," +the word "Resources" was used, but based on feedback from `@alculquicondor`, it was changed +because "Resources" bears too many meanings in the context of kueue +(Kubernetes resources, physical resources, etc.). +- A different implementation was proposed initially. The deletion logic was not +supposed to be part of the `Reconcile` function but rather was supposed to have +its own handler/watcher pair. During implementation, this proved challenging +without substantial changes to the code. Also, creating a watcher for +completed Workloads proved to be very complex. +- The deletion mechanism assumes that finished Workload objects do not have +any finalizers attached to them. After an investigation by `@mimowo`, +the [exact place in the code](https://github.com/kubernetes-sigs/kueue/blob/a128ae3354a56670ffa58695fef1eca270fe3a47/pkg/controller/jobframework/reconciler.go#L332-L340) +was found that removes the resource-in-use +finalizer when the Workload state transitions to finished. If that behavior ever +changes, or if there is an external source of a finalizer being attached +to the Workload, it will be marked for deletion by the Kubernetes client +but will not actually be deleted until the finalizer is removed. ### Risks and Mitigations @@ -212,18 +235,17 @@ How will UX be reviewed, and by whom? Consider including folks who also work outside the SIG or subproject. --> -- In clusters with a big number of existing finished Workloads (1000s, 10000s, ...) -it may take some time to delete all previously finished Workloads that qualify -for deletion. Since the reconciliation loop is effectively synchronous it may -impact reconciling new jobs. From the user's perspective it may seem like the kueue's -initialization time is really long for that initial run until all expired objects -are deleted. Unfortunately there's no good way of mitigating this -without complicating the code. A potential improvement would be to have a special -internal queue that deals with deletion outside of the reconciliation loop or -deleting expired objects in batches so that after a batch of deletions is -completed all expired objects that are left would be skipped over until the next run -of the loop but it'd increase its against the synchronous nature of the reconciliation loop. -The author proposes to include it as part of the documentation +- **R**: In clusters with a large number of existing finished Workloads +(thousands, tens of thousands, etc.), it may take a significant amount of time +to delete all previously finished Workloads that qualify for deletion. Since the +reconciliation loop is effectively synchronous, this may impact the reconciliation +of new jobs. From the user's perspective, it may seem like kueue's initialization +time is very long for that initial run until all expired objects are deleted.\ +**M**: Unfortunately, there is no easy way to mitigate this without complicating the code. +A potential improvement would be to have a dedicated internal queue that handles deletion +outside the reconciliation loop, or to delete expired objects in batches so that after +a batch of deletions is completed, any remaining expired objects would be skipped until +the next run of the loop. However, this would go against the synchronous nature of the reconciliation loop. ## Design Details @@ -234,6 +256,33 @@ required) or even code snippets. If there's any ambiguity about HOW your proposal will be implemented, this is the place to discuss them. --> + +### API +```go +// Configuration is the Schema for the kueueconfigurations API +type Configuration struct { + //... + + // ObjectRetentionPolicies provides configuration options for retention of kueue owned + // resources. + ObjectRetentionPolicies *ObjectRetentionPolicies `json:"objectRetentionPolicies,omitempty"` +} + +//... + +type ObjectRetentionPolicies struct { +// FinishedWorkloadRetention is the duration to retain finished Workloads. +// A duration of 0 will delete finished Workloads immediately. +// A nil value will disable automatic deletion. +// The value is represented using the metav1.Duration format, allowing for flexible +// specification of time units (e.g., "24h", "1h30m", "30s"). +// +// Defaults to null. +// +optional +FinishedWorkloadRetention *metav1.Duration `json:"finishedWorkloadRetention"` +} +``` + ### Test Plan -The following new unit tests or modifications to the existing ones were added -to test new functionality: +The following new unit tests or modifications to existing ones were added to test +the new functionality: - `apis/config/v1beta1/defaults_test.go` - updated existing tests with `defaultObjectRetentionPolicies` which sets @@ -366,13 +415,13 @@ Major milestones might include: Why should this KEP _not_ be implemented? --> -- Deleting things is tricky and GC behavior can often be surprising for users. -Even though the mechanism proposed in this KEP is really simple and needs to be -explicitly enabled, it is a global mechanism applied at the controller level. -The author made sure to introduce some observability in the controller itself -(logging and emitting a deletion event) but K8s users and developers used to -tracking K8s objects' lifecycle using per object attached policies might be -surprised with this behavior. +- Deleting objects is inherently complex, and garbage collection (GC) behavior can often be +surprising for end-users who did not configure the service themselves. Even though the +mechanism proposed in this KEP is relatively simple and must be explicitly enabled, +it is a global mechanism applied at the controller level. The author has added some +observability to the controller itself (logging and emitting deletion events), +but Kubernetes users and developers accustomed to tracking object lifecycles +using per-object policies might be surprised by this behavior. ## Alternatives @@ -383,9 +432,9 @@ not need to be as detailed as the proposal, but should include enough information to express the idea and why it was not acceptable. --> -- Leave the current state of things as is, -- Implement attaching retention policies attached per object like (like `.spec.ttlSecondsAfterFinished`) -instead of having global retention policies configured at kueue level -but it'd complicate the implementation and add complexity to the feature itself -e.g. during reconfiguration of an existing retention policy all existing K8s objects -would need to be updated with the new policy. +- Leave the current state of things as is, not implementing any garbage +collection mechanism for finished Workloads. +- Implement retention policies attached per object (like `.spec.ttlSecondsAfterFinished`) instead of having +global retention policies configured at the kueue level. However, this would complicate the +implementation and add complexity to the feature itself. For example, during the reconfiguration +of an existing retention policy, all existing Kubernetes objects would need to be updated with the new policy. From dfa54217157927f2d9ca8e5a74c5e36d39c36fdb Mon Sep 17 00:00:00 2001 From: Michal Wysokinski Date: Thu, 1 Aug 2024 11:44:07 +0000 Subject: [PATCH 03/15] Fix TOC --- keps/1618-optional-gc-of-workloads/README.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/keps/1618-optional-gc-of-workloads/README.md b/keps/1618-optional-gc-of-workloads/README.md index b57e3e06e9..035b562d25 100644 --- a/keps/1618-optional-gc-of-workloads/README.md +++ b/keps/1618-optional-gc-of-workloads/README.md @@ -22,17 +22,18 @@ tags, and then generate with `hack/update-toc.sh`. - [Goals](#goals) - [Non-Goals](#non-goals) - [Proposal](#proposal) - - [User Stories (Optional)](#user-stories-optional) - - [Story 1](#story-1) - - [Story 2](#story-2) - - [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional) + - [User Stories](#user-stories) + - [Story 0 - Retention policy for finished Workloads is misconfigured](#story-0---retention-policy-for-finished-workloads-is-misconfigured) + - [Story 1 - Retention policy for finished Workloads is not configured (default)](#story-1---retention-policy-for-finished-workloads-is-not-configured-default) + - [Story 2 - Retention policy for finished Workloads is configured and set to a value > 0s](#story-2---retention-policy-for-finished-workloads-is-configured-and-set-to-a-value--0s) + - [Story 3 - Retention policy for finished Workloads is configured and set to 0s](#story-3---retention-policy-for-finished-workloads-is-configured-and-set-to-0s) + - [Notes/Constraints/Caveats](#notesconstraintscaveats) - [Risks and Mitigations](#risks-and-mitigations) - [Design Details](#design-details) + - [API](#api) - [Test Plan](#test-plan) - - [Prerequisite testing updates](#prerequisite-testing-updates) - [Unit Tests](#unit-tests) - [Integration tests](#integration-tests) - - [Graduation Criteria](#graduation-criteria) - [Implementation History](#implementation-history) - [Drawbacks](#drawbacks) - [Alternatives](#alternatives) From 1ddf7f516091aec352ec2c0fd1e9a8be4ede34fa Mon Sep 17 00:00:00 2001 From: Michal Wysokinski Date: Thu, 1 Aug 2024 13:46:06 +0200 Subject: [PATCH 04/15] Update reviewers --- keps/1618-optional-gc-of-workloads/kep.yaml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/keps/1618-optional-gc-of-workloads/kep.yaml b/keps/1618-optional-gc-of-workloads/kep.yaml index b6bfc06120..032b7be11c 100644 --- a/keps/1618-optional-gc-of-workloads/kep.yaml +++ b/keps/1618-optional-gc-of-workloads/kep.yaml @@ -5,9 +5,10 @@ authors: status: implementable creation-date: 2024-07-31 reviewers: + - "@denkensk" - "@mimowo" + - "@PBundyra" - "@tenzen-y" - - "@trasc" approvers: - "@mimowo" - "@tenzen-y" From 482652ef8edac9a8f72418ce1e24390db410a87f Mon Sep 17 00:00:00 2001 From: Michal Wysokinski Date: Thu, 1 Aug 2024 13:54:05 +0200 Subject: [PATCH 05/15] Fix API formatting --- keps/1618-optional-gc-of-workloads/README.md | 29 ++++++++++---------- 1 file changed, 14 insertions(+), 15 deletions(-) diff --git a/keps/1618-optional-gc-of-workloads/README.md b/keps/1618-optional-gc-of-workloads/README.md index 035b562d25..3a96b1bf46 100644 --- a/keps/1618-optional-gc-of-workloads/README.md +++ b/keps/1618-optional-gc-of-workloads/README.md @@ -162,7 +162,7 @@ several user stories are proposed below. #### Story 0 - Retention policy for finished Workloads is misconfigured If the finished Workloads policy is not configured correctly -(the provided value is not a valid time.Duration), kueue will fail to start +(the provided value is not a valid `time.Duration`), kueue will fail to start during configuration parsing. #### Story 1 - Retention policy for finished Workloads is not configured (default) @@ -262,25 +262,24 @@ proposal will be implemented, this is the place to discuss them. ```go // Configuration is the Schema for the kueueconfigurations API type Configuration struct { - //... - - // ObjectRetentionPolicies provides configuration options for retention of kueue owned - // resources. - ObjectRetentionPolicies *ObjectRetentionPolicies `json:"objectRetentionPolicies,omitempty"` + //... + // ObjectRetentionPolicies provides configuration options for retention of kueue owned + // resources. + ObjectRetentionPolicies *ObjectRetentionPolicies `json:"objectRetentionPolicies,omitempty"` } //... type ObjectRetentionPolicies struct { -// FinishedWorkloadRetention is the duration to retain finished Workloads. -// A duration of 0 will delete finished Workloads immediately. -// A nil value will disable automatic deletion. -// The value is represented using the metav1.Duration format, allowing for flexible -// specification of time units (e.g., "24h", "1h30m", "30s"). -// -// Defaults to null. -// +optional -FinishedWorkloadRetention *metav1.Duration `json:"finishedWorkloadRetention"` + // FinishedWorkloadRetention is the duration to retain finished Workloads. + // A duration of 0 will delete finished Workloads immediately. + // A nil value will disable automatic deletion. + // The value is represented using the metav1.Duration format, allowing for flexible + // specification of time units (e.g., "24h", "1h30m", "30s"). + // + // Defaults to null. + // +optional + FinishedWorkloadRetention *metav1.Duration `json:"finishedWorkloadRetention"` } ``` From 344a512fe86872047f6037a7a6d9776de9665315 Mon Sep 17 00:00:00 2001 From: Michal Wysokinski Date: Wed, 7 Aug 2024 11:51:19 +0200 Subject: [PATCH 06/15] REVIEW: Move misconfiguration and default scenario to notes. Merge stories into 1 and move the description of the behavior to a new section under Design Details. --- keps/1618-optional-gc-of-workloads/README.md | 65 ++++++++++---------- 1 file changed, 31 insertions(+), 34 deletions(-) diff --git a/keps/1618-optional-gc-of-workloads/README.md b/keps/1618-optional-gc-of-workloads/README.md index 3a96b1bf46..a02a445afa 100644 --- a/keps/1618-optional-gc-of-workloads/README.md +++ b/keps/1618-optional-gc-of-workloads/README.md @@ -159,41 +159,15 @@ bogged down. Based on how the retention policy is configured for finished Workloads, several user stories are proposed below. -#### Story 0 - Retention policy for finished Workloads is misconfigured +#### Story 1 - Configurable retention for finished Workloads -If the finished Workloads policy is not configured correctly -(the provided value is not a valid `time.Duration`), kueue will fail to start -during configuration parsing. - -#### Story 1 - Retention policy for finished Workloads is not configured (default) - -**This story ensures backward compatibility.** - -If the object retention policy section is not configured, -or finished Workloads retention is either not configured or set to `null`, -Workload objects will not be deleted by kueue. - -#### Story 2 - Retention policy for finished Workloads is configured and set to a value > 0s - -If the object retention policy is configured to a value greater than 0 seconds, -the finished Workload retention policy will be evaluated against every -completed Workload. The flow is as follows: - -1. During kueue's initial reconciliation loop, all previously finished Workloads -will be evaluated against the Workload retention policy. They will either be -deleted immediately if they are already expired or requeued for reconciliation -after their retention period has expired. +As a kueue administrator, I want to control the retention of +finished Workloads to minimize the memory footprint and optimize +storage usage in my Kubernetes cluster. I want the flexibility to +configure a retention period to automatically delete Workloads after a +specified duration or to immediately delete them upon completion. -2. During subsequent reconciliation loops, each Workload will be evaluated -using the same approach as in step 1. However, the evaluation will occur -during the next reconciliation loop after the Workload was declared finished. -Based on the evaluation result against the retention policy, the Workload will -either be requeued for later reconciliation or deleted. - -#### Story 3 - Retention policy for finished Workloads is configured and set to 0s - -When the object retention policy is set to 0 seconds, finished Workloads will be -deleted immediately during the first reconciliation loop after they are declared finished. +**Note:** Immediate deletion can be configured by setting the retention period to 0 seconds. ### Notes/Constraints/Caveats @@ -221,6 +195,14 @@ finalizer when the Workload state transitions to finished. If that behavior ever changes, or if there is an external source of a finalizer being attached to the Workload, it will be marked for deletion by the Kubernetes client but will not actually be deleted until the finalizer is removed. +- If the retention policy for finished Workloads is misconfigured +(the provided value is not a valid time.Duration), kueue will fail to start +during configuration parsing. +- In the default scenario, where the object retention policy section +is not configured or finished Workloads retention is either not configured +or set to null, the existing behavior is maintained for backward compatibility. +This means that Workload objects will not be deleted by kueue, +aligning with the behavior before the introduction of this feature. ### Risks and Mitigations @@ -283,6 +265,20 @@ type ObjectRetentionPolicies struct { } ``` +### Behavior +The new behavior is as follows: + +1. During kueue's initial reconciliation loop, all previously finished Workloads + will be evaluated against the Workload retention policy. They will either be + deleted immediately if they are already expired or requeued for reconciliation + after their retention period has expired. + +2. During subsequent reconciliation loops, each Workload will be evaluated + using the same approach as in step 1. However, the evaluation will occur + during the next reconciliation loop after the Workload was declared finished. + Based on the evaluation result against the retention policy, the Workload will + either be requeued for later reconciliation or deleted. + ### Test Plan -- TBA +- Added behavior `Workload controller with resource retention` +in `test/integration/controller/core/workload_controller_test.go` [//]: # (### Graduation Criteria) From c6f91838e1d7f011a733d127a0623f1de4011dd9 Mon Sep 17 00:00:00 2001 From: Michal Wysokinski Date: Wed, 7 Aug 2024 11:54:21 +0200 Subject: [PATCH 07/15] REVIEW: Remove references to .spec.ttlSecondsAfterFinished since it doesn't make sense in this context --- keps/1618-optional-gc-of-workloads/README.md | 6 ------ 1 file changed, 6 deletions(-) diff --git a/keps/1618-optional-gc-of-workloads/README.md b/keps/1618-optional-gc-of-workloads/README.md index a02a445afa..252a0429f0 100644 --- a/keps/1618-optional-gc-of-workloads/README.md +++ b/keps/1618-optional-gc-of-workloads/README.md @@ -127,8 +127,6 @@ and make progress. --> - Support the deletion/expiration of any kueue-authored Kubernetes object other than finished Workloads. -- Configure retention policies (like `.spec.ttlSecondsAfterFinished`) that are attached to -individual Kubernetes objects. ## Proposal @@ -431,7 +429,3 @@ information to express the idea and why it was not acceptable. - Leave the current state of things as is, not implementing any garbage collection mechanism for finished Workloads. -- Implement retention policies attached per object (like `.spec.ttlSecondsAfterFinished`) instead of having -global retention policies configured at the kueue level. However, this would complicate the -implementation and add complexity to the feature itself. For example, during the reconfiguration -of an existing retention policy, all existing Kubernetes objects would need to be updated with the new policy. From 59c0e42858fcbe752075d2131c997c3b6f984f20 Mon Sep 17 00:00:00 2001 From: Michal Wysokinski Date: Wed, 7 Aug 2024 11:57:57 +0200 Subject: [PATCH 08/15] REVIEW: Replace kueue with Kueue for consistency --- keps/1618-optional-gc-of-workloads/README.md | 38 ++++++++++---------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/keps/1618-optional-gc-of-workloads/README.md b/keps/1618-optional-gc-of-workloads/README.md index 252a0429f0..39626365a2 100644 --- a/keps/1618-optional-gc-of-workloads/README.md +++ b/keps/1618-optional-gc-of-workloads/README.md @@ -60,21 +60,21 @@ updates. [documentation style guide]: https://github.com/kubernetes/community/blob/master/contributors/guide/style-guide.md --> -This KEP proposes a mechanism for garbage collection of finished Workloads in kueue. -Currently, kueue does not delete its own Kubernetes objects, leading to potential accumulation +This KEP proposes a mechanism for garbage collection of finished Workloads in Kueue. +Currently, Kueue does not delete its own Kubernetes objects, leading to potential accumulation and unnecessary resource consumption. The proposed solution involves adding a new API section called `objectRetentionPolicies` to allow users to specify retention policies for finished Workloads. This can be set to a specific duration or disabled entirely to maintain backward compatibility. Benefits of implementing this KEP include the deletion of superfluous information, -decreased memory footprint for kueue, and freeing up etcd storage. The proposed changes +decreased memory footprint for Kueue, and freeing up etcd storage. The proposed changes are designed to be explicitly enabled and configured by users, ensuring backward compatibility. While the primary focus is on finished Workloads, the API section can be extended to include -other kueue-authored objects in the future. The implementation involves incorporating the +other Kueue-authored objects in the future. The implementation involves incorporating the deletion logic into the reconciliation loop and includes considerations for potential risks, such as handling a large number of existing finished Workloads. -Overall, this KEP aims to enhance kueue's resource management and provide users with +Overall, this KEP aims to enhance Kueue's resource management and provide users with more control over object retention. ## Motivation @@ -88,11 +88,11 @@ demonstrate the interest in a KEP within the wider Kubernetes community. [experience reports]: https://github.com/golang/go/wiki/ExperienceReports --> -Currently, kueue does not delete its own Kubernetes objects, leaving it to Kubernetes' -garbage collector (GC) to manage the retention of objects created by kueue, such as Pods, Jobs, and Workloads. +Currently, Kueue does not delete its own Kubernetes objects, leaving it to Kubernetes' +garbage collector (GC) to manage the retention of objects created by Kueue, such as Pods, Jobs, and Workloads. We do not set any expiration on these objects, so by default, they are kept indefinitely. Based on usage patterns, including the amount of jobs created, their complexity, and frequency over time, these objects can accumulate, -leading to unnecessary use of etcd storage and a gradual increase in kueue's memory footprint. Some of these objects, +leading to unnecessary use of etcd storage and a gradual increase in Kueue's memory footprint. Some of these objects, like finished Workloads, do not contain any useful information that could be used for additional purposes, such as debugging. That's why, based on user feedback in [#1618](https://github.com/kubernetes-sigs/kueue/issues/1618) or [#1789](https://github.com/kubernetes-sigs/kueue/issues/1789), a mechanism for garbage collection of finished Workloads @@ -102,7 +102,7 @@ With this mechanism in place, deleting finished Workloads has several benefits: - **Deletion of superfluous information**: Finished Workloads do not store any useful runtime-related information (which is stored by Jobs), nor are they useful for debugging. They could possibly be useful for auditing purposes, but this is not a primary concern. -- **Decreasing kueue memory footprint**: Finished Workloads are loaded during kueue's initialization and kept in memory, +- **Decreasing Kueue memory footprint**: Finished Workloads are loaded during Kueue's initialization and kept in memory, consuming resources unnecessarily. - **Freeing up etcd storage**: Finished Workloads are stored in etcd memory until Kubernetes' built-in GC mechanism collects them, but by default, Custom Resources are stored indefinitely. @@ -115,7 +115,7 @@ know that this has succeeded? --> - Support the deletion of finished Workload objects. -- Introduce configuration of global retention policies for kueue-authored Kubernetes objects, +- Introduce configuration of global retention policies for Kueue-authored Kubernetes objects, starting with finished Workloads. - Maintain backward compatibility (the feature must be explicitly enabled and configured). @@ -126,7 +126,7 @@ What is out of scope for this KEP? Listing non-goals helps to focus discussion and make progress. --> -- Support the deletion/expiration of any kueue-authored Kubernetes object other than finished Workloads. +- Support the deletion/expiration of any Kueue-authored Kubernetes object other than finished Workloads. ## Proposal @@ -142,7 +142,7 @@ nitty-gritty. Add a new API called `objectRetentionPolicies`, which will enable specifying a retention policy for finished Workloads under an option called `FinishedWorkloadRetention`. This API section could be extended in the future -with options for other kueue-authored objects. +with options for other Kueue-authored objects. ### User Stories @@ -159,7 +159,7 @@ several user stories are proposed below. #### Story 1 - Configurable retention for finished Workloads -As a kueue administrator, I want to control the retention of +As a Kueue administrator, I want to control the retention of finished Workloads to minimize the memory footprint and optimize storage usage in my Kubernetes cluster. I want the flexibility to configure a retention period to automatically delete Workloads after a @@ -178,7 +178,7 @@ This might be a good place to talk about core concepts and how they relate. - Initially, other naming was considered for config keys. Namely, instead of "Objects," the word "Resources" was used, but based on feedback from `@alculquicondor`, it was changed -because "Resources" bears too many meanings in the context of kueue +because "Resources" bears too many meanings in the context of Kueue (Kubernetes resources, physical resources, etc.). - A different implementation was proposed initially. The deletion logic was not supposed to be part of the `Reconcile` function but rather was supposed to have @@ -194,12 +194,12 @@ changes, or if there is an external source of a finalizer being attached to the Workload, it will be marked for deletion by the Kubernetes client but will not actually be deleted until the finalizer is removed. - If the retention policy for finished Workloads is misconfigured -(the provided value is not a valid time.Duration), kueue will fail to start +(the provided value is not a valid time.Duration), Kueue will fail to start during configuration parsing. - In the default scenario, where the object retention policy section is not configured or finished Workloads retention is either not configured or set to null, the existing behavior is maintained for backward compatibility. -This means that Workload objects will not be deleted by kueue, +This means that Workload objects will not be deleted by Kueue, aligning with the behavior before the introduction of this feature. ### Risks and Mitigations @@ -220,7 +220,7 @@ Consider including folks who also work outside the SIG or subproject. (thousands, tens of thousands, etc.), it may take a significant amount of time to delete all previously finished Workloads that qualify for deletion. Since the reconciliation loop is effectively synchronous, this may impact the reconciliation -of new jobs. From the user's perspective, it may seem like kueue's initialization +of new jobs. From the user's perspective, it may seem like Kueue's initialization time is very long for that initial run until all expired objects are deleted.\ **M**: Unfortunately, there is no easy way to mitigate this without complicating the code. A potential improvement would be to have a dedicated internal queue that handles deletion @@ -243,7 +243,7 @@ proposal will be implemented, this is the place to discuss them. // Configuration is the Schema for the kueueconfigurations API type Configuration struct { //... - // ObjectRetentionPolicies provides configuration options for retention of kueue owned + // ObjectRetentionPolicies provides configuration options for retention of Kueue owned // resources. ObjectRetentionPolicies *ObjectRetentionPolicies `json:"objectRetentionPolicies,omitempty"` } @@ -266,7 +266,7 @@ type ObjectRetentionPolicies struct { ### Behavior The new behavior is as follows: -1. During kueue's initial reconciliation loop, all previously finished Workloads +1. During Kueue's initial reconciliation loop, all previously finished Workloads will be evaluated against the Workload retention policy. They will either be deleted immediately if they are already expired or requeued for reconciliation after their retention period has expired. From b28f6992eda70b922877eb055645b67d25c65801 Mon Sep 17 00:00:00 2001 From: Michal Wysokinski Date: Wed, 7 Aug 2024 12:07:13 +0200 Subject: [PATCH 09/15] REVIEW: Added info about the behavior of a nil value for all policies. --- keps/1618-optional-gc-of-workloads/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/keps/1618-optional-gc-of-workloads/README.md b/keps/1618-optional-gc-of-workloads/README.md index 39626365a2..73d1debd09 100644 --- a/keps/1618-optional-gc-of-workloads/README.md +++ b/keps/1618-optional-gc-of-workloads/README.md @@ -244,7 +244,7 @@ proposal will be implemented, this is the place to discuss them. type Configuration struct { //... // ObjectRetentionPolicies provides configuration options for retention of Kueue owned - // resources. + // objects. A nil value will disable automatic deletion for all objects. ObjectRetentionPolicies *ObjectRetentionPolicies `json:"objectRetentionPolicies,omitempty"` } From 762ac09ec9ed45225dc9408890238a8678a3173c Mon Sep 17 00:00:00 2001 From: Michal Wysokinski Date: Wed, 7 Aug 2024 10:16:26 +0000 Subject: [PATCH 10/15] Update TOC --- keps/1618-optional-gc-of-workloads/README.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/keps/1618-optional-gc-of-workloads/README.md b/keps/1618-optional-gc-of-workloads/README.md index 73d1debd09..bc73e8f4a0 100644 --- a/keps/1618-optional-gc-of-workloads/README.md +++ b/keps/1618-optional-gc-of-workloads/README.md @@ -23,14 +23,12 @@ tags, and then generate with `hack/update-toc.sh`. - [Non-Goals](#non-goals) - [Proposal](#proposal) - [User Stories](#user-stories) - - [Story 0 - Retention policy for finished Workloads is misconfigured](#story-0---retention-policy-for-finished-workloads-is-misconfigured) - - [Story 1 - Retention policy for finished Workloads is not configured (default)](#story-1---retention-policy-for-finished-workloads-is-not-configured-default) - - [Story 2 - Retention policy for finished Workloads is configured and set to a value > 0s](#story-2---retention-policy-for-finished-workloads-is-configured-and-set-to-a-value--0s) - - [Story 3 - Retention policy for finished Workloads is configured and set to 0s](#story-3---retention-policy-for-finished-workloads-is-configured-and-set-to-0s) + - [Story 1 - Configurable retention for finished Workloads](#story-1---configurable-retention-for-finished-workloads) - [Notes/Constraints/Caveats](#notesconstraintscaveats) - [Risks and Mitigations](#risks-and-mitigations) - [Design Details](#design-details) - [API](#api) + - [Behavior](#behavior) - [Test Plan](#test-plan) - [Unit Tests](#unit-tests) - [Integration tests](#integration-tests) From 43a36f306e16a08bb5ac8ffb1b73f02ee8a35ff1 Mon Sep 17 00:00:00 2001 From: Michal Wysokinski Date: Wed, 7 Aug 2024 15:39:53 +0000 Subject: [PATCH 11/15] Drop redundant comment --- keps/1618-optional-gc-of-workloads/README.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/keps/1618-optional-gc-of-workloads/README.md b/keps/1618-optional-gc-of-workloads/README.md index bc73e8f4a0..b598fd81e2 100644 --- a/keps/1618-optional-gc-of-workloads/README.md +++ b/keps/1618-optional-gc-of-workloads/README.md @@ -152,9 +152,6 @@ the system. The goal here is to make this feel real for users without getting bogged down. --> -Based on how the retention policy is configured for finished Workloads, -several user stories are proposed below. - #### Story 1 - Configurable retention for finished Workloads As a Kueue administrator, I want to control the retention of From 2d59922d3a0a32aaea36e70c14d8a2d9f55fd0bc Mon Sep 17 00:00:00 2001 From: Michal Wysokinski Date: Wed, 7 Aug 2024 16:24:30 +0000 Subject: [PATCH 12/15] REVIEW: Move an alternative approach from Notes to Alternatives --- keps/1618-optional-gc-of-workloads/README.md | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/keps/1618-optional-gc-of-workloads/README.md b/keps/1618-optional-gc-of-workloads/README.md index b598fd81e2..f99e116b78 100644 --- a/keps/1618-optional-gc-of-workloads/README.md +++ b/keps/1618-optional-gc-of-workloads/README.md @@ -175,11 +175,6 @@ This might be a good place to talk about core concepts and how they relate. the word "Resources" was used, but based on feedback from `@alculquicondor`, it was changed because "Resources" bears too many meanings in the context of Kueue (Kubernetes resources, physical resources, etc.). -- A different implementation was proposed initially. The deletion logic was not -supposed to be part of the `Reconcile` function but rather was supposed to have -its own handler/watcher pair. During implementation, this proved challenging -without substantial changes to the code. Also, creating a watcher for -completed Workloads proved to be very complex. - The deletion mechanism assumes that finished Workload objects do not have any finalizers attached to them. After an investigation by `@mimowo`, the [exact place in the code](https://github.com/kubernetes-sigs/kueue/blob/a128ae3354a56670ffa58695fef1eca270fe3a47/pkg/controller/jobframework/reconciler.go#L332-L340) @@ -424,3 +419,11 @@ information to express the idea and why it was not acceptable. - Leave the current state of things as is, not implementing any garbage collection mechanism for finished Workloads. +- Initially, a different approach was hypothesized where the deletion logic + would not be integrated into the `Reconcile` function. Instead, a + dedicated handler and watcher pair would manage the cleanup of finished Workloads. This approach had the following advantages and disadvantages:: + - Cons: + - Kueue's internal state management for Workloads happens within the `WorkloadReconciler`. Moving the deletion logic elsewhere would split up code meant to handle a single resource's lifecycle. + - Implementing a watcher specifically for expired Workloads turned out to be unexpectedly complex, leading to this approach being abandoned. + - Pros: + - A separate handler/watcher could offload the deletion logic, making the already large and intricate Reconcile function more focused and potentially easier to maintain. From 2f97fdfe54576c88471d88bb77994ee0f75dcc06 Mon Sep 17 00:00:00 2001 From: Michal Wysokinski Date: Thu, 8 Aug 2024 13:20:26 +0200 Subject: [PATCH 13/15] REVIEW: Move orphanes workloads mention to the notes --- keps/1618-optional-gc-of-workloads/README.md | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/keps/1618-optional-gc-of-workloads/README.md b/keps/1618-optional-gc-of-workloads/README.md index f99e116b78..cf7b3d9360 100644 --- a/keps/1618-optional-gc-of-workloads/README.md +++ b/keps/1618-optional-gc-of-workloads/README.md @@ -92,8 +92,7 @@ We do not set any expiration on these objects, so by default, they are kept inde including the amount of jobs created, their complexity, and frequency over time, these objects can accumulate, leading to unnecessary use of etcd storage and a gradual increase in Kueue's memory footprint. Some of these objects, like finished Workloads, do not contain any useful information that could be used for additional purposes, such as debugging. -That's why, based on user feedback in [#1618](https://github.com/kubernetes-sigs/kueue/issues/1618) -or [#1789](https://github.com/kubernetes-sigs/kueue/issues/1789), a mechanism for garbage collection of finished Workloads +That's why, based on user feedback in [#1618](https://github.com/kubernetes-sigs/kueue/issues/1618), a mechanism for garbage collection of finished Workloads is being proposed. With this mechanism in place, deleting finished Workloads has several benefits: @@ -191,6 +190,12 @@ is not configured or finished Workloads retention is either not configured or set to null, the existing behavior is maintained for backward compatibility. This means that Workload objects will not be deleted by Kueue, aligning with the behavior before the introduction of this feature. +- Finished Workloads are not the only objects that should be cleaned up. +Another type is orphaned Workloads, as described in +[#1789](https://github.com/kubernetes-sigs/kueue/issues/1789). +Although dealing with these workloads could be an extension to this KEP, +it is not the primary concern at the moment. The initial implementation of cleaning up +finished workloads mentioned in this KEP does not address orphaned Workloads. ### Risks and Mitigations From a43df30d2367ba1c8fb09e9fb11c0233e9f7b0e5 Mon Sep 17 00:00:00 2001 From: Michal Wysokinski Date: Thu, 8 Aug 2024 13:24:39 +0200 Subject: [PATCH 14/15] REVIEW: Remove mention of orphaned workloads --- keps/1618-optional-gc-of-workloads/README.md | 6 ------ 1 file changed, 6 deletions(-) diff --git a/keps/1618-optional-gc-of-workloads/README.md b/keps/1618-optional-gc-of-workloads/README.md index cf7b3d9360..e6cc06ccba 100644 --- a/keps/1618-optional-gc-of-workloads/README.md +++ b/keps/1618-optional-gc-of-workloads/README.md @@ -190,12 +190,6 @@ is not configured or finished Workloads retention is either not configured or set to null, the existing behavior is maintained for backward compatibility. This means that Workload objects will not be deleted by Kueue, aligning with the behavior before the introduction of this feature. -- Finished Workloads are not the only objects that should be cleaned up. -Another type is orphaned Workloads, as described in -[#1789](https://github.com/kubernetes-sigs/kueue/issues/1789). -Although dealing with these workloads could be an extension to this KEP, -it is not the primary concern at the moment. The initial implementation of cleaning up -finished workloads mentioned in this KEP does not address orphaned Workloads. ### Risks and Mitigations From 2efe73431e88dbcd625b17bbdc3185014c4e0691 Mon Sep 17 00:00:00 2001 From: Michal Wysokinski Date: Fri, 9 Aug 2024 14:42:45 +0200 Subject: [PATCH 15/15] REVIEW: Bring back Graduation Criteria section --- keps/1618-optional-gc-of-workloads/README.md | 38 +++++++------------- 1 file changed, 13 insertions(+), 25 deletions(-) diff --git a/keps/1618-optional-gc-of-workloads/README.md b/keps/1618-optional-gc-of-workloads/README.md index e6cc06ccba..9650c3d931 100644 --- a/keps/1618-optional-gc-of-workloads/README.md +++ b/keps/1618-optional-gc-of-workloads/README.md @@ -348,35 +348,23 @@ After the implementation PR is merged, add the names of the tests here. in `test/integration/controller/core/workload_controller_test.go` -[//]: # (### Graduation Criteria) +### Graduation Criteria -[//]: # () -[//]: # () +[feature gate]: https://git.k8s.io/community/contributors/devel/sig-architecture/feature-gates.md +[maturity-levels]: https://git.k8s.io/community/contributors/devel/sig-architecture/api_changes.md#alpha-beta-and-stable-versions +[deprecation-policy]: https://kubernetes.io/docs/reference/using-api/deprecation-policy/ +--> ## Implementation History