Skip to content

Commit

Permalink
Add Serving Workload section on KEP-976.
Browse files Browse the repository at this point in the history
  • Loading branch information
mbobrovskyi committed Nov 29, 2024
1 parent cac26f0 commit fce64c5
Showing 1 changed file with 18 additions and 0 deletions.
18 changes: 18 additions & 0 deletions keps/976-plain-pods/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -551,6 +551,24 @@ spec:
cpu: 1m
```

#### Serving Workload

1. The Pod Group integration adds successfully completed pods to the `ReclaimablePods` list. However,
this is problematic for serving workloads, such as `StatefulSet`, because it prevents to ungate the replacement
pod. This behavior is incorrect, as recreated pods for serving workloads should continue to run
regardless if the pod was failed or succeeded.
To resolve this issue, the `kueue.x-k8s.io/pod-group-serving` annotation can be used. When this
annotation is set to true, the `ReclaimablePods` mechanism no longer tracks the number of
pods, allowing to ungate the replacement pod.
2. The Pod Group integration waits until all pods are created. However, for serving workloads such
as `StatefulSets` with a `PodManagementPolicyType` of `OrderedReady`, pods are created sequentially,
with each subsequent pod being created only after the previous pod is fully running. This
sequential behavior can result in a deadlock.
To resolve this issue, the `kueue.x-k8s.io/pod-group-fast-admission` annotation is used.
When this annotation is set to true, the PodGroup can proceed with admission without requiring
all pods to reach the ungated state.


### Tracking admitted and finished Pods

Pods need to have finalizers so that we can reliably track how many of them run to completion and be
Expand Down

0 comments on commit fce64c5

Please sign in to comment.