Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rolling update does not work for StatefulSet integration #3690

Open
mimowo opened this issue Nov 29, 2024 · 4 comments · May be fixed by #3684
Open

Rolling update does not work for StatefulSet integration #3690

mimowo opened this issue Nov 29, 2024 · 4 comments · May be fixed by #3684
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature.

Comments

@mimowo
Copy link
Contributor

mimowo commented Nov 29, 2024

/kind bug
/kind feature

What happened:

When doing a rolling update of StatefulSet it gets stuck after stopping the first pod.

What you expected to happen:

The rolling update of STS is supported as this is a common operation for serving workloads.

How to reproduce it (as minimally and precisely as possible):

  1. Create the STS as here:
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: nginx-statefulset
  labels:
    app: nginx
    kueue.x-k8s.io/queue-name: user-queue
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: registry.k8s.io/nginx-slim:0.26
          ports:
            - containerPort: 80
          resources:
            requests:
              cpu: "100m"
  serviceName: "nginx"
  1. trigger STS rolling update with:
kubectl set image statefulset/nginx-statefulset nginx=registry.k8s.io/nginx-slim:0.27

issue: The update gets stuck, and the pods are:

> k get pods      
NAME                  READY   STATUS      RESTARTS   AGE
nginx-statefulset-0   1/1     Running     0          3m33s
nginx-statefulset-1   1/1     Running     0          3m30s
nginx-statefulset-2   0/1     Completed   0          3m29s

Anything else we need to know?:

After deleting the finalizer from nginx-statefulset-2 pod manually, the STS update is still stuck, the pods:

> k get pods                            
NAME                  READY   STATUS            RESTARTS   AGE
nginx-statefulset-0   1/1     Running           0          5m51s
nginx-statefulset-1   1/1     Running           0          5m48s
nginx-statefulset-2   0/1     SchedulingGated   0          3s
@mimowo mimowo added the kind/bug Categorizes issue or PR as related to a bug. label Nov 29, 2024
@mimowo
Copy link
Contributor Author

mimowo commented Nov 29, 2024

/assign @mbobrovskyi

@mbobrovskyi
Copy link
Contributor

mbobrovskyi commented Nov 29, 2024

@mimowo thanks!

@mimowo
Copy link
Contributor Author

mimowo commented Nov 29, 2024

cc @gabesaba who recently is getting familiar with the inference support

@mimowo
Copy link
Contributor Author

mimowo commented Nov 29, 2024

/kind feature
I mark it both as bug and feature. As bug, because from the end-user perspective this is a bug, but since the fix requires new API (the serving annotation), I would also call it a feature.

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants