Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing fields from template doesn't remove it in the resulting resource #11773

Open
cwrau opened this issue Jan 30, 2025 · 15 comments
Open

Removing fields from template doesn't remove it in the resulting resource #11773

cwrau opened this issue Jan 30, 2025 · 15 comments
Assignees
Labels
area/clusterclass Issues or PRs related to clusterclass help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@cwrau
Copy link
Contributor

cwrau commented Jan 30, 2025

What steps did you take and what happened?

We have a (kubeadmcontrolplane) template:

apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlaneTemplate
spec:
  template:
    initConfiguration:
      nodeRegistration:
        imagePullPolicy: IfNotPresent
        kubeletExtraArgs:
          cloud-provider: external
        name: '{{ local_hostname }}'
      patches:
        directory: /etc/kubernetes/patches
    joinConfiguration:
      nodeRegistration:
        imagePullPolicy: IfNotPresent
        kubeletExtraArgs:
          cloud-provider: external
        name: '{{ local_hostname }}'

before the change it was like this:

apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlaneTemplate
spec:
  template:
    initConfiguration:
      localAPIEndpoint: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
        kubeletExtraArgs:
          cloud-provider: external
          event-qps: "0"
          feature-gates: SeccompDefault=true
          protect-kernel-defaults: "true"
          seccomp-default: "true"
          tls-cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256
        name: '{{ local_hostname }}'
      patches:
        directory: /etc/kubernetes/patches
    joinConfiguration:
      discovery: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
        kubeletExtraArgs:
          cloud-provider: external
          event-qps: "0"
          feature-gates: SeccompDefault=true
          protect-kernel-defaults: "true"
          seccomp-default: "true"
          tls-cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256
        name: '{{ local_hostname }}'

as you can see we removed all kubeletExtraArgs aside from cloud-provider.

But the resource, in this case kubeadmcontrolplane, still has these fields set, I assume because of a merge apply instead of completely overriding the resource.

This (might) result in a broken cluster which needs manual intervention if the removed fields are not valid anymore, which is the case for a few of these args.

What did you expect to happen?

That the resource (kubeadmcontrolplane) would be the exact result of the template instead of a merged mess.

Cluster API version

1.8.5

Kubernetes version

1.28.15

Anything else you would like to add?

No response

Label(s) to be applied

/kind bug
/area clusterclass

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. area/clusterclass Issues or PRs related to clusterclass needs-priority Indicates an issue lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 30, 2025
@chrischdi
Copy link
Member

How did you apply it? Can you post it in a way we could reproduce it?

Does the behaviour change if you use kubectl apply --server-side=true (on create and on update)

@sbueringer
Copy link
Member

sbueringer commented Jan 30, 2025

We have a (kubeadmcontrolplane) template:

The first YAML shows a KubeadmControlPlane not a KubeadmControlPlaneTemplate

But the resource, in this case kubeadmcontrolplane, still has these fields set, I assume because of a merge apply instead of completely overriding the resource.

We use SSA in the Cluster topology controller. I would have assumed it works

@cwrau
Copy link
Contributor Author

cwrau commented Jan 30, 2025

How did you apply it? Can you post it in a way we could reproduce it?

I apply the cluster-api templates with helm via a flux HelmRelease. Source is https://github.com/teutonet/teutonet-helm-charts/tree/main/charts/t8s-cluster. This is a bit "tricky" to reproduce, as CAPO needs credentials which aren't managed by the cluster-api ecosystem. We have our own operator that fills this gap.

First values:

bastion:
  enabled: false
cloud: ffm3-prod
controlPlane:
  flavor: standard.4.1905
metadata:
  customerID: 1111
  customerName: teuto-net
  serviceLevelAgreement: 24x7
nodePools:
  s2-default:
    availabilityZone: Zone2
    flavor: standard.2.1905
    replicas: 4
version:
  major: 1
  minor: 28
  patch: 15

second values:

bastion:
  enabled: false
cloud: ffm3-prod
controlPlane:
  flavor: standard.4.1905
metadata:
  customerID: 1111
  customerName: teuto-net
nodePools:
  s2-default:
    availabilityZone: Zone2
    flavor: standard.2.1905
    replicas: 4
version:
  major: 1
  minor: 29
  patch: 13

But in principle creating these resources and providing a cluster-api cluster should result in the same problem.

first cluster.yaml:

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: mgmt-prod-cluster
spec:
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: KubeadmControlPlane
    name: mgmt-prod-control-plane
    namespace: cluster-mgmt-prod
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: OpenStackCluster
    name: mgmt-prod-cluster
    namespace: cluster-mgmt-prod
  topology:
    class: mgmt-prod-cluster
    controlPlane:
      metadata:
        labels:
          t8s.teuto.net/cluster: mgmt-prod
          t8s.teuto.net/role: management
      replicas: 3
    variables:
    - name: dnsNameservers
      value:
      - *replace-me
      - *replace-me
    - name: controlPlaneServerGroupID
      value: *replace-me
    - name: machineDeploymentFlavor
      value: compute-plane-placeholder
    version: v1.28.15
    workers:
      machineDeployments:
      - class: compute-plane
        failureDomain: Zone2
        metadata: {}
        name: s2-default
        replicas: 4
        variables:
          overrides:
          - name: machineDeploymentServerGroupID
            value: *replace-me
          - name: machineDeploymentFlavor
            value: standard.2.1905

second cluster.yaml

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: mgmt-prod-cluster
spec:
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: KubeadmControlPlane
    name: mgmt-prod-control-plane
    namespace: cluster-mgmt-prod
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: OpenStackCluster
    name: mgmt-prod-cluster
    namespace: cluster-mgmt-prod
  topology:
    class: mgmt-prod-cluster
    controlPlane:
      metadata:
        labels:
          t8s.teuto.net/cluster: mgmt-prod
          t8s.teuto.net/role: management
      replicas: 3
    variables:
    - name: dnsNameservers
      value:
      - *replace-me
      - *replace-me
    - name: controlPlaneServerGroupID
      value: *replace-me
    - name: machineDeploymentFlavor
      value: compute-plane-placeholder
    version: v1.29.13
    workers:
      machineDeployments:
      - class: compute-plane
        failureDomain: Zone2
        metadata: {}
        name: s2-default
        replicas: 4
        variables:
          overrides:
          - name: machineDeploymentServerGroupID
            value: *replace-me
          - name: machineDeploymentFlavor
            value: standard.2.1905

Does the behaviour change if you use kubectl apply --server-side=true (on create and on update)

The resulting templates are correct, I don't know what another way of applying (which would result in the same resource on the API server) would change?

We have a (kubeadmcontrolplane) template:

The first YAML shows a KubeadmControlPlane not a KubeadmControlPlaneTemplate

Oh yeah, I fixed the yamls 👍

@sbueringer
Copy link
Member

Just to ensure I understand this correctly.

So you are able to make the changes you want on the KubeadmControlPlaneTemplate and the problem is that the changes are not applied the same way on the KubeadmControlPlane of a Cluster?

@cwrau
Copy link
Contributor Author

cwrau commented Jan 31, 2025

Just to ensure I understand this correctly.

So you are able to make the changes you want on the KubeadmControlPlaneTemplate and the problem is that the changes are not applied the same way on the KubeadmControlPlane of a Cluster?

Yes, or rather (because of forbidden changes 🙄) I delete the template, create another and change the template ref in the cluster.

@cwrau
Copy link
Contributor Author

cwrau commented Feb 5, 2025

Just to ensure I understand this correctly.
So you are able to make the changes you want on the KubeadmControlPlaneTemplate and the problem is that the changes are not applied the same way on the KubeadmControlPlane of a Cluster?

Yes, or rather (because of forbidden changes 🙄) I delete the template, create another and change the template ref in the cluster.

I have this problem in general with cluster-api resources. We use OpenStack as our infrastructure provider and the OpenStackClusterTemplate just has .spec.template.spec.bastion.enabled=false but in the existing OpenStackCluster we also have .spec.bastion.image.[...] but CAPI doesn't remove it and instead I get the following error on the CAPI cluster:

error reconciling the Cluster topology: failed to create patch helper
for OpenStackCluster/6446a8a8-87de-410d-9c88-bcb4b25fc9b9-x5xpx: server side
apply dry-run failed for modified object: OpenStackCluster.infrastructure.cluster.x-k8s.io
"6446a8a8-87de-410d-9c88-bcb4b25fc9b9-x5xpx" is invalid: [spec.bastion.spec.image:
Invalid value: 0: spec.bastion.spec.image in body should have at least 1 properties,
spec.bastion.spec: Invalid value: "object": at least one of flavor or flavorID
must be set]

If I manually remove these fields on the OpenStackCluster then CAPI can progress (although it runs into another problem I can't understand right now, but that's either our or CAPOs problem)

@chrischdi chrischdi added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Feb 5, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-priority Indicates an issue lacks a `priority/foo` label and requires one. label Feb 5, 2025
@chrischdi
Copy link
Member

/triage accepted

/help

We'd like to reproduce this in KCP + CAPD first to get some confirmation on the core CAPI side.

@k8s-ci-robot
Copy link
Contributor

@chrischdi:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/triage accepted

/help

We'd like to reproduce this in KCP + CAPD first to get some confirmation on the core CAPI side.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 5, 2025
@chrischdi
Copy link
Member

/assign

@cwrau
Copy link
Contributor Author

cwrau commented Feb 10, 2025

Is there something I can do to help?

@sbueringer
Copy link
Member

Find out what exactly is going on :)

Basically we are just (at least trying) to use SSA. So the question is if either something in our implementation or in SSA In the kube-apiserver is broken

@chrischdi
Copy link
Member

chrischdi commented Feb 10, 2025

I tried to reproduce this with CAPD without any success.

I tried 3 different variants:

  • Change the Cluster to different ClusterClass without kubelet args
  • Modify the existing ClusterClass to point to a new KubeadmControlPlaneTemplate without kubelet args
  • Deleting and re-creating the existing KubeadmControlPlaneTemplate but without kubelet args

Variant 1: switch to new ClusterClass which does not have any kubelet args set

  1. Bring up management cluster
./hack/kind-install-for-capd.sh
export CLUSTER_TOPOLOGY=true
clusterctl init --infrastructure docker:v1.8.5 -b kubeadm:v1.8.5 -c kubeadm:v1.8.5 --core cluster-api:v1.8.5
  1. Create CRS for CNI
CLUSTER_NAME=test CNI_RESOURCES="" clusterctl generate yaml --from test/e2e/data/infrastructure-docker/main/bases/crs.yaml | kubectl apply -f -
kubectl create cm cni-test-crs-0 --from-file data=test/e2e/data/cni/kindnet/kindnet.yaml --dry-run=client -o yaml | kubectl apply -f -
  1. Create a ClusterClass without extra args
clusterctl generate yaml --from test/infrastructure/docker/templates/clusterclass-quick-start.yaml | kubectl apply -f -
  1. Create a ClusterClass with extra args
clusterctl generate yaml --from test/infrastructure/docker/templates/clusterclass-quick-start.yaml | sed 's/          nodeRegistration: {}/          nodeRegistration:\n            kubeletExtraArgs:\n              protect-kernel-defaults: "true"/g' | sed 's/quick-start/quick-start-with-args/g' | kubectl apply -f -
  1. Create Cluster with extra-args ClusterClass and add cni label
KUBERNETES_VERSION=v1.32.0 clusterctl generate cluster --worker-machine-count=0 --from test/infrastructure/docker/templates/cluster-template-development.yaml test | sed 's/quick-start/quick-start-with-args/g' | kubectl apply -f -
kubectl label cluster test cni=test-crs-0
  1. Print KubeadmControlPlane to see that args are set:

Shortened yaml:

$ kubectl get kubeadmcontrolplane test-9x9x7 -o yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
  ...
  name: test-9x9x7
  namespace: default
  ...
spec:
  kubeadmConfigSpec:
    ...
    initConfiguration:
      localAPIEndpoint: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
        kubeletExtraArgs:
          protect-kernel-defaults: "true"
    joinConfiguration:
      discovery: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
        kubeletExtraArgs:
          protect-kernel-defaults: "true"
...

Full yaml:

$ kubectl get kubeadmcontrolplane test-9x9x7 -o yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
  annotations:
    cluster.x-k8s.io/cloned-from-groupkind: KubeadmControlPlaneTemplate.controlplane.cluster.x-k8s.io
    cluster.x-k8s.io/cloned-from-name: quick-start-with-args-control-plane
  creationTimestamp: "2025-02-10T16:47:46Z"
  finalizers:
  - kubeadm.controlplane.cluster.x-k8s.io
  generation: 1
  labels:
    cluster.x-k8s.io/cluster-name: test
    topology.cluster.x-k8s.io/owned: ""
  name: test-9x9x7
  namespace: default
  ownerReferences:
  - apiVersion: cluster.x-k8s.io/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: Cluster
    name: test
    uid: 0672f11a-3d4c-4acd-a4c3-e0db4c736ead
  resourceVersion: "5633"
  uid: 30eb71db-ff83-4d83-b47a-829dc9ff3e37
spec:
  kubeadmConfigSpec:
    clusterConfiguration:
      apiServer:
        certSANs:
        - localhost
        - 127.0.0.1
        - 0.0.0.0
        - host.docker.internal
        extraArgs:
          admission-control-config-file: /etc/kubernetes/kube-apiserver-admission-pss.yaml
        extraVolumes:
        - hostPath: /etc/kubernetes/kube-apiserver-admission-pss.yaml
          mountPath: /etc/kubernetes/kube-apiserver-admission-pss.yaml
          name: admission-pss
          pathType: File
          readOnly: true
      controllerManager: {}
      dns: {}
      etcd:
        local: {}
      networking: {}
      scheduler: {}
    files:
    - content: |
        apiVersion: apiserver.config.k8s.io/v1
        kind: AdmissionConfiguration
        plugins:
        - name: PodSecurity
          configuration:
            apiVersion: pod-security.admission.config.k8s.io/v1
            kind: PodSecurityConfiguration
            defaults:
              enforce: "baseline"
              enforce-version: "latest"
              audit: "restricted"
              audit-version: "latest"
              warn: "restricted"
              warn-version: "latest"
            exemptions:
              usernames: []
              runtimeClasses: []
              namespaces: [kube-system]
      path: /etc/kubernetes/kube-apiserver-admission-pss.yaml
    format: cloud-config
    initConfiguration:
      localAPIEndpoint: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
        kubeletExtraArgs:
          protect-kernel-defaults: "true"
    joinConfiguration:
      discovery: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
        kubeletExtraArgs:
          protect-kernel-defaults: "true"
  machineTemplate:
    infrastructureRef:
      apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
      kind: DockerMachineTemplate
      name: test-pt4rc
      namespace: default
    metadata:
      labels:
        cluster.x-k8s.io/cluster-name: test
        topology.cluster.x-k8s.io/owned: ""
  replicas: 1
  rolloutStrategy:
    rollingUpdate:
      maxSurge: 1
    type: RollingUpdate
  version: v1.32.0
status:
  conditions:
  - lastTransitionTime: "2025-02-10T16:48:10Z"
    status: "True"
    type: Ready
  - lastTransitionTime: "2025-02-10T16:48:09Z"
    status: "True"
    type: Available
  - lastTransitionTime: "2025-02-10T16:47:51Z"
    status: "True"
    type: CertificatesAvailable
  - lastTransitionTime: "2025-02-10T16:48:26Z"
    status: "True"
    type: ControlPlaneComponentsHealthy
  - lastTransitionTime: "2025-02-10T16:50:44Z"
    status: "True"
    type: EtcdClusterHealthy
  - lastTransitionTime: "2025-02-10T16:48:01Z"
    status: "True"
    type: MachinesCreated
  - lastTransitionTime: "2025-02-10T16:48:10Z"
    status: "True"
    type: MachinesReady
  - lastTransitionTime: "2025-02-10T16:48:10Z"
    status: "True"
    type: Resized
  initialized: true
  observedGeneration: 1
  ready: true
  readyReplicas: 1
  replicas: 1
  selector: cluster.x-k8s.io/cluster-name=test,cluster.x-k8s.io/control-plane
  unavailableReplicas: 0
  updatedReplicas: 1
  version: v1.32.0
  1. Modify ClusterClass of Cluster to the one without quickstart
KUBERNETES_VERSION=v1.32.0 clusterctl generate cluster --worker-machine-count=0 --from test/infrastructure/docker/templates/cluster-template-development.yaml test | kubectl apply -f -
  1. Dump again, kubeletExtraArgs is gone

Shortened yaml:

$ kubectl get kubeadmcontrolplane test-9x9x7 -o yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
  ...
  name: test-9x9x7
  namespace: default
  ...
spec:
  kubeadmConfigSpec:
    ...
    initConfiguration:
      localAPIEndpoint: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
    joinConfiguration:
      discovery: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
...

Full yaml:

$ kubectl get kubeadmcontrolplane test-9x9x7 -o yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
  annotations:
    cluster.x-k8s.io/cloned-from-groupkind: KubeadmControlPlaneTemplate.controlplane.cluster.x-k8s.io
    cluster.x-k8s.io/cloned-from-name: quick-start-control-plane
  creationTimestamp: "2025-02-10T16:47:46Z"
  finalizers:
  - kubeadm.controlplane.cluster.x-k8s.io
  generation: 2
  labels:
    cluster.x-k8s.io/cluster-name: test
    topology.cluster.x-k8s.io/owned: ""
  name: test-9x9x7
  namespace: default
  ownerReferences:
  - apiVersion: cluster.x-k8s.io/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: Cluster
    name: test
    uid: 0672f11a-3d4c-4acd-a4c3-e0db4c736ead
  resourceVersion: "6349"
  uid: 30eb71db-ff83-4d83-b47a-829dc9ff3e37
spec:
  kubeadmConfigSpec:
    clusterConfiguration:
      apiServer:
        certSANs:
        - localhost
        - 127.0.0.1
        - 0.0.0.0
        - host.docker.internal
        extraArgs:
          admission-control-config-file: /etc/kubernetes/kube-apiserver-admission-pss.yaml
        extraVolumes:
        - hostPath: /etc/kubernetes/kube-apiserver-admission-pss.yaml
          mountPath: /etc/kubernetes/kube-apiserver-admission-pss.yaml
          name: admission-pss
          pathType: File
          readOnly: true
      controllerManager: {}
      dns: {}
      etcd:
        local: {}
      networking: {}
      scheduler: {}
    files:
    - content: |
        apiVersion: apiserver.config.k8s.io/v1
        kind: AdmissionConfiguration
        plugins:
        - name: PodSecurity
          configuration:
            apiVersion: pod-security.admission.config.k8s.io/v1
            kind: PodSecurityConfiguration
            defaults:
              enforce: "baseline"
              enforce-version: "latest"
              audit: "restricted"
              audit-version: "latest"
              warn: "restricted"
              warn-version: "latest"
            exemptions:
              usernames: []
              runtimeClasses: []
              namespaces: [kube-system]
      path: /etc/kubernetes/kube-apiserver-admission-pss.yaml
    format: cloud-config
    initConfiguration:
      localAPIEndpoint: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
    joinConfiguration:
      discovery: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
  machineTemplate:
    infrastructureRef:
      apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
      kind: DockerMachineTemplate
      name: test-pt4rc
      namespace: default
    metadata:
      labels:
        cluster.x-k8s.io/cluster-name: test
        topology.cluster.x-k8s.io/owned: ""
  replicas: 1
  rolloutStrategy:
    rollingUpdate:
      maxSurge: 1
    type: RollingUpdate
  version: v1.32.0
status:
  conditions:
  - lastTransitionTime: "2025-02-10T16:53:26Z"
    message: Rolling 1 replicas with outdated spec (1 replicas up to date)
    reason: RollingUpdateInProgress
    severity: Warning
    status: "False"
    type: Ready
  - lastTransitionTime: "2025-02-10T16:48:09Z"
    status: "True"
    type: Available
  - lastTransitionTime: "2025-02-10T16:47:51Z"
    status: "True"
    type: CertificatesAvailable
  - lastTransitionTime: "2025-02-10T16:53:56Z"
    message: 'Following machines are reporting control plane info: test-9x9x7-rt2ln'
    reason: ControlPlaneComponentsUnhealthy
    severity: Info
    status: "False"
    type: ControlPlaneComponentsHealthy
  - lastTransitionTime: "2025-02-10T16:53:56Z"
    message: 'Following machines are reporting etcd member info: test-9x9x7-rt2ln'
    reason: EtcdClusterUnhealthy
    severity: Info
    status: "False"
    type: EtcdClusterHealthy
  - lastTransitionTime: "2025-02-10T16:48:01Z"
    status: "True"
    type: MachinesCreated
  - lastTransitionTime: "2025-02-10T16:53:56Z"
    message: 1 of 2 completed
    reason: Draining @ Machine/test-9x9x7-rt2ln
    severity: Info
    status: "False"
    type: MachinesReady
  - lastTransitionTime: "2025-02-10T16:53:26Z"
    message: Rolling 1 replicas with outdated spec (1 replicas up to date)
    reason: RollingUpdateInProgress
    severity: Warning
    status: "False"
    type: MachinesSpecUpToDate
  - lastTransitionTime: "2025-02-10T16:53:26Z"
    message: Scaling down control plane to 1 replicas (actual 2)
    reason: ScalingDown
    severity: Warning
    status: "False"
    type: Resized
  initialized: true
  observedGeneration: 2
  ready: true
  readyReplicas: 2
  replicas: 2
  selector: cluster.x-k8s.io/cluster-name=test,cluster.x-k8s.io/control-plane
  unavailableReplicas: 0
  updatedReplicas: 1
  version: v1.32.0

Variant 2: Modify the existing ClusterClass

  1. Bring up management cluster
./hack/kind-install-for-capd.sh
export CLUSTER_TOPOLOGY=true
clusterctl init --infrastructure docker:v1.8.5 -b kubeadm:v1.8.5 -c kubeadm:v1.8.5 --core cluster-api:v1.8.5
  1. Create CRS for CNI
CLUSTER_NAME=test CNI_RESOURCES="" clusterctl generate yaml --from test/e2e/data/infrastructure-docker/main/bases/crs.yaml | kubectl apply -f -
kubectl create cm cni-test-crs-0 --from-file data=test/e2e/data/cni/kindnet/kindnet.yaml --dry-run=client -o yaml | kubectl apply -f -
  1. Create a ClusterClass with extra args
clusterctl generate yaml --from test/infrastructure/docker/templates/clusterclass-quick-start.yaml | sed 's/          nodeRegistration: {}/          nodeRegistration:\n            kubeletExtraArgs:\n              protect-kernel-defaults: "true"/g' | kubectl apply -f -
  1. Create Cluster with extra-args ClusterClass and add cni label
KUBERNETES_VERSION=v1.32.0 clusterctl generate cluster --worker-machine-count=0 --from test/infrastructure/docker/templates/cluster-template-development.yaml test | kubectl apply -f -
kubectl label cluster test cni=test-crs-0
  1. Print KubeadmControlPlane to see that args are set:

Shortened yaml:

$ kubectl get kubeadmcontrolplane test-9x9x7 -o yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
  ...
  name: test-tbgj5
  namespace: default
  ...
spec:
  kubeadmConfigSpec:
    ...
    initConfiguration:
      localAPIEndpoint: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
        kubeletExtraArgs:
          protect-kernel-defaults: "true"
    joinConfiguration:
      discovery: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
        kubeletExtraArgs:
          protect-kernel-defaults: "true"
...

Full yaml

$ kubectl get kubeadmcontrolplane test-9x9x7 -o yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
  annotations:
    cluster.x-k8s.io/cloned-from-groupkind: KubeadmControlPlaneTemplate.controlplane.cluster.x-k8s.io
    cluster.x-k8s.io/cloned-from-name: quick-start-control-plane
  creationTimestamp: "2025-02-10T16:57:58Z"
  finalizers:
  - kubeadm.controlplane.cluster.x-k8s.io
  generation: 1
  labels:
    cluster.x-k8s.io/cluster-name: test
    topology.cluster.x-k8s.io/owned: ""
  name: test-tbgj5
  namespace: default
  ownerReferences:
  - apiVersion: v1
    kind: Secret
    name: test-shim
    uid: 203c8ff9-88c6-4cdb-9902-93675075ee0f
  - apiVersion: cluster.x-k8s.io/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: Cluster
    name: test
    uid: 7c065d29-6f29-4c61-9f51-7da11fe291e6
  resourceVersion: "7409"
  uid: 8a877b34-36ab-47d4-b997-5598728e9c3c
spec:
  kubeadmConfigSpec:
    clusterConfiguration:
      apiServer:
        certSANs:
        - localhost
        - 127.0.0.1
        - 0.0.0.0
        - host.docker.internal
        extraArgs:
          admission-control-config-file: /etc/kubernetes/kube-apiserver-admission-pss.yaml
        extraVolumes:
        - hostPath: /etc/kubernetes/kube-apiserver-admission-pss.yaml
          mountPath: /etc/kubernetes/kube-apiserver-admission-pss.yaml
          name: admission-pss
          pathType: File
          readOnly: true
      controllerManager: {}
      dns: {}
      etcd:
        local: {}
      networking: {}
      scheduler: {}
    files:
    - content: |
        apiVersion: apiserver.config.k8s.io/v1
        kind: AdmissionConfiguration
        plugins:
        - name: PodSecurity
          configuration:
            apiVersion: pod-security.admission.config.k8s.io/v1
            kind: PodSecurityConfiguration
            defaults:
              enforce: "baseline"
              enforce-version: "latest"
              audit: "restricted"
              audit-version: "latest"
              warn: "restricted"
              warn-version: "latest"
            exemptions:
              usernames: []
              runtimeClasses: []
              namespaces: [kube-system]
      path: /etc/kubernetes/kube-apiserver-admission-pss.yaml
    format: cloud-config
    initConfiguration:
      localAPIEndpoint: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
        kubeletExtraArgs:
          protect-kernel-defaults: "true"
    joinConfiguration:
      discovery: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
        kubeletExtraArgs:
          protect-kernel-defaults: "true"
  machineTemplate:
    infrastructureRef:
      apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
      kind: DockerMachineTemplate
      name: test-47krr
      namespace: default
    metadata:
      labels:
        cluster.x-k8s.io/cluster-name: test
        topology.cluster.x-k8s.io/owned: ""
  replicas: 1
  rolloutStrategy:
    rollingUpdate:
      maxSurge: 1
    type: RollingUpdate
  version: v1.32.0
status:
  conditions:
  - lastTransitionTime: "2025-02-10T16:57:59Z"
    message: Scaling up control plane to 1 replicas (actual 0)
    reason: ScalingUp
    severity: Warning
    status: "False"
    type: Ready
  - lastTransitionTime: "2025-02-10T16:58:00Z"
    reason: WaitingForKubeadmInit
    severity: Info
    status: "False"
    type: Available
  - lastTransitionTime: "2025-02-10T16:57:59Z"
    status: "True"
    type: CertificatesAvailable
  - lastTransitionTime: "2025-02-10T16:57:59Z"
    message: Scaling up control plane to 1 replicas (actual 0)
    reason: ScalingUp
    severity: Warning
    status: "False"
    type: Resized
  observedGeneration: 1
  selector: cluster.x-k8s.io/cluster-name=test,cluster.x-k8s.io/control-plane
  1. Create new KubeadmControlPlaneTemplate without args

Diff:

kubectl get kubeadmcontrolplanetemplates.controlplane.cluster.x-k8s.io quick-start-control-plane -o yaml | yq '.metadata |= {} | .metadata.name = "quick-start-control-plane-without-args" | del(.spec.template.spec.kubeadmConfigSpec.initConfiguration.nodeRegistration.kubeletExtraArgs,.spec.template.spec.kubeadmConfigSpec.joinConfiguration.nodeRegistration.kubeletExtraArgs)' | kubectl apply -f -
  1. Adjust ClusterClass to point to the new KubeadmControlPlaneTemplate
kubectl get clusterclass quick-start -o yaml | yq '.spec.controlPlane.ref.name = "quick-start-control-plane-without-args"' | kubectl apply -f -

Shortened yaml:

$ kubectl get kubeadmcontrolplane test-tbgj5 -o yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
  ...
  name: test-tbgj5
  namespace: default
  ...
spec:
  kubeadmConfigSpec:
    ...
    initConfiguration:
      localAPIEndpoint: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
    joinConfiguration:
      discovery: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
...

Full yaml

```yaml
$ kubectl get kubeadmcontrolplane test-tbgj5 -o yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
  annotations:
    cluster.x-k8s.io/cloned-from-groupkind: KubeadmControlPlaneTemplate.controlplane.cluster.x-k8s.io
    cluster.x-k8s.io/cloned-from-name: quick-start-control-plane-without-args
  creationTimestamp: "2025-02-10T16:57:58Z"
  finalizers:
  - kubeadm.controlplane.cluster.x-k8s.io
  generation: 2
  labels:
    cluster.x-k8s.io/cluster-name: test
    topology.cluster.x-k8s.io/owned: ""
  name: test-tbgj5
  namespace: default
  ownerReferences:
  - apiVersion: cluster.x-k8s.io/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: Cluster
    name: test
    uid: 7c065d29-6f29-4c61-9f51-7da11fe291e6
  resourceVersion: "9395"
  uid: 8a877b34-36ab-47d4-b997-5598728e9c3c
spec:
  kubeadmConfigSpec:
    clusterConfiguration:
      apiServer:
        certSANs:
        - localhost
        - 127.0.0.1
        - 0.0.0.0
        - host.docker.internal
        extraArgs:
          admission-control-config-file: /etc/kubernetes/kube-apiserver-admission-pss.yaml
        extraVolumes:
        - hostPath: /etc/kubernetes/kube-apiserver-admission-pss.yaml
          mountPath: /etc/kubernetes/kube-apiserver-admission-pss.yaml
          name: admission-pss
          pathType: File
          readOnly: true
      controllerManager: {}
      dns: {}
      etcd:
        local: {}
      networking: {}
      scheduler: {}
    files:
    - content: |
        apiVersion: apiserver.config.k8s.io/v1
        kind: AdmissionConfiguration
        plugins:
        - name: PodSecurity
          configuration:
            apiVersion: pod-security.admission.config.k8s.io/v1
            kind: PodSecurityConfiguration
            defaults:
              enforce: "baseline"
              enforce-version: "latest"
              audit: "restricted"
              audit-version: "latest"
              warn: "restricted"
              warn-version: "latest"
            exemptions:
              usernames: []
              runtimeClasses: []
              namespaces: [kube-system]
      path: /etc/kubernetes/kube-apiserver-admission-pss.yaml
    format: cloud-config
    initConfiguration:
      localAPIEndpoint: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
    joinConfiguration:
      discovery: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
  machineTemplate:
    infrastructureRef:
      apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
      kind: DockerMachineTemplate
      name: test-47krr
      namespace: default
    metadata:
      labels:
        cluster.x-k8s.io/cluster-name: test
        topology.cluster.x-k8s.io/owned: ""
  replicas: 1
  rolloutStrategy:
    rollingUpdate:
      maxSurge: 1
    type: RollingUpdate
  version: v1.32.0
status:
  conditions:
  - lastTransitionTime: "2025-02-10T17:06:45Z"
    message: Rolling 1 replicas with outdated spec (1 replicas up to date)
    reason: RollingUpdateInProgress
    severity: Warning
    status: "False"
    type: Ready
  - lastTransitionTime: "2025-02-10T16:58:18Z"
    status: "True"
    type: Available
  - lastTransitionTime: "2025-02-10T16:57:59Z"
    status: "True"
    type: CertificatesAvailable
  - lastTransitionTime: "2025-02-10T17:07:25Z"
    message: 'Following machines are reporting control plane info: test-tbgj5-b49fv'
    reason: ControlPlaneComponentsUnhealthy
    severity: Info
    status: "False"
    type: ControlPlaneComponentsHealthy
  - lastTransitionTime: "2025-02-10T17:07:25Z"
    message: 'Following machines are reporting etcd member info: test-tbgj5-b49fv'
    reason: EtcdClusterUnhealthy
    severity: Info
    status: "False"
    type: EtcdClusterHealthy
  - lastTransitionTime: "2025-02-10T16:58:10Z"
    status: "True"
    type: MachinesCreated
  - lastTransitionTime: "2025-02-10T17:07:41Z"
    status: "True"
    type: MachinesReady
  - lastTransitionTime: "2025-02-10T17:06:45Z"
    message: Rolling 1 replicas with outdated spec (1 replicas up to date)
    reason: RollingUpdateInProgress
    severity: Warning
    status: "False"
    type: MachinesSpecUpToDate
  - lastTransitionTime: "2025-02-10T17:06:45Z"
    message: Scaling down control plane to 1 replicas (actual 2)
    reason: ScalingDown
    severity: Warning
    status: "False"
    type: Resized
  initialized: true
  observedGeneration: 2
  ready: true
  readyReplicas: 2
  replicas: 2
  selector: cluster.x-k8s.io/cluster-name=test,cluster.x-k8s.io/control-plane
  unavailableReplicas: 0
  updatedReplicas: 1
  version: v1.32.0

Variant 3: Modify the existing ClusterClass via kubectl replace

Do steps above until step 6

  1. Modify ClusterClass by replacing KubeadmControlPlaneTemplate
kubectl get kubeadmcontrolplanetemplates.controlplane.cluster.x-k8s.io quick-start-control-plane -o yaml | yq '.metadata |= {} | .metadata.name = "quick-start-control-plane" | del(.spec.template.spec.kubeadmConfigSpec.initConfiguration.nodeRegistration.kubeletExtraArgs,.spec.template.spec.kubeadmConfigSpec.joinConfiguration.nodeRegistration.kubeletExtraArgs)' > new-kcp-template.yaml
kubectl delete kubeadmcontrolplanetemplates.controlplane.cluster.x-k8s.io quick-start-control-plane
kubectl apply -f new-kcp-template.yaml
  1. Check that args have been removed
$ kubectl get kubeadmcontrolplane test-tbgj5 -o yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
  ...
  name: test-pkqjl
  namespace: default
  ...
spec:
  kubeadmConfigSpec:
    ... 
    initConfiguration:
      localAPIEndpoint: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
    joinConfiguration:
      discovery: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
...
$ kubectl get kubeadmcontrolplane test-tbgj5 -o yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
  annotations:
    cluster.x-k8s.io/cloned-from-groupkind: KubeadmControlPlaneTemplate.controlplane.cluster.x-k8s.io
    cluster.x-k8s.io/cloned-from-name: quick-start-control-plane
  creationTimestamp: "2025-02-10T17:18:33Z"
  finalizers:
  - kubeadm.controlplane.cluster.x-k8s.io
  generation: 2
  labels:
    cluster.x-k8s.io/cluster-name: test
    topology.cluster.x-k8s.io/owned: ""
  name: test-pkqjl
  namespace: default
  ownerReferences:
  - apiVersion: cluster.x-k8s.io/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: Cluster
    name: test
    uid: b7ac78f9-3b79-4b05-a0ad-52fa3281ddf9
  resourceVersion: "12584"
  uid: 4e20cacc-4d23-4081-81b8-ce9e6ba1c3d7
spec:
  kubeadmConfigSpec:
    clusterConfiguration:
      apiServer:
        certSANs:
        - localhost
        - 127.0.0.1
        - 0.0.0.0
        - host.docker.internal
        extraArgs:
          admission-control-config-file: /etc/kubernetes/kube-apiserver-admission-pss.yaml
        extraVolumes:
        - hostPath: /etc/kubernetes/kube-apiserver-admission-pss.yaml
          mountPath: /etc/kubernetes/kube-apiserver-admission-pss.yaml
          name: admission-pss
          pathType: File
          readOnly: true
      controllerManager: {}
      dns: {}
      etcd:
        local: {}
      networking: {}
      scheduler: {}
    files:
    - content: |
        apiVersion: apiserver.config.k8s.io/v1
        kind: AdmissionConfiguration
        plugins:
        - name: PodSecurity
          configuration:
            apiVersion: pod-security.admission.config.k8s.io/v1
            kind: PodSecurityConfiguration
            defaults:
              enforce: "baseline"
              enforce-version: "latest"
              audit: "restricted"
              audit-version: "latest"
              warn: "restricted"
              warn-version: "latest"
            exemptions:
              usernames: []
              runtimeClasses: []
              namespaces: [kube-system]
      path: /etc/kubernetes/kube-apiserver-admission-pss.yaml
    format: cloud-config
    initConfiguration:
      localAPIEndpoint: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
    joinConfiguration:
      discovery: {}
      nodeRegistration:
        imagePullPolicy: IfNotPresent
  machineTemplate:
    infrastructureRef:
      apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
      kind: DockerMachineTemplate
      name: test-s2r94
      namespace: default
    metadata:
      labels:
        cluster.x-k8s.io/cluster-name: test
        topology.cluster.x-k8s.io/owned: ""
  replicas: 1
  rolloutStrategy:
    rollingUpdate:
      maxSurge: 1
    type: RollingUpdate
  version: v1.32.0
status:
  conditions:
  - lastTransitionTime: "2025-02-10T17:19:39Z"
    message: Rolling 1 replicas with outdated spec (1 replicas up to date)
    reason: RollingUpdateInProgress
    severity: Warning
    status: "False"
    type: Ready
  - lastTransitionTime: "2025-02-10T17:18:52Z"
    status: "True"
    type: Available
  - lastTransitionTime: "2025-02-10T17:18:34Z"
    status: "True"
    type: CertificatesAvailable
  - lastTransitionTime: "2025-02-10T17:20:01Z"
    message: 'Following machines are reporting control plane info: test-pkqjl-tjd5g'
    reason: ControlPlaneComponentsUnhealthy
    severity: Info
    status: "False"
    type: ControlPlaneComponentsHealthy
  - lastTransitionTime: "2025-02-10T17:20:01Z"
    status: "True"
    type: EtcdClusterHealthy
  - lastTransitionTime: "2025-02-10T17:18:44Z"
    status: "True"
    type: MachinesCreated
  - lastTransitionTime: "2025-02-10T17:19:56Z"
    status: "True"
    type: MachinesReady
  - lastTransitionTime: "2025-02-10T17:19:39Z"
    message: Rolling 1 replicas with outdated spec (1 replicas up to date)
    reason: RollingUpdateInProgress
    severity: Warning
    status: "False"
    type: MachinesSpecUpToDate
  - lastTransitionTime: "2025-02-10T17:19:39Z"
    message: Scaling down control plane to 1 replicas (actual 2)
    reason: ScalingDown
    severity: Warning
    status: "False"
    type: Resized
  initialized: true
  observedGeneration: 2
  ready: true
  readyReplicas: 1
  replicas: 2
  selector: cluster.x-k8s.io/cluster-name=test,cluster.x-k8s.io/control-plane
  unavailableReplicas: 1
  updatedReplicas: 1
  version: v1.32.0

@cwrau would be awesome if you could take a look in what is different to you.

@chrischdi chrischdi added priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Feb 10, 2025
@cwrau
Copy link
Contributor Author

cwrau commented Feb 11, 2025

At first I had this openstackclustertemplate:

Details
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackClusterTemplate
metadata:
  annotations:
    meta.helm.sh/release-name: 11773-test
    meta.helm.sh/release-namespace: test
  creationTimestamp: "2025-02-11T10:31:32Z"
  generation: 1
  labels:
    app.kubernetes.io/instance: 11773-test
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: t8s-cluster
    helm.sh/chart: t8s-cluster-8.0.0
  name: 11773-test-468a98de
  namespace: test
  ownerReferences:
  - apiVersion: cluster.x-k8s.io/v1beta1
    kind: ClusterClass
    name: 11773-test
    uid: c61d0451-4c1e-45f6-9f8d-652cd0220aee
  resourceVersion: "305100636"
  uid: 2db835e0-b19d-456f-b083-6f522b3c2eb8
spec:
  template:
    spec:
      apiServerLoadBalancer:
        enabled: true
      bastion:
        enabled: false
        spec:
          flavor: standard.1.1905
          identityRef:
            cloudName: openstack
            name: 11773-test-cloud-config
          image:
            filter:
              name: Ubuntu 20.04
      identityRef:
        cloudName: openstack
        name: 11773-test-cloud-config
      managedSecurityGroups:
        allNodesSecurityGroupRules:
        - description: Created by cluster-api-provider-openstack API conversion -
            BGP (calico)
          direction: ingress
          etherType: IPv4
          name: BGP (calico)
          portRangeMax: 179
          portRangeMin: 179
          protocol: tcp
          remoteManagedGroups:
          - controlplane
          - worker
        - description: Created by cluster-api-provider-openstack API conversion -
            IP-in-IP (calico)
          direction: ingress
          etherType: IPv4
          name: IP-in-IP (calico)
          protocol: "4"
          remoteManagedGroups:
          - controlplane
          - worker
        allowAllInClusterTraffic: false
      managedSubnets:
      - cidr: 10.6.0.0/24

Which resulted in the following openstackcluster;

Details

(note under bastion the following block:

enabled: false
spec:
  flavor: standard.1.1905
  identityRef:
    cloudName: openstack
    name: 11773-test-cloud-config
  image:
    filter:
      name: Ubuntu 20.04

)

Then I swapped that template out, and changed the ref in the clusterclass, for the following one;

Details
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackClusterTemplate
metadata:
  annotations:
    meta.helm.sh/release-name: 11773-test
    meta.helm.sh/release-namespace: test
  creationTimestamp: "2025-02-11T10:58:26Z"
  generation: 1
  labels:
    app.kubernetes.io/instance: 11773-test
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: t8s-cluster
    helm.sh/chart: t8s-cluster-9.0.4
  name: 11773-test-db494531
  namespace: test
  ownerReferences:
  - apiVersion: cluster.x-k8s.io/v1beta1
    kind: ClusterClass
    name: 11773-test
    uid: c61d0451-4c1e-45f6-9f8d-652cd0220aee
  resourceVersion: "305121291"
  uid: 42034f06-6f33-4a49-9aea-99a7e6af3b71
spec:
  template:
    spec:
      apiServerLoadBalancer:
        enabled: true
      bastion:
        enabled: false
      disableAPIServerFloatingIP: false
      disablePortSecurity: false
      identityRef:
        cloudName: openstack
        name: 11773-test-cloud-config
      managedSecurityGroups:
        allNodesSecurityGroupRules:
        - direction: ingress
          etherType: IPv4
          name: cilium VXLAN
          portRangeMax: 8472
          portRangeMin: 8472
          protocol: udp
          remoteManagedGroups:
          - worker
          - controlplane
        - direction: ingress
          etherType: IPv4
          name: cilium health (http)
          portRangeMax: 4240
          portRangeMin: 4240
          protocol: tcp
          remoteManagedGroups:
          - worker
          - controlplane
        - direction: ingress
          etherType: IPv4
          name: cilium health (ping)
          protocol: icmp
          remoteManagedGroups:
          - worker
          - controlplane
        allowAllInClusterTraffic: false
      managedSubnets:
      - cidr: 10.6.0.0/24

(note the now different part in bastion;

enabled: false

)

But instead of this working I get the following error in the CAPI logs;

E0211 10:59:39.554253       1 controller.go:324] "Reconciler error" err="error reconciling the Cluster topology: failed to create patch helper for OpenStackCluster/11773-test-8gvqs: server side apply dry-run failed for modified object: OpenStackCluster.infrastructure.cluster.x-k8s.io \"11773-test-8gvqs\" is invalid: [spec.bastion.spec.image: Invalid value: 0: spec.bastion.spec.image in body should have at least 1 properties, spec.bastion.spec: Invalid value: \"object\": at least one of flavor or flavorID must be set]" controller="topology/cluster" controllerGroup="cluster.x-k8s.io" controllerKind="Cluster" Cluster="test/11773-test" namespace="test" name="11773-test" reconcileID="f6fb2e26-48f3-4c7d-86a1-4da6b7e50442"

I then, manually, kubectl -n test patch osc 11773-test-8gvqs --type json --patch "$(jq -n '[{op:"remove",path:"/spec/bastion/spec"}]')" to remove the field.

@sbueringer
Copy link
Member

Above you mentioned that you had this problem with KubeadmControlPlaneTemplate. Can you give us something that we can reproduce it with KubeadmControlPlaneTemplate? It's just so much easier for us with CAPD vs CAPO.

@cwrau
Copy link
Contributor Author

cwrau commented Feb 11, 2025

Above you mentioned that you had this problem with KubeadmControlPlaneTemplate. Can you give us something that we can reproduce it with KubeadmControlPlaneTemplate? It's just so much easier for us with CAPD vs CAPO.

While I was testing this with kcpt it "sadly" just worked, should I update the issue and switch it from kcpt to osct?


Ah, I understand I think?, the error is not that the kcpt isn't getting overridden correctly but the osct. Which results in CAPI erroring and just not doing anything with the kcpt. If I do the aforementioned kubectl patch to remove the wrong config, the kcpt get's updated successfully

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clusterclass Issues or PRs related to clusterclass help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

4 participants