-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubelet TLS Handshake Failures After Certificate Rotation #16850
Comments
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What happened?
We are deploying to several kops clusters via pipelines, since kops 1.23, some pipelines would fail with error below, so we implemented temporary retry mechanism that would retry request, currently we are at kops 1.29 and this issue still persists, this is not causing any outage, but I would like to remove our temporary solution and remove this issue below (I also checked PRs for 1.23 but i didnt find anything that might be related or could cause this issue, also, on kops 1.22 we never encountered this error ) :
/usr/bin/helm Error: unable to get pod logs for <APPLICATION>: Get "https://<WORKER NODE>:10250/containerLogs/default/<APPLICATION>/test-service": write tcp <CONTROL-PLANE NODE>:44194-><WORKER NODE>:10250: use of closed network connection
at the exact same time logs from api-server :
│ kube-apiserver I0920 16:30:39.542320 11 cert_rotation.go:88] certificate rotation detected, shutting down client connections to start using new credentials │ kube-apiserver E0920 16:30:39.543643 11 status.go:71] apiserver received an error that is not an metav1.Status: &url.Error{Op:"Get", URL:"https://<WORKER NODE >:10250/containerLogs/default/application/filebeat?sinceSecon │ │ ds=300", Err:(*net.OpError)(0xc071c91090)}: Get "https://<WORKER NODE >:10250/containerLogs/default/application/filebeat?sinceSeconds=300": write tcp <CONTROL-PLANE NODE>:44194-><WORKER NODE>:10250: use of closed network connection
Everytime this error hapens, there is same log in kubelet :
kubelet[5111]: I0920 16:30:39.542666 5111 log.go:245] http: TLS handshake error from <CONTROL-PLANE NODE>:44194: EOF
I checked validity of certificates, they are all valid,
apiserver logs :
I0920 16:10:03.110741 11 cert_rotation.go:88] certificate rotation detected, shutting down client connections to start using new credentials I0920 16:20:03.515003 11 cert_rotation.go:88] certificate rotation detected, shutting down client connections to start using new credentials I0920 16:30:39.542320 11 cert_rotation.go:88] certificate rotation detected, shutting down client connections to start using new credentials I0920 16:43:43.688572 11 cert_rotation.go:88] certificate rotation detected, shutting down client connections to start using new credentials I0920 16:53:43.688628 11 cert_rotation.go:88] certificate rotation detected, shutting down client connections to start using new credentials I0920 17:03:43.689273 11 cert_rotation.go:88] certificate rotation detected, shutting down client connections to start using new credentials I0920 17:14:04.170499 11 cert_rotation.go:88] certificate rotation detected, shutting down client connections to start using new credentials I0920 17:28:43.689063 11 cert_rotation.go:88] certificate rotation detected, shutting down client connections to start using new credentials I0920 17:38:43.688520 11 cert_rotation.go:88] certificate rotation detected, shutting down client connections to start using new credentials I0920 17:48:43.688942 11 cert_rotation.go:88] certificate rotation detected, shutting down client connections to start using new credentials
Is this normal behaviour ? Cert rotation cca every 10 minutes ?
What cloud provider are you using?
AWS
What did you expect to happen?
I expected, that once certs are rotated, this will not cause any intermittent network issues.
Kubelet config
kubelet:
containerLogMaxSize: "20Mi"
containerLogMaxFiles: 5
anonymousAuth: false
authenticationTokenWebhook: true
authorizationMode: Webhook
readOnlyPort: 0
protectKernelDefaults: true
streamingConnectionIdleTimeout: "30m"
eventQps: "0"
featureGates:
RotateKubeletServerCertificate: "true"
HPAContainerMetrics: "true"
kubeReserved:
cpu: "100m"
memory: "100Mi"
kubeReservedCgroup: "/kube-reserved"
systemReserved:
cpu: "100m"
memory: "100Mi"
systemReservedCgroup: "/system-reserved"
tlsCipherSuites:
- TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
- TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
- TLS_RSA_WITH_AES_256_GCM_SHA384
- TLS_RSA_WITH_AES_128_GCM_SHA256
Possible relation
Is there chance that this issue might be related to : golang/go#50984 ?
The text was updated successfully, but these errors were encountered: