Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-43777: certrotationcontroller: run tests which runs deployment and creates projects #1759

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

vrutkovs
Copy link
Member

@vrutkovs vrutkovs commented Oct 24, 2024

Tests we run after cert rotation should ensure that pod gets created from deployment, scheduled on the node and openshift-apiserver can create projects to validate that all component certificates have been regenerated. The test names are included in
certificates.openshift.io/auto-regenerate-after-offline-expiry annotation

@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Oct 24, 2024
@openshift-ci-robot
Copy link

@vrutkovs: This pull request references Jira Issue OCPBUGS-43777, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.18.0) matches configured target version for branch (4.18.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @wangke19

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Tests we run after cert rotation should ensure that pod gets created from deployment, scheduled on the node and openshift-apiserver can use imagestreams to validate that all component certificates have been regenerated. The test names are included in
certificates.openshift.io/auto-regenerate-after-offline-expiry annotation

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from wangke19 and sanchezl October 24, 2024 07:59
Copy link
Contributor

openshift-ci bot commented Oct 24, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vrutkovs

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 24, 2024
@vrutkovs vrutkovs force-pushed the cert-rotation-additional-tests branch from 896b6ba to 1eb6081 Compare October 24, 2024 11:45
@vrutkovs vrutkovs changed the title OCPBUGS-43777: certrotationcontroller: run tests which runs deployment and use imagestreams OCPBUGS-43777: certrotationcontroller: run tests which runs deployment and creates projects Oct 24, 2024
@vrutkovs
Copy link
Member Author

/retest

@@ -324,7 +324,7 @@ func newCertRotationController(
Name: "service-network-serving-certkey",
AdditionalAnnotations: certrotation.AdditionalAnnotations{
JiraComponent: "kube-apiserver",
AutoRegenerateAfterOfflineExpiry: "https://github.com/openshift/cluster-kube-apiserver-operator/pull/1631,'operator conditions kube-apiserver'",
AutoRegenerateAfterOfflineExpiry: "https://github.com/openshift/cluster-kube-apiserver-operator/pull/1631,'[sig-apps] Deployment RollingUpdateDeployment should delete old pods and create new ones [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]'",
Copy link
Contributor

@deads2k deads2k Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how this one tests the service network.

  1. test to external LB to create deployment
  2. kcm uses internal LB to create replicaset and pod
  3. scheduler uses internal LB to scheduler pod
  4. kubelet uses internal LB to get pods and report available status
  5. test uses external LB to confirm available replicas -- and FYI, this is the bit that requires the scheduler. interestingly, the deployment logic itself does not.

which hop uses the service network?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vrutkovs, if you could provide an explanation, that would be very helpful.

I also think that adding a comment with an explanation for each certificate/test pair would be greatly beneficial for future us and for easier reviewing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry, previously this was using

[sig-apps] Deployment RollingUpdateDeployment should delete old pods and create new ones [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]

but deployment rollout doesn't use service network, instead its

[sig-network] Services should serve a basic endpoint from pods [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]

now.

I'm not sure its worth commenting tests (we probably should have an easy to understand alias), but there will be an update for certificate descriptions: #1763 - would it be sufficient? Do you want me to add comments on what these tests do in the code (and not touch the annotations)?

@@ -646,7 +646,7 @@ func newCertRotationController(
Name: "control-plane-node-admin-client-cert-key",
AdditionalAnnotations: certrotation.AdditionalAnnotations{
JiraComponent: "kube-apiserver",
AutoRegenerateAfterOfflineExpiry: "https://github.com/openshift/cluster-kube-apiserver-operator/pull/1631,'operator conditions kube-apiserver'",
AutoRegenerateAfterOfflineExpiry: "https://github.com/openshift/cluster-kube-apiserver-operator/pull/1631,'[sig-apps] Deployment RollingUpdateDeployment should delete old pods and create new ones [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]'",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this test uses the kubeconfig that this certificate is embedded into. this is for being able to use oc if you ssh to the node.

@@ -698,7 +698,7 @@ func newCertRotationController(
Name: "check-endpoints-client-cert-key",
AdditionalAnnotations: certrotation.AdditionalAnnotations{
JiraComponent: "kube-apiserver",
AutoRegenerateAfterOfflineExpiry: "https://github.com/openshift/cluster-kube-apiserver-operator/pull/1631,'operator conditions kube-apiserver'",
AutoRegenerateAfterOfflineExpiry: "https://github.com/openshift/cluster-kube-apiserver-operator/pull/1631,'[sig-apps] Deployment RollingUpdateDeployment should delete old pods and create new ones [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]'",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this test confirms that the check endpoints controller (which may or may not still exist) can function.

@@ -723,7 +723,7 @@ func newCertRotationController(
Name: "node-system-admin-signer",
AdditionalAnnotations: certrotation.AdditionalAnnotations{
JiraComponent: "kube-apiserver",
AutoRegenerateAfterOfflineExpiry: "https://github.com/openshift/cluster-kube-apiserver-operator/pull/1631,'operator conditions kube-apiserver'",
AutoRegenerateAfterOfflineExpiry: "https://github.com/openshift/cluster-kube-apiserver-operator/pull/1631,'[sig-apps] Deployment RollingUpdateDeployment should delete old pods and create new ones [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]'",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as client-cert comment above.

@vrutkovs vrutkovs force-pushed the cert-rotation-additional-tests branch from 1eb6081 to 12ec413 Compare October 25, 2024 15:16
@vrutkovs vrutkovs force-pushed the cert-rotation-additional-tests branch from 12ec413 to d00bec9 Compare November 28, 2024 10:35
@p0lyn0mial
Copy link
Contributor

I can tag this PR to get it merged, as it simply updates some descriptions, but I'm not sure if that is the intended goal.

I think the goal is to ensure we actually have a list of tests that can validate whether certificates have been regenerated properly. To do that, I would have to carefully check each test listed here.

If you want me to confirm the tests, I would suggest changing just one item. That would make it much easier for me to check and make some progress.

@vrutkovs
Copy link
Member Author

The plan here is:

Copy link
Contributor

@p0lyn0mial p0lyn0mial left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, let's start with the aggregation layer.

Could you remind me of the purpose of the certificates and explain how the[sig-cli] oc adm new-project [apigroup:project.openshift.io][apigroup:authorization.openshift.io] [Suite:openshift/conformance/parallel] test is going to validate each certificate?

@@ -135,7 +135,7 @@ func newCertRotationController(
Name: "aggregator-client-signer",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the signer used to sign --proxy-client-cert ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@@ -150,7 +150,7 @@ func newCertRotationController(
Name: "kube-apiserver-aggregator-client-ca",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the CA that is added to --requestheader-client-ca-file ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes (that's the entire CA btw)

@@ -162,7 +162,7 @@ func newCertRotationController(
Name: "aggregator-client",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this --proxy-client-cert ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct

@vrutkovs
Copy link
Member Author

/retest

Tests we run after cert rotation should ensure that pod gets created
from deployment, scheduled on the node and openshift-apiserver can
create projects to validate that all component certificates have been
regenerated. The test names are included in
certificates.openshift.io/auto-regenerate-after-offline-expiry annotation
@vrutkovs vrutkovs force-pushed the cert-rotation-additional-tests branch from fd2867a to 8e7d25b Compare January 10, 2025 14:40
@vrutkovs
Copy link
Member Author

/retest

Copy link
Contributor

openshift-ci bot commented Jan 13, 2025

@vrutkovs: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-single-node 8e7d25b link false /test e2e-aws-ovn-single-node
ci/prow/e2e-aws-ovn-upgrade 8e7d25b link true /test e2e-aws-ovn-upgrade
ci/prow/e2e-gcp-operator 8e7d25b link true /test e2e-gcp-operator
ci/prow/okd-scos-e2e-aws-ovn 8e7d25b link false /test okd-scos-e2e-aws-ovn
ci/prow/k8s-e2e-gcp-serial 8e7d25b link false /test k8s-e2e-gcp-serial
ci/prow/e2e-gcp-operator-single-node 8e7d25b link false /test e2e-gcp-operator-single-node
ci/prow/e2e-aws-ovn 8e7d25b link true /test e2e-aws-ovn
ci/prow/k8s-e2e-gcp 8e7d25b link true /test k8s-e2e-gcp
ci/prow/e2e-aws-operator-disruptive-single-node 8e7d25b link false /test e2e-aws-operator-disruptive-single-node
ci/prow/e2e-aws-ovn-serial 8e7d25b link true /test e2e-aws-ovn-serial

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

AutoRegenerateAfterOfflineExpiry: "https://github.com/openshift/cluster-kube-apiserver-operator/pull/1631,'operator conditions openshift-apiserver'",
JiraComponent: "kube-apiserver",
// This test would ensure kube-apiserver request would be passed to openshift-apiserver
AutoRegenerateAfterOfflineExpiry: "https://github.com/openshift/cluster-kube-apiserver-operator/pull/1631,'[sig-cli] oc adm new-project [apigroup:project.openshift.io][apigroup:authorization.openshift.io] [Suite:openshift/conformance/parallel]'",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// AutoRegenerateAfterOfflineExpiry contains a link to PR and an e2e test name which verifies
// that TLS artifact is correctly regenerated after it has expired

Would the PR not be openshift/release#58130 where we add and rehearse these tests to the cert rotation jobs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to https://github.com/openshift/origin/blob/master/pkg/cmd/update-tls-artifacts/generate-owners/tlsmetadata/autoregenerate_after_expiry/requirement.go#L23 it needs to be the PR which added the annotation, not necessarily the one which runs these tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants