Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.7 Release upgrades #82

Merged
merged 41 commits into from
Mar 10, 2023
Merged

1.7 Release upgrades #82

merged 41 commits into from
Mar 10, 2023

Conversation

DnPlas
Copy link
Contributor

@DnPlas DnPlas commented Feb 13, 2023

This PR introduces the following:

  • feat: knative-operator image version bump 1.1.0->1.8.1
  • fix: remove istio ns
  • feat: add webhook service and auth manifests
  • feat: bump operator manifests 1.1.x -> 1.8.x

@DnPlas DnPlas requested a review from a team as a code owner February 13, 2023 22:19
Previously, we listedon KNative Services to confirm that they were available for use.  This test might have been failing because we had not specified the namespace to list under, leading to us using a default namespace that may not exist.  This updated version refactors the check, instead looking directly for the expected CRD rather than just listing those CRDs.  This makes the check more directly test the thing we care about.
This is trying to fix a problem noticed in CI but not locally, where the KNative Serving and Operator charms go green, but the KNative Serving Service CRD is not (yet?) created.
Previously, integration tests deployed a hard-coded image for knative-operator.  This has been fixed to use the version called for in metadata.yaml
This adds missing aggregated ClusterRoleBindings to the RBAC manifests.  These are usually requirements of the workload, but due to --trust providing broad permissions it is hard to test if they are configured properly here.
…asier

This refactors the knative-operator charm in a way that does not cause any functional change, in order to make it easier to later add another layer to the charm.  This should pass existing tests and change no functionality.
Previously, we used the command of the image as the service name in pebble.  This refactors to use the name of the application
Previously, this configmap appears to have been optional, but the current knative-operator panics without it now.  We see this through operator's logs:

2023-03-07T20:40:55Z INFO Most recent service output:
    (...)
    {"severity":"INFO","timestamp":"2023-03-07T20:40:55.571628465Z","logger":"knative-operator","caller":"profiling/server.go:64","message":"Profiling enabled: false","commit":"e0ef2a2-dirty","knative.dev/pod":"knative-operator"}
    {"severity":"INFO","timestamp":"2023-03-07T20:40:55.576046426Z","logger":"knative-operator","caller":"leaderelection/context.go:47","message":"Running with Standard leader election","commit":"e0ef2a2-dirty","knative.dev/pod":"knative-operator"}
    panic: runtime error: invalid memory address or nil pointer dereference
    [signal SIGSEGV: segmentation violation code=0x1 addr=0x110 pc=0x15e5088]

    goroutine 1 [running]:
    knative.dev/pkg/metrics.NewObservabilityConfigFromConfigMap(0x0)
    	knative.dev/[email protected]/metrics/config_observability.go:114 +0x248
    knative.dev/pkg/injection/sharedmain.GetObservabilityConfig({0x1f7d178, 0xc0000c3170})
    	knative.dev/[email protected]/injection/sharedmain/main.go:118 +0x176
    knative.dev/pkg/injection/sharedmain.SetupObservabilityOrDie({0x1f7d178, 0xc0000c3170}, {0x1cc5a0f, 0x10}, 0x9502f9000?, 0x2540be400?)
    	knative.dev/[email protected]/injection/sharedmain/main.go:354 +0x5b
    knative.dev/pkg/injection/sharedmain.MainWithConfig({0x1f7dbf8, 0xc000012010}, {0x1cc5a0f, 0x10}, 0xc00042ed80, {0xc0008eff60, 0x2, 0x2})
    	knative.dev/[email protected]/injection/sharedmain/main.go:271 +0x5a6
    knative.dev/pkg/injection/sharedmain.MainWithContext({0x1f7dbf8, 0xc000012010}, {0x1cc5a0f, 0x10}, {0xc0008eff60, 0x2, 0x2})
    	knative.dev/[email protected]/injection/sharedmain/main.go:210 +0x1d7
    knative.dev/pkg/injection/sharedmain.Main({0x1cc5a0f, 0x10}, {0xc0008eff60, 0x2, 0x2})
    	knative.dev/[email protected]/injection/sharedmain/main.go:140 +0x8f
    main.main()
    	knative.dev/operator/cmd/operator/main.go:26 +0x50
    2023-03-07T20:40:55Z ERROR cannot start service: exited quickly with code 2

Adding an empty configmap fixes this issue.
This removes part of the standard demo config-observability configmap for knative because the example text was interpreted by jinja as part of a template.  This problem caused rendering errors for the configmap
This removes part of the standard demo config-observability configmap for knative because the example text was interpreted by jinja as part of a template.  This problem caused rendering errors for the configmap
@ca-scribner ca-scribner marked this pull request as draft March 7, 2023 23:08
This adds the knative-operator webhook as a second pebble layer, as well as tests supporting this change.  This "webhook" is really an operator that operates a webhook needed by knative-operator.  Also included here is a new kubernetes Secret, which is used by the webhook operator to pass certs to the webhook itself.

This commit includes a hack to avoid a race condition between kubernetes resource creation and starting the knative workloads, where we add a small sleep between k8s resource creation and pebble interactions.  The knative workloads require certain configmaps (config-logging, config-observability) and secrets to exist before they start.  If these don't exist, they fail too quickly for Pebble's auto-restart.  This is discussed [here](https://chat.charmhub.io/charmhub/pl/48icc7grtfng8dqs8mdo5ojxga).
Copy link
Contributor Author

@DnPlas DnPlas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ca-scribner, I think this looks good already. Some comments.

charms/knative-operator/src/charm.py Outdated Show resolved Hide resolved
charms/knative-operator/tests/unit/test_charm.py Outdated Show resolved Hide resolved
charms/knative-operator/tests/unit/test_charm.py Outdated Show resolved Hide resolved
charms/knative-serving/config.yaml Show resolved Hide resolved
tests/test_bundle.py Outdated Show resolved Hide resolved
@DnPlas
Copy link
Contributor Author

DnPlas commented Mar 9, 2023

Please also note the publish job is failing because the webhook image resource has not been uploaded to charmhub yet. Please do it manually in your local environment to unblock this job.

Updates based on review feedback
This removes individual handlers for each of the pebble-ready events, instead handling them through the central _main() handler.  This is due to the race condition between kubernetes resource creation and executing our workloads.  Our workloads require kubernetes resources to be available before they start, so it makes sense to always deploy these resources before any pebble planning for these workloads.
…rces

Adds a check to ensure the kubernetes resources required to deploy the knative operator/webhook exist.

Closes #90
@ca-scribner
Copy link
Contributor

fyi the obs integration tests are failing because of the same metadata/resource issue that affected the integration tests. I'm fixing them now

Copy link
Contributor Author

@DnPlas DnPlas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one last comment regarding _main(), other than that LGTM.

@ca-scribner ca-scribner marked this pull request as ready for review March 10, 2023 17:23
Copy link
Contributor Author

@DnPlas DnPlas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LTGM, let's work on #90 later. Thanks @ca-scribner

Copy link
Contributor

@ca-scribner ca-scribner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit odd as @DnPlas started this PR, I took it over and finished it, and now she's reviewed and approved it. But since she's the original author, I'm giving the final 👍

@ca-scribner ca-scribner merged commit dfb09f3 into main Mar 10, 2023
@ca-scribner ca-scribner deleted the KF-964-1.7-upgrade branch March 10, 2023 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants