-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI: run e2e tests for libvirt with CRI-O #2068
base: main
Are you sure you want to change the base?
Conversation
Putting on hold because some tests are failing and I didn't run the attestation ones yet. But I appreciate any feedback on these changes. The preliminary tests execution:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks okay to me so far. Thanks Wainer!
src/cloud-api-adaptor/test/provisioner/libvirt/provision_common.go
Outdated
Show resolved
Hide resolved
Cc: @littlejawa Ok, I'm going over the tests that failed, starting with:
That test fails consistently with CRI-O, but passes with containerd. The test (implemented here) checks that from within a podvm client it can access a service deployed in another podvm, accessing by the service's name. It sets up nginx on the podvm server, configure a Below is how I can get the error message of
From test-client it can access the service as long as I pass the cluster IP:
|
Hey @wainersm, Maybe something to do with the way we install/configure crio or the network plugins. I guess we're using kata-deploy here? |
src/cloud-api-adaptor/test/provisioner/libvirt/provision_common.go
Outdated
Show resolved
Hide resolved
It's okay not same environment; indeed I was worried about it be a problem on OCP + OSC, I'm glad it's not the case :)
I talked with @littlejawa in pvt; it turns out that the same client/server test fails with cri-o + runc. So definitively it seems like a problem on the configuration of the kcli cluster but we still don't know what...
I need to figure out how to skip Hey, thanks @littlejawa for promptly checking it! |
I was looking at The error is like:
|
b8dd9f3
to
61ef281
Compare
Updates:
|
Tested with KBS=true, and |
I've tried this out on a test VM - it doesn't seem to be working on
That might not be a fair test though, so I can try out x86 |
on x86 the |
hmmm... might it be a bug on kcli for s390x? |
As for TestLibvirtKbsKeyRelease, it passed here. First I ran with containerd to ensure I had everything proper set; deleted the cluster then re-ran the test with cri-o. The new job won't fail the containerd counterpart, so maybe we should merge to see who is right by CI? :) @littlejawa has helped with debugging Side-note: I was looking for a way to skip these tests when not running on CI (i.e. locally) but I didn't find (yet) a good solution for the fact that from the tests, it cannot access the provision object so it cannot get access to the libvirt.properties, thus, not able to check whether is cri-o or containerd. I thought in query the k8s directly, but ran into another problem which is to get an k8s CLI object from the tests code too. I might be missing something so I plan to have a look at this again later. |
I finally made it working as well, my problem was a bad podvm image :-) I'm getting failures only for:
the TestLibvirtRestrictivePolicyBlocksExec is passing on my laptop (tested without KBS). I'd suggest creating issues for those and add skip there. |
Pradipta added an |
Hi @ldoktor !
Thanks for testing it! I created the issue #2100 as you suggested. |
hmm... I overlooked that function. Ok, I will use it to disable the tests on local runs. Thanks @stevenhorsman |
61ef281
to
711ee01
Compare
Rebased the code and sent a new commit that disable those failing tests when CRI-O. With that the poor remaining tests are all passing. It's ready to be merged, if not more suggestions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I think this is ready to try out now. Thanks @wainersm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @wainersm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good and tracks the failing jobs so let's get this in and see how stable it is.
@wainersm - do you think we can rebase this and merge it now, or are you waiting for anything else? |
pinging @wainersm again ;) |
In order to use kcli to create a k8s cluster with configured with cri-o, it will be needed to use a version newer than 07/02/2024 which containers the karmab/kcli@77cf2cb fix. So picking the latest version available at the time of this commit. Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Hi @stevenhorsman @bpradipt ! I rebased it and resolved the conflicts. One issue arose since last time I tried it: webhook's cert-manager installation has failed. In https://github.com/wainersm/cc-cloud-api-adaptor/actions/runs/12011685522/job/33484069022 :
I suspect it's some misconfiguration on kcli with cri-o. I will be reporting to kcli and let you posted. |
711ee01
to
ab66e65
Compare
By exporting the CONTAINER_RUNTIME=crio variable, kcli will create a k8s cluster configured CRI-O: $ export CONTAINER_RUNTIME=crio $ ./src/cloud-api-adaptor/libvirt/kcli_cluster.sh create Fixes confidential-containers#1981 Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Commit a0247ae introduced a new parameter (CONTAINER_RUNTIME) for docker provider, allowing users to specify the container runtime used. Some tests will take decisions based on that property, for example, whether nydus snapshotter messages should be inspected or not. Likewise, this added the handler for that property for libvirt, so allowing to test with cri-o too. Fixes confidential-containers#1981 Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Added a new container_runtime matrix column to generate one job for each runtime: containerd and crio. Fixes confidential-containers#1981 Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Let's keep it running for a while on CI, once it's stable we can remove the continue-on-error. Fixes confidential-containers#1981 Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
The DoTestRestrictivePolicyBlocksExec test for CRI-O will have the "error executing command in container" error message instead of "failed to exec in container". So adjusted the expected strings on the error message to consider the output of CRI-O too. Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
ab66e65
to
4716432
Compare
Some tests are already skipped on CI, also disabled them when running locally because they fail as well. The TestLibvirtImageDecryption test has failed on CI because it is not supported with CRI-O. Related-to: confidential-containers#2100 Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
4716432
to
4ce0ac0
Compare
ok, problem was reported in karmab/kcli#744 I figured flannel+crio seems to lack some configuration for networking services, I don't know exactly what's the problem. I tried calico then cilium; the later worked out (i.e. cert-manager installed successfully). Hence, I added a conditional to kcli_cluster.sh to instruct kcli to configure the cluster with cilium if cri-o. I'v run a couple of times before end up with a green execution. For the records:
|
I think it's now ready to be merged if the approvals stand. But I will keep it on 'hold' until the 0.11 release is out. |
I will also wait on #2019 be merged first. Then I can resolve any conflict. |
This is the minimum required to run e2e tests for libvirt on k8s configured with CRI-O.
CONTAINER_RUNTIME=crio
container_runtime=crio
property in libvirt.propertiescontinue-on-error
enabled for a while as we will be to fully check its stability by running on CI env.Fixes #1981