Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate CRI-O jobs away from kubernetes_e2e.py #32567

Open
saschagrunert opened this issue May 6, 2024 · 18 comments
Open

Migrate CRI-O jobs away from kubernetes_e2e.py #32567

saschagrunert opened this issue May 6, 2024 · 18 comments
Assignees
Labels
priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@saschagrunert
Copy link
Member

saschagrunert commented May 6, 2024

The kubernetes_e2e.py script is deprecated and we should use kubetest2 instead.

All affected tests are listed in https://testgrid.k8s.io/sig-node-cri-o

cc @kubernetes/sig-node-cri-o-test-maintainers

Ref: https://github.com/kubernetes/test-infra/tree/master/scenarios, #20760

@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label May 6, 2024
@haircommander
Copy link
Contributor

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 6, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 4, 2024
@saschagrunert
Copy link
Member Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 5, 2024
@kannon92
Copy link
Contributor

/triage accepted
/priority important-longterm

@kannon92 kannon92 moved this from Triage to Issues - To do in SIG Node CI/Test Board Aug 21, 2024
@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Aug 21, 2024
@elieser1101
Copy link
Contributor

Does this still need help? can i start looking at it?

@saschagrunert
Copy link
Member Author

@elieser1101 I'd appreciate your eyes on that. 🙏

@elieser1101
Copy link
Contributor

/assign

@elieser1101
Copy link
Contributor

I opened many PRs to replicate the presubmit ones. After merging I would like to create a noop PR to test all the changes I made and fix anything broken.

After that I can start working on the periodics.
Reviews needed

@elieser1101
Copy link
Contributor

Any feedback or suggestions would be appreciated.

/cc saschagrunert kannon92 krzyzacy

@saschagrunert
Copy link
Member Author

The kubetest2 dra jobs seems to have a syntax error:

https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/127985/pull-kubernetes-node-e2e-crio-cgrpv1-dra-kubetest2/1844649030915723264
https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/127985/pull-kubernetes-node-e2e-crio-cgrpv2-dra-kubetest2/1844649032593444864

Error: unknown flag: --label-filter

Should we fix that up here or is it another issue?

@elieser1101
Copy link
Contributor

those 2 are part of the batch I migrated to kubetest2, I can look at it

@kannon92
Copy link
Contributor

ah sorry, I missed this. #33647

@elieser1101 There are quite a few ones failing.

@pacoxu
Copy link
Member

pacoxu commented Oct 15, 2024

With #33658, pull-kubernetes-node-e2e-crio-cgrpv1-dra-kubetest2 is now passed. pull-kubernetes-node-e2e-crio-cgrpv2-dra-kubetest2 is similar and should be fixed as well.

@elieser1101
Copy link
Contributor

For the test that don't pass I can se the following (on kubernetes/kubernetes#128092)
pull-kubernetes-node-crio-cgrpv1-evented-pleg-e2e-kubetest2 fails and the non kubetest2 have been failing for some time now
https://testgrid.k8s.io/sig-node-presubmits#pr-crio-cgrpv1-evented-pleg-gce-e2e-kubetest2
https://testgrid.k8s.io/sig-node-presubmits#pr-crio-cgrpv1-evented-pleg-gce-e2e

And for the
pull-kubernetes-node-crio-cgrpv2-imagefs-e2e-kubetest2 https://prow.k8s.io/job-history/gs/kubernetes-ci-logs/pr-logs/directory/pull-kubernetes-node-crio-cgrpv2-imagefs-e2e-kubetest2
pull-kubernetes-node-crio-cgrpv2-splitfs-e2e-kubetest2 https://prow.k8s.io/job-history/gs/kubernetes-ci-logs/pr-logs/directory/pull-kubernetes-node-crio-cgrpv2-splitfs-e2e-kubetest2

both if i review the job history worked at some point but is not consistent so i'm not sure if there is something on my side to complete for this, any pointers on how to proceed with these jobs will be helpful
@kannon92

@kannon92
Copy link
Contributor

kannon92 commented Nov 8, 2024

The non kubetest jobs for imagefs seem pretty green. It sounds like there is a kubetest migration issue.

@kannon92
Copy link
Contributor

@elieser1101

Where are we with this for presubmits?

@elieser1101
Copy link
Contributor

elieser1101 commented Nov 25, 2024

Still got no luck with the jobs mentioned here

I think could be related whit a kubetest2 issue to which I opened a PR, at the moment is no possible to set container-runtime-endpoint which is always defaulting to containerd.

we can see the command the jobs are running includes the flag duplicated.

--container-runtime-endpoint=unix:///run/containerd/containerd.sock --container-runtime-endpoint=unix:///var/run/crio/crio.sock

@kannon92
Copy link
Contributor

kannon92 commented Dec 2, 2024

I see an issue with the DRA tests:

I think there is an issue with the label-filter and it is not finding the right jobs for DRA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: Issues - To do
Development

No branches or pull requests

7 participants