Completed Jobs Not Being Cleaned Up #420

CH-BrianJurgess · 2024-11-14T18:19:33Z

We noticed that completed jobs are not being cleaned up. We currently have the job-ttl argument set to 5m in our configuration. I believe the configuration sets the .spec.ttlSecondsAfterFinished value on the job. It looks like this was introduced in Kubernetes 1.23. Unfortunately, we are not able to update K8S as rapidly. As a result, pods continue to pile up in our cluster requiring us to either create a cron to clean them up or to manually delete them.

An approach I've seen from Github Actions Kubernetes Runners is to have the controller watch for completed Jobs and clean them up manually.

I've attached an image from one of namespaces running the agents showing the pods continuing to exist past 5 minutes.

The text was updated successfully, but these errors were encountered:

DrJosh9000 · 2024-11-27T06:31:08Z

Thanks for raising this @CH-BrianJurgess!

I think we will need to build a job cleanup function - aside from older k8s versions, there are other ways jobs can accumulate (e.g. they create successfully but can fail to start a pod for some reason, and sit around retrying forever).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Completed Jobs Not Being Cleaned Up #420

Completed Jobs Not Being Cleaned Up #420

CH-BrianJurgess commented Nov 14, 2024

DrJosh9000 commented Nov 27, 2024

Completed Jobs Not Being Cleaned Up #420

Completed Jobs Not Being Cleaned Up #420

Comments

CH-BrianJurgess commented Nov 14, 2024

DrJosh9000 commented Nov 27, 2024