-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: allow pod spec to be defined and patched in agent and plugin #262
feat: allow pod spec to be defined and patched in agent and plugin #262
Conversation
…et-from-agent-and-plugin
reviewers : Let me know if you have question, I need help on the tests part 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another great contribution @42atomys!
I talked a bit about some tweaks I have made to #248 in that PR, and I think I want to do some similar tweaks to this too.
To keep things simple, I wonder whether we need the patch at the plugin (aka step) level? Prima facie, almost everything you define in the patch could have been defined in the podSpec. Is it there because the "system containers" may need patching? I'd image that whatever patching they need would most likely be the same for all steps run by the controller, so would it not be better handled by the podSpecPatch in the controller config?
So what I suggest is:
- remove the step level podSpecPatch
- rebase over main once feat: allow ssh-credentials to be set in agent and plugin #248 is merged as some combination of your commits and my tweaks.
As for the testing, you should be able to Start
a test controller with a podSpecPatch specified in config
. If we eliminate the step level patch, then all you need to do is test that a config level patch is propogated to all the jobs in a pipeline.
Tags stringSlice `mapstructure:"tags" validate:"min=1"` | ||
ProfilerAddress string `mapstructure:"profiler-address" validate:"omitempty,hostname_port"` | ||
ClusterUUID string `mapstructure:"cluster-uuid" validate:"omitempty"` | ||
PodSpecPatch map[string]interface{} `mapstructure:"pod-spec-patch" validate:"omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if this is specified as something k8s would not recognise like:
llamas: true
Is it a no-op, or is there a failure of the patch to apply which causes the Job to fail to start? If it's the latter, then I suggest we should try to detect this as early as possible.
It's most likely too onerous to validate it in the JSON Schema for the helm chart, but I think if some Kubernetes library exports as type for this, we should try to marshal into that so that the helm chart fails to deploy if podSpecPatch
would cause and error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, both this and #248 will need to update to the Helm chart's JSONSchema
Hi @triarius, thank you for your time. I have an example of usage that is implemented internally on a large scale, across thousands of jobs per day. We deploy one agent per cluster using a helm chart, with this configuration (we've been using the fork for 3 weeks now) to bind the service account and define resources at the agent level with our default values. Note: Not defining resources in a Kubernetes pod results in a low priority assignment and scheduling of all jobs on the same node, causing latency and job crashes when many jobs are running concurrently. Agent configuration (helm chart): image: ghcr.io/42atomys/helm/agent-stack-k8s/controller:0.8.0
config:
tags: [ queue="default" ]
ssh-credentials-secret: github-credentials
pod-spec-patch:
serviceAccountName: buildkite-service-account
automountServiceAccountToken: true
initContainers:
- name: copy-agent
resources:
requests:
cpu: 100m
memory: 96Mi
limits:
memory: 96Mi
containers:
- name: container-0
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
memory: 256Mi
- name: dind
resources:
requests:
cpu: 100m
memory: 336Mi
limits:
memory: 336Mi
- name: checkout
resources:
requests:
cpu: 100m
memory: 480Mi
limits:
memory: 480Mi
- name: agent
resources:
requests:
cpu: 50m
memory: 64Mi
limits:
memory: 64Mi In the step definition, we sometimes need "more power" to execute tasks, like integration tests with a browserless engine. With pod-spec-patch at the step level, we can granularly configure our pipeline without having to redefine all the pod/container definitions. See the example below: Simple CI Pipeline: steps:
# Execute a step on a Kubernetes pod to run rspec with the BUILDKITE_ANALYTICS_TOKEN
# secret injected into container-0.
# NOTE: We define the commands in the `commands` field, we dont declare
# any container in the `podSpecPatch` field, so the plugin will use the default
# container (container-0) to run the commands.
- agents:
queue: default
commands:
- bundle exec rspec
plugins:
- kubernetes:
podSpecPatch:
containers:
- env:
- name: BUILDKITE_ANALYTICS_TOKEN
valueFrom:
secretKeyRef:
key: BUILDKITE_ANALYTICS_TOKEN
name: buildkite
name: container-0
# Execute a playwritght test on a Kubernetes pod to run playwright with more
# resources due to the high memory and CPU usage.
- agents:
queue: default
commands:
- bundle exec playwright test
plugins:
- kubernetes:
podSpecPatch:
containers:
- name: container-0
resources:
limits:
cpu: 2
memory: 4Gi
requests:
cpu: 2
memory: 4Gi This is a simple example of how podSpecPatch can be used at the step level to adjust resources, but any field of the pod can be changed as well, without the need to define the entire pod spec each time. We are currently scaling Buildkite to an industrial level on my organisation |
Thanks for the context @42atomys. I feel like with a I've got the tests working on another branch. Are you happy for me to push to this branch and merge this PR? I'll be making it independent of #248 though. We can discuss the merits of step level |
@triarius Thanks for your return, I think deprecate PodSpec in the future must be done and rework the logic about job creation ! FYI: I dont know if you got it, I create a default You can merge your branch and merge it for me all are good 💯 |
Description
This pull request introduces a new feature where the PodSpec can be defined and patched dynamically, allowing for more customized configurations of the Kubernetes pods that run the Buildkite agents.
This includes the ability to specify any pod specifitations directly through the Buildkite agent's configuration or inside the plugins with the logic of strategic merge logic used in kubectl.
Execution Logic
pod-spec-patch
in agent configuration file and apply it on all jobs executed by him.pod-spec-path
field defined in pipeline step and apply it above.Example
This agent configuration change the service account name, automount the service account and add a env var without erase env var already present.
Related Issues
Resolve #247