-
Notifications
You must be signed in to change notification settings - Fork 1.5k
WIP: KEP-5343: Updates to kube-proxy-backends #5344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danwinship The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Another possibility would be to deprecate the existing multi-mode | ||
kube-proxy binary in favor of having separate `kube-proxy-iptables`, | ||
`kube-proxy-ipvs`, and `kube-proxy-nftables` binaries (and perhaps, | ||
eventually, separate images). That would also work well with the plan | ||
to deprecate `ipvs` mode (and would allow us to completely remove | ||
the existing deprecated CLI options)... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is painful for developers and users, more binaries means more images to maintain and to release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multiple binaries in the same image (built from nearly the same sources) would not really be much more work for anyone.
we could maybe even do the argv[0]
hack and just have a single binary, but have it behave differently depending on whether you invoke it as kube-proxy
or kube-proxy-nftables
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I was assuming independent artifacts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, both options (multiple binaries, one image / multiple binaries, multiple images) are things to consider. Clearly the single image version is simpler though.
|
||
- Moving `kube-proxy-ipvs` to a staged repository. | ||
|
||
- Moving `kube-proxy-ipvs` to a non-staged repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will just fork the entire code in its own repo and open it for new maintainers, basically advocating for this option and what you conclude in the next paragraph
a good exercise for people willing to help will be to create a standalone repo with the ipvs proxy and windows proxy from the existing code in k/k to show feasibility ... I think that if that works we just start the deprecation period and point the people that wants to use them to this new repo ... |
/cc |
|
||
## Proposal | ||
|
||
### Determine a timeline for declaring `nftables` to be the "preferred" backend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spit balling an idea here:
As a GKE user, we just use whatever GKE give us. Last I checked on v1.31 clusters, we get iptables. I'd very much like us to move to nftables, exposing me and our team to nftables, and also helping shake out any bugs that we may hit.
In other KEPs I've seen graduation criteria include conditions saying that enough clouds needs to use a feature before it can be declared GA... I wonder if we need to do something similar here? Something along the lines of "Go convince X providers to set nftables to default".
I assume a cloud provider will need some sort of motivation to do so, so I guess we may need to give them a reason to do that too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graduation criteria like that are generally for features that are implemented outside of k/k. You can't declare a NetworkPolicy or LoadBalancer feature GA if network plugins / cloud providers haven't implemented it.
The motivation for clouds/distros/etc to move to nftables is that it's faster and more efficient. If nobody was moving to nftables by default, that would probably be a good signal that there's something wrong and we need to deal with it before declaring nftables default. But I don't think it quite works in the other direction: the fact that Amazon and Google have decided to ship new-enough kernels that they can support nftables doesn't imply that everyone else is running new-enough kernels that they can support nftables...
|
||
- Figure out the situation around "the default kube-proxy backend". | ||
|
||
- Figure out a plan for deprecating the `ipvs` backend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the sig-network call, I asked if it was possible to move this item into its own KEP, so we could move on it faster.
What was the reason that that wasn't a good idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessarily not a good idea. I was saying the ipvs plan may want to be informed by what we decide about making nftables the default, but I guess we're already telling people to stop using it either way, so maybe it does make sense to split it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was saying the ipvs plan may want to be informed by what we decide about making nftables the default, but I guess we're already telling people to stop using it either way, so maybe it does make sense to split it out.
Since ipvs is opt-in (ie: it's not default), I assume people have a reason they want to use ipvs. My assumption is that the reason is for performance. So in theory nftables fixes that for them, and no matter what we do with the default, they can chose to use nftables too.
I also assume that iptables is here to stay, for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but nftables only fixes it for them if they can run nftables, and currently the kernel requirement for being able to run nftables is much more recent than for kubernetes as a whole (and we don't currently have good information about what percentage of kubernetes users have a new-enough kernel to support nftables kube-proxy).
Also FWIW, the most recent "maybe you shouldn't use ipvs" issue was filed against 1.27, which is before even alpha nftables...
(This is WIP, but ready for review; the WIP-iness is about figuring out the scope/details of what we want to do.)