Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container networking failures on hosts with net.ipv4.conf.all.arp_ignore=2 #3880

Closed
shaneutt opened this issue Mar 1, 2025 · 11 comments · Fixed by #3881
Closed

Container networking failures on hosts with net.ipv4.conf.all.arp_ignore=2 #3880

shaneutt opened this issue Mar 1, 2025 · 11 comments · Fixed by #3881
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@shaneutt
Copy link
Member

shaneutt commented Mar 1, 2025

What happened:

I use kind across a variety of modern Linux distributions. I deployed it today on a recent release of Ubuntu and noticed that the local-path-provisioner was failing right after cluster creation. The logs showed dial tcp 10.96.0.1:443: connect: no route to host. Digging in deeper, I created some pods and noticed that when I exec into them they are unable to send traffic to each other.

What you expected to happen:

The local-path-provisioner should be able to start successfully, and pod networking should be functional.

Environment:

  • kind version: kind v0.27.0 go1.22.2 linux/amd64
  • Runtime info: podman version 4.9.3
  • OS: Ubuntu 24.04.2 LTS amd64
  • Kubernetes version: Client v1.32.2, Server v1.32.2
  • Any proxies or other special environment settings?: All default settings

Note: I was able to reproduce the problem with docker as well as podman.

How to reproduce it (as minimally and precisely as possible):

Note: Everything I did here was very vanilla. kindnet and all default settings.

  1. Create a VM with the desktop version of Ubuntu 24.04.2 LTS
  2. apt-get install podman -y
  3. go install sigs.k8s.io/[email protected]
  4. kind create cluster

Note: There are likely multiple distributions that could trigger the issue, but in this case I deployed a desktop version of Ubuntu in a VM (I could not reproduce this on the server version).

At this point you will find the local-path-provisioner is failing, and any pods you start can't access the network.

Anything else we need to know?:

Stepping through some standard diagnostics I found pretty quickly that I was unable to arping the default route IP in these containers, but tcpdump did show those ARP requests showing up on the node's veth. This led me to discover that net.ipv4.conf.all.arp_ignore=2 was set, which causes the ARP requests to go unanswered because this mode ignores those ARP requests because the sender's address and the requested address are not on the same subnet (kindnet configures a /32). So it appears that kind has historically relied on net.ipv4.conf.all.arp_ignore=0 being set, as this has been pretty common for a variety of distributions but some distributions set it higher.

I ran the following:

sysctl -w net.ipv4.conf.all.arp_ignore=1

and the local-path-provisioner came up and pod networking started working.

Note: I tested other systems like Fedora 41 and the issue was not present as the host system sets net.ipv4.conf.all.arp_ignore=0. This happens on systems where the host sets net.ipv4.conf.all.arp_ignore=2 by default.

As such I have created a patch which has worked for me in my testing, for your considerations:

#3881

However I'm a little perplexed: I would be kinda surprised if I was the first person to find this ? Did I do something weird here? I promise I searched around, if I missed another report staring me right in the face I'm sorry! 😂

@shaneutt shaneutt added the kind/bug Categorizes issue or PR as related to a bug. label Mar 1, 2025
@aojea
Copy link
Contributor

aojea commented Mar 1, 2025

I don't know if this is better to solved in the cni plugin itself, so it does not depend on kernel parameters.

During this christmas I created my own cni plugin and I use the "onlink" flag to avoid arp

https://github.com/aojea/kindnet/blob/f56c64b2d5613c16a78c36472374e101fa5c63e7/cmd/cni-kindnet/netdev.go#L179-L185

@BenTheElder maybe we should switch to the cni-kindnet plugin, that already supports portmap and is a single binary instead of having to chain multiple like we do today https://kindnet.es/docs/design/cni/
This has been running since January on the kubernetes jobs that use kindnet directly

@aojea
Copy link
Contributor

aojea commented Mar 2, 2025

I checked that the onlink flag does not work with arp_ignore=2 #3882

@shaneutt this is per interface option, this should be fixed in the cni containernetworking ptp plugin cc: @squeed

sysctl -a | grep arp_ignore
net.ipv4.conf.all.arp_ignore = 2
net.ipv4.conf.default.arp_ignore = 0
net.ipv4.conf.eth0.arp_ignore = 0
net.ipv4.conf.knet224f398d.arp_ignore = 2
net.ipv4.conf.knet37a08c0b.arp_ignore = 0
net.ipv4.conf.knetb1ce758b.arp_ignore = 0
net.ipv4.conf.knetf5b10fa1.arp_ignore = 0
net.ipv4.conf.lo.arp_ignore = 0

the fix sounds simple, just set arp_ignore to 0 in the interface, it is already done for other things
https://github.com/search?q=repo%3Acontainernetworking%2Fplugins%20sysctl&type=code

@aojea
Copy link
Contributor

aojea commented Mar 2, 2025

hmm, no, it seems all overrides the per interface setting :/

@aojea
Copy link
Contributor

aojea commented Mar 2, 2025

I honestly do not know what is the best place to handle this

@aojea
Copy link
Contributor

aojea commented Mar 3, 2025

@shaneutt I didn't find any place that indicates where arp_ignore is defaulted to 2, is that set on the distro or is there any other software setting it ?

The only place I found are some guides to configure the host with that option https://discourse.ubuntu.com/t/ubuntu-24-04-server-diy-router-project-with-ipv6-and-wireguard/52102 , but that is an user action

@BenTheElder
Copy link
Member

The only place I found are some guides to configure the host with that option https://discourse.ubuntu.com/t/ubuntu-24-04-server-diy-router-project-with-ipv6-and-wireguard/52102 , but that is an user action

We should reasonably attempt to defend against user actions anyhow.

node wide, we already set some sysctl for networking in https://github.com/kubernetes-sigs/kind/tree/main/images/base/files/etc/sysctl.d

we should only set it here if we think it will be desirable without kindnetd in use.

@BenTheElder maybe we should switch to the cni-kindnet plugin, that already supports portmap and is a single binary instead of having to chain multiple like we do today https://kindnet.es/docs/design/cni/

Interesting. It does add a bit more to maintain and patch (deps) here though.

@shaneutt
Copy link
Member Author

shaneutt commented Mar 10, 2025

@shaneutt I didn't find any place that indicates where arp_ignore is defaulted to 2, is that set on the distro or is there any other software setting it ?

The only place I found are some guides to configure the host with that option https://discourse.ubuntu.com/t/ubuntu-24-04-server-diy-router-project-with-ipv6-and-wireguard/52102 , but that is an user action

The distribution sets this and (as you came to realize above) the global setting overrides individual interface settings. Whatever the host has set gets passed down to containers in the container runtimes I have tested for this (podman and docker).

The most recent Ubuntu desktop is the only distribution I've run into so far that defaults this to 2 (as noted in my original description, the server edition at the same version does not seem to).

maybe we should switch to the cni-kindnet plugin, that already supports portmap and is a single binary instead of having to chain multiple like we do today https://kindnet.es/docs/design/cni/

Switching seems fine, but also a large undertaking. So to me at least it seems reasonable as a stop-gap solution to set this in an opinionated manner until that is completed. My thinking is that I expect Ubuntu desktop is a fairly popular place to use kind, so I anticipate more reports of this issue soon as people update to the latest? 🤔

@kundan2707
Copy link
Contributor

/sig network

@k8s-ci-robot k8s-ci-robot added the sig/network Categorizes an issue or PR as relevant to SIG Network. label Mar 10, 2025
@BenTheElder
Copy link
Member

node wide, we already set some sysctl for networking in https://github.com/kubernetes-sigs/kind/tree/main/images/base/files/etc/sysctl.d

Something like this may be the best solution, if we expect it to be reasonable with other CNI installations. (...yes?)

We generally want to respect host settings (e.g. MTU) but this seems like one we might want to just always override at the node level.

@aojea
Copy link
Contributor

aojea commented Mar 10, 2025

We generally want to respect host settings (e.g. MTU) but this seems like one we might want to just always override at the node level.

agree, @shaneutt do you mind changing your PR to define this value in https://github.com/kubernetes-sigs/kind/tree/main/images/base/files/etc/sysctl.d and add a comment with the rationale?

@shaneutt
Copy link
Member Author

Sure no problem! 🖖

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants