Skip to content

Nexus ICMP paranoia breaks Path MTU Discovery #7998

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jclulow opened this issue Apr 17, 2025 · 4 comments · May be fixed by #8194
Open

Nexus ICMP paranoia breaks Path MTU Discovery #7998

jclulow opened this issue Apr 17, 2025 · 4 comments · May be fixed by #8194
Assignees
Labels
networking Related to the networking. nexus Related to nexus
Milestone

Comments

@jclulow
Copy link
Collaborator

jclulow commented Apr 17, 2025

It would appear that we wholesale drop any ICMP messages in or out of the Nexus zone. While I generally see this as counter-productive as it inhibits diagnostics, some ICMP messages are load bearing; e.g., Fragmentation Needed messages are required for Path MTU Discovery to work.

This caused problems in the lab recently, where some of our traffic traverses a tunnel with a current MTU of 1402 bytes (down from the 1500 that one expects from Ethernet). When remote hosts correctly support PMTUD, this is not a problem, but Nexus instances specifically were effectively uncontactable through the VPN.

We should confirm that PMTUD works by default on services exposed by the rack.

@jclulow jclulow added networking Related to the networking. nexus Related to nexus labels Apr 17, 2025
@rcgoodfellow
Copy link
Contributor

I think the answer to this is going to be operator controllability of the OPTE firewall rules for the system-level VPC that nexus and other services are in. We could provide an interface for operators to manage said rules similar to how owners of project VPCs can manage firewall rules in that context.

@jclulow
Copy link
Collaborator Author

jclulow commented Apr 17, 2025

That sounds like a good generic solution, but presumably we can begin to allow send and receive of specifically the Fragmentation Needed messages required for PMTUD in the meantime, in the existing ruleset Nexus configures for itself? I don't believe you need the Echo Request and Reply messages, for example, or just about anything else.

@rcgoodfellow
Copy link
Contributor

rcgoodfellow commented Apr 17, 2025

It's not totally clear to me what the more expedient path here is. As filtering on particular ICMP types and codes would require additions to the OPTE filtering machinery (e.g. it's not just filtering on a protocol type), and we'll bump into oxidecomputer/opte#369 as well along either solution path, since DU packets carry the leading 64 bits of the original datagram.

@FelixMcFelix
Copy link
Contributor

I've written up some testing on #8194 (comment), but it looks as though the fix for oxidecomputer/opte#748 has fixed the underlying bad MSS behaviour.

Now that we're passing the MSS through OPTE untouched, illumos seems able to converge on the correct MSS for the path without any ICMP DU messages (case in point -- we're observing a narrowed MSS on dogfood now when I access it remotely). I think allowing through these ICMP families is still useful (and is a useful capability for customers), so we'll look into getting it in but we can probably be more confident that the system works without issue over tunnelled connections again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
networking Related to the networking. nexus Related to nexus
Projects
None yet
4 participants