-
-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] MTU discovery aka iterate on packet size #873
Comments
@jmanteau could this be done during normal tracing or would this need to be a separate mode of operation? I've only pondered this briefly but I think it would need to be a separate mode as nodes which respond with I'm keen to keep the feature set of Trippy narrowly focused and so would like to implement his if it could be an addition to the basic trace data, much as ICMP extension data is for example. |
From an Internet point of view, the MTU part can be useful but less interesting. Where adding this info can be interesting is in the Enterprise world with diverse WAN links and encap done over the network. |
I think being able to see the "MTU Path" could be valuable, especially in ECMP contexts where the MTU may differ between paths. Another idea would be to send extra probes, outside of those used for normal tracing, with varying MTUs such that path MTU data could be obtained alongside, but not instead of, the trace data. I think we'd have to PoC this and see how feasible it is. Is this something you'd be able to attempt? |
From a networking point of view and general programming, yes I could. However I am a complete novice in Rust and I don't know Trippy codebase. |
@jmanteau Thanks for offering. I think the concept and mechanism for MTU discover is well understood, the "trick" here is figuring out how to integrate it with the existing codebase, so I don't think a PoC outside of Trippy would add much. |
@fujiapple852, I echo the use-case of this feature request and would point out the following that may help as part of the implementation...
|
@mpenning (cc @jmanteau) i'm keen on the idea of adding the ability to determine and display path MTU in Trippy, but i'm not yet clear on the right implementation approach and there are a number of complexities in Trippy not present in tools like I've written up some notes on the options and challenges below. Probe OptionsThe two broad options I can see are: IntrusiveThe first option, which i'll name "intrusive", would involve varying the MTU for legitimate tracing probes, much as Trippy varies the TTL of probes today. Trippy would interpret ICMP DestinationUnreachable messages with code 4 (i.e. "Fragmentation needed, DF bit set") responses and use these to determine the lowest path MTU. The advantage of this approach is that it works "in-band" with the existing tracing strategy, i.e. we don't have to send any additional probe packets for path MTU discovery. However this would mean that, if a probe fails along the path due to the MTU, it will prevent that host (and any subsequent hosts on the path) from returning a response it otherwise would have (i.e. ICMP TimeExceeded) which will distort the tracing statistics. So for example, a probe with an initial ttl of 4 may generate an error at the 3rd hop on the path due to the MTU, and thus the TimeExceeded error that would otherwise have been returned by 4th hop on the path will never occur, and instead the tracer will see the MTU error from hop 3. This also begs the question, during MTU discovery, would Trippy vary the probe size per hop within a round or between rounds? My gut feeling is Trippy should use a consistent packet size for all probes within a single round and only vary the size (when needed) between rounds. This would lead to some rounds being "truncated" (no probe will get past the first host which rejects the probe due to the MTU) which seems weird. DedicatedThe second option, which i'll name "dedicated", would involve the tracer sending dedicated probes over and above those used for regular tracing, for the sole purpose of determining the path MTU. This has the advantage of avoiding any negative interaction with the normal statistics used for tracing. It is not immediately clear how this would work as, assuming only UDP here, Trippy needs a way to distinguish probes by sequence number and uses various techniques (classic, Paris, Dublin) to stuff this into an outgoing probe packet and it would need to do this for both regular probes and dedicated MTU probes. This would need to be tightly integrated into the tracing sliding window algorithm which is already complex. Other IssuesEither way, Trippy would also have to have logic to decide when to vary the packet length and how to deal with legacy systems which do not set the "Next-Hop MTU" in the IMCP DestinationUnreachable message. The guidance in rfc1191 section 7.1 seems sensible here. Flows / Paris / DublinThere is also the issue for "flows", which Trippy records for UDP/paris and UDP/dublin, where the path MTU is, clearly, a function of the path and so each path will potentially have a separate MTU and each such MTU can change over the lifetime of the trace. It isn't immediately clear to me how Trippy could associated a given DestinationUnreachable with code 4 to an existing flow, given a flow is identified by the list of hosts traversed, and this list will be truncated when probes fail due to MTU. Maybe path MTU discovery would only be supported for UDP/classic mode as a simplification. Packet SizeI don't recall why 1024 was chosen as the maximum packet size initially, it was perhaps an arbitrary choice or inspired by other tools. Of course, for tracing purposes, this limit on packet size is not big issue, It only becomes an issue for path MTU discovery on links which support large frames. One practical issue with increasing this limit is that Trippy attempts to avoid heap allocations in the tracer hot path and so allocates the probe buffer on the stack. Increasing this substantially above 1024 would increase required maximum stack size of the application, though I don't think even 64k would be a deal breaker. Changing the maximum packet size also introduced complexities for the implements of IPv6/dublin (see IPv6 section below), but these should be possible to overcome, perhaps by having separate maximum packet size limits for different modes. This would need some more thought. Don't Fragment BitTrippy sets the DF bit for IPv4 unconditionally, this was done mainly to allow for UDP/dublin which uses the IPv4 fragment IPv6IPv6 has no fragmentation and hence no DF bit so another mechanism is needed to support path MTU discovered, as set out in the various RFCs. I haven't explored this yet. IPv6/UDP/dublin would be problematic as the sequence number is encoded as the size of the payload, and so the two features would be in direct conflict and MTU discovery would need to be disallowed for IPv6/UDP/dublin tracing. Payload PatternsTrippy allows users to specify an octet to use as the repeated payload. This should work as expected when the payload is varied in length as the packet size is adjusted. |
I don't feel very strongly about MTU detection as it is not a need that I have but understand why it is relevant to others. I just wanted to jump in with my opinion on the first option That said I'm wondering if MTU is something that changes often? Maybe a separate mode or feature could be used for it's detection. Maybe even option one |
Hello @fujiapple852, I've considered your comment above and thought I'd share the following... PMTU Detection CLI optionI think the thing that makes the most sense is that PMTU detection should NOT be on by default and it should be triggered from a CLI option. Additionally, I think the user should be required to specify the max MTU size to be detected in the CLI option. For the case of nodes that do not share what the MTU should be in the ICMP response, this avoids the issue of having to send an absurd number of probes for MTU values beyond what the user knows is configured in their network (and sometimes people will know what the max MTU should be). In the case that they do not know what the max should be, they can set max MTU some large value (beyond what they believe the equipment is capable of). PMTU unit of measureIn all cases, I think the MTU should be specified as the maximum IP packet size in bytes (including IPv4 / IPv6 headers and associated options). PMTU implementation ideasBased on a comparison of intrusive and dedicated MTU detection, I think a dedicated MTU detection mode should be used; however, I think it's helpful to keep tracing the path as usual while detecting PMTU because you want to know whether a hop is dropping packets. If a specific combination of IP addr / hop TTL value is dropping all packets, it's pointless to try detecting MTU at that node and you should just bypass MTU detection while that node drops all packets. Furthermore, once you find that a node fails to return an MTU probe, the question is open whether the node dropped the probe due to MTU size, packet loss, or control-plane ICMP rate-limiting. As such, I think the user should be given the option of specifying how many MTU probes should be sent per-packet-size, per-hop. This places the burden on the user to set how reliable they want the MTU detection to be. Flows / Paris / DublinI personally think that MTU detection for ICMP, UDP, and TCP are all valuable; I have witnessed different results from ICMP vs UDP vs TCP traces. Large Packet size use-caseThe reason I mention fragmented IP packet sizes up to 64k is because real networks may have problems with:
DF-bitYou are correct that there is no DF bit in IPv6, but there is an IPvt6 more-fragments bit. IPv6 fragmentation is still permitted by the sending station, so I believe that testing it is useful in IPv6 corner cases. |
Trippy is an incredible tool that I start to integrate into my tooling.
One interesting point could be to iterate on the packet size to discover what is the maximum size without fragmentation (on each hop) hence discovery the allowable MTU on the path.
In a way, reimplement PMTUD and showing it graphically in the output.
The text was updated successfully, but these errors were encountered: