[dns-operations] For darpa.mil, EDNS buffer == 1232 is *too small*. :-(

Petr Špaček petr.spacek at nic.cz
Tue Apr 21 11:16:44 UTC 2020

Beware, wall of text ahead.

On 21. 04. 20 10:52, Brian Dickson wrote:
> On Tue, Apr 21, 2020 at 1:04 AM Petr Špaček <petr.spacek at nic.cz <mailto:petr.spacek at nic.cz>> wrote:
>     On 21. 04. 20 9:00, Paul Vixie wrote:
>     > On Tuesday, 21 April 2020 06:20:04 UTC Petr Špaček wrote:
>     Unfortunatelly I can't, we never got to the root cause.
>     It is the same story again and again:
>     Probably one of ISPs in chain on the affected link was doing weird stuff with big packets. We as Knot Resolver developers were not "their customer" but merely "supplier of their customer" so they refused to talk to us, and their actual customer lost interest as soon as it started to work reliably for them. That's all we have, i.e. nothing.
> I'm confused by the use of the definite and indefinite, which are in disagreement.
> ..."the root cause" suggests a single instance that was being investigated.
> ..."again and again" suggests multiple occurrences.
> If it was multiples, was it all involving a single network, perhaps?
> Or can you clarify if this was only a single instance of this happening?

Sorry, a non-native English speaker here.

We have encountered this on various networks all over place, mainly because Turris Omnia comes with DNSSEC-validating resolver by default.

> Can you share any other diagnostic information or observations? 
> Was there fragmentation, or just packet loss above a certain size?
> (Fragmentation would suggest smaller-than-expected MTU or perhaps tunnels, while packet loss would suggest possible MTU mismatch on a single link.)
> Understanding whether this was operator error by an ISP, versus some other non-error situation with real MTU below 1400, is important.

Except for rare cases users tell us just "this magic number works, bye" so there is really nothing to share. If I had data I would share them.

Hopefully my last presentation about server selection algorithms @ DNS-OARC 31 [1] is strong enough indicator that my team publishes everything of value which comes from our tests and experience, even if it does not show us our products or decisions in the best light.

[1] https://indico.dns-oarc.net/event/32/contributions/711/

> This all has very real and very serious consequences to the entirety of the DNS ecosystem.
Is this a generic statement, or does it relate specifically to difference between 1410 and 1232? If so, do you have data to support such strong statement?

>     No, I did mean "would":
>     - OpenDNS's experience says that in data centers 1410 works.
>     - Our experience says that outside of data centers 1410 does not always work.
> Are there additional instances where 1410 did not work, or are you using that single instance to support the "does not always work" position?
>     Let's be precise here. The proposal on the table is to change _default values in configuration_.
>     Nobody is proposing to impose "arbitrary maximum response buffer size" and weld it onto DNS software. Vendors are simply looking for defaults which work for them and their customer/user base.
> Actually, it is both.
> Just as a reminder: UDP responses will be sent with a not-to-exceed size of MIN(configured authority max, requestor's max from EDNS0 UDP_BUFSIZE).
> If the client has a smaller value than the server, that is the maximum size of a response that the server will send.
> If that value is too small, it has consequences.
> If that value is used for that client, to all servers, it has consequences for all servers traffic to that client.
> If that value comes from default, and is used on the vast majority of that packages' operators, that affects traffic from all servers to all of those operators.
> If that package represents a large portion of the traffic from resolvers, that's a big deal.
> A significant shift in the amount of TCP traffic could occur due to TC=1 responses, with a non-linear relationship between the apparent decrease in MTU, and the amount of TCP traffic.
> Particularly with large RSA key sizes and large signatures, and a large proportion of DNSSEC traffic, the impact could be severe.
> If a DNS authority operator were to begin providing DNSSEC for their customer base, DNSSEC deployment could jump from 1%-2% to 40% overnight.
> (Hint: at least one major DNS hosting provider has strongly suggested this is likely to occur quite soon.)

Well, in that case I strongly suggest this unspecified large operator should with an ECC algorithm instead of RSA ;-)

> And a 5% to 10% decrease in actual MTU (offered by clients in EDNS), the proportion of TCP traffic could triple or worse.

Sure. Do you have data to share? What does "tripple or worse" mean in practice?

If TCP before the change is at 0.01 % of total volume, increasing the traffic ten fold is most likely not going to make any difference at all.

For one real data point please see attached graph cztls-tcp-ratio.png with fresh data from CZ TLD authoritative servers. You practially cannot see the TCP in it.

> (This adversely affects both clients and servers. Plus, anything that adversely affects authority servers affects all resolvers, not just those small MTU resolvers, i.e. can/will cause collateral damage.)

I think the theory is clear to people on this mailing list.

Do you have some measurements which support hypothesis that 1410 is much better in terms of TCP traffic than 1232 (or any other number)?

I just checked response sizes for CZ TLD auths and it turns out that maximum possible response size for CZ TLD is ~ 780 bytes, so in our case the difference between 1410 and 1232 is exactly 0 extra TCP connection. See the attachment. CZ TLD has NSEC3 + KSK/ZSK pair alg 13.

> If 1410 is actually reasonable, we should not shy away from using that number.
> The question is whether the oddball networks showing up are statistically significant, and whether those should dictate the global DNS consensus value for a minimum of "maximum MTU" for default.
> So, pushback on asking for more details, particularly with regard to whether the observed situation(s) are ongoing, as well as some sense of the prevalence or frequency of this, is entirely appropriate.
> Please share as much as you know, particularly if this is a solved/resolved problem that affected one customer only (even if root cause was never discovered.)

I think this was answered above so let's not repeat ourselves.

Let's make one step back:
World seems to be heading to TLS and/or QUIC transports, so I think everyone in DNS ecosystem needs to move discussion from "oh my 0.1 % increate in TCP traffic might be too much" to something more constructive and forward looking.

Can we focus on TCP/DoT/DoH/DoQ benchmarking projects? Focus on scalability all these transports, instead of attempting to keep everything as it was since 1986?

If you are concerned about scalability talk your favorite vendor and ask "what happens if 0.1 %/1 %/10 %/all my clients switch to DNS-over-TCP/DoT/any other transport".

If you vendor does not have an answer talk to me about DNS Shotgun [2], it can be used on anything which speaks DNS. Cooperation and sponsors are more than welcome!

[2] https://ripe79.ripe.net/presentations/45-benchmarking.pdf

Thank you for your time if you managed to read this far!
Petr Špaček  @  CZ.NIC
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cztld-response-sizes.png
Type: image/png
Size: 3521 bytes
Desc: not available
URL: <http://lists.dns-oarc.net/pipermail/dns-operations/attachments/20200421/ec2ae421/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cztls-tcp-ratio.png
Type: image/png
Size: 2507 bytes
Desc: not available
URL: <http://lists.dns-oarc.net/pipermail/dns-operations/attachments/20200421/ec2ae421/attachment-0003.png>

More information about the dns-operations mailing list