[dns-operations] Verisign TLDs, some other servers may trim critical glue from very large referrals

Matt Nordhoff lists at mn0.us
Fri Jan 4 12:33:47 UTC 2019


Hi,

It turns out that, with some authoritative servers, if a client sends
a query specifying a small EDNS buffer size, and the server is
returning a large referral with in-bailiwick nameservers, if the sizes
align just right, some servers will trim *all* critical additional
records from the response, *without* setting the TC bit.

(Run-on sentence, wow.)

If all of a parent zone's servers return such unusable referrals,
Unbound will give up and return SERVFAIL. I don't know how other
recursive servers behave.

This was found in the wild by an Unbound resolver configured with an
EDNS buffer size of 512 bytes, with the domain chattanoogastate.edu
(TLD backend operator: Verisign). For example:

$ dig +bufsize=512 +dnssec +norecurse @b.edu-servers.net chattanoogastate.edu

; <<>> DiG 9.13.5-1+ubuntu16.04.1+deb.sury.org+2-Ubuntu <<>> +bufsize
+dnssec +norecurse @b.edu-servers.net chattanoogastate.edu
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 53484
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 9, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
;; QUESTION SECTION:
;chattanoogastate.edu.          IN      A

;; AUTHORITY SECTION:
chattanoogastate.edu.   172800  IN      NS      ns2.chattanoogastate.edu.
chattanoogastate.edu.   172800  IN      NS      ns1.chattanoogastate.edu.
chattanoogastate.edu.   86400   IN      DS      10114 5 2
A22479C3577ABDDA48962F74EECCE16D3EFE14B5C95FD9463BA5A28F CD67CF3A
chattanoogastate.edu.   86400   IN      DS      10114 5 1
2CA51C740D54B8B3EBE5D58BD012196D5584A895
chattanoogastate.edu.   86400   IN      DS      10618 5 1
F8B1C75138745E23976CAD453E812D89E366E5A1
chattanoogastate.edu.   86400   IN      DS      10618 5 2
54592BC341F637A43C0D14F0704B58913A2B1B702266083AAEEF11B8 96E84400
chattanoogastate.edu.   86400   IN      DS      4483 5 1
5BC068184A5BEC46EC3C786AB8722C8E74559A3B
chattanoogastate.edu.   86400   IN      DS      4483 5 2
D19A8289B0EA70DF1F986138200EE3D5BDF5CA4FAEECD439C72847A6 965AF362
chattanoogastate.edu.   86400   IN      RRSIG   DS 8 2 86400
20190109062913 20190102051913 37217 edu.
EncjLTtHy8zy5lHiqCe8vRayTf/8Miwik4LhUY9wkntu4uWmltruXD6J
OiHAaOeYJeHNzlf0BYLoJCgfySL6uIlyf+3cmxobIvoGTy9bmseg+OWi
ckkgTK5n3yU/b4w0J+j51oRGh1H2etBVxdy5a0QnUu10FIRAw6Mzi00+ tJo=

;; Query time: 1 msec
;; SERVER: 2001:503:231d::2:30#53(2001:503:231d::2:30)
;; WHEN: Fri Jan 04 10:57:54 UTC 2019
;; MSG SIZE  rcvd: 500

Gist without mangled line wrapping:

<https://gist.github.com/mnordhoff/ce9f4d1a8c496751ef6e8ab435f0c45f>

(I'm not affiliated with Chattanooga State. They are aware of the issue.)

I experimented with some of my own (expendable) domains under the TLDs
.info (backend Afilias), .ooo and .xyz (both CentralNic). With them,
_some_ of the TLD servers set TC and some do not, and I haven't been
able to coax Unbound into failing.

I'm running edns-buffer-size.ooo as a demonstration, with a similar
configuration to chattanoogastate.edu.

PowerDNS Authoritative doesn't set TC. [0] I haven't tested other DNS servers.

As background context, with the renewed attention to fragmentation
attacks last year, Let's Encrypt [1] and at least one other CA [2]
have configured the resolvers in their domain validation
infrastructure with an EDNS buffer size of 512 bytes to reduce the
risk of fragmentation.

Let's Encrypt changed their production DNS servers November 15. Since
then, there have been about a half dozen public reports of problems.
The chattanoogastate.edu issue [3] is the only one that's out of the
ordinary -- the others were all caused by nameservers that didn't
support TCP.

I've chatted about this with some people; none of us had any immediate
conclusions. (Also it's the holidays...) I haven't emailed any TLDs.

It turns out RFC 4472 appendix B [4] discusses this problem, and it
even links to a short namedroppers thread from 2005 [5].

Do any of you have thoughts, experiences, quotes from standards track
RFCs, or friends at Verisign? ;-)

Obviously, authoritative and recursive servers _could_ both work
around this, and people probably shouldn't set six DS records anyway.

Should anything be changed?

[0] <https://github.com/PowerDNS/pdns/issues/7315>
[1] <https://mn0.us/L-wy>
<https://community.letsencrypt.org/t/edns-buffer-size-changing-to-512-bytes/77945>
[2] <https://mn0.us/XTNm>
<https://groups.google.com/d/msg/mozilla.dev.security.policy/emREqhAZ3nM/ZNB35l_GBwAJ>
[3] <https://community.letsencrypt.org/t/servfail-while-renewing/80430>
[4] <https://tools.ietf.org/html/rfc4472#appendix-B>
[5] <https://mn0.us/Vn6R>
<https://web.archive.org/web/20060714005533/http://ops.ietf.org:80/lists/namedroppers/namedroppers.2005/msg01032.html>

Happy new year,
-- 
Matt Nordhoff



More information about the dns-operations mailing list