[dns-operations] root? we don't need no stinkin' root!

Wed Nov 27 16:18:02 UTC 2019

On 26. 11. 19 12:46, David Conrad wrote:
> On Nov 26, 2019, at 11:33 AM, Jim Reid <jim at rfc1035.com <mailto:jim at rfc1035.com>> wrote:
>>> On 26 Nov 2019, at 09:16, Florian Weimer <fw at deneb.enyo.de <mailto:fw at deneb.enyo.de>> wrote:
>>>
>>> Up until recently, well-behaved recursive resolvers had to forward
>>> queries to the root if they were not already covered by a delegation.
>>> RFC 7816 and in particular RFC 8198 changed that, but before that, it
>>> was just how the protocol was expected to work.
>>
>> So what? These RFCs make very little difference to the volume of queries a resolving server will send to the root. QNAME minimisation has no impact at all: the root just sees a query for .com instead of foobar.com <http://foobar.com/>. A recursive resolver should already be supporting negative caching and will have a reasonably complete picture of what's in (or not in) the root. RFC8198 will of course help that but not by much IMO.
> 
> It would appear a rather large percentage of queries to the root (like 50% in some samples) are random strings, between 7 to 15 characters long, sometimes longer.  I believe this is Chrome-style probing to determine if there is NXDOMAIN redirection. A good example of the tragedy of the commons, like water pollution and climate change.
> 
> If resolvers would enable DNSSEC validation, there would, in theory, be a reduction in these queries due to aggressive NSEC caching.  Of course, practice may not match theory (https://indico.dns-oarc.net/event/32/contributions/717/attachments/713/1206/2019-10-31-oarc-nsec-caching.pdf). 

The discussion after the talk (including hallway :-) was also interesting, and with all respect for Geoff's work, these slides should be read with some sceptism.

Main points:
1) Load-balancer with N resolver nodes behind it decrease effectivity of aggressive cache by factor N, it *does not* invalidate the concept.

In other words, a random subdomain attack which flows through resolver farm with N nodes has to fill N caches with NSEC records, and that will simply take N times longer when compared with non-load-balanced scenario.

The aggressive cache still provides upper bound for size of NXDOMAIN RRs in cache, which is super useful under attack because it prevents individual resolvers from dropping all the useful content from cache during the attack.

2) Two out of five major DNS resolver implementations used by large ISPs did not implement aggressive caching (yet?), so it needs to be expected that deployment is not great. Also the feature is pretty new and large ISPs are super conservative and might not deployed new versions yet ...

I forgot the rest so I will conclude with: Watch the video recording and think yourself! :-)
Petr Špaček  @  CZ.NIC

> 
> Of course, steady state query load is largely irrelevant since root service has to be provisioned with massive DDoS in mind. In my personal view, the deployment of additional anycast instances by the root server operators is a useful stopgap, but ultimately, given the rate of growth of DoS attack capacity (and assuming that growth will continue due to the stunning security practices of IoT device manufacturers), stuff like what is discussed in that paper is the right long term strategy.