[dns-operations] Cloudflare DNS resolver (1.1.1.1): Weird DNSSEC race condition

Mon Aug 6 10:44:19 UTC 2018

Petr,

On 2018-08-06 11:42, Petr Špaček wrote:
> On 4.8.2018 02:53, Michael Sinatra wrote:
>>
>> On 08/03/18 16:12, Marek Vavruša wrote:
>>> This is due to the way how Knot Resolver deals with insecure
>>> delegations currently. If the delegation doesn't have a DS, resolver
>>> will remember the delegation as insecure and turn off DNSSEC until the
>>> record expires from cache. The current cache maximum TTL is 3 hours,
>>> which is when the NS+DS+DNSKEY was refetched. I've opened a ticket to
>>> track this here
>>> https://gitlab.labs.nic.cz/knot/knot-resolver/issues/391
>>
>> Yikes...well that explains the 3 hour delay on that one instance of 1.1.1.1.
>>
>> I knew you folks were using knot-resolver at the core of the 1.1.1.1
>> service, but wasn't sure if that or a Cloudflare optimization was the
>> cause of what I was seeing, so I didn't think to try to replicate it on
>> knot-resolver.  Thanks for opening the issue with CZ NIC.
> 
> Thank you Marek and Michael for reporting this!
> 
> For curious:
> This is side-effect of not setting DO in queries to insecure zones.
> 
> The main motivation is that some insecure zones/name servers break
> horribly when they receive a query with DO=1, and we did not want to
> clutter code with even more workarounds for this particular type of
> brokenness.
> 
> Let's see how we can improve this. Thanks once again!

I wasn't aware of this feature but it seems pretty cool to me.

It also should result in smaller packets and less CPU load on the 
authoritative side for insecure zones, so really it seems like it should 
have been the behavior from the beginning. (A large TLD operator 
mentioned this to me like 10 years ago...)

Cheers,

--
Shane