[dns-operations] Cloudflare DNS resolver (1.1.1.1): Weird DNSSEC race condition

Petr Špaček petr.spacek at nic.cz
Wed Aug 8 07:44:54 UTC 2018


On 7.8.2018 01:07, Michael Sinatra wrote:
> On 08/06/18 03:44, Shane Kerr wrote:
> 
>> I wasn't aware of this feature but it seems pretty cool to me.
>>
>> It also should result in smaller packets and less CPU load on the
>> authoritative side for insecure zones, so really it seems like it should
>> have been the behavior from the beginning. (A large TLD operator
>> mentioned this to me like 10 years ago...)
> 
> If I am an authoritative server operator, and I want my zone to remain
> insecure, then I probably should not be signing it.  If I am signing it,
> and I am returning RRSIGs in response to clients that set the DO bit,
> then I am essentially signaling my intent that it be *possible* for a
> downstream client to validate the signatures I have intentionally placed
> in my zone, *or* that I will soon be introducing a DS record and want
> signatures to appear in caches before putting the DS record in place.
> The current behavior, unfortunately, violates the *spirit* of the timing
> recommendations in RFC 7583, section 3.3.5.  (That section ostensibly
> discusses DNSKEYs, but the effect is the same of the RRSIGs are
> effectively being stripped/ignored by a recursive resolver.)
> 
> IMO, a better behavior is the following:
> 
> - if there is no DS record or other trust anchor configured *and* the
> downstream client does NOT set the DO bit in the query to the recursive
> resolver, then the recursive resolver does NOT set the DO bit in the
> query to the authoritative server.
> 
> - if there is a DS record or configured trust anchor, then the recursive
> resolver sets the DO bit in the query to the authoritative server and
> attempts to validate the received signature.  The RRSIG is passed on to
> the downstream client based on the presence of the DO bit in the
> downstream client's query.
> 
> - if there is no DS record or other trust anchor configured, but the
> downstream client *does* set the DO bit in the query to the recursive
> resolver, then the recursive resolver SHALL set the DO bit in the query
> to the authoritative server and return the RRSIGs to the downstream
> client (but will not attempt its own validation).
> 
> The three above cases all assume that the RR being queried by the
> downstream client is not already cached by the recursive resolver.
> 
> HOWEVER, that still doesn't fix the following scenario:
> 
> - If the RR is first queried by a client that doesn't set the DO bit,
> and is then subsequently queried by a client that DOES set the DO bit,
> then the recursive resolver will return what's in-cache, i.e. the RR
> without the RRSIG.  This would continue to break the use-case described
> here:
> 
> https://gitlab.labs.nic.cz/knot/knot-resolver/issues/153
> 
> and it would greatly complicate timing of the introduction of the DS
> record for a busy zone that was moving from insecure to secure.  The
> only way to fix that is to set a flag on the cached entry that
> represents whether the DO bit was set when the recursive resolver
> queried the authoritative server and cached the result.  If the flag
> isn't set in the cache and the DO bit was set on a subsequent query by a
> downstream client, then the recursive resolver would have to re-fetch
> the RR with the DO bit set.
> 
> This is starting to get complicated as we attempt to get rid of the
> corner cases, and as it currently stands, this optimization makes it
> difficult to time the introduction of new trust anchors and use
> knot-resolver (and Cloudflare DNS) as a forwarder for a validating
> resolver.  (As cool as I think the optimization otherwise is...)

Thank you Michael for thoughtful reply, I really appreciate it.

We will reconsider this approach to see how we can improve it. Let me
add that your description assumes perfect compliance on all sides, which
unfortunatelly is not the case!

Approach with "DO always set" is not feasible without additional
workarounds for broken authorities which break in various ways once
confronted with DO=1 query (which is why it was invented this
optimization the first place).

Hopefully these authoritatives will get fixed during https://dnsflagday.net/

Thank you for your time.

-- 
Petr Špaček  @  CZ.NIC



More information about the dns-operations mailing list