[dns-operations] DNS .com/.net resolution problems in the Asia/Pacific region
Mark Andrews
marka at isc.org
Thu Jul 20 02:09:44 UTC 2023
> On 20 Jul 2023, at 01:23, Shumon Huque <shuque at gmail.com> wrote:
>
> On Wed, Jul 19, 2023 at 3:28 AM Shane Kerr <shane at time-travellers.org> wrote:
> Shumon and all,
>
> On 18/07/2023 21.41, Shumon Huque wrote:
> > On Tue, Jul 18, 2023 at 3:29 PM Viktor Dukhovni <ietf-dane at dukhovni.org
> > <mailto:ietf-dane at dukhovni.org>> wrote:
> >
> > Yes, I agree. A resolver can't really tell that a response with an
> > expired signature wasn't an attacker trying to replay old data. For
> > robustness against attacks, it must re-query other available other
> > servers if they exist.
>
> I kind of think that a resolver using UDP should just drop a response on
> the floor if it has an expired signature. Otherwise an attacker can
> induce behavior change by spoofing replies, which is itself a security
> problem (in this case, blocking with a response that would arrive later
> and work, effectively removing a name server from the set of name
> servers queried for a given lookup).
>
> The problem is that in the general case the resolver can't really tell if
> this was an attack or a misconfiguration. So, it's best to build in robust
> behavior to deal with the case more generally. Which in my opinion is
> "drop the response on the floor, maybe blacklist the server for a while,
> and retry the next server". If a later valid response does come, then be
> prepared to accept it (if you've still held on to the query).
>
> This idea mostly applies to UDP without DNS cookies since it is the only
> transport easily vulnerable to spoofing. With other transports you are
> much more sure that the answer actually came from the server you are
> querying, and so you can be confident that the server is giving out
> bogus answers. (TCP is vulnerable to BGP hijacking and the like, but in
> that case you would still expect to get bogus answers for subsequent
> queries to the same server.)
>
> Well, there are inline attackers as well and DNSSEC is designed to protect
> against those too. If this only applied to UDP, then other protocols that use
> connection oriented transport or session oriented frameworks on top of UDP
> would not bother with cryptographic authentication either. And yet, they all
> do.
>
> Unfortunately I don't think any resolvers hold onto a UDP query until
> after the DNSSEC validation. So there is not really much option other
> than to try again. 🤓
>
> Yeah, but I don't really see that as a problem. I see concerns have been
> raised in this thread about the NXNAME attack and such dissuading
> resolver implementers from more retries, but in my view the only thing
> that taught us is that resolver implementers need to go back to first
> principles and sensibly bound the amount of work they are willing to do,
> not eliminate retries.
Lookups take enormous numbers of queries these days. A support customer
was asking why a lookup wasn’t completing within 3 seconds. The resolution
process took 48 queries with a cold cache. Involved several CDNs and required
fetching nameserver addresses in several different TLDs. There where no retries
in that count.
CNAME chains are expensive but we have a whole industry that has fallen in love
with them.
Yes, we do have query limits but they need to be large to handle this sort of
stuff.
> To quote RFC 1034 (published in 1987):
>
> "The recommended priorities for the resolver designer are:
>
> 1. Bound the amount of work (packets sent, parallel processes
> started) so that a request can't get into an infinite loop or
> start off a chain reaction of requests or queries with other
> implementations EVEN IF SOMEONE HAS INCORRECTLY
> CONFIGURED SOME DATA."
>
> Note: the capitalized phrase for emphasis.
>
> (My addendum: they should also bound the time spent.)
>
> Transient configuration problems are _pervasive_ in deployed DNS
> infrastructure. If BIND for example did not have the robust retry
> behavior that Mark Andrews documented upthread, we could never
> have used them in our infrastructure. In my experience Unbound also
> had the same robustness, but I'm now a little concerned by the description
> of the Unbound failures reported by Gavin M during this Verisign incident.
> Maybe they have limited retries too aggressively? It would be good to
> get some NLnetLabs colleagues to chime in with a description of their
> behavior.
>
> Shumon.
>
> _______________________________________________
> dns-operations mailing list
> dns-operations at lists.dns-oarc.net
> https://lists.dns-oarc.net/mailman/listinfo/dns-operations
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: marka at isc.org
More information about the dns-operations
mailing list