[dns-operations] dynect.net outage
mnordhoff at gmail.com
Mon May 30 07:34:49 UTC 2022
On Mon, May 30, 2022 at 6:48 AM Ralf Weber <dns at fl1ger.de> wrote:
> On 30 May 2022, at 8:34, Robert Edmonds wrote:
> >> So how do you expect the domain to be resolved if all of your out
> >> of bailiwick name server names no longer point to an IP address?
> > By using the working nameservers with resolvable names specified in the
> > delegation from the parent zone, which never changed in this particular
> > case. This is what Unbound's resolution algorithm does if there are not
> > too many nonexisting nameserver target names in the child's NS RRset,
> > and what other resolver algorithms do.
> So you mean the parent provided additional records (A/AAAA) when issuing
> a referral? Otherwise I can not see how from an empty cache you can
> resolve this domain if all of the name server names supplied are NXDOMAIN.
> > There is more than one resolver implementation, and they differ in the
> > results of resolving a zone with this type of misconfiguration, and none
> > of them are the reference implementation of DNS. So just looking at a
> > particular resolver algorithm returning SERVFAIL when encountering a
> > particular data pattern starting from a cold cache cannot tell us
> > whether the algorithm or the data is at fault.
> I agree on this, however the difference in implementation are less
> when it comes to resolving from a cold cache and all the explanations
> given so far for me point to the domain being unresolvable for all
> implementations from an empty cache.
> So long
> Ralf Weber
The domain was always resolvable from a cold cache because the
delegation was always* correct.
The outage happened when the authoritative NS records were changed to
*completely different* names that do not exist.
A totally parent-centric resolver would never have noticed anything wrong.
The "bug" in Unbound is that, in this precise error situation, it
apparently returns SERVFAIL before trying to fall back on the
parent-side NS records.
* As a separate issue, half of the nameservers have out-of-date glue
records with IPs that don't respond.
More information about the dns-operations