[dns-operations] slack.com bogus

Puneet Sood puneets at google.com
Fri Oct 1 16:05:39 UTC 2021


Missed this in the previous email:

* We cache most records needed for DNS resolution with a 6 hour TTL.

On Fri, Oct 1, 2021 at 12:03 PM Puneet Sood <puneets at google.com> wrote:
>
> Some information on what happened during this incident with the Google
> Public DNS service.
>
> * GPDNS did not configure an NTA for slack.com
> * We observed a small percentage of SERVFAILs during 10:05-10:47 AM PT
> on 20210930. Which was fixed by
> * a number of user-initiated cache flush requests for slack.com
> records (dname, ds, a, soa) between 10:46 AM to 11:28 AM PT on
> 20210930.
>
> Our general policy on NTAs is to only add them after evaluating the
> specific scenario. We never add them by default.
>
> -Puneet
>
> On Thu, Sep 30, 2021 at 3:20 PM Viktor Dukhovni <ietf-dane at dukhovni.org> wrote:
> >
> > On Thu, Sep 30, 2021 at 02:31:32PM -0400, vom513 wrote:
> >
> > > I see it as broken at home on my validating BIND instance (I get SERVFAIL).
> > >
> > > Interesting (at least to me) - checking 8.8.8.8 and 1.1.1.1 I get A
> > > records back just fine.  However no ‘ad’ flag (using dig).  Other
> > > DNSSEC signed zones still give ad off quad-8/quad-1 though.
> > >
> > > So perhaps a dumb question - could Google and Cloudflare be hitting
> > > some kind of “manual overrride” to not validate a given zone - i.e.
> > > human intervention / look the other way ?
> > >
> > > Or is there a more technical explanation that I could be missing ?
> >
> > It suffices to cap DS RRSet TTLs well below the .COM TLDs 24 hour
> > default.  If they cap TTLs at say 1 hour, the issue would have been
> > resolved 1 hour after the DS RRSet was deleted from the .COM zone.
> >
> > Of course the DS RRSet might also have been manually flushed from the
> > cache, as special treatment for "slack.com", given their popularity and
> > the verifiable deletion of the DS records.
> >
> > A final possibility is automatic refresh of DS records for domains that
> > go bogus.
> >
> > So there are a number of ways to reduce the impact duration to below the
> > full ~1 day maximum.
> >
> > --
> >     Viktor.
> > _______________________________________________
> > dns-operations mailing list
> > dns-operations at lists.dns-oarc.net
> > https://lists.dns-oarc.net/mailman/listinfo/dns-operations



More information about the dns-operations mailing list