[dns-operations] dnssec-failed.org and dns.google

Puneet Sood puneets at google.com
Thu Aug 15 21:11:34 UTC 2019


On Thu, Aug 15, 2019 at 9:38 AM Livingood, Jason
<Jason_Livingood at comcast.com> wrote:
>
> Hi Puneet - Given the extraordinary level of use of Google Public DNS as a recursive resolver - reliability, predictability, standards compliance, and transparency are especially important. So I wonder if you wouldn't mind more fully describing what the change was that was rolled out (seems something related to DNSSEC validation) - or at least what it was intended to do. Clearly in this case, a purposely broken DNSSEC domain should never properly resolve when validation is performed.

We have been making changes to improve our handling of unsupported
algorithms and other corner cases. The breakage here was due to a bug
in the case when there is a DS record in the parent but no matching
DNSKEY record in the child zone.

We  have monitoring using public DNSSEC test domains (including
dnssec-failed.org) to catch such problems early. Due to an unfortunate
coincidence the monitoring system had a misconfiguration at the same
time as the rollout of the DNSSEC validation changes.

>
> Also, I am not well acquainted with how your management systems work but hopefully you have a mechanism for more rapid rollback/deployment of new code when there are more critical security flaws on systems. Certainly in this case I think this is a low priority security issue but for a higher priority issue should probably be fixed more rapidly than 24-hours plus longer for some cached responses. I'd imagine you could both force a cache flush across all systems on a domain-level basis and push code more aggressively if needed (given the right risk/reward balance, which is not really met in this instance).

We do have the ability to rollback changes faster and to flush
specific bad domains from our caches. In this case, we decided the
situation did not warrant using the faster deployment options.

>
> On a related note, I have noticed that Google Public DNS and some other resolvers will in some DNSSEC conditions provide valid responses when strictly speaking a validation failure would be the standards-based response - perhaps by using automated Negative Trust Anchors (NTAs) in certain circumstances (e.g. recent key rollover detected). My guess is that this is likely considered friendly for users in some scenarios and probably helpful to furthering DNSSEC deployment in the short-term. IMO it may be useful for any resolvers doing that sort of thing to write it up in an informational I-D so that folks can understand what's happening and it can inform future standards development for how validation should work.

We do not use automated negative trust anchors - we configure them
manually on an infrequent basis. If you notice real problems, do let
us know through one of the ways (mailing list or issue tracker)
described at https://developers.google.com/speed/public-dns/groups.

Thanks,
Puneet
for the Google Public DNS team

>
> Thanks!
> Jason
>
> On 8/14/19, 1:54 PM, "Puneet Sood" <puneets at google.com> wrote:
>
>     We are rolling back the breaking change. It should complete in less than a day.
>
>     Some of the incorrect results might persist a little longer due to
>     additional caching in our frontends.
>
>     On Wed, Aug 14, 2019 at 10:11 AM Warren Kumari <warren at kumari.net> wrote:
>     >
>     > [ Top-post ]
>     >
>     > Hi,
>     >
>     > Thanks for letting us know. Google Public DNS is aware of the issue --
>     > it's a bug related to a new feature / validation, and is being
>     > addressed now...
>     >
>     > W
>     >
>     > On Wed, Aug 14, 2019 at 9:26 AM Viktor Dukhovni <ietf-dane at dukhovni.org> wrote:
>     > >
>     > > On Wed, Aug 14, 2019 at 08:27:54AM +0200, A. Schulze wrote:
>     > >
>     > > > ; <<>> DiG 9.10.3-P4-Debian <<>> @8.8.8.8 dnssec-failed.org. +aaonly
>     > > > ; (1 server found)
>     > > > ;; global options: +cmd
>     > > > ;; Got answer:
>     > > > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 54820
>     > > > ;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
>     > > >
>     > > > ;; OPT PSEUDOSECTION:
>     > > > ; EDNS: version: 0, flags:; udp: 512
>     > > > ;; QUESTION SECTION:
>     > > > ;dnssec-failed.org.             IN      A
>     > > >
>     > > > ;; ANSWER SECTION:
>     > > > dnssec-failed.org.      7199    IN      A       69.252.80.75
>     > >
>     > > I also see answers coming back, with the AD-bit set, from 8.8.{8.8,4.4},
>     > > but not 1.0.0.1, 1.1.1.1, 64.6.64.6 or 64.6.65.6.
>     > >
>     > > If validation were disabled, I'd at least expect the AD bit to be
>     > > off.  Not clear what the reason might be.
>     > >
>     > > --
>     > >         Viktor.
>     > > _______________________________________________
>     > > dns-operations mailing list
>     > > dns-operations at lists.dns-oarc.net
>     > > https://lists.dns-oarc.net/mailman/listinfo/dns-operations
>     >
>     >
>     >
>     > --
>     > I don't think the execution is relevant when it was obviously a bad
>     > idea in the first place.
>     > This is like putting rabid weasels in your pants, and later expressing
>     > regret at having chosen those particular rabid weasels and that pair
>     > of pants.
>     >    ---maf
>     > _______________________________________________
>     > dns-operations mailing list
>     > dns-operations at lists.dns-oarc.net
>     > https://lists.dns-oarc.net/mailman/listinfo/dns-operations
>
>




More information about the dns-operations mailing list