Petr Špaček petr.spacek at nic.cz
Mon Mar 4 21:23:21 UTC 2019

On 04. 03. 19 21:34, James Stevens wrote:
> I'm working with a large client who is currently trying to change their
> DNSSEC signing operator. The client is *very* against going unsigned, if
> it can be avoided.
> Of course, option-1 would be to follow RFC-6781 - but looks like that's
> not going to be possible :(
> Currently neither operator appears to support adding each other's
> DNSKEYs in the zone. I have tickets open with both, but I'm not holding
> my breath.
> What I tried (with a test zone) was to put both operator's DS keys in
> the parent, wait >24 hrs then switch all NS (parent & zone), all the
> time polling Google, Cloudflare and ISC's test bind & unbound to check
> for outage - using queries that would give NXDOMAIN answers, to avoid
> cached answers.
> NS ttl is 3600, DNSKEY ttl is 7200 and parent is dot-COM, so DS ttl is
> 86400
> Cloudflare switched to validating with the new keys pretty much
> immediately, with very little outage.
> ISC's servers both behaved the same - they kept validating on the old
> keys until the NS expired, then they gave SERVFAIL until the DNSKEY
> expired, then they switched and orked fine on the new keys.
> Google was odd - they switch randomly and gradually - 7 hours later some
> of their servers still hadn't switched (i.e were giving answers
> validated using the old keys) and they were often refreshing the TTL on
> the old DNSKEYs - I can't see how that could be correct behavior after 7
> hours?? 24 hrs later they had completely switched to the new keys.
> If I can just get the old provider to carry the new DNSKEYs, it seems to
> me this would alleviate most of the outage.
> Failing that plan-b is to try and reduce the TTL on the NS & DNSKEY, to
> minimize the outage.
> Plan-Z is unsigned for 24 hrs :(
> So questions
> 1) Am I right that Google is behaving oddly?
> 2) Anybody got any better ideas for switching while avoiding going
> unsigned and avoiding outage ?

Shortening TTLs for NS + DNSKES, along with "union DS" sounds like
workable approach to me.

May I suggest repeating your experiment with TTLs like 1 minute or so?

Petr Špaček  @  CZ.NIC

