[dns-operations] Someone from Cloudflare here?
franklin at sentaidigital.com
Tue Oct 27 05:47:03 UTC 2020
On Tue, Oct 27, 2020 at 12:24 AM, Viktor Dukhovni
<ietf-dane at dukhovni.org> wrote:
> On Mon, Oct 26, 2020 at 10:08:31PM -0400, Viktor Dukhovni wrote:
>> qname | flags | alg | stime | etime | kid
>> agrilinks.org | 257 | 8 | 2018-07-29 | 2020-07-22 |
>> agrilinks.org | 257 | 13 | 2020-07-22 | |
>> agrilinks.org | 256 | 8 | 2020-07-11 | 2020-07-22 |
>> agrilinks.org | 256 | 13 | 2020-07-22 | |
>> The rather low KSK (key row id) ("kid") of 486, is due to the fact
>> the P256 key in question has been in use as a KSK over the last ~3
>> by ~262,991 distinct domains and is still in use by ~206,413 of
> I might note that my "kid" 29460111 is also shared by many domains
> (presently at least 251,300) and has been in use for over two years.
I'd expect Cloudflare to have more than 206k customers, so clearly not
their sole signing key. 200k to 300k zones per signing server sounds
plausible, each (as you say below) with its own HSM.
> So it appears that domains using this KSK and ZSK (likely stored in
> hardened HSMs, ...) tend to keep using them long-term. It looks
> therefore like your domain was migrated to Cloudflare back in July,
> but lack of ZSK rollovers since is not immediately indicative of any
> latent issue.
Sort of. It was migrated from one CF account to another. Doesn't
"explain" why the DNSKEY's RRSIG was allowed to expire, but may narrow
down the possible code paths. When it was in the prior account, it was
likely on older hardware with an RSA(8) ZSK and then moved as part of
the transfer. We had to update the nameservers in WHOIS as part of the
migration from one set of CF name servers to another.
I'm less concerned about the lack of a ZSK rollover per se, and more
that the RRSIG covering it wasn't regenerated in a timely manner. All
we can really say is it wasn't caused by a rollover, but if the RRSIG
and the rollover are always done at the same time, that would explain
the expiring RRSIG. As you point out, the past history of monthly ZSK
rollovers on the prior account, and none in the three months for the
current account -- is that a bug in their zone management or is it just
a different policy for ECDSA(13) vs RSA(8) ZSKs?
Put another way, is it one bug or two bugs, one that prevented a
rollover and one that prevented regenerating the RRSIG? And, is it
worthwhile to keep the latter bug as a canary?
> What's more, it seems to now have a valid signature, so it looks like
> you reached the right folks via some channel or other.
> agrilinks.org. IN DNSKEY 257 3 13
> agrilinks.org. IN DNSKEY 256 3 13
> agrilinks.org. IN RRSIG DNSKEY 13 2 3600 20201126034731
> 20200927034731 2371 agrilinks.org.
Yes, toggling DNSSEC off and on in the web dashboard forced a
re-signing of the zone. I don't consider that to be more than a
quick-fix for the issue, as I don't want to have to go into the dash
every month or so to click a button, wait, and click it again.
Thank you for your advice and the zone history. I have some more
pointed questions for them now and data to support the questions.
I'm going to recommend to our team that we add some DNS monitoring to
track the DNSKEY RRSIG expiration dates. Other RRs in the zone get
2-day (50 hour) RRSIGs, with an inception date roughly one day in the
past. There simply aren't 3.14 days for a warning track, making
pre-failure monitoring harder and more likely to generate critical
alerts outside normal business hours.
franklin at sentaidigital.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the dns-operations