[dns-operations] Someone from Cloudflare here?

John Franklin franklin at sentaidigital.com
Tue Oct 27 05:47:03 UTC 2020



On Tue, Oct 27, 2020 at 12:24 AM, Viktor Dukhovni 
<ietf-dane at dukhovni.org> wrote:
> On Mon, Oct 26, 2020 at 10:08:31PM -0400, Viktor Dukhovni wrote:
> 
>>           qname     | flags | alg |   stime    |   etime    |    kid
>>      
>> ---------------+-------+-----+------------+------------+------------
>>       agrilinks.org |   257 |   8 | 2018-07-29 | 2020-07-22 |   
>> 26428972
>>       agrilinks.org |   257 |  13 | 2020-07-22 |            |        
>> 486
>>       [...]
>>       agrilinks.org |   256 |   8 | 2020-07-11 | 2020-07-22 | 
>> 4024196652
>>       agrilinks.org |   256 |  13 | 2020-07-22 |            |   
>> 29460111
>> 
>>  The rather low KSK (key row id) ("kid") of 486, is due to the fact 
>> that
>>  the P256 key in question has been in use as a KSK over the last ~3 
>> years
>>  by ~262,991 distinct domains and is still in use by ~206,413 of 
>> them.
> 
> I might note that my "kid" 29460111 is also shared by many domains
> (presently at least 251,300) and has been in use for over two years.

I'd expect Cloudflare to have more than 206k customers, so clearly not 
their sole signing key.  200k to 300k zones per signing server sounds 
plausible, each (as you say below) with its own HSM.

> So it appears that domains using this KSK and ZSK (likely stored in
> hardened HSMs, ...) tend to keep using them long-term.  It looks
> therefore like your domain was migrated to Cloudflare back in July,
> but lack of ZSK rollovers since is not immediately indicative of any
> latent issue.

Sort of.  It was migrated from one CF account to another.  Doesn't 
"explain" why the DNSKEY's RRSIG was allowed to expire, but may narrow 
down the possible code paths.  When it was in the prior account, it was 
likely on older hardware with an RSA(8) ZSK and then moved as part of 
the transfer.  We had to update the nameservers in WHOIS as part of the 
migration from one set of CF name servers to another.

I'm less concerned about the lack of a ZSK rollover per se, and more 
that the RRSIG covering it wasn't regenerated in a timely manner.  All 
we can really say is it wasn't caused by a rollover, but if the RRSIG 
and the rollover are always done at the same time, that would explain 
the expiring RRSIG.  As you point out, the past history of monthly ZSK 
rollovers on the prior account, and none in the three months for the 
current account -- is that a bug in their zone management or is it just 
a different policy for ECDSA(13) vs RSA(8) ZSKs?

Put another way, is it one bug or two bugs, one that prevented a 
rollover and one that prevented regenerating the RRSIG?  And, is it 
worthwhile to keep the latter bug as a canary?

> What's more, it seems to now have a valid signature, so it looks like
> you reached the right folks via some channel or other.
> 
>     agrilinks.org. IN DNSKEY 257 3 13 
> mdsswUyr3DPW132mOi8V9xESWE8jTo0dxCjjnopKl+GqJxpVXckHAeF+KkxLbxILfDLUT0rAK9iUzy1L53eKGQ==
>     agrilinks.org. IN DNSKEY 256 3 13 
> oJMRESz5E4gYzS/q6XDrvU1qMPYIjCWzJaOau8XNEZeqCYKD5ar0IRd8KqXXFJkqmVfRvMGPmM1x8fGAa2XhSA==
>     agrilinks.org. IN RRSIG DNSKEY 13 2 3600 20201126034731 
> 20200927034731 2371 agrilinks.org. 
> yjY9OSOLtMViN8ZYL/J0uaUGzTtJcHoyzP5WhMXIXqqF99YONh4AkmL0D1kOBkWKFnwqseU8vFbME8BmigQRxA==

Yes, toggling DNSSEC off and on in the web dashboard forced a 
re-signing of the zone.  I don't consider that to be more than a 
quick-fix for the issue, as I don't want to have to go into the dash 
every month or so to click a button, wait, and click it again.

Thank you for your advice and the zone history.  I have some more 
pointed questions for them now and data to support the questions.

I'm going to recommend to our team that we add some DNS monitoring to 
track the DNSKEY RRSIG expiration dates.  Other RRs in the zone get 
2-day  (50 hour) RRSIGs, with an inception date roughly one day in the 
past.  There simply aren't 3.14 days for a warning track, making 
pre-failure monitoring harder and more likely to generate critical 
alerts outside normal business hours.

jf
-- 
John Franklin
franklin at sentaidigital.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.dns-oarc.net/pipermail/dns-operations/attachments/20201027/611e85e0/attachment.html>


More information about the dns-operations mailing list