[dns-operations] DNSSEC outage in e164.arpa

Wolfgang Nagele wnagele at ripe.net
Fri Mar 4 12:40:11 UTC 2011


Dear colleagues,

Together with our vendor, we have analysed the DNSSEC outage that
occurred in e164.arpa on 15 February 2011 and have come to the
conclusion that it was caused by an unknown bug in the signer system.

The publication of the new DS record for e164.arpa (key tag 33067)
occurred on the same day that we began rolling out our new DNS
provisioning system. During that migration we had to change our serial
timestamp format from YYYYMMDDNN serial number to Unix timestamp format.
To do this, we had to increment serials in all of our zones twice to
roll to the new values (e.g. 2011021500 -> 1297728000). This excessive
re-transfer behaviour seemed to cause the system that verifies the
publication of new DS records to skip this signature.

As this re-transfer behavior doesn't occur during normal operations, we
do not foresee this becoming an reoccurring problem in the future.
However, we have decided that there is a need to increase the sanity
checks performed before a zone gets published. We are now in contact
with NLNetLabs and are investigating the possibility of having a zone
transfer proxy that can be configured to validate a zone that comes in
on one end and is only sent out on the other end if it validates against
a given set of trust anchors.

For more information:
https://lists.dns-oarc.net/pipermail/dns-operations/2011-March/006926.html

We believe this type of sanity check would benefit the community by
reducing the occurrence of DNSSEC-related incidents. We would appreciate
your input on what requirements should be included in a sanity checker.
You can share your input with me by email, or in person at the DNS-OARC
meeting in San Francisco (13-14 March).

Regards,

Wolfgang Nagele
RIPE NCC DNS Group Manager

On 2/15/11 17:30, Wolfgang Nagele wrote:
> Dear colleagues,
> 
> Our DNSSEC signer system produced a e164.arpa zone that was missing a
> signature for the current KSK at approximately 13:00 UTC today.
> 
> We resolved the error at approximately 16:30 UTC by producing a new zone
> (serial 1297787668).
> 
> We apologise for the outage and will provide more details after we've
> analysed the incident.
> 
> Regards,
> 
> Wolfgang Nagele
> RIPE NCC DNS Group Manager



More information about the dns-operations mailing list