[dns-operations] Akamai's missing SOA bug fixed

David C Lawrence tale at akamai.com
Wed Oct 4 17:56:26 UTC 2017


On Saturday at the DNS-OARC meeting, Ralf Weber of Nominum gave a very
interesting presentation on cache performance. His slides are here:

https://indico.dns-oarc.net/event/27/session/5/contribution/5/material/slides/0.pdf

One of his observations was that he was getting a surprisingly high
number of cache misses for Akamai names.  He eventually tracked this
down to uncacheable negative answers where our authorities had not
included an SOA in the response.  This was certainly a surprising
finding to me, and so I got up to the mic and acknowledged it was
unintentional and that we would endeavour to fix it quickly.

The fix was deployed on Monday afternoon.

It turns out that this was a bug introduced with a software release
that completed on 30 August.  It had not been present in previous
releases.  One of the nameservers team members noticed the problem and
had already fixed it for the next software release, due imminently,
but hadn't made the leap as to the implications for resolver caches.

The reason for the bug is buried in the finer points of the
architecture of our nameserver, but can generally be described as the
interactions of a particular configuration that was unfortunately not
well-represented in developer testing but is very common in
production.  We're working to remedy that testing gap.

There is a dynamic configuration switch that could be flipped to bypass
the bug and that was sent out on Monday.  We now return you to your
regularly scheduled negative answers.




More information about the dns-operations mailing list