[dns-operations] Forcing BIND to randomly expire records from cache ahead of time

Warren Kumari warren at kumari.net
Fri Jul 4 19:14:17 UTC 2014


On Thu, Jul 3, 2014 at 6:04 PM, Tim Wicinski <tjw.ietf at gmail.com> wrote:
>
> Mark
>
> Unbound has this feature, but its' a % of the TTL (oh they may of changed
> this).
>
> You may be also interested in this idea which was floated during IETF, and
> not rejected, just a small sliver of useful customer base:
>
> http://tools.ietf.org/id/draft-wkumari-dnsop-hammer-00.txt

... and, almost exactly one year later, we are *finally* rev'ing that
document (I just edited it this afternoon, Suzanne is giving it the
once over as we speak, and we are planning on submitting in the next
hour (you know, just before the draft cutoff :-) - “I love deadlines.
I love the whooshing noise they make as they go by.” — Douglas Adams,
The Salmon of Doubt )

The new version mainly described the general concept, and that
OpenDNS, Unbound and BIND 10 do something like this. In a somewhat
contrived bit of writing, it also manages to keep the "Stop! Hammer
time!" references...

W


>
>
>
> On 7/3/14 5:06 PM, Mark Pettit wrote:
>>
>> Hi, folks.
>>
>> I have an issue with BIND cache timeouts, and I was hoping someone else
>> might have some idea how to fix this.
>>
>> Here's the situation: we have a large number of servers that do a huge
>> number of DNS lookups at the top of every minute. The TTL for the
>> records they're looking up is 3600.
>>
>> What we've noticed is that on a host with a recently-restarted copy of
>> BIND, we see huge spikes in DNS latency every 61 minutes. This makes
>> logical sense, given the behavior of the DNS lookups.
>>
>> What is more interesting is that on hosts that have been running BIND
>> for a very long time (on the order of months), the spikiness is not
>> visible.
>>
>> Our speculation is that over time, due to the interaction between the
>> 3600 TTL and the "once every minute" lookup behavior, cache misses
>> become randomly distributed throughout the hour, and don't cause the
>> spiky behavior that is observed initially.
>>
>> One of our ideas to resolve this is to randomize the TTLs in the zone
>> files, causing them to expire out of cache at different times, thus
>> forcing more-rapid distribution of cache misses across the hour.
>>
>> However, this would involve some massive edits to our zone files, and
>> isn't really ideal.
>>
>> What *would* be ideal would be if we could tell BIND to randomly expire
>> some small percentage of cached entries ahead of the actual TTL
>> expiration. This would serve the same purpose as assigning "random" TTLs
>> to the actual records in the zone files.
>>
>> Does BIND have a config option like this? Has anyone else ever
>> encountered this issue, and if so, how did you address it?
>>
>> Thanks for any advice, and I hope everyone has a fantastic Fourth of
>> July weekend.
>>
>> Mark Pettit
>>
>>
>>
>> _______________________________________________
>> dns-operations mailing list
>> dns-operations at lists.dns-oarc.net
>> https://lists.dns-oarc.net/mailman/listinfo/dns-operations
>> dns-jobs mailing list
>> https://lists.dns-oarc.net/mailman/listinfo/dns-jobs
>>
> _______________________________________________
> dns-operations mailing list
> dns-operations at lists.dns-oarc.net
> https://lists.dns-oarc.net/mailman/listinfo/dns-operations
> dns-jobs mailing list
> https://lists.dns-oarc.net/mailman/listinfo/dns-jobs




More information about the dns-operations mailing list