[dns-operations] co-operative cacheing
paul at vix.com
Wed May 7 21:29:21 UTC 2008
> My email servers each have a local DNS cache + recursive resolver, to
> simplify administration and spread the load. The disadvantage is that one
> server might spend ages doing a full recursive lookup when another one
> already has the answer. It would be nice if a server could reduce latency
> by asking the other members of a cluster if they already have an answer.
> Something like the co-operative cacheing that was implemented for HTTP.
> Has anyone done this?
most recursive nameservers, including BIND, have a similar sounding
technology called "forwarding". this is where a single network-facing
server front-ends a set of recursive "forwarder" servers that are then
used by clients. this is non-ideal since it's not high-availability
which is what it sounds like you want.
one could theoretically offer a "forward-first, recursion-undesired,
short-short-timeouts" config option, to opportunistally fetch an answer
from a "nearby" server, and then head out to authority-land otherwise.
this has terrible scaling properties for even small sets of servers,
since every member of this server-set would have to do this before every
upstream query they made. you could do it with 2, maybe 3, but not with 4.
there is an ideology out there called "distributed hash tables" which i
know opendns.com uses to solve exactly this problem. there's no standard
protocol for it, and the F/L/OSS DNS DHT software i've seen is research
quality and is oriented toward authority rather than recursive server sets.
what i've been thinking about for a couple years now is a way to send each
upstream answer received by a recursive nameserver, to other recursive
nameservers who are thought of as "administratively nearby", so that all
servers would have a chance to cache things that any server was asked. i
don't know how this would interact with an LRU replacement strategy, since
i don't know whether another server's need for an upstream datum counts as
a "use" in that it should cause an LRU replacement the way our own upstream
data would do.
unless the cliff between "nearby" and "not nearby" is very tall+steep, you
are probably better off letting every recursive nameserver in your network
fend for itself.
More information about the dns-operations