[dns-operations] resolver cache question

Fri Nov 13 21:24:29 UTC 2020

* Mark Allman:

> I just finished reading a paper that basically tries to figure out
> if a hostname is worth caching or not [1].  This isn't the first
> paper like this I have read.  This sort of thing strikes me as a
> solution in search of a problem.  The basic idea is that there are
> lots of hostnames that are automatically generated---for various
> reasons---and only ever looked up one time.  Then there is an
> argument made that these obviously clog up resolver caches.
> Therefore, if we can train a fancy ML classifier well enough to
> predict these hostnames are ephemeral and will only be resolved the
> once---because they are automatically generated and so have some
> tells---then we can save cache space (and effort) by not caching
> these.

One could use a Bloom filter to avoid caching (most) lookup results
that are encountered just once.  Or start out with an artifically
lowered TTL combined with prefetching.

The authoritative server operators know that their zones are used for
what is essentially an RPC.  Don't they set a zero TTL?  So maybe this
is about ignoring low TTLs to increase cache hit rates.  Unfortunately
this report is behind a paywall, and I'm not going to spend $35.95 to
find out if the authors have considered Bloom filters and the rest.