[dns-operations] resolvers considered harmful
mallman at icir.org
Thu Oct 23 03:03:11 UTC 2014
> Their model doesn't make it a "large" increase, and I think that's
> because of their unrealistic assumptions about actual use.
I am not sure I follow this one. We used a workload derived from actual
use to assess how much the load on .com would increase in a world where
each endpoint was doing its own name lookups.
> The problem is that the really popular domains on the Internet (Google
> is one example they do discuss) have completely different patterns
> than everyone else. The mitigation technique of increasing TTLs
> imposes a cost on the operator of the popular domain (a cost in terms
> of flexibility). The authors seem to think this is no large cost, and
> I disagree.
The paper quantifies this cost for .com. We find that something like 1%
of the records change each week. So, while increasing the TTL from the
current two days to one week certainly sacrifices some possible
flexibility, in practical terms the flexibility isn't being used.
> It's true that the increase in your numbers doesn't seem to be as bad
> as the theoretical maximium, bit 2.5 or 3 times several million is
> still a large number.
Right... But, let me make a few points:
- As noted in the paper 93% of the zones see no increase in our
trace-driven simulations. That is, they are accessed by at most one
end host per TTL and therefore see no benefit from the shared cache
and hence will see the same load regardless of whether it is an end
host or a shared resolver asking the questions.
- The point is that the increase rates above don't seem to me to be
disqualifying. I.e., there is some increase rate (e.g., on the
order of the number of users) that would I think be a show stopper.
But, that doesn't seem to be what we're looking at here.
- Or, put differently ... We are not pretending that there is no
additional cost at some auth servers. But, this additional cost
does buy us things. So, it is simply a different tradeoff than we
are making now.
- As for popular content providers, as you note, their TTLs are much
lower. So, while the NS for facebook.com is going to be used by
nearly all the clients behind some shared resolver over its two day
TTL, the A for star.c10r.facebook.com lasts less than a minute and
so a cached version is only going to shield facebook from so much of
I haven't run the numbers, but my hunch is that we'll see increase
rates for facebook.com, google.com, etc. to be smaller than for the
.com results we have in the paper because the cachability of these
records is smaller.
- There is also a philosophical-yet-practical argument here. That is,
if I want to bypass all the shared resolver junk between my laptop
and the auth servers I can do that now. And, it seems to me that
even given all the arguments against bypassing a shared resolver
that should be viewed as at least a rational choice. So, in this
case the auth zones just have to cope with what shows up. So, do we
believe that it is incumbent upon (say) AT&T to provide shared
resolvers to shield (say) Google from a portion of the DNS load?
Or, put differently, the results in the paper suggest that there
really isn't much for AT&T to gain from providing those resolvers,
so why should it? One argument here could be that AT&T is trying to
provide its customers better performance. But, the paper shows this
is really not happening (which is largely a function of pervasive
DNS prefetching). So, if I am AT&T I'd be thinking "hey, what am I
or my customers actually gaining from this complexity I have in my
network?!". And, if the answer is little-to-nothing then it seems
rational to not provide this service. Or, so it seems to me.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 180 bytes
Desc: not available
More information about the dns-operations