[dns-operations] More Aggressive prefetch for popular names

Paul Hoffman phoffman at proper.com
Mon Apr 8 03:14:02 UTC 2019

On 7 Apr 2019, at 19:03, Davey Song wrote:

>> The "popular sites" you mention have all done this already. They also
>> tend to use services like Akamai, which use short TTLs, dynamic 
>> records,
>> and CDNs which limit the types of damage that you are describing.
>> I missed one case in the "outage of popular names during the TTL ". 
>> It is
> that the short DNS TTL of CDN ,5 minutes for example, will be 
> occasionally
> ignored and changed by resolver operators up to 2-3  hours due to some
> policy conflicts.

Please describe these "policy conflicts", and how they appear for some 
names but not for others.

> It occurred one or twice in a month observed in one large
> CDN operator I'm familiar with.  I'm not sure how Akamai or Cloudflare
> handle this, but it happens every month, people are suffering.

Please describe who is suffering, and how they suffer. (It feels like 
this could be an exaggeration.)

> It is partially due to different interest of recursive/authoritative
> operators and loosely coordination between them as people mentioned. 
> But I
> also observed that resolver operators have motivation and tools to set
> policy of a minimum TTL or a larger TTL . They care more about  the 
> rate
> of  cache miss than rate of serving stale data. Normally they are
> cooperative if they receive a call and notice the conflicts for 
> specific
> names case by case, but there seems no automatic approach set before 
> the
> event between resolver and authoritative operators.

As the previous messages from others have said, the automatic approach 
is the setting of TTLs.

>> We have to get out of the mindset that it's our job to fix someone
>> else's mistakes.
> Mistakes of both resolver and authoritaive servers are observed. I'm
> writing this not asking to add more straw on the camel. I just would 
> like
> to konw any best practice on this issue on this mailing list. Or it is
> nothing but other people's problem?

It is the problem of the authoritative servers when they guess wrong 
about their TTLs, and then they learn to guess better in the future. 
That's not "other people": it is all of us. Where you are getting 
pushback is by trying to fix other people's problems in ways that make 
the system more fragile.

--Paul Hoffman

More information about the dns-operations mailing list