[dns-operations] Cache efficiency (was: Re: DNS .com/.net resolution problems in the Asia/Pacific region)

Fri Jul 21 04:21:16 UTC 2023

Robert Edmonds wrote on 2023-07-20 14:50:
> Mark Andrews wrote:
>> ...
> 
> Yes, there are lookups that can take a long time to perform with a cold
> cache. By putting lots of users behind large, centralized caches we can
> insulate users from a lot of cold cache lookups, but these centralized
> resolvers then become concentrated points of failure, convenient
> monitoring points, etc.

i have also found the cache hit performance to be notably worse when 
talking to an off-campus recursive than when talking on-campus. it is 
not only the cache miss performance (as driven by population size) which 
should concern us. 10ms vs 1ms doesn't matter for windowing workflows or 
for human perception but it adds up in ask-and-wait workflows not driven 
by human tentacles and optics.

> Personally, I'd like to see the "full resolver"
> role be re-distributed and move out as close as possible to the
> endpoints, given that the original justification for the stub
> resolver/full resolver split was a lack of resources at the endpoints --
> in the 1980's. But if you have full resolvers running on individual
> endpoints, or on network elements that serve individual households, etc.
> you're much more likely to run into cold cache lookups and it would be
> nice to be able to accelerate or avoid those cold lookups.

those are unpleasant extremes. can we consider a middle ground, like 
campus-level and isp-level name servers as were common before the 
opendns era, as having population sizes that led to good cache reuse? 
notably, another advantage from that rdns granularity is that the cache 
miss shares connectivity with subsequent transactions, obviating some of 
the complexities like EDNS client subnet which only came into existence 
because of the global anycast trend (quad this, quad that, quad etc.)

> Here are some random ideas for improving the efficiency of cold or
> lukewarm caches.
> 
> 1. Cache occlusion rather than replacement of outranked data. The DNS
> protocol reuses the same record type (NS) for both the non-authoritative
> delegation nameserver record set served by the parent zone as well as
> the authoritative nameserver record set served by the child zone. RFC
> 2181 § 5.4.1 says that resolvers "should replace" the data from the
> parent zone when they receive authoritative data from the child zone,
> but the parent zone often has a much longer TTL on the records that it
> serves. (E.g., the twitter.com data from the .com zone has a 2-day TTL,
> while the twitter.com NS record set from the twitter.com zone has a <4
> hour TTL.)>
> If resolver caches were able to retain the longer-lived NS records from
> the parent and "occlude" them when a shorter-lived NS record from the
> child is cached, then utilize them again when they become unoccluded
> upon the expiration of the child NS record, it would avoid sending
> unnecessary queries to the parent. It would also be arguably more
> compatible with the lower case "should replace" text in RFC 2181 § 5.4.1
> than a "parent-centric" resolver implementation.

draft-ietf-dnsop-ns-revalidation could be tuned to make that happen.

> 2. Persist some or all of the resolver's cached NS and nameserver
> address records to disk. These are typically long-lived records and I'd
> gladly trade a few tens of MB of disk space in exchange for better P99+
> resolution latency after a restart. Perhaps this could also include the
> RTTs, EDNS capabilities, etc. that is sometimes called the
> "infrastructure cache".

while i don't think i have any "disks" left, i agree with what you mean. 
we had cache dump/restore on shutdown/startup in bind4 but pulled it out 
in bind8 as a complexity vs. utility tradeoff.

> 3. You mention CNAME chains, but NS delegations are another source of
> indirection that may require additional upstream lookups, especially if
> the nameserver names are in several different TLDs (as a reliability
> hedge?). There are a couple of things that could be done here:
> 
> a) Delegations within the same organization often reflect internal
> organizational boundaries. One team may want to give control over part
> of the namespace to another team, without handing over write permissions
> for the whole zone, so the typical solution is to carve out a child zone
> for the other team, and host that zone on the same provider as the
> parent zone. If the cloud-based DNS providers that many organizations
> use offered a more granular, less than whole zone permissions model, it
> would cut down on the number of child zones that are created solely to
> reflect intra-organizational boundaries.

i'd hate to see us adopt a cloud-centric model. whatever we do to 
improve NS-chain performance -- and i think your first two suggestions 
would do this -- should also benefit the normal delegation, notify, and 
transfer system.

> b) Make nameserver address indirection *optional* without requiring a
> backwards-incompatible protocol change.

*cough*.

-- 
P Vixie