[dns-operations] new public DNS service: 9.9.9.9

Wed Dec 6 10:57:39 UTC 2017

Am 04.12.2017 um 15:17 schrieb Paul Vixie:
>
>
> abang wrote:
>> Am 22.11.2017 um 09:14 schrieb P Vix:
>>> Ok! Please show your measurements and metrics. Your assertion is
>>> most intriguing.
>>>
>> This [1] is a measurement of query/answer (qa) latency within a DNS
>> resolver. The traffic I sent to this resolver is a sample of a copy
>> of real customer traffic with a increasing sample rate.
>
> sampling an aggregate flow is a not-good way to approximate individual
> flows unless the individual flow is a large enough fraction of the total
> to be representative.
Yes, that's right. However, it confirms our observations: Resolvers, 
serving a small customer base have high latency.

> are you able to run actual per-stub flows at
> varying densities?
No. But I'll do another experiment:  I'll measure the q/a latency on a 
resolver which is only used for my private DSL line. This is a normal 
household with 10-15 internet devices. I'm curious...

> your methodology here does not simulate small subscriber pools. since 
> your earlier claim rests on a higher shared-cache hit rate than what 
> would be seen from a non-shared cache, methodology is quite important. 
> we need to know the average stub's working set of visited DNS content 
> as well as its re-visit rate,
Yes.

> and also the rdns's cache size and purge policy.
The cache size is is large enough to purge expired RRs only. Maximum 
lifetime for an RR is one day, which is the default setting for my 
resolver brand.

>> 10 clients (1qps): ~60ms 100 clients (10qps): ~35ms 1.000 clients
>> (100qps): ~25ms 10.000 clients (1kqps): ~10ms 100.000 clients
>> (10kqps): ~6ms 300.000 clients (30kqps): ~3ms
>
> for an ISP shared-cache rdns, the speed-of-light delay from a stub to 
> its rdns might be on the order of 3ms. for wide-area anycast rdns, 
> even at the extraordinary density achieved by google or cisco/opendns, 
> it is not. therefore some of the distinctions in your table are 
> without a meaningful difference to the 9.9.9.9 thread we're in today.
Yes. But just to avoid misunderstandings: I measured the query/answer 
latency within the resolver. More precisely, the resolver measured it 
itself. From stub perspective you have to add the "speed-of-light delay".

> the reason i originally suggested math rather than measurements is 
> that your assertion rests on a theory, and that theory can be modeled. 
> my model says that for content i am revisiting often enough for its 
> DNS latency to be meaningful to me as an end user, i'll hit the cache, 
> unless the TTL's are artificially low (as in the CDN theory you gave.)
Yes. Therefor I'll measure my private DSL line. I'm a sysadmin and not a 
mathematician ;-) Maybe you do that measurement as well.

> your challenge in putting a foundation under your intriguing assertion 
> is to show that there DNS content i don't revisit often enough for it 
> to be in a non-shared cache, but that i do revisit often enough for it 
> to be present in a shared cache of a certain user population (20000 or 
> more). that's a very small keyhole.
Yes, I know. But as I said, it matches our observations.

> finally there's the perception problem. most DNS traffic comes from 
> web browsers, which are quite parallel in their lookups and resulting 
> HTTP fetches. having the long pole in my DNS lookup tent by 60ms in 
> some cases rather than 3ms will not be perceptible to me as a human 
> user. so if you can find a mode in a common-enough DNS flow where the 
> average DNS lookup time really is 60ms, it wouldn't be compelling.
Yes. Overall latency from customer point of view is a long chain of 
latency chunks. Each of them should be considered separately. And each 
of them must be optimized to get a proper overall customer experience. 
Internet devices just work more smoothly when latency is low. The 
influence of DNS must not be underestimated.

Winfried