[dns-operations] DNS over TLS: slowly happening
John Todd
jtodd at quad9.net
Tue Jun 26 16:18:18 UTC 2018
On 26 Jun 2018, at 6:02, bert hubert wrote:
> Hi Björn,
>
> Good to see someone from the resolver community post on this list!
> Usually
> most of us are from the (cc)TLD or hosting community.
I’m certain there are more lurkers from the resolver community here
watching with interest.
> On Tue, Jun 26, 2018 at 11:17:46AM +0000, Hellqvist, Björn wrote:
>> Have anyone done any real research with real-world numbers on the
>> server side when using DNS-over-TLS?
>
> Somewhat:
> https://ripe76.ripe.net/presentations/92-RIPE76_DNS_Privacy_measurements.pdf
> and
> https://ripe76.ripe.net/presentations/95-jonglez-dns-tcp-ripe76.pdf
>
> These numbers are not entirely 'real world' though.
>
>> And what happens during an attack and each client opens up a large
>> number
>> of new unique connections? Or if a vendor introduce a bug that does
>> not
>> reuse the TCP connection and open up a new one each time and not
>> closing
>> the unused one?
>
> I personally recommend having a proxy do the dnsdist termination, this
> means
> that at worst the proxy fails. This has also been measured in the
> presentations above.
Bert: Did you mean “dns” instead of “dnsdist” in the above
sentence?
>> Also how will this work in an ISP Anycast situation?
>
> DNS TCP is routinely anycast and this appears to work very well.
Agreed; we’ve not seen any complaints about broken sessions (we’ve
been running DTLS since our public launch 11/2017) though of course that
could be occurring with some regularity but not noticed. It seems that
most DTLS client implementations perform re-connections aggressively
enough to disguise any failures due to path shifts. Most anycast paths
are fairly stable so this is not as big a problem as it might seem. I
think a more significant problem arises with local distribution of
sessions across a large array of servers in a single POP, which is
something that the anycast provider needs to solve and which is
typically invisible to the client in the path. ECMP seems to be one way
to do this reliably, and I’m sure there is quite a bit of testing that
has been done in the 80/443 world to validate this model. To date, this
has not been an issue for TCP sessions for us in testing or in
production.
>> Personally I think that such studies should be done before any vendor
>> introduces this functionality. The study should also take into
>> account
>> for global DNS providers, ISP DNS providers and maybe enterprise DNS
>> infrastructure.
>
> People from the DNS Privacy Project are doing such measurements. It
> may
> also be possible to replay existing DNS traffic over DNS over TLS. I
> agree
> lots of measuring "in the real world" is required.
I’ll speak for Quad9 and say that there is quite a bit more work that
is needed on what the predicted impact of various client behaviors will
be on anycast TCP/TLS resolvers, mostly because we don’t know what
most clients will look like in final form yet (either DTLS or DOH.)
There will be a significant amount of development of new TCP-based DNS
clients in the next few years, and I suspect many will not get things
right on the first try, or will exhibit unfortunate behaviors due to
developers thinking they’re being clever for the client but in fact
being detrimental to the server side of the equation.
It might be useful to have a baseline test regimen that could be
developed as a best practices guideline for client implementations,
which included both the DNS and TCP components of the stack in the test.
The DNS-OARC “check my DNS” extended to be used as a resolver, with
some TLS and TCP timer behaviors might be interesting (not that the OARC
folks are lacking anything to do…) As a recursive operator, it’s
very difficult for me to tell clients “You’re broken!” without
some third-party site to which to reference them that clearly shows the
problem. DNSViz and OARC’s CMDNS have been great examples of how this
is useful for other protocol implementations; it makes implementers or
remote operators self-service on how to fix themselves.
Our current load volumes on 853 to 9.9.9.9 & 2620:fe::fe are still quite
small, even in European locations where DTLS seems to be most popular at
the moment. There is still not enough data here to build a comprehensive
model that applies to gigabits-per-second of DNS traffic, though
certainly we’d be interested in working with researchers who believe
they could make extrapolations based on the existing packet flows.
Contact me off-list.
>> Although we should aim to privacy, we should not jump in to a
>> solution
>> where operators actively will disable it due to resource and cost
>> limits.
>
> I'm afraid that if service providers will not make a move, the
> browsers of
> their subscribers will, and start prefering the DNS of their vendor or
> preferred partner, like CloudFlare.
>
> You mention disabling things, but DNS over HTTPS is specifically
> designed to
> be hard to disable.
>
> So the service provider community may not have a lot of choice, unless
> they
> are fine with third parties taking over their customers DNS (this is a
> common choice in Africa for example).
>
>> For me this kind of sounds like a way to promote Google DNS resolver
>> than
>> thinking for all other potential problematic scenarios that can
>> happen
>> when this is introduced.
>
> You may well be right. If this is an outcome we like or not is open to
> discussion...
The assumption of DNS being performed by the client’s transport
operator (or even visible to the transport operator) is not a safe bet
at this point under any circumstances, not today and certainly not in
the future, regardless of transport protocol or encapsulation. There are
many drivers for this disconnection, with some reasons more comforting
than others, but it is clearly happening.
JT
More information about the dns-operations
mailing list