[dns-operations] some advice on configuring the dns resolver inside "systemd"

Thu Dec 28 17:24:57 UTC 2017

On Thu, Dec 28, 2017 at 7:29 AM, Florian Weimer <fw at deneb.enyo.de> wrote:
>
>
> The fact is that some users push very hard for state tracking in the
> stub resolver because that's what they are used to from other systems,
> despite its inherent limitations. It's unlikely that we're going to
> implement this part of the glibc stub resolver, though, because there
> already many alternative solutions.

It's not simply a matter of what the other systems do, it's also the extent
to which things snowball when the first attempted server is unreachable.
Most sysadmins without a degree in DNS do not realize how badly search
domains interact with glibc's DNS module. Every search domain permutation
will be attempted before resolution moves on to the second server, plus an
additional attempt to resolve the absolute name if the *ndots* criteria is
met (default: 1 dot). They're a force mutliplier to the timeout that makes
the lack of state tracking for remote server availability extremely
painful, and much greater than the 5 seconds that people like to talk about
being a problem for mission critical applications.

We can have the usual discussion over the evils of search domains, but the
reality is that 1) the math behind this snowball is not evident to most
sysadmins consuming the manpage of resolv.conf, and 2) generally they do
not get to choose the number of search domains that their employers require
them to maintain support for. Mixed server environments typically require
at least one because Active Directory exists, and a second because the *nix
admins will run their own DNS servers if anyone gives them a say in the
matter. That's two, and it only goes up from there.

To avoid a philosophical debate over whether glibc's behavior is the
"right" one, I'll simply say that most users of glibc's NSS module for DNS
(read: everyone who isn't on this list and a few more besides) don't
understand the extent to which these factors multiply with each other.
Without that data, it's hard for them to understand just how badly the
defaults are going to impact their production environments when the first
server goes down for any reason -- planned or otherwise. It's quite a lot
of avoidable, repetitive pain that companies world over are forced to learn
and re-learn because the math is not readily on the tin/user docs.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.dns-oarc.net/pipermail/dns-operations/attachments/20171228/d4576ca9/attachment.html>