[dns-operations] Should medium-sized companies run their own recursive resolver?

Thu Oct 17 14:51:31 UTC 2013

> From: Jared Mauch <jared at puck.nether.net>

> I think the difference is this is an -operations list, so I'm looking at/around things
> that can be done to operate the equipment.

Then object to the hypothetical DNS appliances proposed by other on
the grounds that Amazon doesn't sell them today instead of nonsense
about technical impossibilities and pointing-and-clicking IP TTLs.

> > GUI pointing and clicking to maintain a suitable stanza into a DNS
> > server text configuration file would be almost as trivial.
>
> This certainly can be true, but the average operator isn't going to understand
> that IP TTL != DNS TTL, and may not even be aware of that part of the packet.

That's irrelevant, because the user interface would not talk abut "IP
TTL" any more than it now talks about IP fragmentation, wire format
decompression, or other arcana.  The UI might ask about the number of
routers or address blocks in the organization.  Better would be to not
ask the user at all, but sell the box for use in organizations of at
most 100 users and 5 IP address blocks.  The 2 IT professionals in the
scenario at issue would know their number of IP address blocks and
routers and so a good upper bound on TTLs even if they can't spell IGP,
because they'd be paying the ISP bill.

> The average user is going to use a document like this to configure their DNS:
>
> https://support.google.com/a/answer/48090

Yes, and so what?  Knowing how many IP address blocks you've rented
from Comcast is less obscure than messing with the MX and CNAME RRs
that document talks about.

To foreclose yet another nonsense objection, if you have 5 blocks,
then a TTL of 6 or 10 would close your resolver.  That a smaller TTL
that depends on topology would also work is irrelevant.

> Most of these "advanced" DNS things like RRL, RPZ and others aren't for
> the faint of heart.  Most people don't watch/monitor logs like those here.

RPZ is easier to use in common cases than a classic DNSBL and RRL is
even easier.  Operators have trouble only because they insist on
fiddling with knobs that they don't and don't need to understand.
Instead of copying the 4 line configuration from the RRL web page,
they read all of the documentation and set all of the knobs to crazy
values because they understand less than they realize.  When the glamour
of RRL and RPZ has worn off, users will treat them as boring black
boxes like DNSBLs in SpamAssassin, and most of the complaints will stop.

> I can't even get my vendors to fix their software bugs after months of saying
> "it breaks when I do X", and I pay you $X mm/year to service these, including
> the software updates.  Even for those advanced in the space, these things
> are difficult, unclear and fragile at best.

>From my perspective with decades on the vendor side, most of the
problems are caused by users who insist on changing and controlling
things that you haven't had and never will have time or inclination
to really understand.  Much of that lack of understanding comes
from hard to understand documentation such as mine for RRL and RPZ,
but the operational problem are caused by adminstrators who are incapable
of saying or even thinking "I don't know (and so should keep my mitts
off)",

and so can't imagine or admit that `named -c/dev/null` might work fine.

Talk about pointing-and-clicking IP TTLs to close resolvers is an
example.  The low level documentation and controls would necessarily
talk about "IP TTL", but the high level interface would be "on" and
"off", perhaps with "off" disabled or hidden.  The troubles would come
from use who don't understand, and think "Some is good, more is better,
and so set the TTL to 1" or "too small is bad, to set the TTL to 200".
(I'll scream if I have to argue with another user ignoring the suggested
RRL limit in favor of 10X or 100X.)

Your talk about pointing and clicking TTLs examplifies that problem
of harmful obsessive knob fiddling.
I didn't mention 3 or 5 at random, but because a TTL >3 and <10
would fit more than 80% of installations and safely close the resolver.

> > More important, why do you ignore my point about required minimum
> > competence?  Long ago, you could buy an airplane and go into business

> I think the challenge here is that there is no true certifying authority for
> the industry.  You speak of state or possible federal (or transnational licenses)
> regimes of inspection, licensing, authorities, etc.  They don't exist here.

It is a political problem, and political problems are addressed
when enough of the punters demand solutions.

> > "Economics" in this century have nothing to do with where and when
> > local DNS caches are good or bad, necessary or useless.
>
> I think that's the point of the discussion though, "Should medium-sized companies run their own recursive resolver".  100 subscribers, with possibly 500+ devices behind it (n*iPhone+n*iPad+n*Android+n*TV+n*Appliance) are common these days.  You can easily assume that each person has at least *one* device,with average household sizes, etc.. and you're in the right range for 4-5+ devices per home at least.  So 500 unique stub resolvers is a large population, but at the same time, it's not enough to justify spending $1k for a server *2 + UPS.  That cost on my friends WISP would wipe some or all profits.  Perhaps he should charge more, but he doesn't have the knowledge or skill to operate things, but does know it might help to have a faster box on-net

That's yet another nonsensical and offensive (because it insults the
reader's intelligence) red herring.  That organization doubtless has
a firewall, often part of a router.  That firewall could easily run a
recursive resolver for those 500 stubs, because DNS requires only a
fraction of the resources needed for firewalling or routing.  The
configuration of that recursive resolver could be entirely automatic,
because the firewall has all of the parameters needed to configure a
closed verifying recursive resolver.

Yes, your Open Resolver project might have found that many borken open
DNS forwarders in CPE routers and bridges answer WAN requests.  The
fault for that lies with your big ISPs with those big, "creative" DNS
resolvers, because they chose the incompetent, below cost CPE vendors,
and shipped and supposedly continue to support and maintain that junk
CPE.  A dispassionate observer might see that as a compelling argument
against using more products from your big ISPs, such as their resolvers.

> > I am offended on behalf of those hypothetical IT professionals by your
> > persistent infantilizing them.  Attitudes like yours in ISPs are why
> > there there is so little BCP38 compliance and so many open resolvers.
>
> Ha!  BCP-38 is actually *hard*.  Customers don't know their IP
> ranges or domains they own/operate.

That's yet another irritating and offensive red herring.  As you
know, BCP 38 is done in the vicinity of routers.  Customers who
don't know their IP address blocks aren't running their own routers
and aren't responsible for BCP 38.  Even when customers are configuring
their own BGP, the primary responsibility for BCP 38 is upstream at the ISP.

> > If ISPs would refuse to route packets to customers that can't comply
> > with BCP38 or that run unnecessary open resolvers or open resolvers
> > unprotected by rate limiting, then a lot of problems would go away.
>
> I encourage you to start an ISP and report back your results :)
>
> I can either forward packets or filter them in many cases.  The damaging packets, even when there are large attacks are still such a small percentage of the overall traffic that I can't sacrifice them for the sake of others.  You've seen other providers back away from BCP-38 over the years.  I continue to strive in this direction and I resent the fact that you think we're all not trying to do it.

So an ISP expectations of riches justifies taxing the rest of the
Internet to deal with the ISPs abusive customers?
For years, many ISPs and their employess said the same things about
their spamming customers.  Those that didn't respond to spam reports
with "Just hit delete and stop bothering me sonny" demanded that I
dontate my time to them (ISPs) by making spam reports.
Both tactics were and were intended as substitutes for dealing with
spamming customers.  ISPs whined about losing customers should they
outlaw outgoing spam.  Some such as MCI/UUNET spun lies about the
technical impossibility of counting outgoing TCP SYNs to port 25 as
reasons why MCI/UUNET could not find spam friendly resellers.

Today the spam problem is largely solved, partly by better filters,
but also by ISPs discovering that ignoring spam is unprofitable.

> If you think you can build a better router that can do it right,
> please do.  The market needs it, but that's OT for this list.

That's yet another irritating and offensive red herring, because I
said nothing about building better routers and because better routers
are not needed for BCP 38.  What's needed is equivalent of the spam
solution, de-peering ISPs that can't be bothered.

Vernon Schryver    vjs at rhyolite.com