[dns-operations] FreeBSD and the slaving of the root zone

Wed Aug 1 02:45:58 UTC 2007

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

Stephane Bortzmeyer wrote:
> Pierre Beyssac noticed that FreeBSD default configuration is now to
> slave the root zone from the root name servers who accept it:
> 
> http://www.freebsd.org/cgi/cvsweb.cgi/src/etc/namedb/named.conf
> 
> /*	Slaving the following zones from the root name servers has some
> 	significant advantages:
> 	1. Faster local resolution for your users
> 	2. No spurious traffic will be sent from your network to the roots
> 	3. Greater resilience to any potential root server failure/DDoS
> 
> 	If you do not wish to slave these zones from the root servers
> 	use the entry below instead.
> 	zone "." { type hint; file "named.root"; };
> */

Howdy,

I see this change has touched off quite a discussion, so I'd like to
respond. Rather than go piecemeal I'll try to do my part to keep the
conversation focused on the high points, and respond individually as
needed.

First, yes I am the one responsible for this change in the default
named.conf for FreeBSD 7 (soon to be 7.0-RELEASE) and 6-STABLE (soon
to be 6.3-RELEASE). Second I'd like to address David's point about it
being rude to do so without prior coordination. If I have offended the
operators of B, C, F, G or K (or any of the others for that matter) I
apologize. That was certainly not my intention. If the operators of
those servers ask to have them removed I will do so at once.

My motivations for the change are listed in order of importance above.
Tony Finch already posted the link to David Malone's paper
(http://www.caida.org/projects/oarc/reading/entries/malone04slaves.xml),
I think it's worth reading for anyone who has a deep interest in this
topic.

There is another element of this that is not mentioned in the paper
but was alluded to in this thread by Jason Fesler (and others). Namely
the overall benefit of having one server slave the root zone and then
in turn slave it out to a network of resolvers. Whether the benefits
to a local resolver outweigh the costs is an open question, but I can
say that it has a lot of benefits on a busy network like Yahoo!'s
since I was the one who set this up when I was the DNS Administrator
there. Nice to hear that they are still seeing some benefit from it.

It's probably also worthwhile to put what this change means in a
larger context. This change is in the default named.conf, but named is
off by default in FreeBSD. Users have to take an affirmative step to
enable it, and they are of course able to make changes to named.conf
as they see fit. I included the instructions in the comments on how to
change back to a hint zone if they would prefer. And while I'd like to
think that there are probably "millions" of FreeBSD users out there,
only those who enable named and use the default conf file (or at least
this part of it) will actually be slaving the zones.

Before addressing the details I'd also like to point out that just on
this list, where there are plenty of people who have more knowledge
and experience than I do, the opinions are pretty sharply divided. To
me that indicates that the "right" answer probably needs more
discussion and experience to illuminate itself. So let's take a shot
at it ...

Yes, it is theoretically possible that all 5 roots that are currently
offering AXFR would stop, but I don't see that as likely. I certainly
pay pretty close attention to these things, as do many others in the
FreeBSD community whose names y'all might not know. Also, if you
accept the premise that the poor innocent users who will be bamboozled
by this change will be so by virtue of the fact that they are blindly
accepting config file changes in the base, then it follows that for
those that upgrade, the problem is self correcting. For those that
don't, I particularly like David Conrad's quote about the problem
being, "self-immolating, thus self-remedying."

Paul mentioned some questions he asked David Malone that he didn't
provide answers for. If those questions haven't been addressed in this
thread yet I'd be curious as to what they are. And FWIW, yes, David
mentioned to me that his paper wasn't received with universal acclaim.

I won't presume to be the last word on the effects of DDoS attacks on
the root, others here are more qualified, and have already spoken. I
will say however that it is an inevitable fact of life that attacks
will become more sophisticated, more overwhelming, and more successful
as time goes on. It is, and always has been, an arms race. Therefore
solutions that take away a point of vulnerability deserve serious
discussion. It's also worth pointing out that these attacks have not
had much impact on the 'net as a whole, they are devastating to those
who are actually affected by them.

Leaving off the question of spurious traffic to the roots for now, all
other things being equal the issue of faster responses for the local
clients, for both "good" and "bad" queries, is a good thing, and
probably worth it on that basis alone. This is even more true in the
scenario above where all the resolvers in a busy data center have a
local copy of the zone slaved from a "local master." I don't think
it's telling tales out of school to say that the Yahoo! mail resolvers
are well within worst case scenario territory in terms of both sheer
quantity and diversity of queries. When I was there they saw a
substantial increase in responsiveness with this change, especially
right after a cold start (which given how hard they got pounded
happened more often than I would have liked). Of course the relative
benefits to individual hosts will scale down accordingly, but the
question in my mind (as others in this thread have also said) is
definitely "when does this become a good idea?," not "if."

Along this line I think Peter Dambier had excellent things to say
about authoritative zones not being subject to cache poisoning, and
local networks with erratic connectivity. If I can expand on the last
bit (and sadly introduce some layer 9 issues), one of the things that
you hear regularly in the WSIS context are leaders of emerging nations
insisting that they must have a "root server" in their country.
Misunderstandings about what that means aside, encouraging people who
might have bad connectivity to the "outside world" but good
connectivity locally to reduce their dependence on that outside world
is probably a good thing, and might even stave off some of the
political drama.

I personally don't want to deal much with the privacy question since I
think that's too far off topic for this list. However thinking again
about citizens in those emerging nations, the root server operators
will not be the only ones who might be interested in those queries;
even if we all agree that the root ops capturing, and they and others
analyzing that traffic is a good idea.

I think it's also worth noting that part of the heat (as well as the
light) surrounding this topic involves a clash between "older" and
"newer" Internet cultures, where "older" means "more centralized" and
"newer" means "more locally autonomous." On the off chance that anyone
is concerned about my reaction to Paul's vehement criticism of this
idea, rest assured that it comes as no surprise. :) I've had this
conversation with Paul going at least as far back as early 2002. I
didn't make this change lightly, and I certainly respect Paul's wisdom
and experience enough to know that he has a solid basis for his
opinions. I'm glad to hear that others on this list whose opinions I
also value have other ideas, and I'm glad that this question is
getting explored by the right people.

One of the questions that I don't know how to answer is that of
whether the "crap" traffic is an overwhelming problem (and therefore
worth trying to mitigate) or not. My read of the root operator
community is that on a good day they are fairly schizophrenic, both
individually (in some cases) and collectively (in many cases). :) That
is to say, back when AS112 was created the crap traffic to the roots
was considered to be a problem overwhelming enough to warrant throwing
non-trivial resources at. Now we have an excellent draft from Mark
Andrews
(ftp://ftp.ietf.org/internet-drafts/draft-ietf-dnsop-default-local-zones-02.txt)
which talks about the need to hard code empty zones into the named
binary to avoid having to add even more resources to that project. On
the other hand you have some operators on this list saying (pardon the
paraphrase), "yum, crap!" If the rootops (RSSAC?) want to come up with
a statement on this, I'll listen, and act accordingly. Otherwise, I
think it's good to keep talking about this issue, and whether and how
it's worth mitigating.

And now, a final burst of actual operational content. David Malone's
paper presented some interesting thoughts at the end about how this
idea could be an even bigger win. The status quo under which his
measurements were made is a 30 minute refresh interval, and
twice-daily increments of the zone even if only the serial is changed
(which is by far the common case historically). I think it's
reasonable to say that the SOA queries for refresh are far in the
noise. That leaves the question of not rev'ing the zone unless there
is an actual content change. I realize that any change to "how things
are done" is drama-inducing (believe me, I realize it). However given
that the actual distribution mechanism of the current root zone from
VeriSign to the rootops is more or less independent of what's actually
in the SOA, it seems reasonable to me that if there is a basis for
improving the efficiency (or reducing the impact) of this process that
it is entirely in the hands of the ICANN-VRSN-RootOp triumvirate.

Another big potential win pointed out in David's paper is the
possibility of using IXFR instead of AXFR. Even if we decide that
rev'ing the serial twice a day is still a good thing, an IXFR of that
change would hardly be measurable.

Finally at the risk of being somewhat self serving, I'd like to quote
David Conrad:

> I think this is orthogonal, but https://ns.iana.org/dnssec/ 
> status.html  (:-)).  More seriously, it may be possible that  
> separating the zone publication from zone serving could result in  
> getting a signed root zone out more quickly (the rationale being that  
> in theory at least, you have to be slightly more DNS cognizant to be  
> able to set up a root slave and thus, would be willing to participate  
> in a root zone DNSSEC experiment).  I don't know if anyone has  
> seriously proposed something like this, but there might be a remote  
> chance it could fly...

Well, yes David, someone has proposed it. I know I did back when I
tried to get a serious conversation started about ICANN getting the
root signed (where serious means, "How can we get moving on this?"
not, "boy, this is a serious problem"), and I'm sure others have too.
If we can use this idea to gain leverage for a signed root, fantastic!
If others want to propose a dedicated [AI]XFR network with TSIG,
DNSSEC, PGP, what have you, that's good too! I hope it's clear by now
that I don't think this is the final solution. But if we can take some
more steps down the path, so much the better.

To anyone who has hung in this far, thanks, and I owe you a $BEVERAGE.

Doug

- --

	If you're never wrong, you're not trying hard enough
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFGr/PlyIakK9Wy8PsRA/h+AKDQpnS0ZzoFbuPuHu1ZWVAFqt1BFQCffVC0
YTStrhLLZoqKtH5DtP44dfs=
=jMgu
-----END PGP SIGNATURE-----