[dns-operations] Delegation checking (was: Re: Some DNSSEC trivia)
Ed.Lewis at neustar.biz
Thu Jan 10 14:54:46 UTC 2008
At 3:51 +0100 1/10/08, Michael Monnerie wrote:
>I've read all this thread with big interest, and there seems to be the
>side of techies who like to fix broken and bad things, and the lawyers
>who are concerned about contracts.
I know my responses, on-list and off, put me into the lawyers
category. But I have been through this work as an engineer and know
well why it is not worth the cost. I have also worked in TLD
registries and understand the tradeoffs made in resource allocation.
Engineers are trained to fix a problem at all costs. That's they way
I was brought up. I had to learn the hard way when it came time to
get the funds to work on a task "just to fix something" and
discovered that what I was taught as an engineer was wrong. A
problem is to be fixed only if it's worth fixing.
>Has anybody got estimations or real numbers about how many problems
>would be solved, how many domains would not meet requirements, and what
>this would help to save the planet? I read about "bad
>domains", "hackers", "poisoning" etc. but how many are there really?
When I last measured (probably 3 or 4 years ago), in a portion of the
reverse map, it ranged from 30% where a registry would cut the
delegation before informing the registrant of the zone name (unlike
the forward, the registrant likely cannot guess the name ahead of
registration) to under 10% at registries that cut the delegation only
after it was running.
After a pass at lameness checking, the 30% number dropped a bit. It
didn't get tracked over time, but when it dropped, it seemed to rise
again because new zones were being cut constantly. (Keep in mind
that there are many differences between the reverse and forward maps
To me that indicated that a check before delegation did have a big
impact on the rate of lameness. The impact was bigger than repeated
testing and nagging.
The reason I was asked (by the "customers" of the registry) to chase
down lameness was an old resolver that didn't realize a lame
delegation message was that and not a referral. I.e., the resolver
didn't recognize it was being referred back to the root or towards
the root. Instead of stopping, the resolver followed the message.
What was overlooked at the time was that a non-response did not
trigger this, yet that was also the focus. But the problem that
launched the work was "true" lameness, where a server is not
authoritative for a zone it is thought to be authoritative for, not
By the time I put down my pencil on the problem, the broken resolver
had largely disappeared from the network and the ill effects of lame
delegations had faded into the noise below things like spam, botnets
and other wasted use of transmission energy. The policy has remained
on the books but may be removed for other reasons.
>It would be interesting to see some agreement on what checks should
>exactly be done, and then just run these (silently, without any
>shutdowns or informational e-mails or whatever) - looks like there are
>enough people here that are able to to it to a respectful number of
I couldn't even settle on my own set of checks that should be done,
much less get a consensus. You can always get agreement on a set of
checks if you pare back to a few obvious ones, but then the sentiment
is that you are not aggressive enough and you aren't going to catch
enough of the broken delegations to make it worth the effort.
E.g., is a zone broken if one of the server delegations still works?
If one worked, then any, including the broken resolvers, eventually
got there and the looping problem was squelched. I used to label the
choices as "save the Internet" and "save the DNS". The former
limited reporting to zones that were completely unreachable and the
other reported server problems.
E.g., in statistics, do you quite percentages by zone, by NS record,
by name server addresses, unreachable zones, partially reachable
I don't mean to say you can't define the problem, but there are many
angles to it and probably there's no definition that will reach a
Edward Lewis +1-571-434-5468
Think glocally. Act confused.
More information about the dns-operations