[dns-operations] Additional information about the RIPE NCC reverse DNS issue
Doug Barton
dougb at dougbarton.email
Sun Mar 19 19:36:33 UTC 2017
On 03/19/2017 08:09 AM, Shane Kerr wrote:
> Doug,
>
> At 2017-03-18 18:34:25 -0700
> Doug Barton <dougb at dougbarton.email> wrote:
>
>> On 03/18/2017 08:46 AM, Anand Buddhdev wrote:
>>> Dear colleagues,
>>>
>>> This is a follow-up to our message of Friday about issues with some
>>> reverse delegations.
>>>
>>> After doing a thorough analysis, and with the help of ARIN staff, we
>>> found more issues with our zonelet generation code
>>
>> Can you say more about the benefit of this "zonelet" system vs. ARIN
>> simply delegating the appropriate zones to you, and you managing them
>> like any other DNS zone?
>>
>> I do appreciate you keeping the community informed about the causes of
>> the outage, but it seems that at least part of the root cause is that
>> you're operating what sounds like a fairly fragile system in the first
>> place, with (de fact) insufficient validity checking.
>
> I was at the RIPE NCC when we adopted the zonelet approach, although I
> haven't worked for them for over a decade.
>
> The zonelet system was designed to allow reverse DNS for IPv4 space
> that was originally assigned to one RIR but was later partially migrated
> to other RIRs. This happened when LACNIC and AfriNIC were formed,
> although I think that an audit was done at the time and so space was
> moved around between all 5 RIRs.
>
> The problem is that we could have a delegation like 999.in-addr.arpa
> going to the RIPE NCC and then 888.999.in-addr.arpa being managed by
> ARIN... but want 888.999.in-addr.arpa to point to the **address
> holder's** name servers, not **ARIN's** name servers.
>
> So ARIN needs a way to get the information about the name servers to
> the RIPE NCC somehow (and RIPE NCC to LACNIC, and so on). Zonelets are
> used for this, which is basically just the NS records needed, probably
> picked up using SSH.
>
> I think that we discussed using dynamic DNS (DDNS) for this at the
> time, but decided that the simplest & best solution was zonelets.
>
> DNAME could be used, but it would involve an extra lookup for
> resolvers, right? (DNAME was pretty new when zonelets were adopted, and
> I don't know that BIND 8 supported them, which was still the most
> popular DNS server at that time.)
>
> My guess is that the bugs are probably more due to ancient Perl code
> than an overly-complicated system for exchanging this information. Heck,
> it's possible that the bugs are due to MY ancient Perl code, although I
> really don't remember who wrote or tested the code....
Thank you, Shane for the explanation, which makes perfect sense.
RIPE folks, the operational answer to this problem would seem to be
having ARIN implement a sanity check such that if more than N% of the
information is changed in a given pass that humans need to get involved
to approve the change. I had a lovely chat with John Curran about that
on NANOG, which you can see starting here:
https://mailman.nanog.org/pipermail/nanog/2017-March/090626.html
Short version, they won't do anything differently unless you
specifically ask them to.
We all make mistakes, and I have no doubts that y'all have done your
best to find/fix the bugs that created the most recent problem. But I've
used similar sanity check systems in the past with good success.
Everyone makes mistakes, and there is no shame to a "belt and braces"
approach to critical infrastructure like this. I hope that you'll
consider it.
Doug
More information about the dns-operations
mailing list