[dns-operations] Problems in AERO?

Alexander Gall gall at switch.ch
Thu Nov 23 19:40:35 UTC 2006


James,

On Thu, 23 Nov 2006 16:35:57 +0000, James Raftery <james at now.ie> said:

> We got a customer report of their AERO domain intermittantly failing to
> resolve (and so some of their outbound email being rejected by receipient
> mail servers verifying their sender address). 

> These nameservers for AERO:

> 	dns7.denic.de.		81.91.161.68
> 	ns5.knipp.de.		195.253.6.62
> 	merapi.switch.ch.	130.59.211.10, 2001:620::5

> were returning NXDOMAIN for the customer's domain while
> tld(1|2).aerodns.aero were returning the expected records.

> Those three were also giving NXDOMAIN for `tld1.aerodns.aero', one of the
> two other .AERO nameservers. It's fixed itself in the past few minutes.

I'm running the name server on merapi.switch.ch.  I don't know either
what was going on, but the setup for zone propagation is a bit
peculiar.  The three servers above appear to use different (hidden)
masters than tld{1,2}.aerodns.aero.  These two sets of masters are
usually out of sync, like as I'm typing this:

$ host -C aero.
Nameserver merapi.switch.ch:
        aero has SOA record tld1.aerodns.aero. domadmin.ultradns.net. 2006116854 3600 1800 6048000 3600
Nameserver dns7.denic.de:
        aero has SOA record tld1.aerodns.aero. domadmin.ultradns.net. 2006116854 3600 1800 6048000 3600
Nameserver ns5.knipp.de:
        aero has SOA record tld1.aerodns.aero. domadmin.ultradns.net. 2006116854 3600 1800 6048000 3600
Nameserver tld1.aerodns.aero:
        aero has SOA record tld1.aerodns.aero. domadmin.ultradns.net. 2006116861 3600 1800 6048000 3600
Nameserver tld2.aerodns.aero:
        aero has SOA record tld1.aerodns.aero. domadmin.ultradns.net. 2006116862 3600 1800 6048000 3600
 
The zones sometimes differ by up to a few tens of generations.  I
don't know why UltraDNS is doing this, but it explains why the
secondaries form two equivalence classes.

However, what you have observed doesn't look like just slightly
different versions of the zone but could indicate that the masters
serving these three secondaries temporarily handed out a bogus zone.

We did in fact see a failed transfer due to a corrupted zone (BIND
complained that the "zone has no NS records") at around 15:06 UTC+1,
but I'm pretty sure the zone wasn't loaded.  But it shows that at
least one of the master servers did have a problem of some sort around
this time.

--
Alex




More information about the dns-operations mailing list