[dns-operations] A strange DNS problem (intermittent SERVFAILs)

dagon dagon at sudo.sh
Sat May 30 18:50:53 UTC 2020

On Sat, May 30, 2020 at 06:09:24PM +0200, Stephane Bortzmeyer wrote:

> The existing DNS configuration is clearly very questionable, such as a
> zone delegated to just one name server, and a broken one, replying
> REFUSED for NS and SOA queries.

> The question is "how did this incorrect setup can produce *sometimes*
> a resolution failure?"


  1) I speculate some cloud resolvers like Cloudflare enforce 1035's
     requirement that a valid zone ('master file') contain at least an
     SOA and NS, and therefore fail lookups in the zone.  This is a

  2) And as you know qname minimization (strict or flexible) should
     work here.  But perhaps Cloudflare has another 7816-like policy
     that fails this zone?  This is also a complete guess.

     In writing intentionally bad DNS authorities, I've found the
     cloud DNS providers have batteries of checks (e.g., for ECS
     support, for child NS, for SOA, etc.).  Some like OpenDNS even
     probe each label level of lenghty qnames for zone cuts.

     The cloud providers promise safety improvements, so perhaps the
     missing SOA causes a fail, or they implemented 7816 differently.

  3) Perhaps an appliance is not passing SOA queries?  Even if cloud
     providers don't enforce the 1035 'minimal master file'
     requirements, the bank's authority should.  How can you even load
     such a zone in a modern authority server?  All modern auth
     servers would fail, I believe.

Most malware domains do a better job of provisioning and
configuration.  Can this bank be helped?

David Dagon
dagon at sudo.sh
