[dns-operations] Domains delegated to blackhole consuming all recursive slots
Ondrej.Caletka at cesnet.cz
Mon Mar 24 15:46:11 UTC 2014
for a few weeks we are seeing that our recursive nameservers are
returning SERVFAILs irregularly. By analysing the logs, it looks like
the default BIND quota of 1000 concurrent ongoing recursions is being hit.
Analysing output of `rndc recursing` shows that there is a lot of
queries like this:
; client 126.96.36.199#50037: 'cxqhupitwhmb.biantai666.cbi1.net'
; client 188.8.131.52#39096: 'ilydadqjobypmz.biantai666.cbi1.net'
; client 184.108.40.206#35832: 'kxszsnwtufqbob.biantai666.cbi1.net'
; client 220.127.116.11#33908: 'cropapebglizol.biantai666.cbi1.net'
; client 18.104.22.168#34537: 'qfsxyhmzedwlsd.biantai666.cbi1.net'
According to recent thread: "Sporadic but noticable SERVFAILs in
specific dual stack nodes in an anycast resolving farm", I assume that
this is some kind of C&C communication of a botnet. The problem is that
the authoritative servers responsible for such C&C domain are now
somehow blackholed on IP level (or just choked under the amount of traffic).
By not getting any answer, the query will stay in recursing state for
very long time, eventually filling up the limit of 1000 concurrent
recursions. Increasing the limit is possible but there is a risk of
reaching limits on another level (like number of open file descriptors).
I'm now working this around by defining the zones with longest recursion
times as authoritative with no data. But this have to be checked
manually to be sure no legitimate domain (like in-addr.arpa) would be
There should be better solution. Something like cache for unreachable
nameservers so the non-responding nameserver would be considered dead
for a couple of minutes.
Am I missing something?
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 5563 bytes
Desc: Elektronicky podpis S/MIME
More information about the dns-operations