Trouble resolving .by TLD due to circular dependency?

Sascha E. Pollok sp at iphh.net
Tue Jan 11 11:44:36 UTC 2022


Hello nice people,

for a few days I have worked on an issue we see with our Bind resolvers of different 
versions regarding resolving addresses under .by. I assume it is not Bind's fault at all 
but the result of a circular dependency in .by after a change of the Auth NS beginning of 
January but let me explain what I see.

If I start on an empty cache I query the Root NS for .by and get back these Auth NS with 
originally 2d TTL:

by.                     130511  IN      NS      dns1.tld.becloudby.com.
by.                     130511  IN      NS      dns3.tld.becloudby.com.
by.                     130511  IN      NS      dns2.tld.becloudby.com.
by.                     130511  IN      NS      dns5.tld.becloudby.com.
by.                     130511  IN      NS      dns4.tld.becloudby.com.

They come with Glue A records:

dns5.tld.becloudby.com. 172800  IN      A       54.180.35.203
dns4.tld.becloudby.com. 172800  IN      A       184.72.17.94
dns2.tld.becloudby.com. 172800  IN      A       93.125.25.73
dns1.tld.becloudby.com. 172800  IN      A       93.125.25.72
dns3.tld.becloudby.com. 172800  IN      A       185.98.83.4

The becloudby.com is maintained by two nameservers:

;; QUESTION SECTION:
;dns1.tld.becloudby.com.                IN      A

;; AUTHORITY SECTION:
becloudby.com.          172800  IN      NS      u1.hoster.by.
becloudby.com.          172800  IN      NS      u2.hoster.by.

These machines u1+u2.hoster.by have these IP addresses:

;; QUESTION SECTION:
;u1.hoster.by.                  IN      A

;; AUTHORITY SECTION:
hoster.by.              3600    IN      NS      dns2.hoster.by.
hoster.by.              3600    IN      NS      dns1.hoster.by.

Asking one of them for the IP of u1.hoster.by:

;; QUESTION SECTION:
;u1.hoster.by.                  IN      A

;; ANSWER SECTION:
u1.hoster.by.           3600    IN      A       93.125.30.201

These nameservers are again under the .by TLD. If we query them for the IP of the .by TLD 
servers, we get a TTL of 600 and an IP:

~$ dig +norec dns1.tld.becloudby.com a @93.125.30.201   # Asking u1.hoster.by for 
dns1.tld.becloudy.com

;; ANSWER SECTION:
dns1.tld.becloudby.com. 600     IN      A       93.125.25.72

Which now has a TTL of 600 as opposed to the TTL that the Root NS gave us (2 days).
This entry overwrites the previously received Glue A record for dns1.tld.becloudby.com 
which TTL is now 600.

If this cache entry expires, the following steps happen:

- We need to ask again u1.hoster.by for this IP address. This works as its TTL is longer 
(3600s) and we still know that IP.
- Once the cache entry of u1.hoster.by expires also, we need to go back to dns1.hoster.by 
and ask for the IP of u1.hoster.by. This entry is also 3600s and would expire now or later.
- Once dns1.hoster.by is expired, we still had the 2 days TTL entry for .by in our cache 
but without the Glues as the 2d glues have been overwritten by the 600s responses for 
dns1.tld.becloudby.com (reminder: the Auth NS for .BY).

So what we have left in cache are NS entries for the .BY nameservers but without IP 
addresses which causes a SERVFAIL in Bind. Seems to me like there is a circular dependency 
between those servers which is not obvious to a lot of users if e.g. their resolvers use 
the glues parent-centric like Google's 8.8.8.8 NS).

Does this analysis seem correct and are there maybe any .BY ccTLD people on this list to 
take a look at this? I have worked on this together with Anand Buddhdev so I want to thank 
him for working with me. Always a pleasure.

Thank you!

Cheers
Sascha



More information about the dns-operations mailing list