[dns-operations] Observing DNS Propagation Inconsistencies Across Global Resolvers

John Todd jtodd at loligo.com
Tue Mar 10 16:52:15 UTC 2026


That's disappointing. Thanks for the heads-up.

That's the kind of thing that causes significant dis-engagement 
tendencies, speaking for myself - I just wasted 10m on talking to a bot 
that looked like a well-meaning but inexperienced newcomer to the 
community. I have no immediate ideas on how to prevent it, though.

JT


On 9 Mar 2026, at 21:23, Ondřej Surý wrote:

> John, this is just a SEO spammer replying to emails with AI slop 
> (copy&paste), see the similar exchange here: 
> https://mailman.ripe.net/archives/list/dns-wg@ripe.net/message/E4SENMXPYAF77X4WY3QOR3SRFWPIQGZB/--Ondřej 
> Surý (He/Him)
>
> A gentle nudge is always appreciated if I take a little longer to 
> reply.
>
>> On 10. 3. 2026, at 1:01, John Todd <jtodd at loligo.com> wrote:
>
>>
>> 
>> Hi Vahid -
>> Thanks for the tool; always good to see more people building things 
>> for measurement. I'm with Quad9. A few 
>> notes/questions/clarifications:
>>
>> Quad9's "primary" should probably be 9.9.9.9 & 149.112.112.112 & 
>> 2620:fe::fe - you have us as 149.112.112.13, which is not even listed 
>> in our production set of resolvers. See 
>> https://quad9.net/service/service-addresses-and-features/ for 
>> details.
>>
>> Are you using IPv6 to connect to resolvers in your tests? That would 
>> probably give faster results in some cases.
>>
>> Given the latency figures at the bottom of the page, I'm guessing 
>> you're testing public resolvers from within a very specific 
>> testpoint. If you're only testing to public resolvers from one test 
>> point, the data is going to be very skewed and may give people quite 
>> non-intuitive (false) impressions. For instance: I'm fairly confident 
>> that Quad9 and Cloudflare are not >21ms for the vast majority of the 
>> world's population, which is over my repeated tests what seems to be 
>> shown.
>>
>> The question mark icon beside each testing point does not work. What 
>> was it supposed to show?
>>
>> Are you performing DNSSEC validation on results? If DNSSEC fails, is 
>> there a special error type shown? Hopefully "yes".
>>
>> Your test seems to be focusing on recursive resolvers, but that is 
>> only half the problem set when you are trying to determine why a zone 
>> is not sync'ed. The first half of the problem is un-described by your 
>> tool, and that is "does the authoritative anycast array have the 
>> correct answers to give?" If you're just measuring what the recursive 
>> resolver says, you can't actually identify where the problem is. The 
>> only way to identify the actual problem (or at least a closer view, 
>> which will still be imperfect) is to have both the recursive resolver 
>> AND a direct query to the authoritative servers from the same 
>> location. A significant problem with zone synchronization is 
>> propagation between anycast instances on the authoritative side of 
>> the equation. This will be more visible to you with more testing 
>> across zones that have anycast authoritative servers.
>>
>> I would expect that clicking on a city name would bring up the list 
>> for ALL recursive resolvers, and their answers from that perspective. 
>> Quad9, Google, Cloudflare, OpenDNS, and many others have enough 
>> anycast nodes that you would need to perform a many-to-many test for 
>> all queries in order to determine if results are lagging. Right now, 
>> a single test in each geography is not very meaningful. The point of 
>> having servers in remote geographies is to test both the 
>> authoritative AND recursive behaviors of large arrays (though of 
>> course you should still test more local resolver caches as well.)
>>
>> I hope you have some math in your system that accounts for TTL when 
>> triggering warnings for "out of sync".
>>
>> (probably the most important question) How do you even validate 
>> correctness of answers? CDNs may have 1000 possible different answers 
>> on a specific query, and which one you get varies depending on 
>> geography/time/cost/phase of moon/routing/etc. You may only see an A 
>> record once across your entire fleet of queries. How do you know this 
>> particular IP address is correct for that query, at that time, from 
>> that geography? In other words: what is the baseline against which 
>> you are testing "correctness"? You can't compare resolvers to each 
>> other, and unless you know the ground truth state of the zone from 
>> the authoritative's perspective (which may be programmatically 
>> generated) then you can't derive that a particular answer is valid or 
>> not. So I'm not sure what the tool provides except in the most 
>> simplistic cases where there is a non-dynamic answer for a 
>> qname/qtype. This may still be enough for some people to find useful 
>> but I suspect the people most concerned with zone update latency may 
>> not be using static zone data.
>>
>> To answer or clarify some background on your questions:
>>
>> "serve-stale" is a behavior some recursive resolvers utilize (Quad9 
>> does.) If authoritative servers are unreachable, the stale record may 
>> be served in lieu of failing. This behavior should be coupled with 
>> other indicators that something is wrong with the authoritative 
>> server, though those problems may not be visible from your testing 
>> vantage point. I don't know if that is the root cause of what you 
>> describe, but it exists.
>>
>> Can you provide some examples of this DS behavior?
>>
>> Again, some examples of CNAME flattening would be useful. Who does 
>> this? Do we? Is the answer correct in the A/AAAA result at the end, 
>> or is this "breaking" in some way that prevents expected client 
>> connection behavior?
>>
>> I'd say that expectations of "fully propagated" would be having 
>> records updated within ${now}-{$last-ttl-value+2s} in normal 
>> conditions. Each resolver may get a query that holds the record for 
>> the TTL just before you ask, so you have to assume that the TTL needs 
>> to expire before you can trust that an answer has been refreshed. I 
>> put the caveat of "in normal conditions" due to serve-stale behavior. 
>> You may not know "last-ttl-value" as well unless you're testing prior 
>> to change events and are certain that TTL is fixed and not generated 
>> per-query so that is another edge case that may catch you.
>>
>> [note: I wrote this when I had a few unexpected minutes of time; 
>> replies will be very very slow if at all]
>>
>> JT
>>
>> On 2 Mar 2026, at 9:12, Vahid Shaik wrote:
>>
>>> Hello,
>>>
>>> I have been building and maintaining an open DNS propagation 
>>> monitoring platform at DNS Robot 
>>> (https://dnsrobot.nethttps://dnsrobot.net/ ) that queries 30+ 
>>> globally distributed DNS resolvers simultaneously to track record 
>>> propagation in real time.
>>>
>>> During development and ongoing monitoring, I have observed some 
>>> interesting inconsistencies in how different resolver 
>>> implementations handle TTL expiry and cache refresh behavior. 
>>> Specifically:
>>>
>>> Some major public resolvers (particularly in Asia-Pacific regions) 
>>> appear to serve stale records well beyond the configured TTL, 
>>> sometimes by 2-3x the expected duration.
>>>
>>> DNSSEC-signed zones occasionally show inconsistent DS record 
>>> propagation between parent and child zones during key rollovers, 
>>> visible when checking multiple resolvers within a short window.
>>>
>>> CNAME flattening behavior varies significantly — some resolvers 
>>> return the flattened A record while others return the CNAME chain, 
>>> which can cause confusion when debugging propagation.
>>>
>>> These observations come from real-time queries across resolvers in 
>>> 20+ countries. I am curious whether others on this list have 
>>> documented similar patterns, or if there are known resolver-specific 
>>> behaviours that explain these discrepancies.
>>>
>>> I would also welcome any feedback on best practices for measuring 
>>> propagation completeness — specifically, what threshold of global 
>>> resolver agreement should be considered "fully propagated” for 
>>> operational purposes.
>>>
>>> Best regards,
>>>
>>> Shaik Vahid
>>>
>>> DNS Robot — https://dnsrobot.nethttps://dnsrobot.net/
>>>
>>> Free DNS Propagation Checker & Network Tools
>>>
>>> dns-operations mailing list
>>> dns-operations at lists.dns-oarc.net
>>> https://lists.dns-oarc.net/mailman/listinfo/dns-operations
>>
>> _______________________________________________
>> dns-operations mailing list
>> dns-operations at lists.dns-oarc.net
>> https://lists.dns-oarc.net/mailman/listinfo/dns-operations
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.dns-oarc.net/pipermail/dns-operations/attachments/20260310/e6723842/attachment.html>


More information about the dns-operations mailing list