[dns-operations] Question for fellow users of RIPE ATLAS (broken/saturated probes or what?)

Moritz Muller moritz.muller at sidn.nl
Tue Dec 11 12:45:43 UTC 2018


Very interesting finding.

I mentioned your observation on Twitter and got referred to two papers taking about this issues:

https://dl.acm.org/citation.cfm?doid=2805789.2805796
https://clarinet.u-strasbg.fr/~pelsser/publications/Holterbach-ripe-atlas-sharing-imc2015.pdf

The second one describes that the latency increases when the load on a V2 probe is high as well.
This could also explain why you see different extremes of this phenomenon.

- Moritz


> On 11 Dec 2018, at 10:13, Giovane Moura <giovane.moura at sidn.nl> wrote:
> 
> Hi Jake,
> 
> Thanks for pointing this. We also monitor our authoritative name servers
> but I was not aware of this issue. This is particularly important when
> deciding where to deploy new anycast sites.
> 
> Say a country X observes 200ms latency to one of your NSes. You'd need
> to break it down per probe version,as you did, and consider version>6000
> to be sure that this is not an artifact.
> 
> Monitoring systems that rely of the variation of RTT should be OK, at
> least in the aggregate.
> 
> thanks,
> 
> /giovane
> 
> 
> On 12/7/18 10:13 PM, Jake Zack wrote:
>> Hey all,
>> 
>> 
>> 
>> I often do DNS tests via RIPE ATLAS to confirm that changes and/or new
>> additions to our anycast network haven’t created any collateral damage
>> to our clouds and others.
>> 
>> 
>> 
>> I’ve noticed that the first several thousand probes (probe ID’s < ~6000)
>> consistently return inaccurate/terrible/useless results to DNS tests.
>> 
>> 
>> 
>> Has anyone else noticed this?  Has anyone else reported it to RIPE
>> ATLAS?  Any theories?
>> 
>> 
>> 
>> I’ll first use DNS queries coming out of Belgium as an example…
>> 
>> 
>> 
>>                 My average response time from Belgium to
>> ANY.CA-SERVERS.CA is 25.103 ms when you exclude probe ID’s < ~6000.
>> With those probes, it’s 83.944 ms – Greater than a 300% difference?!
>> 
>> 
>> 
>> The worst part is that the ‘traceroute’ functionality never seems to
>> show this latency…so I feel powerless in fixing this brokenness from my end.
>> 
>> 
>> 
>> And it’s not just traffic from Belgium, to be clear…
>> 
>> 
>> 
>>                 My average response time from Ireland to
>> ANY.CA-SERVERS.CA is 21.296 ms when you exclude probe ID’s < ~6000.
>> With those probes, it’s 33.507 ms.
>> 
>>                 My average response time from Netherlands to
>> ANY.CA-SERVERS.CA is 22.187 ms when you exclude probe ID’s < ~6000.
>> With those probes, it’s 77.280ms.
>> 
>>                 My average response time from Poland to
>> ANY.CA-SERVERS.CA is 43.910 ms when you exclude probe ID’s < ~6000.
>> With those probes, it’s 89.235 ms.
>> 
>> 
>> 
>> And it’s not just CIRA…
>> 
>> 
>> 
>>                 The average response time from Belgium to SNS-PB.ISC.ORG
>> is 22.883 ms when you exclude probe ID’s < ~6000.  With those probes,
>> it’s  33.169 ms.
>> 
>>                 The average response time from Belgium to
>> F.ROOT-SERVERS.NET is 15.094 ms when you exclude probe ID’s < ~6000.
>> With those probes, it’s 23.076 ms.
>> 
>>                 The average response time from Belgium to AC1.NSTLD.COM
>> is 36.333 ms when you exclude probe ID’s < ~6000.  With those probes,
>> it’s 44.900 ms.
>> 
>>                 The average response time from Belgium to
>> A0.INFO.AFILIAS-NST.INFO is 146.38 ms when you exclude probe ID’s <
>> ~6000.  With those probes, it’s 155.70 ms.
>> 
>>                 The average response time from Belgium to X.NS.DNS.BE is
>> 67.214 ms when you exclude probe ID’s < ~6000.  With those probes, it’s
>> 83.836 ms.
>> 
>> 
>> 
>> I’m attaching some photos to visually show just how ineffective these
>> probes are at measuring anything related to DNS…
>> 
>> 
>> 
>> Confirmations on others seeing this, or reporting this, or moving away
>> from RIPE ATLAS for these measurements because of this?  Recommendations
>> other than ThousandEyes (I’m not interested in paying $100K/year for
>> what costs $500/year in VM’s and perl scripts).
>> 
>> 
>> 
>> Ideas on how to get this rectified so that these tests can be useful again?
>> 
>> 
>> 
>> -Jacob Zack
>> 
>> DNS Architect – CIRA (.CA TLD)
>> 
>> 
>> _______________________________________________
>> dns-operations mailing list
>> dns-operations at lists.dns-oarc.net
>> https://lists.dns-oarc.net/mailman/listinfo/dns-operations
>> dns-operations mailing list
>> https://lists.dns-oarc.net/mailman/listinfo/dns-operations
>> 
> 
> _______________________________________________
> dns-operations mailing list
> dns-operations at lists.dns-oarc.net
> https://lists.dns-oarc.net/mailman/listinfo/dns-operations
> dns-operations mailing list
> https://lists.dns-oarc.net/mailman/listinfo/dns-operations

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: Message signed with OpenPGP
URL: <https://lists.dns-oarc.net/pipermail/dns-operations/attachments/20181211/b34b3d8b/attachment.sig>


More information about the dns-operations mailing list