[dns-operations] Question for fellow users of RIPE ATLAS (broken/saturated probes or what?)

Tue Dec 18 12:25:01 UTC 2018

Hi,

I've been looking into what could cause the big difference in reported
reponse times between V1/V2 probes and more modern probes.

My preliminary conclusion is that it caused by recording the start time
of the measurement, for the purpose of computing the response time,
before all initialisation is complete.

This initialisation code involves creating the DNS request to send,
opening the socket, etc. It is just a bit of C code. However the CPUs in
those probes is slow enough that a bit of code quickly adds up to quite
a bit of time. There is also a bit of debug output left in this path,
which doesn't help.

Hard to measure, but quite plausible is that oneoff measurement are
delayed more because they run concurrent with periodic measurements.
I.e., oneoff measurements experience interference from periodic
measurements of which there are many. On average, periodic measurements
would not experience interference from oneoff measurements because
oneoff measurements are very infrequent.

Other effects, such as a delay in processing response packets when
another measurement is using CPU time will also increase the recorded
response time.

V1 and V2 probes can be excluded from measurements using probe tags. V1
and V2 probes have tags system-v1 and system-v2. To exclude hose probes
from a measurement add the tags to the exclude list in the probe
specification of a measurement.

A few years ago, the Atlas team decided that deployment of new firmware
for V1 and V2 probes would be limited to fixing security issues. We are
currently evaluating whether or not to issue a new firmware release to
fix this issue.

Philip