[dns-operations] DNS server benchmarking sanity check

bert hubert bert.hubert at netherlabs.nl
Mon Aug 15 21:24:20 UTC 2016

On Mon, Aug 15, 2016 at 03:38:52PM -0400, Jared Mauch wrote:
> Moving from using send() to sendmsg() can help performance as well as
> other tuning, but these tend to be less portable and avoided by developers
> in my experience.  Simply put, this is the one case where I would still use

So here is something someone might want to try if we don't find time for it

I experimented a bit with using the interface for retrieving errors for UDP
sockets.  Normally we connect() outgoing datagram sockets so we get errors
via recv().  This allows us to more quickly discover a UDP port 53 that is
closed.  But I wanted to see if we could make this work without connect().

This actually works, and I was stunned to read in man recvmsg:

       MSG_ERRQUEUE (since Linux 2.2)
              This flag specifies that queued errors should be received from
              the socket error queue.  The error is passed in an ancillary
              message with a type dependent on the protocol (for IPv4
              IP_RECVERR).  The user should supply a buffer of sufficient
              size.  See cmsg(3) and ip(7) for more information.  The
              payload of the original packet that caused the error is passed
              as normal data via msg_iovec.  The original destination
              address of the datagram that caused the error is supplied via

"The payload of the original packet that caused the error". In order to do
this, the kernel must keep a copy of all recently sent UDP packets around,
if only to verify the authenticity of the ICMP responses. 

A cursory reading of the Linux kernel appears to show that sending behaviour
is not dependent on IP_RECVERROR setsockopt, but I might be wrong.

If we combine this hypothesis with the observation that PowerDNS reliably
spends 50% of its time in the kernel on Linux, and the anecdote that I once
saw that accidentally not sending out answers radically dropped the CPU
load, it may be that the Linux kernel packet sending overhead may in fact
not be due to the dreaded "userland/kernel communications", but due to
potentially unasked for keeping of state for millions of packets in flight.

If this is true, actually sending raw packets which (presumably) are not
tracked would lead to far higher performance while losing UDP/ICMP
responses, but nobody probably wants these when responding to client
requests and benchmarks.

A nice experiment if someone feels like it. If this turns out to be the
case, a Linux kernel change to not track ICMP for UDP if IP_RECVERR is not
set might be wise. Or conversely, maybe a setsockopt IP_FIREANDFORGET.


> FreeBSD > Linux.  Other than that, my personal pains w/ FreeBSD caused me
> to leave their ecosystem.
> - Jared
> _______________________________________________
> dns-operations mailing list
> dns-operations at lists.dns-oarc.net
> https://lists.dns-oarc.net/mailman/listinfo/dns-operations
> dns-operations mailing list
> https://lists.dns-oarc.net/mailman/listinfo/dns-operations

More information about the dns-operations mailing list