[dns-operations] 'dnstap' (Re: Prevalence of query/response logging?)

Paul Vixie paul at redbarn.org
Fri Jul 4 22:04:10 UTC 2014



Roland Dobbins wrote:
> I know that some DNS operators disable logging of queries/responses due to the overhead of doing so - are most folks on this list with large-scale DNS recursive and/or authoritative DNS infrastructure disabling logging, enabling it, and/or logging queries/responses out-of-band via packet-capture taps, databases, etc.?

we've been using PCAP to do passive dns collection since 2008 or so, and
we've determined that it is unsuitable. the overhead of reassembling
fragmented IP datagrams is fairly low, but the overhead and complexity
of reassembling TCP streams is extreme -- post-processor i know is
willing to pick through a TCP stream inside a PCAP file processing each
transaction.

our first attempt was NCAP, which was better in that it represented DNS
messages rather than IP packets. however this approach quickly showed
its age and naivety when we tried to express "orphan query, no response
seen" or "orphan response, no query seen" or "this is a query and
response pair".

our next attempt was NMSG, which was better in that it was based on
google protocol buffers, which are extensible without fork lift upgrades
to your producers and consumers, and which are extremely performant. our
PCAP sensor today generates NMSG output, which is then post-processed
into many intermediate real time channels and ultimately informs the
DNSDB passive DNS database. however, the reliance on PCAP turns out to
be a problem for appliances, and the problem of collecting TCP remains
difficult, and there are many events we care about which don't appear on
the wire at all (cache expiry vs. cache purge, for example).

therefore, 'dnstap', as presented by robert edmonds of farsight during
the recent dns-oarc meeting in warsaw. 'dnstap' output can be created by
PCAP feeds, but it is optimized for in-nameserver data collection, and
it supports many different message types, and is extensible, and very,
very performant. 'dnstap' tries hard to offload its CPU and memory costs
to other processes, and is unlike syslog() completely asynchronous -- no
queries are ever dropped because a 'dnstap' consumer couldn't keep up,
and thus the primary purpose of "name service" is not threatened by the
secondary purpose of "gather telemetry".

see https://dnstap.info/ for more details. so far this is live in knot
(including kdig, which can now read and generate 'dnstap' format
telemetry streams), and has been implemented on a development branch of
unbound. we hope to get it into BIND and NSD in the next few months. we
received an indication of no interest from powerdns, but i'm hoping bert
will relent later on.

dnstap is completely open source, with a BSD-style license (Apache 2.0).
it is sponsored by farsight because we need a uniform DNS telemetry
format for our business purposes. we are giving it away in order to make
the world better, primarily for our own products and customers, but by
necessary extension, better for everybody including our competitors.

vixie




More information about the dns-operations mailing list