[dns-operations] 'dnstap' (Re: Prevalence of query/response logging?)

bert hubert bert.hubert at netherlabs.nl
Mon Jul 7 08:52:38 UTC 2014


On Fri, Jul 04, 2014 at 03:04:10PM -0700, Paul Vixie wrote:
> 
> 
> Roland Dobbins wrote:
> > I know that some DNS operators disable logging of queries/responses due to the overhead of doing so - are most folks on this list with large-scale DNS recursive and/or authoritative DNS infrastructure disabling logging, enabling it, and/or logging queries/responses out-of-band via packet-capture taps, databases, etc.?
> 
> we've been using PCAP to do passive dns collection since 2008 or so, and
> we've determined that it is unsuitable. the overhead of reassembling
> fragmented IP datagrams is fairly low, but the overhead and complexity
> of reassembling TCP streams is extreme -- post-processor i know is
> willing to pick through a TCP stream inside a PCAP file processing each
> transaction.

Paul, I've written many many TCP/IP reassemblers and in fact the overhead is
trivial. Your kernel does it all the time for example. The trick is to have
a limited window in which you do the reassembly, and not scan over the
entire file. Neither does a kernel.

> telemetry streams), and has been implemented on a development branch of
> unbound. we hope to get it into BIND and NSD in the next few months. we
> received an indication of no interest from powerdns, but i'm hoping bert
> will relent later on.

Having said all that, it doesn't mean we aren't big fans of logging. But
people I know are also big fans of logging being separate from their
production servers, and this implies packets & reassembly. This is why we
have ample tooling in powerdns-tools to analyze packets. 

Packets also have the wonderful advantage that they represent what actually
happened and not what the nameserver THOUGHT that happened.  We've for
example been able to solve & debug many issues caused by malformed packets. 
Such malformed packets would probably not retained their unique malformation
by serialization to dnstap.

As another example, we've in the past had cases where our own logging showed
we were serving answers with low latency yet packet traces showed
signigicant time between query and response packets.  The ultimate issued
turned out to be queueing before our listening socket.  Once we *got* the
packet we answered it quickly enough.  But we did not (and could not easily)
account for when the packet hit the server.

Our tool 'dnsscope' shows such statistics wonderfully.

> dnstap is completely open source, with a BSD-style license (Apache 2.0).
> it is sponsored by farsight because we need a uniform DNS telemetry
> format for our business purposes. we are giving it away in order to make
> the world better, primarily for our own products and customers, but by
> necessary extension, better for everybody including our competitors.

It so happens that we now have the infrastructure to plug in arbitrary
modules at packet entry & exit, we could perhaps do a dnstap implementation
there. Will keep you posted.

	Bert




More information about the dns-operations mailing list