[dns-operations] Open source release of the DNS-STATS Compactor

Paul Vixie paul at redbarn.org
Thu Jun 22 18:13:29 UTC 2017


jim, THANK YOU. dnscap has dropped packets mysteriously on some linux 
kernels but not others, since day 1. i did not dig near as deeply as 
you've done (recounted below), but the symptoms are those you describe. 
i very much look forward to seeing the pcap and/or linux teams get to 
the bottom of this -- because as you point out, the select timeout ought 
not matter. we don't want a workaround; we need it fixed! --paul

re:

Jim Hague wrote:
...
> Anyway, I did a bit of digging. And, thanks largely to the magic of
> strace (ploughing through your 3000+ line single source file library
> wasn't going to help much), I've found the difference between the two
> programs that causes my test to drop ping packets and yours not.
>
> Both programs have a central loop that is basically:
>
>      pcap_dispatch()
>      select(pcapfd, timeout)
>
> For the select() timeout, you happen to (usually) use the same timeout
> given to pcap_set_timeout(). I set an arbitary 2s timeout for the
> select. Result: I drop, you don't.
>
> If, on the other hand, I set the select() timeout to 1ms and also set a
> pcap timeout of 1ms, I behave like your program. And if I change the
> timeout on your select to 2s and specify a pcap timeout of 1ms, you drop
> just like me.
>
> It also turns out that if we leave the select() timeout at 2s, and set
> the pcap timeout to 2s as well, dropping doesn't (seem to) happen.
>
> I can see no reason that the select() timeout should matter here, and
> that to my mind that it doesn't with later kernels suggests strongly
> that it shouldn't and that something bad is happening at the kernel
> level. I certainly can't find any docs suggesting that if your loop
> includes a select() it should have a timeout the same as any pcap timeout.
>
> So, in summary, on old Trusty kernels and current Jessie, if you are
> using a pcap loop with select(), set a pcap timeout and set the select()
> timeout to the same period. It looks like it makes things better; I
> can't be sure it cures the problem.

-- 
P Vixie




More information about the dns-operations mailing list