[Collisions] Datasets to use for older years

Roy Hooper roy at demandmedia.com
Thu Sep 26 19:49:33 UTC 2013


Hi Kevin,

I didn't go digging into that.  When I did a first pass against CLEAN the preliminary numbers were noticeably lower than the interisle report. I thus reran it against RAW, resulting in numbers that were very similar.  Due to time constraints, I was never able to go back and quantify the reasons for the differences.

-Roy

Sent from my iPhone

On Sep 26, 2013, at 3:19 PM, "Kevin White" <kwhite at jasadvisors.com<mailto:kwhite at jasadvisors.com>> wrote:

Roy, what data did you find missing in CLEAN?

I’m really conflicted now on what the “right” way to look at this data is.

Kevin

From: collisions-bounces at lists.dns-oarc.net<mailto:collisions-bounces at lists.dns-oarc.net> [mailto:collisions-bounces at lists.dns-oarc.net] On Behalf Of Roy Hooper
Sent: Thursday, September 26, 2013 3:16 PM
To: Wessels, Duane
Cc: collisions at lists.dns-oarc.net<mailto:collisions at lists.dns-oarc.net>
Subject: Re: [Collisions] Datasets to use for older years


Thanks for the clarification on how the CLEAN data sets differ from RAW, Duane!

The CLEAN directories sound like they're a much better data set to work from for any analysis that is not comparing directly against the Intersite report.


-Roy

On 2013-09-26, at 3:02 PM, "Wessels, Duane" <dwessels at verisign.com<mailto:dwessels at verisign.com>> wrote:


You should be able to find CLEAN data for every year, if you prefer that.
FYI the clean data differs from the raw data in the following ways:

  - files begin and end on consistent time boundaries.
  - pcap files have the same datalink type.
  - Any non-root traffic is removed.

If you prefer to work with the RAW data make sure you pay attention to the
server's IP address during the analysis.  In some cases non-root-trafic
has been known to "leak" into the DITL data.  Also there were times where
J-root data was uploaded as A-root, and perhaps vice-versa.

DW


________________________________
Please NOTE: This electronic message, including any attachments, may include privileged, confidential and/or inside information owned by Demand Media, Inc. Any distribution or use of this communication by anyone other than the intended recipient(s) is strictly prohibited and may be unlawful. If you are not the intended recipient, please notify the sender by replying to this message and then delete it from your system. Thank you.

________________________________
Please NOTE: This electronic message, including any attachments, may include privileged, confidential and/or inside information owned by Demand Media, Inc. Any distribution or use of this communication by anyone other than the intended recipient(s) is strictly prohibited and may be unlawful. If you are not the intended recipient, please notify the sender by replying to this message and then delete it from your system. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dns-oarc.net/pipermail/collisions/attachments/20130926/d93209a6/attachment.htm>


More information about the Collisions mailing list