[Collisions] Datasets to use for older years

Kevin White kwhite at jasadvisors.com
Thu Sep 26 19:02:24 UTC 2013

OK, so the contention would be that the older sets that don't have the
?-root folders in the RAW files somehow would need to be processed to get
the root-only queries out of them.


This still makes me curious about, like, the ripe/k-root problem.  I'm
concerned that the folders, by organization, may contain more info than just
the root queries.


Do we know who we can ask about this?






From: Roy Hooper [mailto:roy at demandmedia.com] 
Sent: Thursday, September 26, 2013 3:01 PM
To: Kevin White
Cc: collisions at lists.dns-oarc.net
Subject: Re: [Collisions] Datasets to use for older years


I meant to say CLEAN didn't include them, but RAW did!  Sorry about the




On 2013-09-26, at 2:40 PM, Roy Hooper <roy at demandmedia.com> wrote:

I found that RAW didn't include many of the bad queries that we needed to


The only org to root-name I recall off the top of my head is  ripe =>
k-root, which affected 2013 and 2012, as it was not renamed like the rest.




On 2013-09-26, at 2:30 PM, Kevin White <kwhite at jasadvisors.com> wrote:

I'm doing some work that may involve going back in time farther than we did
originally.  So, I'm trying to make a map of which files to use from the
start.  Here is what I have:


my %years = (

    '2013' => '/mnt/oarc-pool2/DITL-20130528/RAW',

    '2012' => '/mnt/oarc-pool4/DITL-20120417/RAW',

    '2011' => '/mnt/oarc-pool4/DITL-20110412/RAW',

    '2010' => '/mnt/oarc-pool4/DITL-20100413/RAW',

    '2009' => '/mnt/oarc-pool4/DITL-200903/CLEAN-ROOTS',

    '2008' => '/mnt/oarc-pool4/DITL-200803/CLEAN',

    '2007' => '/mnt/oarc-pool3/DITL-200701/CLEAN',

    '2006' => '/mnt/oarc-pool3/DITL-200601/CLEAN',



The first 4 are good: each of those has, in the "RAW" folder, folders for
each root that participated, as well as folders for other participants.


The final 4 seem to have the RAW files grouped by organization.  I had to
move into the folders listed there to actually find folders grouped by root.


Is that right?  Is it safe to use those CLEAN (and CLEAN-ROOTS) folders to
get the same kinds of data we used for 2013 and 2012, queries just to the










Please NOTE: This electronic message, including any attachments, may include
privileged, confidential and/or inside information owned by Demand Media,
Inc. Any distribution or use of this communication by anyone other than the
intended recipient(s) is strictly prohibited and may be unlawful. If you are
not the intended recipient, please notify the sender by replying to this
message and then delete it from your system. Thank you.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dns-oarc.net/pipermail/collisions/attachments/20130926/8621828c/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4869 bytes
Desc: not available
URL: <http://lists.dns-oarc.net/pipermail/collisions/attachments/20130926/8621828c/attachment-0001.bin>

More information about the Collisions mailing list