[Collisions] Datasets to use for older years

Roy Hooper roy at demandmedia.com
Thu Sep 26 19:06:27 UTC 2013


Kevin,

Judging from the contents of the directories, ripe for 2011 is not the f-root, but k-root is.


[roy at an3 ~]$ ls /mnt/oarc-pool4/DITL-20110412/RAW/ripe/ditl/
as112 f.ip6-servers.arpa-ams1
f.in-addr-servers.arpa-ams1 f.ip6-servers.arpa-ams2
f.in-addr-servers.arpa-ams2 f.ip6-servers.arpa-ams3
f.in-addr-servers.arpa-ams3 f.ip6-servers.arpa-lon1
f.in-addr-servers.arpa-lon1 f.ip6-servers.arpa-lon2
f.in-addr-servers.arpa-lon2 f.ip6-servers.arpa-lon3
f.in-addr-servers.arpa-lon3

[roy at an3 ~]$ ls /mnt/oarc-pool4/DITL-20110412/RAW/k-root/ditl/
ams-ix k1.ficix k1.tix k2.ficix k2.qtel
denic k1.grnet k2.apnic k2.grnet k2.tix
k1.apnic k1.isnic k2.bix k2.isnic linx
k1.bix k1.mix k2.cern k2.mix nap
k1.cern k1.nskix k2.delhi k2.nskix tokyo
k1.delhi k1.qtel k2.emix k2.poznan


For 2013, we probably should have excluded server.as112.ripe.net<http://server.as112.ripe.net> from the total query counts, however it was only 11GiB of compressed data.  The queries going there were bad or for private IP spaces and thus were all PTR queries in in-addr.arpa.

[roy at an3 ~]$ ls /mnt/oarc-pool2/DITL-20130528/RAW/ripe/ditl/
k1.ams-ix.k.ripe.net<http://ams-ix.k.ripe.net> k1.tokyo.k.ripe.net<http://tokyo.k.ripe.net> k2.nap.k.ripe.net<http://nap.k.ripe.net>
k1.cern.k.ripe.net<http://cern.k.ripe.net> k2.ams-ix.k.ripe.net<http://ams-ix.k.ripe.net> k2.nskix.k.ripe.net<http://nskix.k.ripe.net>
k1.denic.k.ripe.net<http://denic.k.ripe.net> k2.bix.k.ripe.net<http://bix.k.ripe.net> k2.poznan.k.ripe.net<http://poznan.k.ripe.net>
k1.emix.k.ripe.net<http://emix.k.ripe.net> k2.cern.k.ripe.net<http://cern.k.ripe.net> k2.qtel.k.ripe.net<http://qtel.k.ripe.net>
k1.ficix.k.ripe.net<http://ficix.k.ripe.net> k2.delhi.k.ripe.net<http://delhi.k.ripe.net> k2.tokyo.k.ripe.net<http://tokyo.k.ripe.net>
k1.grnet.k.ripe.net<http://grnet.k.ripe.net> k2.denic.k.ripe.net<http://denic.k.ripe.net> k3.ams-ix.k.ripe.net<http://ams-ix.k.ripe.net>
k1.isnic.k.ripe.net<http://isnic.k.ripe.net> k2.emix.k.ripe.net<http://emix.k.ripe.net> k3.denic.k.ripe.net<http://denic.k.ripe.net>
k1.linx.k.ripe.net<http://linx.k.ripe.net> k2.ficix.k.ripe.net<http://ficix.k.ripe.net> k3.linx.k.ripe.net<http://linx.k.ripe.net>
k1.mix.k.ripe.net<http://mix.k.ripe.net> k2.grnet.k.ripe.net<http://grnet.k.ripe.net> k3.nap.k.ripe.net<http://nap.k.ripe.net>
k1.nap.k.ripe.net<http://nap.k.ripe.net> k2.isnic.k.ripe.net<http://isnic.k.ripe.net> k3.tokyo.k.ripe.net<http://tokyo.k.ripe.net>
k1.nskix.k.ripe.net<http://nskix.k.ripe.net> k2.linx.k.ripe.net<http://linx.k.ripe.net> server.as112.ripe.net<http://server.as112.ripe.net>
k1.qtel.k.ripe.net<http://qtel.k.ripe.net> k2.mix.k.ripe.net<http://mix.k.ripe.net>

For 2012, ripe was only k-root:

[roy at an3 ~]$ ls /mnt/oarc-pool4/DITL-20120417/RAW/ripe/ditl/
k1.ams-ix.k.ripe.net<http://ams-ix.k.ripe.net> k1.qtel.k.ripe.net<http://qtel.k.ripe.net> k2.linx.k.ripe.net<http://linx.k.ripe.net>
k1.apnic.k.ripe.net<http://apnic.k.ripe.net> k1.tokyo.k.ripe.net<http://tokyo.k.ripe.net> k2.mix.k.ripe.net<http://mix.k.ripe.net>
k1.cern.k.ripe.net<http://cern.k.ripe.net> k2.ams-ix.k.ripe.net<http://ams-ix.k.ripe.net> k2.nskix.k.ripe.net<http://nskix.k.ripe.net>
k1.denic.k.ripe.net<http://denic.k.ripe.net> k2.apnic.k.ripe.net<http://apnic.k.ripe.net> k2.poznan.k.ripe.net<http://poznan.k.ripe.net>
k1.emix.k.ripe.net<http://emix.k.ripe.net> k2.bix.k.ripe.net<http://bix.k.ripe.net> k2.qtel.k.ripe.net<http://qtel.k.ripe.net>
k1.ficix.k.ripe.net<http://ficix.k.ripe.net> k2.cern.k.ripe.net<http://cern.k.ripe.net> k2.tokyo.k.ripe.net<http://tokyo.k.ripe.net>
k1.grnet.k.ripe.net<http://grnet.k.ripe.net> k2.delhi.k.ripe.net<http://delhi.k.ripe.net> k3.ams-ix.k.ripe.net<http://ams-ix.k.ripe.net>
k1.isnic.k.ripe.net<http://isnic.k.ripe.net> k2.denic.k.ripe.net<http://denic.k.ripe.net> k3.denic.k.ripe.net<http://denic.k.ripe.net>
k1.linx.k.ripe.net<http://linx.k.ripe.net> k2.emix.k.ripe.net<http://emix.k.ripe.net> k3.linx.k.ripe.net<http://linx.k.ripe.net>
k1.mix.k.ripe.net<http://mix.k.ripe.net> k2.ficix.k.ripe.net<http://ficix.k.ripe.net> k3.nap.k.ripe.net<http://nap.k.ripe.net>
k1.nap.k.ripe.net<http://nap.k.ripe.net> k2.grnet.k.ripe.net<http://grnet.k.ripe.net> k3.tokyo.k.ripe.net<http://tokyo.k.ripe.net>
k1.nskix.k.ripe.net<http://nskix.k.ripe.net> k2.isnic.k.ripe.net<http://isnic.k.ripe.net>





-Roy

On 2013-09-26, at 2:53 PM, Kevin White <kwhite at jasadvisors.com<mailto:kwhite at jasadvisors.com>> wrote:

So, what would concern me about that is that, if you look at 2011, there is both a k-root and a ripe.  So, is the “ripe” data in 2012 and 2013 a combination of ripe’s internal stuff and their root stuff?

Kevin

From: Roy Hooper [mailto:roy at demandmedia.com<http://demandmedia.com>]
Sent: Thursday, September 26, 2013 2:40 PM
To: Kevin White
Cc: collisions at lists.dns-oarc.net<mailto:collisions at lists.dns-oarc.net>
Subject: Re: [Collisions] Datasets to use for older years

I found that RAW didn't include many of the bad queries that we needed to analyze.

The only org to root-name I recall off the top of my head is  ripe => k-root, which affected 2013 and 2012, as it was not renamed like the rest.

-Roy

On 2013-09-26, at 2:30 PM, Kevin White <kwhite at jasadvisors.com<mailto:kwhite at jasadvisors.com>> wrote:


I’m doing some work that may involve going back in time farther than we did originally.  So, I’m trying to make a map of which files to use from the start.  Here is what I have:

my %years = (
    '2013' => '/mnt/oarc-pool2/DITL-20130528/RAW',
    '2012' => '/mnt/oarc-pool4/DITL-20120417/RAW',
    '2011' => '/mnt/oarc-pool4/DITL-20110412/RAW',
    '2010' => '/mnt/oarc-pool4/DITL-20100413/RAW',
    '2009' => '/mnt/oarc-pool4/DITL-200903/CLEAN-ROOTS',
    '2008' => '/mnt/oarc-pool4/DITL-200803/CLEAN',
    '2007' => '/mnt/oarc-pool3/DITL-200701/CLEAN',
    '2006' => '/mnt/oarc-pool3/DITL-200601/CLEAN',
    );

The first 4 are good: each of those has, in the “RAW” folder, folders for each root that participated, as well as folders for other participants.

The final 4 seem to have the RAW files grouped by organization.  I had to move into the folders listed there to actually find folders grouped by root.

Is that right?  Is it safe to use those CLEAN (and CLEAN-ROOTS) folders to get the same kinds of data we used for 2013 and 2012, queries just to the roots?

Thanks,

Kevin



________________________________
Please NOTE: This electronic message, including any attachments, may include privileged, confidential and/or inside information owned by Demand Media, Inc. Any distribution or use of this communication by anyone other than the intended recipient(s) is strictly prohibited and may be unlawful. If you are not the intended recipient, please notify the sender by replying to this message and then delete it from your system. Thank you.


________________________________
Please NOTE: This electronic message, including any attachments, may include privileged, confidential and/or inside information owned by Demand Media, Inc. Any distribution or use of this communication by anyone other than the intended recipient(s) is strictly prohibited and may be unlawful. If you are not the intended recipient, please notify the sender by replying to this message and then delete it from your system. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dns-oarc.net/pipermail/collisions/attachments/20130926/61a4cc0c/attachment-0001.htm>


More information about the Collisions mailing list