[dns-operations] MX record scanning

Tue May 17 00:20:41 UTC 2011

On 17/05/11 01:26, Jake Zack wrote:
> The "spambot killer" doesn't appear to be randomly generating domains in
> real-time, or if it does, it appears to be doing a fairly lousy job at
> randomness.
> 
> But if this was static content sitting on a webpage somewhere, shouldn't
> I be able to find it via Google (isn't that how the botnet runner
> would've found it?).

If it's the "spambot killer" script I used to run on my home server, or
anything similar, it does indeed generate completely random and
seemingly non-repeating domains without keeping track of any any of the
domains generated. The one I used was written in PHP and used mt_rand to
pick letters out of source string, plus a list of known gTLD, first and
last names. It seems I've lost the original script, but there are a few
out there.[1]

You probably wont find its work in search indexes since it used robots
meta and/or robots.txt to prevent rule abiding search engines from
filling up with what was meant to be reserved for the bad guys.

Oops, a quick search finds a few of the generator pages that do not have
robots protection on them. [2]

After my priorities changed, I wrote my own that generated a unique
"local part" based upon the source IP and time, at a (spamtrap) domain I
own. Theoretically this would allow me to track who was scraping the
addresses and perhaps take action on them, and build up some spam traps,
but it turns out that a unique address every time a page is loaded is
not particularly appealing to spammers. It has received perhaps 20
emails in 6 years.

My theory on why you are getting multiple IP addresses for the same
domain is based upon my very limited knowledge of how bot herders seem
to operate their bots, and how I would do it with my developer hat on.

A bot herder may separate out tasks between bots to prevent the entire
herd being blacklisted at one time. Tasks might be - scrape the internet
for addresses, be a drive by install web host, be a fast flux dns host,
DDoS someone, send out spam, etc.

At any one time there will be a section of the botnet that is useless
for sending spam due to it being blacklisted, but there is little chance
that these hosts will also be blocked from crawling the internet.

A bot that is crawling will not do DNS lookups on every address it finds
(based on what I've seen in logs, they don't even sanity check them -
mailto3Arealaddress at example.com) but instead just send any string with
an @ in it back to the bot herder.

Next spam run, these addresses are simply added to the list and sent out
to the bots that are sending spam today. There is an incredible observed
association between different bots sending different messages to the
same email address at almost the exact same time. Perhaps an at most 10
second variation. It's these machines that you are seeing, not the ones
finding the addresses in the first place.

Of course, this is entirely theory, but it makes sense to me with my
software hat on. It might be a sign of a new group or programmer coming
onto the scene that wasn't around for the early 2000's and does not have
these poison pages excluded.

[1] http://www.liamtog.org/
[2] google spam killer script "randomly generated" addresses