[dns-operations] Source code to identify the fake DNS packets from China Re: Odd behaviour on one node in I root-server (facebook, youtube & twitter)
David Dagon
dagon at cc.gatech.edu
Mon Mar 29 19:21:01 UTC 2010
On Sat, Mar 27, 2010 at 10:07:53PM +0800, ?????? wrote:
> These wrong DNS replies are sent by the notorious Great Firewall of
> China. Checked by this program, the phenomenon is GFW's DNS poisoning
> without doubt.
I'd like to add a few comments to this topic.
Please note that in this discussion, I do not state an opinion about
any governmental policy re: modifying DNS answers. (Nobody asked my
opinion, and it's not relevant or on topic here). I think that's an
important caveat: Much the discussion on this topic considers this
practice to be censorship. My comments do not require one to have an
opinion on that topic. I've attempted to state strictly neutral,
factual observations.
So, what are the implications of such a DNS policy? I can suggest a
few:
1) "Network Policy and Topology Leak"
As noted by others in this list, almost any IP in China will reply
to a query for a qname matching a substring for {twitter, facebook,
youtube}. For example, using prips(1) in the bsd sysutils/prips
ports, one can:
for ip in `prips $CHINA_CIDR`; do dig @${ip} twitter.com; done
Nearly 80% of all networks in China will answer, on average.
Clearly, the host does not even have to be a DNS resolver. In some
tests of several large networks over 3/4 of the IPs reply.
What are some implications of this?
a) Entire CIDRs (large ones: /16s, etc.) in China are open
recursive, at least for qnames matching those three strings.
Although open recursive DDoS amplification attacks might favor
large records in cache, network operators in China might
nonetheless share the Internet community's concern about the
increase in the number of open recursive hosts.
Of course, there are tens of millions of open recursives
already. But this behavior makes nearly every host in China
open recursive for a large set of qnames.
People far more clever in designing attacks than me might use
this. For example, this property might be used to congest low
bandwidth segments in China, and complicate filtering as a
remediation (since there are no longer just a few open
recursives to filter, but entire networks).
b) Note that about 20% of the IPs don't answer. Some of these
might be routers, some might be hosts that operate under a
different policy system. A common technique for some networks
is to blackhole traffic, to frustrate attacker probes.
But now outsiders can use queries for {facebook, twitter,
youtube} to trivially enumerate hosts in China's network that
are _not_ subject to this DNS policy. I am not aware of a
similar, trivial technique for mapping networks in any other
country or network. The DNS policy has made such a study easy.
c) A few of the hosts do answer, but provide the 'correct' answer
(meaning: an answer one might obtain from the zone authority).
These hosts are open recursives (and have a meaningful fpdns
signature). Further, these hosts can also be observed seeking
glue from authorities for zones with embedded substrings (e.g.,
A? facebook.please.visit.me.from.china.example.com).
There are well-known techniques for finding open recursives.
This DNS policy lets outsiders identify hosts in China that for
some reason are _not_ subject to the same rewriting policy.
This leaks information from the host country about which
networks are subject to the discussed policy, and which ones are
not. Thus, one could infer the extent and reach of the
governmental organization requiring this policy, simply by
querying a set of IPs in China.
Discovering these policy relationships by other means would seem
difficult; at least I'm not aware of where this information can
be found. But the DNS rewriting policy makes it trivial to
discover the cross-entity relationships in networks.
In short, modifying the resolution plane via the routing plane
reveals information about how the networks in China are managed.
Since finding this information about other networks in other
countries is hard (and requires complex probes, which are often
blocked or stopped via abuse@ email), this is an unusual property
of the DNS rewriting policy.
2) "Mirror Loss"
I do not know what root operators will do, but some might wish to
disable anycast roots inside China (even if they already require
the anycast prefix to be announced with BGP NO-EXPORT, etc.)
Likewise, some DNSBLs will not locate their mirrors in China.
I'm of course speculating, but if we assume the tendency is for
operators to not locate mirrors in China, then the implications
are:
a) China's 400+ million users do not see the benefit of a local
anycast root. Although nobody asked my opinion, I'd personally
would like the users of China to have the same quality of
service I enjoy with localized DNS roots.
b) The resolution plane in China becomes potentially more brittle.
One motivation for anycasted roots is to resist DDoS (properly
understood as Distributed _Degradation_ of Service, since
universal *Denial* of service is not always a realistic attack
outcome). Once again, nobody asked my opinion, but I think the
Internet should be equally robust, anywhere, and should not have
areas where the resolution plane is more brittle and vulnerable.
c) Domain BL operators will not locate mirrors in China, since some
spam domains might have matching substrings. For operational
reasons, this means they might also move IP BLs outside of
China, simply for convenience. This means:
-- Unless extensive localized mirrors are created, those outside
of China can witness the rates of spam arrivals (and
filtration) in Chinese networks. Consider:
www.cc.gatech.edu/~feamster/publications/dnsbl.pdf
This analysis does not merely work with botnets; without
local mirrors, BL behavior is broadcast to other parts of the
Internet.
-- Users in China will experience slower resolution (and perhaps
more timeouts) when checking BL status.
3) "Mirror Utilization Leak"
The hosts that rewrite DNS packets appear to pick a non-random
ordering of 9 answer IPs. (As others have noted, the RRs always
include one of: 243.185.187.39, 203.98.7.65, 159.106.121.75,
93.46.8.89, 78.16.49.15, 59.24.3.173, 46.82.174.68, 37.61.54.158,
and 8.7.198.45, and the TTL is either 300 or 86400).
I only have a few months of testing data, but the pattern in which
these IPs appear looks non-random. (One can further differentiate
between answers with 300 or 86400 TTL, which appear to come from
different routers doing the rewriting. Specifically, there appear
to be *either* a ratio of 2 to 1 routers providing the 86400 TTL
answers, *or* an equal ratio, with congestion on the routers
providing the 300 TTL. This is a guess based on a basic sampling
of the TTLs and timeouts, and could be wrong.)
But since the pattern appears non-random, if one queries a random
network rapidly (say, once a second), the pattern of IPs will
repeat, in order---except if another user made a query. In such a
case, one of the IPs will be skipped, out of order. Measuring the
time between each repeated answer therefore lets one estimate how
many *other* third party queries were rewritten (minus one's own
measurement queries, of course).
Potentially, this leaks information about how many queries are
being intercepted by this DNS policy. (Of course, many of these
queries could come from outside researchers, such as myself.)
If I'm correct about this (I could be wrong), one can estimate the
rate of blocking performed by this DNS policy. So far, the rate
appears diurnal, which further suggests this technique measures the
deterministic, non-random selection of RRs.
Assuming I'm correct, this reveals details about the rule's
application. This would be unusual. It is rare for a network
policy to provide information about how many times the rule has
been applied. (Normally, such statistics are known only to the
firewall operators or those with access to routers or snmp traps).
At least that is what I would expect from a policy that many see as
sensitive.
4) "Potential Channel Loss."
Some applications use DNS to tunnel information, e.g., nstx,
ozzymanDNS, etc. It is unlikely, but possible, that offending
substrings could be generated naturally. This may disrupt the
operation of these tools.
Perhaps that's a good thing; DNS tunnels are not always welcomed by
network operators. But certainly tunnel operators outside of China
enjoy a slightly more robust channel. This varies with the size of
the substring against which qnames are matched of course. A short
substring (e.g., 'cnn') would likely generate false positives for
numerous tunnel applications.
5) "Dictionary Reduction Via CNAME Manipulation."
If twitter, youtube, and facebook are all blocked by a DNS policy,
what would stop these zone owners from obtaining CNAMES (perhaps
IDN domains) and pointing to the blocked IPs? (Likewise,
HTTP redirects could be used).
Certainly filter evasions via aliases are possible. If the DNS
policy were to keep pace with the ability of these zone owners to
make aliases for their blocked domains, then the DNS policy would
also have to block the aliases.
In one scenario, the zone owners of twitter, youtube and facebook
could select IDN names, or other crafted strings that collide with
domain assets owned in China. (Indeed, any person could acquire
and publicize an alias for a blocked service, using a name that
collides with another service.) A well-chosen string would be one
that results in many false positives for the DNS policy.
This means the DNS policy would either not keep pace with the
aliases for the blocked domains, or suffer false positives. I also
believe the undecidable dimensions of this problem prevent the use
of good automation tools to craft useful DNS policy updates.
(I.e., humans must be used to craft new block strings---ones which
are not accepted by systems vulnerable to false positives.)
Taken to an extreme, this effectively gives remote third parties,
generally not located in China, some small degree of control over
the qname dictionary that China might use.
I stress this is only a theory. I suspect it will not be tested,
given the trouble of actually publicizing a colliding alias.
In short, there are many negative (likely unintended) outcomes from
the DNS policy, beyond merely propagating the DNS policy to caches
outside of China.
* * *
I work as a university researcher, and am quite realistic about the
humble area of the world I occupy. I can't even influence the color
of chalk used in the lecture halls.
Perhaps some readers of this list believe themselves more influential,
and I believe many might be. But I do offer this: I suspect
characterizing this issue as a "censorship problem" engages a larger,
non-technical issue. At least from how I see things, this might
guarantee things don't get fixed quickly. I could be wrong.
I'd suggest the networking community observe the neutral, factual
implications of this DNS policy. I believe many of its side-effects
can be avoided.
I invite scrutiny of the ideas I've suggested above. This list has
already provided an excellent overview of the unwanted, external
side-effect that China's policy had on others. Have I correctly
characterized the effects on their own network?
--
David Dagon /"\ "When cryptography
dagon at cc.gatech.edu \ / ASCII RIBBON CAMPAIGN is outlawed, bayl
Postdoc Researcher X AGAINST HTML MAIL bhgynjf jvyy unir
Georgia Inst. of Tech. / \ cevinpl."
More information about the dns-operations
mailing list