[dns-operations] To A or to AAAA - was Re: Signaling client

Mark Andrews marka at isc.org
Tue Jan 18 21:46:38 UTC 2011


In message <alpine.LSU.2.00.1101181019400.5244 at hermes-1.csi.cam.ac.uk>, Tony Fi
nch writes:
> On Mon, 17 Jan 2011, Edward Lewis wrote:
> >
> > I would restate Doug's "right answer" to be - fix applications (which inclu
> des
> > OS') to ask for all possible means to reach a resource and then try all
> > possible means until a working one is found.
> 
> The important thing is to try all of them at the same time. The usual IPv6
> failure mode is an unpleasant timeout. If it works but you get routed
> through a shitty 6to4 gateway, it isn't then obvious that tearing down the
> connection and retrying with IPv4 would be better. Hence, try them all at
> once and the first connection to be able to transfer data wins.
> 
> The standard protocol-agnstic getaddrinfo API isn't great because you
> can't be sure that it is implemented well, i.e. that it makes all the name
> service queries symultaneously. It also doesn't allow you to overlap name
> service queries with connection initiation. So perhaps we want a new API
> to attempt concurrent connections on all the results from getaddrinfo,
> plus a higher-level call that bundles the two together and which could
> reduce latency further.

getaddrinfo() isn't the problem.  The problems happen after getaddrinfo.
Clients making AAAA queries is not the problem for servers deploying
IPv6.  The clients are already making those AAAA queries and getting
back a NODATA response and its not causing problems.  The only time
when making AAAA queries is a problem is when the server operator
deploys a DNS load balancer that is broken for AAAA queries either
through design or more usually mis-configuration.

The example of how to use the results of getaddrinfo in the man
page isn't great as it is sequential and waits for each connect()
call to fail in turn.

If you use a non-blocking connect() with poll()/select() or threads
with cancelation points you can get rid of delays without adding
lots of embyonic connection attempts.  I have slow start example
code for all three of these methods which just replaces the loop
over the results of getaddrinfo().

The only real question is when do you start the second connect()
call?  100ms, 200ms or 400ms.  From here in Sydney Australia I can
connect to RIPE's http server in about 400 ms.  This is close to
the longest terrestrial path in the world.

connect_to_host(www.ripe.net) -> 3 in 420 ms

The code will add at most whatever the initial delay is when there
is failing connections.

For web browers you just mantain database of connectivity history
and re-order the results of getaddrinfo based on that.

Good, unknown, returned error, timeout.

> Tony.
> -- 
> f.anthony.n.finch  <dot at dotat.at>  http://dotat.at/
> HUMBER THAMES DOVER WIGHT PORTLAND: NORTH BACKING WEST OR NORTHWEST, 5 TO 7,
> DECREASING 4 OR 5, OCCASIONALLY 6 LATER IN HUMBER AND THAMES. MODERATE OR
> ROUGH. RAIN THEN FAIR. GOOD.
> _______________________________________________
> dns-operations mailing list
> dns-operations at lists.dns-oarc.net
> https://lists.dns-oarc.net/mailman/listinfo/dns-operations
-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742                 INTERNET: marka at isc.org



More information about the dns-operations mailing list