[dns-operations] ECN & Juniper load balancing breaks TCP queries

Thu Aug 31 09:30:22 UTC 2017

Hi,

We are using Juniper routers in-front of our anycast dns nodes in some locations.

Noticed if the client set the ECN flags in a TCP query the router sends the threeway handshake to one node, but the data to a second node which correctly sends a RESET.

Looks like it possibly could affect others using juniper in the same way for anycast.

To reproduce:

Enable ECN, on linux

echo '1' > /proc/sys/net/ipv4/tcp_ecn

dig @nameserver zone +tcp

ie (random root server that looks to be affected)

dig @b.root-servers.net . soa +tcp
;; communications error to 2001:500:200::b#53: connection reset

We've got a bug raised with Juniper at the moment, looks pretty much identical to this, but were using M series and latest 15 code didn't fix.

https://forums.juniper.net/t5/SRX-Services-Gateway/Broken-ECMP-ipv6-with-SRX1500-in-paketmode/m-p/305636<UrlBlockedError.aspx>

Explicitly disabling ECN on the DNS nodes themselves, from the default of 2 on RHEL, (respond with ECN if client requests it) resolves the issues as they no longer nogotiate ECN and the data flows to the same node as the threeway handshake.

Fix:
echo '0' > /proc/sys/net/ipv4/tcp_ecn

Anyone else seen this?

Ben
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.dns-oarc.net/pipermail/dns-operations/attachments/20170831/e9c31302/attachment.html>