[dns-operations] [DNSOP] dnsop-any-notimp violates the DNS standards

Thu Mar 12 12:59:13 UTC 2015

Paul Wouters writes:
> So if the MX or AAAA record has expired from the cache but another
> RRtype with larger TTL (say NS) is still in there, your ANY query will
> fail to find records.

The client is behaving correctly. The ANY query isn't guaranteed to find
the MX, but you're wrong in claiming that the client is relying on this.
I realize that you don't understand how this type of DNS client works,
so let me go through the details, slowly, covering

   (1) why the queries reliably produce the desired information and
   (2) why QTYPE=ANY ends up reducing the number of queries that the
       authoritative server has to handle for producing this information.

The client begins with an ANY query to the cache. There are several
possibilities for the cache state at this moment:

   * maybe there's nothing;
   * maybe there's an MX record;
   * maybe there's an A record;
   * maybe there's some other regular record, such as NS;
   * maybe there's a mix of regular records;
   * maybe there's a CNAME record;
   * etc.

In the "nothing" case, the cache forwards the query to the server. This
could end up retrieving, e.g., an MX set, an A set, and an NS set; it
could end up retrieving a CNAME; there are many possibilities.

The cache then returns whatever it has to the client. The client notices
failure cases (such as SERVFAIL) and, in those cases, defers mail
delivery. Let's now focus on what happens in the success cases.

If the cache returns a CNAME: The client sees the CNAME, replaces the
original name with the CNAME, and starts over.

If the cache returns, e.g., an NS record: The client sees this record
and concludes that there _isn't_ a CNAME. This conclusion is justified
by the rule that a name can't have CNAME together with regular records.
An administrator who sets up a single name with both CNAME and NS (or
CNAME and MX, etc.) is entirely at fault for the resulting confusion.

The client then sends an MX query to the cache. Again failures are
caught and defer mail delivery. The most common success case is that
the cache has an MX set at this point; the client sees the MX set and
acts accordingly. (The A that the MX points to is normally cached too.)
If there's a successful response without an MX (case 5 described in
http://cr.yp.to/djbdns/notes.html#response-parsing) then the client
correctly concludes that there isn't an MX and falls back to an A query
for the original name.

To summarize, this type of ANY->MX->A client correctly sees whether
there's a CNAME, correctly sees whether there's an MX, and (when there
isn't an MX) correctly sees whether there's an A. If there's a server
failure (e.g., NOTIMP) or cache failure, the client correctly defers
mail delivery.

A comprehensive efficiency analysis requires detailed measurements for
many years at many sites, but spot checks have consistently shown that
the following cases are most important:

   * Most common: MX in server and in cache. The ANY->MX->A client ends
     up generating 0 queries to the server: the cache answers ANY with
     MX, and answers MX with MX.

     For comparison, a CNAME->MX->A client would also end up sending 0
     queries to the server _if_ the cache were smart enough to use the
     MX as a reason to deny CNAME. But typical caches aren't that smart,
     so this type of client would end up sending 1 query to the server.

     An MX->A client uninterested in CNAME would end up sending 0
     queries to the server.

   * MX in server but not in cache. The ANY->MX->A client ends up
     generating 1 query to the server: the server answers ANY with MX
     (and typically A), and then the cache answers MX with MX.

     For comparison, a CNAME->MX->A client would end up generating 2
     queries to the server: the server answers CNAME with no data, then
     answers MX with MX.

     An MX->A client uninterested in CNAME would end up sending 1 query
     to the server: the server answers MX with MX.

   * A in server and in cache, no MX in server: The ANY->MX->A client
     ends up sending 1 query to the server. The cache answers ANY with
     A; the server answers MX with no data; the cache answers A with A.

     For comparison, a CNAME->MX->A client would end up generating 2
     queries to the server (or 1 if the cache is smarter): the server
     answers CNAME with no data; the server answers MX with no data; the
     cache answers A with A.

     An MX->A client uninterested in CNAME would end up sending 1 query
     to the server. The server answers MX with no data, and the cache
     answers A with A.

   * A in server, nothing in cache: The ANY->MX->A client ends up
     sending 2 queries to the server. The server answers ANY with A; the
     server answers MX with no data; the cache answers A with A.

     For comparison, a CNAME->MX->A client would end up generating 3
     queries to the server: the server answers CNAME with no data; the
     server answers MX with no data; the server answers A with A.

     An MX->A client uninterested in CNAME would end up sending 2
     queries to the server. The server answers MX with no data, and then
     answers A with A.

To summarize, in all of these common cases, the ANY->MX->A client sends
fewer queries than the CNAME->MX->A client (despite acquiring the same
information), and just as few as the MX->A client (despite acquiring
more information). The CNAME->MX->A client can perform fewer queries
when there _is_ a CNAME, but that's a vastly less frequent situation.

Packet size is harder to analyze. ANY often pulls some records that
aren't used, and if the site isn't configured carefully then ANY can
even end up falling back to TCP, costing bytes _and_ packets. On the
other hand, there are a huge number of Internet sites that don't have a
noticeable volume of unusual records and don't need TCP, and there's a
clear traffic win for every skipped query and skipped no-data response.

If IETF is serious about considering DNS protocol changes, the right way
forward is to recognize that QTYPE=ANY _does_ have efficiency benefits
and to consider protocol extensions that would do even better. In any
event, IETF has a multi-stage procedure for creating mandatory standards
in the first place, and needs to follow the same procedures for
modifying those standards.

> the result of qmail's ANY query is completely meaningless and qmail
> cannot derive any conclusion from the absence of any record from that
> query.

False. As above, _any_ regular record asserts the non-existence of a
CNAME. You're wrong in your claim that the result is meaningless _and_
you're wrong in your claim that qmail is misbehaving.

> qmail with this feature is broken
  [ etc., etc., etc. ]

There's an impressive volume of misinformation being presented here in
an attempt to justify Cloudflare unilaterally and intentionally creating
interoperability problems by violating IETF's mandatory DNS standards.
It's not reasonable to expect all the errors to be corrected within "a
few weeks".

---Dan