[dns-operations] Configurable TC=1?

Mon Dec 21 16:02:39 UTC 2015

On 12/21/15, 8:34, "dns-operations on behalf of Paul de Weerd"
<dns-operations-bounces at dns-oarc.net on behalf of pdeweerd at ripe.net> wrote:
>
>If your NS is not authoritative for a domain, queries for it are not
>legitimate.
>
>I've heard this 'hard to debug' argument before, but I don't buy it.

I'll give you an use case from personal history.  Because I would
ordinarily agree that's it's nobody's business, except the operator's,
what's up and what's down.  (Customers ought to care and make their
choices wisely.)

I was asked (more than a decade ago) to detect faulty registrations,
specifically those that included 'lame servers.'  As some folks on this
list desire, the registry was trying to clean up it's registrant
information.  Good, eh?

Due to the nature of the registry, a particular registration object would
imply a series of DNS zone delegations and all to the same set of servers.
 A common malignancy was an operator configuring only the working set of
zones and leaving the rest of them dis-configured.

DJBdns was a popular implementation of the time for this registry.  As
noted earlier in the threat, DJBdns would not respond to queries for which
it wasn't authoritative for.  Makes sense.  The result was my probing code
had to sit out a time-out for the server before deciding it wasn't going
to answer.  And since the server didn't answer, I was unable to mark the
run of zones as "lame" because a lame server is only a server that answers
negatively (for the zone's SOA).  (I did parallelize this, all threads
quickly hung up on the time-ing servers.)

No response is not an conclusive indicator of a remote device's health.
It could have responded and the return path dropped the message segment.
I ran a side experiment, repeating queries to unresponsive servers.  Of
the ones that eventually answered, 91% would answer to the first, 8.9&
answered to the second, and then a long tail of the remaining 0.1% that
went past 10 retries.  So - no response to 2,3,4 queries isn't really
definitive.

Having to deal with this, the probe took a lot longer to run than it
should have.   For that reason, trying to clean up the registrations was
considered untenable.

>If a name server doesn't respond, look at the machine to see why that
>is.  If it's not your machine, get in touch with the people that run
>it for you.

I do agree with this, and keep it discreet.  Name and shame isn't the
answer unless you (the reporter) just want to show off.

>Why should the protocol deal with your mishaps?  To me it seems that the
>time of "be
>liberal in what you accept" is long gone, with all due respect to Jon
>Postel.  The new motto should be "Be conservative in what you send and
>in what you receive".

The protocol needs to deal with individual's mishaps.  If it doesn't, the
protocol will be unstable, be useless and at worst wreck havoc.

When developing a new protocol, I've repeatedly heard this "we shouldn't
deal with that message, no one should be sending it."  My recommendation
is "define a protocol based on how you react to stimuli.  Always know how
you are to react to any received message."  If that fails and there's some
unanticipated message that will cause your implementation to abort, some
miscreant will send that message.

The DNS has a problem, it is in it's management of the transport layer.
UDP suits the needs of the protocol nicely but UDP in a bad drug mule.
TCP is overkill, and it faces longstanding discrimination (e.g., firewall
rules don't expect it).

Another problem is whether the DNS is the victim of the attacks or is an
unwitting accomplice.  DNSSEC armors the data carried in the messages,
anycast, over-provisioning, and adaptive responses fight back against DDoS
of the service.  These approaches have become to be assets to anti-social
elements targeting someone else. Our approach to defining the DNS has made
it into a weapon itself.

I've always been skeptical of RRL, but the idea of more intelligently
deciding when to respond or not has a track record of being beneficial.
Architecturally, it can be argued on paper that it assists in cache
poisoning - and I want to stress "on paper."  In the real world, it is
relied upon more and more to manage the transport layer.

Wither BCP 38 - a useful tool that is not used isn't.  Enlightenment would
be the answer, if all operators are willing (and, btw, in that case
"security" wouldn't be a skill posted in LinkedIn).  Without any leverage
based on fear and greed (the two motivating emotions, as I've learned from
sales people), the DNS has got to take matters into it's own hands.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4604 bytes
Desc: not available
URL: <https://lists.dns-oarc.net/pipermail/dns-operations/attachments/20151221/ac3e3a65/attachment.bin>