[dns-operations] DNS error reporting (was: DNS at FOSDEM 2016)

Tue Feb 9 21:47:04 UTC 2016

Hi Evan, Jim,

On 9 February 2016 at 20:46, Jim Reid <jim at rfc1035.com> wrote:
>
>> On 9 Feb 2016, at 11:21, Marek Vavruša <marek at vavrusa.com> wrote:
>>
>> - Each code SHOULD have a free-form explanation on what actually
>> happened. This is what humans want to see.
>
> If so, this is something DNS tools should do. It doesn’t belong in the protocol or server-specific code IMO. ie Whatever is returned on the wire gets turned into human-friendly text by tools which are specifically designed to do that job.
>
> That said, it would be nice if there was a discrete error code to signal validation failures. But that train left the station a long time ago and it’s unlikely to ever return.
>
>> - EDNS is okay, but the drawback is that diagnostic tools like
>> kdig/drill/dig or anything similar won't be able to interpret them and
>> it won't probably live through forwarder hops.
>
> So develop diagnostic tools which can do that. Simple matter of programming. Says he who can’t code and is just hand-waving. :-)

I will, eventually ;-)

>> I'm inclined towards
>> RFC4892 TXT in reserved name reporting (chaos class or not), as it's
>> simple, existing tools will be able to parse and display it and it's
>> flexible enough. Something like:
>>
>> error.server. CH TXT "701"
>> error.server CH TXT "Signature of abcd.is expired 30 days ago.”
>
> Hmmm. Wouldn’t those (on the fly?) TXT records need to be signed and validated somehow?
>
>> I'll be probably implementing something along these lines in Knot Resolver.
>
> I’m not sure it’s a good idea to pile more stuff into the CHAOSNET class. It’s another step down a slippery slope: developing a “registry” of owner-names for error conditions (some of which will be implementation-specific and/or local policy-specifc), perhaps defining special RRtypes and their semantics, etc, etc. How would clients know which owner-names to lookup, or would these TXT-like records be automagically appended to the Additional Section of a reply?

Perhaps yes, I was thinking it of adding it to additionals in the
final resolver answer.

> Surely the RDATA would need to be standardised too so that implementations behave consistently when they handle these TXT records? If these messages were left to the discretion of developers, expect inconsistencies and/or ambigious/unclear text depending on which server answered or on its local configuration. For instance: “This answer didn’t validate. Don’t use it.”, “This answer didn’t validate. Do you feel lucky?”, “This answer didn’t validate. So you got no useful response data from me apart from this message.”. Then there’s the potential for American English v British English to add more scope for confusion. Or maybe having to feed Knot’s chatty answers in Czech (say) into google translate and hoping for the best.
>
> There’s already a very wide spectrum of supposedly helpful text appended to the numeric error codes in SMTP and quite probably in HTTP too. I think it would be unfortunate to repeat that mistake in DNS.

I think I have explained myself poorly. There are several channels
where better error reporting could be beneficial.

1. Auth <---> User

What I proposed to do is not for this case. This would be better
served by the mentioned ESD or simply adding more eRCODEs, but
honestly I'm not very interested. a) this is somewhat niche, only
experts make these queries b) each server comes with own
debugging/introspective facilities

2. Auth <---> Resolver

Again, not something I'm very interested in. From a resolver's
perspective, I want to get as few information in the answer as
possible, as it's likely I'm going to scrub it anyway. I also think
the idea of a DNS auth being a "zone file + daemon" is not going to
fly anymore. Most of them are driven by some kind of database, some
provide generated answers, some are not real auths at all, some are
routers/filters, some are purpose-specific, ... The point is that
trying to fit numerous possible failures into codes is going to be a
mess anyway.

3. Resolver <---> User

This is interesting because: a) it affects the end-users directly b)
the number of various failure reasons is limited. This is also where
you can afford to be more verbose as the resolver might know more
about the failure, DNSSEC doesn't apply here, trusted channel between
consumer and resolver is assumed.

I've been thinking about this for a while and maybe the protocol
between consumer and resolver doesn't have to be DNS at all. Honestly,
I'm not overly fond of either current frugal or "next-gen" bulky APIs
to consume DNS data. Alas I have no better idea right now. Maybe it
would be beneficial to talk to authors of mailservers, web browsers,
load balancers, workload systems and other DNS data consumers about
what do they really want to get.

Marek