[dns-operations] Using IP_RECVERR/IPV6_RECVERR on resolver client sockets
paul at nohats.ca
Wed Jan 9 00:04:16 UTC 2019
On Tue, 8 Jan 2019, Florian Weimer wrote:
> Subject: [dns-operations] Using IP_RECVERR/IPV6_RECVERR on resolver client
> Someone noticed that the Linux kernel only puts some networking-related
> errors on the socket error queue for connected UDP sockets:
Here, let me show you my 20 year old scar:
/* Process any message on the MSG_ERRQUEUE
* This information is generated because of the IP_RECVERR socket option.
* The API is sparsely documented, and may be LINUX-only, and only on
* fairly recent versions at that (hence the conditional compilation).
* - ip(7) describes IP_RECVERR
* - recvmsg(2) describes MSG_ERRQUEUE
* - readv(2) describes iovec
* - cmsg(3) describes how to process auxiliary messages
* ??? we should link this message with one we've sent
* so that the diagnostic can refer to that negotiation.
* ??? how long can the messge be?
* ??? poll(2) has a very incomplete description of the POLL* events.
* We assume that POLLIN, POLLOUT, and POLLERR are all we need to deal with
* and that POLLERR will be on iff there is a MSG_ERRQUEUE message.
* We have to code around a couple of surprises:
* - Select can say that a socket is ready to read from, and
* yet a read will hang. It turns out that a message available on the
* MSG_ERRQUEUE will cause select to say something is pending, but
* a normal read will hang. poll(2) can tell when a MSG_ERRQUEUE
* message is pending.
* This is dealt with by calling check_msg_errqueue after select
* has indicated that there is something to read, but before the
* read is performed. check_msg_errqueue will return TRUE if there
* is something left to read.
* A write to a socket may fail because there is a pending MSG_ERRQUEUE
* message, without there being anything wrong with the write. This
* makes for confusing diagnostics.
* To avoid this, we call check_msg_errqueue before a write. True,
* there is a race condition (a MSG_ERRQUEUE message might arrive
* between the check and the write), but we should eliminate many
* of the problematic events. To narrow the window, the poll(2)
* will await until an event happens (in the case or a write,
* POLLOUT; this should be benign for POLLIN).
More information about the dns-operations