[dns-operations] Possible issue with Office365 DNS?
Paul.Harris at nominet.uk
Thu Oct 29 12:23:13 UTC 2020
Thanks Jeroen, some useful info there.
Just wondering if anyone from Microsoft is available to have a chat about this issue?
On 28/10/2020, 15:00, "Jeroen Massar" <jeroen at massar.ch> wrote:
One issue there is the CNAME chain. Not all recursors will actually follow that many, just in case there is a loop (e.g powerdns limits to 10)
Also note https://link.springer.com/chapter/10.1007/978-3-030-00470-5_7
Seems somebody at Microsoft has the CNAME bug, deploys them for everything left and right and does not know of the negative side effects that CNAMEs have.
(especially in labels you control, there is no reason to bother answering with a CNAME unless one maybe sets a low TTL for that and not the actual answer...)
For instance they still want folks to configure autodiscover.outlook.com as a SRV record for _autodiscover._tcp.<domain>
but... that label (autodiscover.outlook.com) is a CNAME chain too. And guess what, SRVs should not point to CNAMEs...
As per https://www.ietf.org/rfc/rfc2782.txt page 3:
The domain name of the target host. There MUST be one or more address records for this name, the name MUST NOT be an alias (in the sense of RFC 1034 or RFC 2181).
https://testconnectivity.microsoft.com/ check happily declares:
Attempting to retrieve Autodiscover CNAME record for domain 'example.com'.The Autodiscover CNAME record was successfully retrieved from DNS.
The Autodiscover CNAME record 'autodiscover.outlook.com' is properly configured for your domain 'example.com' in Office 365.
Which, from a RFC perspective is impossible ;)
> On 20201028, at 14:30, Paul Harris <Paul.Harris at nominet.uk> wrote:
> Hi All
> One of the customers of a recursive service we operate has reported issues with their users accessing Office 365. Their internal investigation have indicated that the problem probably lies with resolving outlook.live.com against our service.
> They have reported that when their local resolvers query that domain, the answer they get back varies. So they are sometimes getting this cname chain:
> ;; ANSWER SECTION:
> outlook.live.com. 141 IN CNAME live.ms-acdc.office.com.
> live.ms-acdc.office.com. 1729 IN CNAME outlook.ha.office365.com.
> outlook.ha.office365.com. 41 IN CNAME outlook.ms-acdc.office.com.
> outlook.ms-acdc.office.com. 21 IN CNAME LHR-efz.Ms-acDC.office.com.
> LHR-efz.ms-acdc.office.com. 36 IN CNAME outlook-fs.office.com.
> outlook-fs.office.com. 14 IN CNAME outlook-exo.trafficmanager.net.
> outlook-exo.trafficmanager.net. 28 IN CNAME lhr-mvp.trafficmanager.net.
> lhr-mvp.trafficmanager.net. 35 IN A 18.104.22.168
> lhr-mvp.trafficmanager.net. 35 IN A 22.214.171.124
> lhr-mvp.trafficmanager.net. 35 IN A 126.96.36.199
> lhr-mvp.trafficmanager.net. 35 IN A 188.8.131.52
> lhr-mvp.trafficmanager.net. 35 IN A 184.108.40.206
> lhr-mvp.trafficmanager.net. 35 IN A 220.127.116.11
> lhr-mvp.trafficmanager.net. 35 IN A 18.104.22.168
> lhr-mvp.trafficmanager.net. 35 IN A 22.214.171.124
> In the above output, the qname LHR-efz.ms-acdc.office.com (5th in the chain) is a CNAME. However, sometimes it is an A rrset:
> ;; ANSWER SECTION:
> outlook.live.com. 32 IN CNAME live.ms-acdc.office.com.
> live.ms-acdc.office.com. 1727 IN CNAME outlook.ha.office365.com.
> outlook.ha.office365.com. 39 IN CNAME outlook.ms-acdc.office.com.
> outlook.ms-acdc.office.com. 19 IN CNAME LHR-efz.ms-acdc.office.com.
> LHR-efz.ms-acdc.office.com. 1 IN A 126.96.36.199
> LHR-efz.ms-acdc.office.com. 1 IN A 188.8.131.52
> LHR-efz.ms-acdc.office.com. 1 IN A 184.108.40.206
> LHR-efz.ms-acdc.office.com. 1 IN A 220.127.116.11
> It’s not unusual for a GSLB to reply with different IPs every time, but I have not seen it before where you sometimes get a CNAME and sometimes A/AAAA. I’m not totally sure why this would be problematic, but one theory that we are tossing around is that the customer’s caching resolvers are attempting to prefetch the LHR-efz.ms-acdc.office.com record, receiving a NODATA response and therefore responding to the client with NXDOMAIN. I am wondering what people’s thoughts are on whether this theory has merit?
> I was previously able to replicate the issue up until an hour or two ago, now it is responding consistently with an A rrset, so it’s possible that it has gone away.
> Paul Harris
> Senior DNS and Network Engineer
> Website | Twitter | Facebook
> DD: +44 (0)1865 981759 E: paul.harris at nominet.uk
> Minerva House, Edmund Halley Road, Oxford Science Park, Oxford, OX4 4DQ, United Kingdom
> This message is intended exclusively for the individual(s) to whom it is addressed and may contain information that is privileged, or confidential. If you are not the addressee, you must not read, use or disclose the contents of this e-mail. If you receive this e-mail in error, please advise us immediately and delete the e-mail. Nominet UK has taken every reasonable precaution to ensure that any attachment to this e-mail has been swept for viruses. However, Nominet cannot accept liability for any damage sustained as a result of software viruses and would advise that you carry out your own virus checks before opening any attachment.
> dns-operations mailing list
> dns-operations at lists.dns-oarc.net
More information about the dns-operations