In light of recent events this week, the Solutions Engineering team here at Dyn has had a great number of conversations aimed towards the deployment of a multi-vendor DNS strategy, which is a very sensible thing to do in our current era of DNS-directed DDoS attacks. Eventually, they will be directed at your DNS provider, your own network, or both. In the interest of time, this article will be covering a lot of ground, quickly, to give the technically-inclined a full 360-degree view of DNS multi-vendor relationships. It is highly advised you contact your local Dyn Solution Engineer to develop the solution that works best for you.
Having multiple providers is a great thing to do as it provides redundancy if one of them were to have issues, as well as potentially increase performance to focused markets. As with any layer of your network infrastructure - managed well, two is better than one.
A common misconception about Primary-Secondary DNS is that traffic will only go to the Primary DNS until such time that it goes down, then only go to the Secondary DNS until it comes back up. This is false. What determines which DNS inquiries reach your primary or secondary provider? The short answer is: users do not perform queries themselves, but will instead ask their Recursive DNS provider –usually their local ISP– to make the query. The Recursive will then query the Authoritative DNS provider, which has been charged by a domain to respond for all the records within the zone.
Normally, a single authoritative provider will provide multiple NS records for their network, typically 2-4. If you have two DNS providers, your zone will contain anywhere from 4 to 8 NS records in the delegation. For example, in Wayfair.com that is using Dyn and Akamai together, we see 8 NS records:
$ dig ns wayfair.com +short ns2.p01.dynect.net. ns3.p01.dynect.net. a13-67.akam.net. ns4.p01.dynect.net. ns1.p01.dynect.net. a1-100.akam.net. a22-65.akam.net. a16-64.akam.net.
So how do the recursives determine which nameserver to send traffic to? Who is primary and who is secondary? From the Recursive’s perspective, there is no distinction, only nameservers in the delegation. Most recursives today will then keep a performance assessment against them and send queries only to the fastest nameservers; this process called RTT banding.
The TL;DR here is that the fastest set of responses are (generally) the ones which are preferred by the recursive for subsequent queries. This is exactly how something like a China Network deployment might look, with an asymmetric delegation in which some nameservers are global anycast, while others are unicast in China. This means some nameservers are within China and some are outside for redundancy. Yes, there will be a performance impact due to the fact that if you look individually at the nameservers, over time the best ones will be preferred by the recursive which ultimately improves resolution for users as a whole. It’s best to not get carried away with this concept, however, as adding things into delegation without considering the overall global impact can really impact users. This is exactly why we recommend only adding the China nameservers on a China specific domain (example.com.cn) to reduce the impact to global users.
Ok, so now you have multiple nameservers with your zone information. Great. But now if you update your records in one zone and not the other, your queries may receive the wrong responses. This is why it is important to keep zones in sync between providers, as any nameserver can get a query at any time. Having a secondary DNS solution in place allows you to easily keep your zones in sync.
In the effort to allow zones to stay in sync across nameservers and providers, our DNS forefathers developed Secondary DNS, in which DNS information is provided for a zone from a Primary nameserver, which has read/write access, to a Secondary nameserver, which has read-only, to keep them in sync. This is performed using AXFRs and IXFRs to send the information, with DNS SOA Refreshes and NOTIFYs to queue the sync mechanism. Dyn supports all these tools, in both directions, and you should press your other vendors to do so, as well, for the best experience.
Great! Now our primary can notify the secondary any time the zone is changed, keeping it in sync. This is good, but it really only works with flat DNS records like you learned about in CS 101 class. Today, many DNS providers supply services like Traffic Director which can do neat things like automatic failover and geographic targeting. This is done in special markup, which at best means your secondary will receive a flat record with no fanciness, but potentially could mean a query to the nameserver will provide no useful response.
Takeaway - secondary is fantastic, but not with advanced features.
The common solution, which has been in popular conversation this week, is to deploy what we call a management zone to house your advanced features and CNAME from the asset you would like to use it to the host on said management zone which contains the advanced feature configuration. If this process seems familiar, it’s probably because this is exactly what cloud solutions, like CDNs or Cloud Hosting, do on their side. In this, you are just building that configuration in your own domain portfolio.
To build this, all you will need is a dedicated domain to house the services outside of the zones that you want to be standard Primary-Secondary, like example-mgt.com. This would then have hosts which contain your advanced features such as Traffic Director. Finally, from the original zone, you put a CNAME to the host with the advanced service, so www.example.com -> td1.example-mgt.com.
Finally, with all the advanced features out of the original zone and only CNAMEs remaining, you can easily configure the Primary-Secondary using plain jane AXFR/IXFR/NOTIFY. This is illustrated end-to-end below, in full whiteboard glory:
Hey, extraordinary times call for extraordinary measures. If all you have on the road is a whiteboard, then a whiteboard you shall use!
Takeaway -- if you move advanced features out of zones, you can get back to primary-secondary and make multi-DNS easier.
If you wanted to, you could just stop there, but this does, of course, means that all your advanced features are on one provider, which creates some risk. But the majority of your records are well secured, and in a pinch you can change the CNAMEs to flat records to ride out a storm.
Another option, is to then investigate a Primary-Primary option for the management zone. This is when you keep a second Primary zone on the other provider which uses their version of advanced features for their version of td1.example-mgt.com. This would then need to be synced, which could be done manually, or via the API. While you could certainly do this on the entire original zone –and people do– it is vastly more effort than keeping the scope to a management zone. Complexity increases the odds that something is overlooked. That smells like risk to me.
Completely unrelated to making multi-DNS easier, you do get two nice bonuses that come about with management zones, generally speaking. First, your operational plan for all your advanced services are in a single domain, which means your entire team can be aligned to where everything is. Today I spoke with an organization which has dozens of regional based versions of their zones with local TLDs. Using a management zone would simplify this even if no second DNS provider is deployed. Once you have all the services in a single zone, you can also restrict access to that domain so your interns go nowhere near it! While on Dyn, you can restrict permissions by CRUD on every type of service. This isn’t universal, so keeping things central would help accommodate other providers that might come or go over time.
Virtually every organization with an internet presence has probably met to talk about what is the right decision for zone management. If they haven’t, they should be -- this threat is real, and it’s here to stay. Dyn was able to come back in a matter of hours, but I have talked to brands which endured attacks on their own infrastructure for days. While there are a few moving pieces using a multi-DNS strategy, with patience and a planning each attack can be conquered. Lastly, as I have seen personally, there are some brilliant folks both here at Dyn and across this industry here to help.