You are chewing through your internal resolver query logs, reviewing your authoritative DNS logs, and on your way to understanding what is requesting your domain and what domains systems under your supervision are requesting from the DNS. Depending on the resolvers you use or how you are collecting your DNS query data; you might be wondering why some of the query names look garbled. Who is going to GoOgLe.CoM? Surely no one is mistyping all of these domains on purpose! So what is going on here?
On traditional domain names, the DNS matches names without regard to case, this is why when you type Google.com into your web browser or code your application to make API calls to WhatsApp.com, both of these names resolve without issue. Back in March of 2008, a draft was put forward as part of the DNS Operations working groupto increase DNS security. The goal of the draft—”Use of Bit 0x20 in DNS Labels to Improve Transaction Identity”—was to help increase the entropy beyond source port randomization by randomizing the case of the question name. So instead of always asking for www.dyn.com requests might be made for WWW.DYN.COM, wWw.DyN.CoM, www.DYN.CoM etc. The longer the domain name the more entropy is possible! But why do we need more entropy?
2008 was a popular year for the DNS as it was the year of the so-called “Kaminsky Attack”. This attack (which, it should be noted, was long-previously described by others—it was named after Kaminsky because he produced a compelling demonstration of it) allowed an attacker to inject a phony answer to a DNS query. There was a lot of attention at the time—heck it even made its way to the Wall Street Journal (warning: PayWall). The DNS has widespread deployment (it is part of practically every computer on the Internet) and making systemic changes to the entire distributed name resolution system is a challenge.The most impactful mitigations might be those which are incremental and require minimal change to the protocol and the supporting software ecosystem.
It is important to note that nameservers only accept responses to pending queries. The question someone seeking to exploit such the system is “How does a nameserver know a response is expected and what steps does it take to verify this response?” The response will arrive on the same port it was sent from otherwise it would be dropped by the operating system’s networking stack. This consideration is what made Dan Bernstein’s suggestion of source port randomization so effective, as it added yet another thing an attacker needed to account for. Aside from the source port the resolver currently relies on a Transaction ID ( TXID ), a 16-bit random number in the identification field as one of the main means of verifying a response is for a specific question. Given that a Pentium 100 was specced at being able to generate 100,000 guesses a second, sending a packet with every possible transaction id is a trivial task for modern hardware.
One of the other steps in the process is “bailiwick checking”. If we want to find addresses associated to example.com, normally we first ask a recursive server, the distributed caching layer of the DNS. If the cache is empty the recursive will then ask the root nameservers who can provide it with a referral to .com. The root hands back a list of nameservers for .com we then proceed to query the .com top level domain (TLD) nameservers to find out where we can find example.com. .com may hand back the NS records for the domian ns1.example.com and ns2.example.com which would leave things at a stand still. This stalemate is broken by glue records which are required when you set the nameservers of a domain name to a hostname under the domain name itself. If the response were to also contain an A record for anythingbutexample.com it would be ignored because it is outside of the domain in the question, or out of bailiwick.
Assuming the port matches, the response will be passed to the name resolution process which will then verify that the question section in the reply matches the question that was originally asked. This is where use of the bit 0x20 improves transaction identity as someone seeking to spoof a response would need to know how the domain name case was randomized. This effectively encodes one random bit per ASCII letter, this means its additional complexity to spoof is directly proportional to the length of the qname, example:
WWW.DYN.COM 111 111 111
wWw.DyN.CoM 101 101 101
www.DYN.CoM 000 111 101
After reading this you may have a number of questions. How many DNS resolvers have implemented use of bit 0x20 in DNS labels? What percentage of recursive resolvers in the wild are making use of and verifying bit 0x20 in DNS labels? How many of your customers are using these resolvers? These are all great questions as they help expand your mental model of the complexities inherent in the collective global DNS infrastructure. As you seek to answer them for your infrastructure let us know what you find out by tweeting to us @Dyn!