In an effort to explain the complexities of the DNS, we are examining some abnormalities that impact how queries are received. This post will provide an overview of the mechanics of the ANY query in context of areas where it has been implemented incorrectly or abused.
Perhaps because the DNS specification is in so many RFCs, it is especially cruel to naïve understandings of the specifications. Worse, applications often can’t even tell whether a given request for the address associated with a name came from the DNS or from some other protocol. Most alternative name resolution protocols are only available on the local network, but for a general-purpose application that does not matter. If you’re trying to stream video in your campus network, you (and your video application) don’t care that ‘building1” and “building2” are “different local networks”. It’s your campus network. In order to get around this ambiguity, a popular Web browser released a beta version that made a bunch of queries, and then asked for “anything” — the so-called “ANY query”.
“Give Me What Matches”
ANY does not mean what it means in English. If you read this blog, you know that DNS is distributed in both administration and operation. Administrators of various zones can each operate their own DNS, so their isn’t a single entity that controls the DNS: that’s the distributed administration, and we talk about it a lot. But the operational distribution means that every cache operator is free to return whatever is in the cache. And ANY, it turns out, in the DNS means “give me what matches”. For a cache, that means, “What do you have on hand?” not, “What could I possibly know about this DNS name?”
Now, you might think, “Aha! I’ll just ask ANY first! That’ll fix everything!” Alas, no. For distributed operation of the DNS requires a way to invalidate cache entries. This is achieved through the Time To Live, or TTL, on a DNS record. Everything in the DNS comes with this TTL and it applies to everything of the same type — in the DNS, the Resource Record Set or RRset. But it doesn’t apply to everything of the same name, and and ANY query asks for everything at the same name. Different data of the same name, but different type, can have different TTLs.
The Problem of ANY
The effect of all this is that the answer to an ANY query will tell you the “true data” about a name only unreliably, unless you ask the authoritative server about that name. You might get data you didn’t want. You might get data you did want. And you might not get data you did want. When you are asking ANY through a normal resolver (that is, behaving as applications almost always do, and not asking the authoritative directly), you have no idea how to interpret the response. ANY is a good way to compare, “What’s in the cache?” and, “What’s in the authoritative server?” Otherwise, nobody should use it.
“This all seems like a theoretical argument. How does this appear in reality?” you might be asking. It turns out confusion about the use of and results from ANY requests is a
real problem. As I mentioned above, recently a popular Web browser was attempting to increase the application’s awareness of the TTLs. The API the Web browser developers had been using didn’t expose the TTL value returned to the stub resolver. As a result, the development team thought that they could get the data they need by a different path, so in cases where they needed an A or a AAAA record’s TTL they also issued an ANY query to the DNS. This turned one request into two requests and for Dyn specifically lead to a 10x increase in the number of ANY queries that we received. Lack of understanding of the DNS request types leads to confusion and in some cases misuse. The second query increased network traffic and didn’t meet the original goal of TTL awareness.
Ask a Simple Question, Get More Than You Bargained For
This may sound like a one-off oversight, but these types of DNS issues aren’t limited to Web browsers, and in some cases are calculated. The ANY request is often associated with UDP reflection attacks due to their amplification factor. You can ask a very small / simple question and receive a voluminous response: in network terms I can use a small amount of bandwidth to create a larger bandwidth response. So, an attacker issue queries with QTYPE ANY to thousands of DNS servers (either full-service resolvers or authoritative servers), claiming that the source address is the IP address of the target victim. The servers, like good DNS servers should, immediately reply to the target, who receives a big pile of large DNS responses. As an authoritative operator, we often see this type of activity but instead of only asking for the contents of the recursive cache, the third party will send ANY queries with the Recursion Desired bit set, presumably so that, if we permitted recursion, we would get additional data.. The ANY request isn’t just a source of developer confusion but can also be weaponized due to its ability to produce bandwidth amplification related attacks.
Because of the issues with ANY, some operators have decided that the ANY request is no longer suitable for use in the public DNS,and are deprecating it. Instead of replying to the ANY request with the contents of the cache they will now respond with an RCODE 4 (Not Implemented) response. It is not clear how well this conforms to the DNS protocol specifications; the protocol community is split on the issue. Since the point of IETF protocol specifications is interoperation, and not conformity with some scripture, the dispute may be solved by learning how recursive and stub resolvers will respond to and process the response.
This acts as the perfect bridge to our next post in which we will be covering negative caching semantics in the DNS and observations which seem to signal issues with different DNS resolvers.