Last September, my colleague Ben Anderson provided a nice illustration of the ways browser caching behavior can be a problem. This is a general issue that has been around for a number of years. Many browsers have wrestled with it repeatedly and the issue doesn’t seem to go away.
However, there seems to be some new interest in tackling this issue and some interested parties gathered just before the IETF meeting at the end of March to discuss it.
Why do browsers care about the DNS at all?
DNS is not something that most people think about when using the Internet. Neither should they have to: the DNS is just part of the infrastructure in the same way that IP addresses are. This is why Dyn works so hard to make our customers’ DNS infrastructure so reliable as the only time a user ought to notice the DNS is when it breaks (and it should never break).
If that’s true, then we ought to expect any Internet client – including web browsers – to use the very same infrastructure as everything else and for the DNS resolution mechanisms to be the ones offered by the operating system. What makes browsers different?
If we think about the way browsers work, we can see what the difference is.
First, browsers usually have a real human in front of them and people are very sensitive to any delay in getting what they want. In order to reduce such delays, many web browsers want control over the way the DNS is used. A shared facility from the operating system doesn’t provide that control.
Second, native resolver libraries offer programmers a minimal interface called getaddrinfo(), a part of the POSIX standard. This interface blocks: if the application wants to resolve more than one name at a time, it needs to make several separate calls to getaddrinfo(). That technique can cause overloading on upstream resolvers, which might regard the many parallel requests as an attack (and stop responding).
When that happens, it looks (to the user) like the Internet connection is slow or has failed. Moreover, getaddrinfo() does not provide a way to know how long the DNS record is allowed to be cached (the time to live or TTL); so the application can’t tell how long it is safe to use the result.
Third, browsers have a security problem that most other systems don’t have. Many things that “run in the browser” are actually really separate clients with examples including Adobe’s Flash and every Java applet. In order to mitigate some of the inherent problems with this, browsers have adopted a strategy called “pinning” that is very similar to a DNS cache.
As attacks have become more clever, the pinning policies have become more sophisticated. The inevitable effect of pinning is to treat results from the DNS differently than the authoritative DNS server’s TTL (even if the TTL were available to the application).
The pressing need
Chromium (the basis for Chrome) now has its own experimental stub resolver inside the browser. The Mozilla project, which produces Firefox, has not gone quite that far yet but the topic comes up from time to time. Yet putting a resolver in every browser, tempting though it might be, is not a good idea. Instead of having one or two well-known but problematic ways to do end-point DNS resolution, stub resolvers inside applications gives us even more. That’s more redundant code with subtle incompatibilities and bugs, surprising and inconsistent behavior and features that don’t overlap.
DNS resolution is part of the infrastructure and making it part of an application undermines the layered approach that has made networked applications so strong. The need to deliver a robust, cross-platform, reliable system that solves applications’ needs is urgent. To their credit, the browser manufacturers know all this. But they have a real problem to solve and quickly.
What is to be done?
A large part of the problem as things stand is that applications – in this case, browsers – can’t get the information they need from the operating system facilities and those facilities are inadequate anyway because they block. This doesn’t even consider the services and features that could be available if applications could take advantage of knowledge about DNSSEC validation.
Application designers need a stable, high-performance, non-blocking API so that the full richness of DNS data is available to them and under difficult performance constraints. The DNS operating environment has changed over the years and the operating system environment is also diverse. But applications are still stuck with the minimal interface offered by POSIX. That needs to change.
The way forward is for people who know about the DNS, but who are not end-user application developers, to collaborate with those end-user application developers. Together we need to develop a new widely-available, cross-platform API that solves the problems applications have. Performance needs to be tunable by applications so that timeouts do not take a long time (the “happy eyeballs” requirement).
The API needs not to block, so that an application does not have to sit forever waiting for an answer that may never come. All of the data available from deep in the DNS guts needs to be available for the application to use. But this API must not require application designers to become experts in the DNS, how it works, or its arcane formats.
This is a tall order, but it isn’t impossible. Most of the Internet today has been built up with many layers, so that applications can do complicated things without understanding all the details. We need to do this again for DNS.
Now is the ideal time to do it. DNSSEC is still in the relatively early days of widespread adoption and standards for building services on top of DNSSEC are just being finalized. Browser makers are starting to run into the limitations of their traditional approaches to many of these issues and so they may be amenable to thinking about better ways of doing things.
Ok, so just do it already.
If the issues are so obvious, what’s the problem? There are two.
- The people who work on DNS and the people who work on applications often have very different interests and problems. It is hard to get such people to work together on a problem, not because they disagree but because they talk past one another.
- In order to get something useful, a large amount of work needs to be done. Until all (or at least most) of the important work is complete, the new API will not be useful. That means that early testing and deployment will not happen.
Despite these barriers, we as an industry must find a way to get this work done. The current direction is unsustainable and as the name space expands, it will get worse. We cannot afford to wait or to keep tinkering at the edges.
Skeptics say that it can never happen. The DNS environment is too polluted. Hotels and web cafés and home gateways and a million other devices will continue to interfere in DNS transactions and make deployment impossible. But applications need origin policies actually linked to the domains they are talking to – not to lists of special domains and not to cache times that seem like a good guess, but to the real, verifiable data from the DNS.
Customers demanded Internet access in hotels in the past. If it’s important enough, customers will demand reliable Internet access in hotels in the future. We need to make reliability a core value. Since it is now that the root is expanding and now that the attention to web security is more acute than ever, now is the time to strike. We can create a more usable, more sustainable and friendlier system to sit under every other Internet transaction.