Not surprisingly, people want their websites to be fast. I often get asked, “If I cut my TTLs [Time To Live] in half, will it double my QPS [Queries Per Second]?” I try to explain that doing this may do much more than that or it may make very small changes to your traffic.
But since that answer obviously needs some further explaining, I thought it appropriate to elaborate on the relationship between TTLs, propagation time and DNS queries. Consider this a companion piece to Chris Gonyea’s piece on what TTLs are and what they mean.
When the recursive DNS checks the value of the cached value instead of performing a lookup, it will reduce the total DNS queries. There is the most potential for caching at the 0th second and it decreases dramatically as time goes on. For the math geeks in the room, it will look like a rational function (y=1/x) with respect to time. For you engineers out there, it follows a 1st order response curve.
Propagation on the other hand, occurs linearly, meaning increasing directly as the TTL increases. The longer the propagation time, the less emergency flexibility you have.
What does that mean for your business?
As you increase the TTL, your queries will drop and your propagation will increase. At first, the reduction in queries vastly exceeds the increase in propagation (yay, saving money!). Eventually, they will intersect at a point where propagation time is proportional to caching. This value is different for every traffic pattern.
For example, someone who has one customer requesting 1000 times in a row has more caching potential than someone with 1000 customers requesting one time, as the former will have more queries to take advantage of the increased TTL to cache lookups. Because of this, it is hard to predict the effect of a TTL change on your queries.
On average, “normal” web traffic for A records hits this saturation point at about an hour. From that point on, the increase in TTL will make propagation time longer than the value in reduced queries, so it’s probably not worth it for you.
To illustrate that, I whipped up a sample curve and output the numbers:
Matt’s Totally Fake Ecomm Website’s Fake Caching Graph
As a fake ecomm site, I might have users with multiple lookups within the same session. I might benefit by increasing my TTL to cache some of those lookups, especially at first, but not forever.
If the traffic was for Matt’s Totally Fake Adserve Company, the principal would be the same but the numbers might look dramatically different due to the different traffic patterns. You will just have to experiment on what is right for you.
Notice that after only 1 hr, 98.4% of all caching has been achieved and your propagation time is still reasonable, whereas getting the traffic down to 1 QPS would take over a week!
If you have something like an A record or CNAME pointing to a CDN, keep your TTL at something reasonable like an hour or two. If something is more static like your NS records or a static CNAME, you can go to 12-24 hrs without much worry.
Want more DNS knowledge?
Bookmark our DNS blog, updated every week
by the top IaaS thought leaders in the world.