Two of the more common questions our support team receives here at Oracle Dyn are: what is DNS QPS, and how does it affect billing for managed DNS accounts?
In some cases, QPS data is used for auditing purposes, or to provide a more efficient way to free up account resources for domain space. Before we go into how best to tackle the task of compiling this information, we should first discuss what QPS is and how it relates to DNS.
QPS stands for queries per second. It is the rate that measures, on average, how many DNS queries hit your account every second. Specifically, QPS is the number of authoritative DNS queries. It’s easiest to think of QPS as you would with measuring miles per gallon in your car. QPS is a solid measure used to compare DNS volume across different time periods and is a unique way in which we measure DNS usage.
Now that we have a basic understanding of DNS QPS, we should also talk about the measure in which responses are cached, known as time to live (TTL), which plays an important role in factoring QPS-related traffic.
TTL is the length of time a requesting server can cache or store, your domain information before it needs to be requested once more. Often one of the first troubleshooting steps we like to recommend is to increase the minimum TTL value of the SOA record, or to adjust the TTL of the specific record found to have increased QPS. We will dive deeper into adjusting these values later as a potential solution further on.
But you may still be asking, “What does this mean for me?”
For most, DNS QPS directly relates to billing calculations and potential overages with Oracle Dyn. The raw data (actual count of queries over a real number of seconds) is calculated and stored in five-minute (300 seconds) and one-hour (3,600 seconds) data point increments. Oracle Dyn uses these data points to create all QPS reports and QPS billing calculations.
If additional information is needed on how Billing QPS is calculated and how you can download your specific account data, please feel free to review our FAQ.
Reports and graphs
Understanding QPS levels for your account usually starts with reviewing the graphical representation of queries for your Managed DNS account. The QPS-level reporting graphs also provide a snapshot of the current query levels as a way to quickly identify problem areas such as an errant workflow process. The easiest way to retrieve that information is through the graph and reporting metrics around within the Managed DNS user interface.
Resource record graph
The resource record graph is located under the View Reports tab in the main account header and can be activated by clicking the Resource Record Graph link under the Aggregate Graphs title.
Here, The Queries per Second graph contains the DNS QPS data points for each resource record based on the time period selected for the graph. The resource records are listed individually in the legend along with their indicating color.
The zone graphs for QPS are located under the View Reports tab in the main account header and can be accessed by clicking the Zone Graph link under the Aggregate Graphs title. The title Aggregate Zone Graphs at the top of the page means that all the zones in your account are included in the report view. This graph contains the QPS data points for each zone in the account based on the time period selected for the graph. The zones are listed individually in the Legend along with their indicating color.
Once you have a graphical understanding from a broader standpoint, it’s now time to look closer at more specific data.
Managed DNS API for QPS reporting
Oracle Dyn offers a method in which you can easily pull raw DNS QPS data statistics via the Managed DNS API at any time, which helps create a simple automated solution for your team’s review.
To get started, if you’ve never worked with our API you may want to review our detailed tutorial on how to successfully and efficiently communicate with your Managed DNS services. Once you get your feet wet, and are up and running with a few basic calls and sessions, you can then start working through the QPS Reporting call.
Diving right in, you should see the following:
The DNS QPS API report will give you the option to break down return data by host or subdomain, zone or domain name, or by resource record types such as A, CNAME, or AAAA. In most cases, leaving the breakdown field blank and returning the account-wide QPS will give you the desired data. However, it’s worth noting that you do have multiple options based on other potential needs.
As an added benefit, our API will give you the option to parse the requested data into an easy to read .CSV file, which you may then filter from ascending to descending order. Doing so will give you a solid understanding of which zones or hosts are seeing the highest number of queries during the desired timeframe.
There are some considerations we should mention, however. To start, with API QPS reporting, the longer the length of time, the less granular that returned data is. Running a QPS API call for the last 30 days will return data in one-hour increments, which is great for easier viewing and quicker run time. Extending that period past 45 days will provide you with four-hour increments, which may be suitable for long-term review but won’t provide you with billing granularity and will require more time to complete. Our recommendation is to implement an automated solution set to pull the last 24 hours worth of DNS QPS information that iterates through the last 30 days until the desired data is retrieved. This will provide the fastest results mixed with the most useful information, creating a sufficient workflow when interacting with our API.
Now you may be saying, “I have the QPS information, but I have concerns about occurring overages,” or “I simply don’t understand where some of these queries are coming from.” Let’s take a look at some options to help investigate.
Spikes vs. generic traffic increases
Often times when we review reporting, and especially when we look at the graphical representation, we will see a steady increase in QPS. In these cases, this is most likely the product of generic traffic increasing for your domain or hostname. You may also see that these queries are for A or AAAA records, which is to be expected depending on the time of year, or during peak months for business or web traffic.
Where things get really interesting is when we see large spikes with increases and decreases in QPS statistics that do not correlate with any expected high-traffic time periods. We should also be on the lookout for the record type being reported other than A or AAAA, as there could be a potential errant system such as an email server, or a third-party service unexpectedly driving requests against your domain or hostname.
TTL values and possible adjustments can make a significant difference in overall DNS QPS. A long TTL will result in the requesting servers to wait longer to refresh domain information due to cached responses, resulting in fewer queries. The tradeoff, however, is that a longer TTL will hold a potential negative answer longer, sometimes resulting in a loss of reachability to that endpoint. A shorter TTL makes it possible to make changes to your network faster, but it increases the number of DNS queries against your hostname. A balance between the two extremes is best when considering your TTL settings in relation to your QPS reporting metrics.
Striking a balance can be a tricky process, and engaging support is always recommended so we can openly discuss the pros and cons in more detail.
Unexpected AAAA records
Due to some operating systems requesting AAAA records in addition to A records, some customers may see a lot of AAAA queries even though they do not have any AAAA records. This can be caused by a low minimum value on Dyn SOA records. This will sometimes raise concerns about driving QPS enough for overages. It has been shown that it is possible to lower a zone’s overall QPS by editing the zone SOA record by hand and changing the SOA minimum value from one minute (the default) to one hour. With any changes, however, we always recommend checking in either via the API or through the QPS graphs in the UI to ensure no unintended results are triggered.
Now that we have the ability to visually review QPS, pull raw data from the API and implement solutions, we should now be able to make informed decisions to action concerns around your Managed DNS zones.
Looking for an even deeper dive into your DNS QPS metrics? Please feel free to contact our technical support team, and we will be happy to review, analyze and discuss!