One of the more common questions our Support team receives here at Oracle Dyn is what Queries Per Second (QPS) is, and how it impacts billing for their Managed DNS account. In some cases, this data is used for auditing purposes, or for a desire to provide a more efficient way to free up account resources for domain space. Before we go into how best to tackle the task of compiling this information, we should first discuss what QPS is, and how it relates to DNS.
QPS is the a rate that measures, on average, how many DNS queries hit your account every second. Specifically, QPS is the number of Authoritative queries, and it’s easiest to think of QPS as you would with measuring Miles per Gallon (MPG) in your car. QPS is a solid measure used to compare DNS volume across different time periods and is a unique way in which we measure DNS usage.
Now that we have a basic understanding of QPS, we should also talk about the measure in which responses are cached known as Time To Live (TTL) which plays an important role in factoring QPS related traffic.
Time to Live (TTL) is the length of time a requesting server can cache or store, your domain information before it needs to be requested once more. Often one of the first troubleshooting steps we like to recommend is to increase the minimum TTL value of the SOA record, or to adjust the TTL of the specific record found to have increased QPS. We will dive deeper into adjusting these values later as a potential solution a further on.
But you may still be asking, “What does this mean for me?”
For most, QPS directly relates to billing calculations and potential overages with Oracle Dyn. The raw data (actual count of queries / real number of seconds) is calculated and stored in 5-minute (300 seconds) and 1-hour (3600 seconds) data point increments. Oracle Dyn uses these data points to create all QPS reports and QPS billing calculations.
If additional information is needed on how Billing QPS is calculated and how you can download your specific account data, please feel free to review the second half of the following help documentation:
Reports & Graphs
Understanding QPS levels for your account usually starts with reviewing the graphical representation of queries for your Managed DNS account. The QPS level reporting graphs also provide a snapshot of the current query levels as a way to quickly identify problem areas such as an errant workflow process. The
easiest way to retrieve that information is through the graph and reporting metrics around within the Managed DNS UI:
Resource Record Graph
The Resource Record Graph is located Under the “View Reports” tab in the main account header and can be acted by clicking the the “Resource Record Graph” link under the “Aggregate Graphs” title.
Here, The Queries per Second graph contains the QPS data points for each Resource Record based on the Time Period selected for the graph. The Resource Records are listed individually in the Legend along with their indicating color.
The Zone Graphs for QPS are located under the “View Reports” tab in the main account header and can be accessed by clicking the “Zone Graph” link under the “Aggregate Graphs” title. The title “Aggregate Zone Graphs” at the top of the page means that all the zones in your account are included in the report view. This graph contains the QPS data points for each Zone in the account based on the Time Period selected for the graph. The Zones are listed individually in the Legend along with their indicating color.
Once you have a graphical understanding from a broader standpoint, it’s now time to look closer at more specific data.
Managed DNS API for QPS Reporting
In case you weren’t already aware, Oracle Dyn offers a method in which you can easily pull those raw QPS data statistics via the Managed DNS API at any time, which helps create a simple automated solution for your teams review.
To get started, if you’ve never worked with our API you may want to head over to our “Getting Started” page to review our detailed tutorial on how to successfully and efficiently communicate with your Managed DNS services:
Once you get your feet wet, and are up and running with a few basic calls and sessions, you can then working through the QPS Reporting call illustrated here:
Diving right in, you should see the following:
As you may have guessed the QPS API report will give you the option to break down return data by “Host” or subdomain, “Zone” or domain name, or by Resource Record types such as A, CNAME, or AAAA. In most cases leaving the breakdown field blank, returning the account wide QPS will give you the desired data. However, understanding that you do have multiple options based on other potential needs is worth noting
As an added benefit, our API will give you the option to parse the requested data into an easy to read .CSV file, that you may then filter from ascending to descending order. Doing so will give you a solid understanding for which zones or hosts are occurring the highest number of queries during the desired time frame.
There are some considerations we should mention however. To start, with API QPS reporting, the longer the length of time the less granular that returned data is. Running a QPS API call for the last 30 days will return data in 1-hour increments, which is great for easier viewing and quicker run time. Extending that period past 45 days as indicated above, will provide you with 4 hour increments and may be suitable for long term review, but won’t provide you with billing granularity and will require more time to complete. Our recommendation, is to implement an automated solution set to pull the last 24-hours worth of QPS information that iterates through the last 30 days until the desired data is retrieved. This will provide the fastest results mixed with the most useful information, creating a sufficient workflow when interacting with our API.
Now you may be asking; “I have the QPS information, I may have concerns about occurring overages or “ I simply don’t understand where some of these queries are coming from”. Let’s take a look at some options to help investigate.
Solutions and Recommendations
Spikes vs Generic Traffic Increases
Often times when we review reporting, and especially when we look at the graphical representation, we will see a steady increase in QPS. In these cases, this is most likely the product of generic traffic increasing for your domain or hostname. You may also see that these queries are for A or AAAA records which is normal and to be expected depending on the time of year, or during peak months for business or web traffic.
Where things get really interesting, is when we see large spikes with increases and decreases in QPS statistics that do not correlate with any expected high traffic time periods. We should also be on the lookout for the record type being reported other than A or AAAA as there could be a potential errant system such as an Email Server, or a third party service unexpectedly driving requests against your domain or hostname.
As we talked about earlier TTL values and possible adjustments can make a significant difference in overall total QPS.. A long TTL will result in the requesting servers to wait longer to refresh domain information due to cached responses, resulting in fewer queries. The trade off however, is that a longer TTL will hold a potential negative answer longer sometimes resulting in a loss of reachability to that endpoint. A shorter TTL makes it possible to make changes to your network faster, but it increases the number of queries against your hostname. A balance between the two extremes is best when considering your TTL settings in relation to your QPS reporting metrics.
Striking a balance can be a tricky process, and engaging support is always recommended so we can openly discuss the pros and cons in more detail.
A large Number of Unexpected AAAA records:
Due to some Operating Systems requesting AAAA records in addition to A records, some customers may see a lot of AAAA queries even though they do not have any AAAA records. This can be caused due to the low minimum value on the Dyn SOA records. This will sometimes raise concern about driving QPS enough for overages. It has been shown that it is possible to lower a zones overall QPS by editing the zone SOA record by hand and changing the SOA minimum value from 1 minute (the default) to 1 hour. With any changes however, we always recommend checking in either via the API or through the QPS graphs in the UI to ensure no unintended results are triggered by as a consequence of an update.
Now that we have the ability to visually review QPS, pull raw data from the API , and implement solutions we should now be able to make informed decisions to action concerns around your Managed DNS zones.
Looking for an even deeper dive into your QPS metrics? Please feel free to contact our Technical Support Team, and we will be happy to review, analyze and discuss with your team!