Quality Data Makes Internet Performance Solutions Intellectually Satisfying

The great chef Julia Child said, “I enjoy cooking with wine, sometimes I even put it in the food I’m cooking.” When she did use wine in dishes such as Boeuf Bourguignon or Coq au Vin, she also said that a dish is only as good as the wine you use.

The same can be said for using quality data when determining Internet Performance. With more attention focusing on ‘Big Data’ analytics, the temptation is to believe that more is better—and the Internet certainly generates tons of data. Cisco forecasts that Internet traffic will surpass the zettabyte threshold (1000 billion gigabytes) by the end of 2016. That’s 140 gigabytes for every man, woman, and child on earth. There is certainly the volume of Big Data needed for insight. Assuming you have access to data and a plan to use it, how do you make sure it is useful and usable? Because the Internet is so ubiquitous and commercially critical, many Internet Performance solutions are appearing on the market. All of them will consume and create tons of data for analytics, but the ones that rise to the top will be those that use high-quality data. What, then, are the attributes of high-quality, useful Internet Performance data?

1. Internet Performance has been primarily focused on the performance of the inner workings of Internet-facing assets to date. This is typically the role of Application Performance Management (APM) functions such as measuring page load times, server response time, memory utilization, and transaction tracing. However, the Internet actually exists outside these assets as the connectivity that binds everything together. True Internet Performance data needs to come from the path between customers/users and these Internet assets—from network providers, cloud companies, CDNs, and ISPs. There are hundreds of these companies across the globe that need to be measured.

2. Data collected from Internet sources can be done in several ways. For example, there are Internet consortiums such as RIPE (Réseaux IP Européens) and RouteViews, crowd-sourcing approaches that collect Internet connection data by enabling personal computers and enterprises to monitor for performance, and direct source collection. The first two approaches are not commercially reliable data sources since they are voluntary, geographically random, inconsistent, and not necessarily timely. Only direct collection is commercially viable. The best solution is to collect from within the source or as close to the source as you can get, to have broad global coverage and to make sure the data covers the markets where your customers/users are.

3. Directly collected data itself needs several considerations. It needs to be generated or collected in a timely fashion to be used in real-time or near real-time notifications. The data needs to be appropriate to the analysis. For example, for connectivity issues BGP data may be most appropriate, but for performance and content, Real User Monitoring data (RUM) may be the right fit. Also, the data needs to be as accurate as possible. For example, IP addresses are constantly being deployed and moved around. Just because an IP address is registered at a headquarters location doesn’t mean it is deployed there. Diligent screening, checking, cleansing, and triangulating data is necessary to support accurate geolocation services, market analysis, and vendor comparisons.

4. Finally, if all this great data feeds multiple environments and can’t be brought together into a consolidated analysis, then it may not be useable—or at least really difficult to use. Apps, web pages, and services are not running single technologies. They are increasingly complex integration points using potentially hundreds of Internet accesses per service. Travel sites regularly pull data from hundreds of sites to populate a page. Retail sites might be getting feeds from dozens of product sites to show a single page view. To really see the impact of Internet Performance on your business, you need all analysis combined in a single Internet Performance tool across cloud, CDN, and hosting assets.

While drinking a lot of wine (regardless of quality) may have its place, drinking and cooking with quality wine that interests your guest on many levels is the goal of every chef. The Big Data revolution gives us access to lots of data that has potential to make a significant business impact on your online relationship with customers. But to really get value make sure the data is quality data and that you can exploit its full value with a tool specifically designed for Internet Performance measurement.

Whois: Michael Kane

Michael Kane is a Senior Product Marketing Manager at Dyn. Michael leads the product marketing team in educating the industry about Dyn's suite of Internet Performance products. You can connect with Michael on LinkedIn.