This post previously appeared as part of a series on Network World.
When I think about the Internet, I think about the General Motors bankruptcy of 2009. Okay, maybe it’s not the first thing that pops to mind. But there’s a lesson in it for builders of networks.
It is hard not to draw an analogy between the rise of North American car culture and the development of the Internet. In the earliest days of car culture, it was a lot of work to use a car. You needed to be a pretty reasonable mechanic, and you were using a mode of transportation that was just as uncomfortable as any other one, but that was unreliable and experimental as well. But this didn’t matter, because other enthusiasts like you were trying out the same things, and if the new technology turned out to work it would be a really big deal. Similarly, in the earliest days of the network, the users were mostly also developers of the technology. Only pretty geeky people could have thought of telnet or FTP as user-friendly.
But starting in the 1950s, North America reshaped itself around a particular car culture. Governments — through land-use rules and road-building programs and so on — encouraged suburban dwelling. People moved from crowded cities to subdivisions with yards. Multi-lane highways allowed quick and easy access to downtown. Gradually, the cars got better, too. Cars of the 1950s were impressive machines, but by modern lights they were mechanically unsophisticated, and they were terribly inefficient.
Internetworked computers of the late 1990s and early 2000s were reminiscent of those 1950s cars: ever more powerful, and impressive in their day, but now laughably inefficient and dangerous. Operating systems used to crash all the time. Services routinely ran as superuser, which meant that if an attacker could get into the service, the attacker could completely take over the computer: no seat belts! And those computers used an enormous amount of electricity and generated far too much heat.
Something happened to the car culture starting in the late 1960s. First, challenges from once-unimaginable places started to undermine the basic car culture. Cars were more ubiquitous than ever, but consumers began to buy smaller (and less profitable) models. Exhaust emissions and fuel economy became issues to worry about. Reliability gained importance. And governments — who had been enthusiastic boosters — started to look more carefully at automotive output. Then, they started regulating.
The challenges that hit some automakers in the late 2000s had lots of causes, but at least part of the cause was a certainty that “normal” would reassert itself. Some of the thinking about network infrastructure today shares this same problem.
Despite the amount of data people have about their systems, too many decisions are made on the basis of what everyone else is doing or what people have always done. Today we are accustomed to believing that “cloud” models rule. Internet of Things devices all call back to a central point. Companies take data that once would have been locked in a vault and store that information with cloud providers who might become competitors. People optimize network performance according to rules of thumb and usage patterns that may not reflect users’ experiences.
The selection of the cloud model, and the exact way that companies use it, is too often founded in poor metrics about how the applications in question will respond. Too many deployments are undertaken without a clear idea of how moving to the cloud affects the applications, and without a plan for how to modify thinking about how the application needs to be designed and deployed. Just as introducing freeways and cars remade the city, moving a network service or application to the cloud means rethinking everything about what is being done. Taking a system designed to be in your own data center, and moving it to someone else’s data center, is just a good way to spend more money on data centers — in this case, someone else’s. The new technology environment requires a new mindset.
All of this is why better measurement — not just more, but better, in that it measures the right things — is so important for any successful cloud deployment. The irony is that, the more you outsource network functions on which you depend, the more active measurement of those vendors you need to do. Otherwise, you are just blindly following the current fashion, and badly.
When figuring out what and how to measure, you should attend to three issues:
- Does some kind of data tell me exactly one thing about a user’s experience? If not, it needs to be broken down into smaller pieces. Always measure one thing at a time.
- If I measure this thing in this way, am I measuring performance of the application itself, or am I accidentally measuring effects of the cloud application environment instead? It is easy to measure the wrong thing in the cloud, because the environment is so much more dynamic than your own data center.
- Is the measurement instrument an accurate model of what it is supposed to be measuring? For instance, many HTTP and DNS measurement systems measure behavior no application ever sees, or contain implicit assumptions about the way the network operates.
In future installments here, I’ll turn to these questions and look at practical things that you can do, both if you’re directly involved in the technical work and also if you’re a technical manager.
The automakers who failed to thrive in the 2000s were not facing a crisis for the first time. Instead, they were weakened by years of ignoring the rich data that they had, or using it to tell themselves stories about a future recovery that didn’t come in time. Don’t let information bankruptcy get you! Understand the data you collect. Make sure you collect the right data rather than just using data you have. On a regular schedule, review how your measurements fit your application. Then you will be profiting from what you measure, and building better networks.