Cloud computing and the DevOps culture movement are teaching engineers new ways to deliver services to their end users through a phenomenal paradigm shift. The cloud has brought a new stack of advantages and disadvantages to developers, infrastructure engineers (a.k.a. system administrators), and management that we’ve never seen.
I thought it might be interesting to muse on the impact of cloud for each of these archetypes: the rock star developer, the “get it back up and running” infrastructure engineer and the traditional corporate manager.
What are the lessons learned for each archetype as the shift to cloud has occurred?
The Rock Star Developer’s Perspective
For the rock star developer, two words come to mind: opportunity and quality. Opportunity comes in the form of being able to spin up computing resources with the swipe of a credit card, never worrying about available capacity and being able to continuously scale up if one grows their business.
The cloud has also offered a ubiquitous set of tools for developers to utilize making infrastructure suddenly company agnostic – meaning less time to get educated and ramped up on how a particular company’s infrastructure runs, giving developers more agility in making career decisions.
With respect to quality, it’s the frail nature of cloud systems that has resulted in more robust software being put into the open source world. It is now expected that a system will respond gracefully even if a number of computing resources disappear with no warning or good reason and certainly without apology.
The system is, instead, expected to “self-heal,” pick up the pieces and merrily move along.
Gone are the days of “reliable” RAID disk arrays, network connections controlled by a local network engineering team and someone driving to the data center to clean fibers when things seem flaky. Now reliability has shifted away from the traditional systems administrator onto the developer’s shoulder to encode directly into the application.
In this regard, we’ve effectively raised the bar for one to define himself or herself as a rock star developer. This means that the software engineers of yesterday are incompatible with the systems of today and a skill set evolution is required.
In the complexity of a cloud environment, I’d question whether or not our current educational standards for computer science programs are going to yield us the engineers we need 5-10 years from now to keep this stuff running.
Have we worked ourselves into an “unbootstrapable” corner? Is it possible for an engineer to understand how to develop for the cloud when they’ve never had to make something work on bare metal first?
The Infrastructure Engineer’s Perspective
The cloud has certainly evolved the technology platform that we use today to provide services from; the sheer scale of available computing resources has led to an explosion of the raw numbers of compute elements providing a service.
For the infrastructure engineer, this means a shift in the way that these resources are deployed, provisioned, monitored and maintained. There’s no such thing as “doing it by hand” in the cloud world, especially since there is no way to guarantee a particular steady state of the cloud. Your resources may disappear at any time and you’ll be expected to rebuild them rapidly.
A shift from managing the “mechanical” feel of hardware systems, racking hardware, installing operating systems and provisioning accounts has shifted to efforts that are nearly 100% software engineering oriented.
The resulting alignment for these engineers to developers and management has grown intensely. It’s the infrastructure engineers who have survived massive failures in the bare metal server world that become the institutional knowledge required to educate developers about how to survive failure in the cloud world.
The days of carefully selecting and testing the most optimal hardware system for a particular application have been replaced with a selection of a few popular server templates for any given cloud provider. Delivering availability through network redundancy, using protocols like BGP, OSPF, HSRP, VRRP, ECMP and the like have been replaced with machines that have a single network connection and a single routable IP address.
Instead, the job is to ensure that these concepts become encoded into the software launched into the cloud as opposed to being maintained in a separate hardware layer.
The last characteristic of this archetype is the word invisibility. With the buzz, hype and excitement of developing and living in the cloud, are we missing the emphasis that we need cloud builders to help run and maintain the systems we are all living in?
If we’re shifting to a world where servers exist after typing “knife ec2 server create,” who are the people who are truly launching the hardware resources “into the cloud” itself?
To me, the task of building a cloud, public or private, seems to be a one-time operation – the hardware platform is selected, the core code is written and the cloud is built. Every follow on operation is merely designed to scale out the resources available to those consuming the cloud resources.
Over time, as the only objective becomes to scale out the cloud, the engineers and designers most familiar with the hardware and software engineering functions may have moved on to new opportunities or have just forgotten how these things work.
Then what happens when something fails in an unexpected fashion? Are we doing what we need to do get our next generation of cloud builders excited or are we just teaching graduates of today that “getting dirty” in the data center is something that someone else does for us?
The Corporate Manager’s Perspective
In this archetype, I’d argue that the core words that come to mind are risk and reward. These functions are largely no different as a result of the cloud, but the scale of them is seriously different.
On one hand, seeking the reward for efficiency and scale in the business is easily found in the cloud. Bills go up only when resources are needed and can come back down when they aren’t. It’s a huge business benefit to have this type of resource flexibility, akin to being able to fill up your vehicle’s gas tank in $1 increments.
This means that more aggressively conservative slow start models can be employed in business, providing entrepreneurs and large companies insurance to the risk of launching new efforts.
However, in the risk category, the cloud moves a whole slew of previously known knowns into the category of known unknowns.
The cloud doesn’t always clearly define where data is stored, exactly how it is secured and how resources are allocated. The ability to react to situations is limited to what can be done over a series of API commands, as opposed to sending a team down to the data center.
In some ways, the ability to mitigate risk has been transferred from a company’s internal management team onto the shoulders of the cloud provider but with no shift of accountability – as the case should be.
What it all means
As the book Tubes – A Journey to the Center of the Internet, by Andrew Blum demonstrates, the Internet is a series of layers, just like the OSI model. The clouds of today are still built on the same fundamental pieces of infrastructure we’ve always used: fiber optics, physical buildings, power systems, computing hardware and operating systems.
Cloud itself is an abstraction layer designed to shim certain complexities away from hardware and into software. It has focused our developers and infrastructure engineers to create highly robust software capable of surviving significant disasters. It has created alignment, which has formulated the DevOps culture movement.
Undoubtedly, cloud computing is here to stay. It’s an efficient, eco-friendly and scalable way to operate today’s Internet and business applications.
But what if we take it too far? If what the masses ever know is cloud, then whom do we have to build and maintain them for us? Are we too overly focused on educating our next generation of workers that the cloud is the only thing they will ever touch?