(This article originally appeared in Data Center Solutions Europe.)
Walk through a full data center and you may be surprised. You may think you’ll find racks overflowing, but instead, you will see rows and rows of racks that are only one third populated. The data center is full, however, because it has run out of power and cooling. The new restraint for a full data center is physics, not physical.
As a result, companies have to rethink how they scale. They can no longer afford to do as they have done in the past. In previous years, companies would deploy their infrastructure into a single data center. When their growth exceeded the space, they went looking for a larger data center and would pick everything up and move to a new facility.
This wasn’t only happening for companies based in Silicon Valley. It is a global problem and eventually, there won’t be any larger facilities to move to. Suddenly, you no longer have a growth strategy. What happens then?
Well, what happens is that companies will have to adopt horizontal scaling of their data centers — a method in which you break up your infrastructure into small pieces and strategically place them around the world.
By load balancing around them, you can have access to all of your important data while always having a clear growth strategy because you can always open up a new facility.
A great example of a company that does this is Iovation, which specializes in online fraud protection. They deployed facilities in Portland, Oregon, and Seattle, Washington, which provided tremendous benefits.
This approach gives them resiliency in the face of a natural disaster. If one of those facilities is knocked offline, the other is still up and running and the experience of their customers is not interrupted. Additionally, it is much easier to maintain. Since they have two facilities, they can take the pieces out of one that they need to work on and still share the load with their other facility.
If they only had one facility, all of their proverbial eggs would be in a single basket. This makes maintenance quite tricky. It would also make them vulnerable to downtime.
Of course, there are still challenges with horizontal scaling, the biggest being data replication and consistency. There are different approaches to how you divide your data from one facility to the next. What level of consistency is needed for your business will play a role in what approach you take.
Think about your bank account as an example. The amount of money you have in your account should be the exact same, whether you’re accessing that information from a data center in Seattle or New York. However, if you’re talking about your home address (which is attached to your bank account) changing when you move, there is a bit of acceptable lag time between when the change is replicated between data centers.
Building your infrastructure so that it is capable of horizontal scaling across physical data centers or cloud regions is much easier when done from the beginning. You will have to make some tradeoffs, but at least you will not have backed yourself into a corner.
Of course, if you’re reading this and have all of your infrastructure in a single facility, you shouldn’t panic. There are plenty of examples of companies who were able to evolve from vertical to horizontal scaling.
Here is an example of how one popular tech giant did it.
The most common modes of operations for HTTP are “writing data” and “reading data.” What some folks have done is having created global load balancing around their application so that when you “write data” it only goes to a single data center, while clients requesting to read data are equally load balanced across all global data centers.
In many applications, there are far more reads than writes, and this scheme scales well. By doing so, they are able to spread out the reads and consolidate the changes, and rely on eventually replicating all changes made to the single data center to all other data centers for reading.
Folks familiar with these strategies will quickly point out a flaw: how can you guarantee that a user who recently made a change to the data in their application will have subsequent reads rendering that new information and not a stale view?
The solution deployed was rather clever. For a few minutes after “writing data,” any user that made a change was essentially “pinned” to the single “write data” data center for reading data as well, ensuring all renders of the application were performed with the most updated data available. After a few moments, allowing sufficient time for the change to replicate to other data centers (often referred to as an “eventually consistent” strategy), the user accessing the application would be directed to the other “read only” facilities.
This approach, while requiring some application awareness to the load balancing strategy employed, can be a highly effective way to horizontally scale to multiple data centers while ensuring data consistency However, this is not the silver bullet solution. There are tradeoffs, but this is a good stepping stone for a company to get to horizontally scaled architecture across many data centers, depending on volume.
The future of scaling is happening now. It is lighter, smaller ,and more nimble. It is an excellent model to ensure that your services are always available, easy to service and resilient against a natural disaster.
Now if only growing the business was so easy.