Load balancing at the frontend

We serve many millions of requests every second and, as you may have already guessed, we use more than a single computer to handle this demand. But even if we did have a supercomputer that was somehow able to handle all these requests (imagine the network connectivity such a configuration would require!), we still wouldn’t employ a strategy that relied upon a single point of failure; when you’re dealing with large-scale systems, putting all your eggs in one basket is a recipe for disaster.

This lesson focuses on high-level load balancing—how we balance user traffic between datacenters.

Power Isn’t the Answer

For the sake of argument, let’s assume we have an unbelievably powerful machine and a network that never fails. Would that configuration be sufficient to meet Google’s needs? No. Even this configuration would still be limited by the physical constraints associated with our networking infrastructure. For example, the speed of light is a limiting factor on the communication speeds for fiber optic cable, which creates an upper bound on how quickly we can serve data based upon the distance it has to travel. Even in an ideal world, relying on an infrastructure with a single point of failure is a bad idea.

In reality, Google has thousands of machines and even more users, many of whom issue multiple requests at a time. Traffic load balancing is how we decide which of the many, many machines in our ...

Get Load balancing at the frontend now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.