System Architecture: Load Balancing
Load Balancers (LBs) are a critical component of any distributed system. They distribute incoming client requests or network traffic efficiently across multiple servers, ensuring that no single server bears too much demand.
1. Core Functions of a Load Balancer
By balancing application requests across multiple servers, a load balancer reduces individual server load and prevents any one application server from becoming a single point of failure (SPOF).
- Health Checking: LBs continuously monitor the "health" of backend servers (usually via a
/healthHTTP endpoint). If a server fails a health check, the LB automatically removes it from the pool and reroutes traffic to healthy servers. - SSL Termination: LBs often handle the decryption of incoming HTTPS traffic, relieving backend servers of SSL/TLS overhead.
- Session Persistence (Sticky Sessions): Routes a specific client to the same backend server using a cookie, if continuous session access is required.
- Note: This is often considered an anti-pattern in modern stateless microservices architectures.
2. Placement of Load Balancers
In massive-scale architectures, LBs are deployed at multiple layers to ensure redundancy and low-latency routing.
graph TD
Client --> LB1[Edge Load Balancer]
LB1 --> API1[API Gateway 1]
LB1 --> API2[API Gateway 2]
API1 --> LB2[Internal Load Balancer]
API2 --> LB2
LB2 --> WebA[Web Server A]
LB2 --> WebB[Web Server B]
WebA --> LB3[Database Load Balancer]
WebB --> LB3
LB3 --> DB1[(Primary DB)]
LB3 --> DB2[(Read Replica)]
3. Load Balancing Algorithms
| Algorithm | Mechanism | Best Use Case |
|---|---|---|
| Round Robin | Cycles through servers in order (A -> B -> C -> A) | Equal hardware servers with similar request costs |
| Weighted Round Robin | Assigns more requests to servers with higher weight | Heterogeneous clusters (some servers faster than others) |
| Least Connections | Routes to server with fewest active connections | Varying request durations (e.g., long WebSockets) |
| IP Hash | Hashes client IP to determine server | Ensures sticky sessions for a specific user |
| Consistent Hashing | Distributes requests based on hash ring topology | Distributed Caching & storage (see Architecture Patterns) |
4. Redundancy: Active-Passive vs. Active-Active
To ensure the Load Balancer itself is not a single point of failure:
- Active-Passive: One LB handles all traffic while the secondary (passive) LB monitors the primary. If the active LB fails, the passive one takes over immediately.
- Active-Active: Both LBs route traffic simultaneously, doubling throughput. This requires multiple public IPs or DNS-based load balancing (Global Server Load Balancing).
5. Practical Implementation
Explore the implementation of proxying, routing, and distribution logic in the repository:
- Rate Limiting & Proxying: Infrastructure Challenges: Redis Rate Limiter
- Job Scheduling & Worker Distribution: Infrastructure Challenges: Dockerized Job Scheduler