2.1.2 | ugeco

Okay, let's continue with 2.1.b Load Balancing (within the Scalability section). We touched on load balancers in Phase 1, but now we'll go into more depth, as they are critical for horizontal scaling.

Review: A load balancer distributes incoming network traffic across multiple servers (a server pool or cluster). This prevents any single server from becoming a bottleneck, improving performance, availability, and scalability.
Algorithms (Review and Expansion):
- Round Robin: Simple, distributes requests sequentially.
- Least Connections: Sends requests to the server with the fewest active connections. Good when requests have varying processing times.
- IP Hash: Uses the client's IP address to choose the server. Ensures session persistence (requests from the same client go to the same server).
- Weighted Round Robin: Servers are assigned weights; higher-weight servers get more requests.
- Weighted Least Connections: Combines least connections with server weights.
- Resource-Based (Adaptive): Considers server resource utilization (CPU, memory) when making decisions. More complex but can be more efficient.
- URL Hash: Uses the URL to make a decision to route to the server.

Session Persistence (Stickiness):
- Problem: Some applications require that all requests from a particular client be handled by the same server (e.g., to maintain session state).
- Solutions:
  - IP Hash: (As mentioned above)
  - Sticky Sessions (using cookies): The load balancer sets a cookie on the client's browser that identifies the server. Subsequent requests with that cookie are routed to the same server.
  - Centralized Session Store: Store session data in a separate, shared location (e.g., a database or cache like Redis) that all servers can access. This is the most scalable and fault-tolerant approach.
Health Checks:
- Purpose: Load balancers need to know which servers are healthy and available to handle requests.
- Mechanism: The load balancer periodically sends requests (health checks) to each server. If a server fails to respond correctly, the load balancer removes it from the pool.
- Types of Health Checks:
  - Simple Ping: Checks if the server is reachable.
  - TCP Connection Check: Attempts to establish a TCP connection.
  - HTTP Status Code Check: Sends an HTTP request and checks for a successful response (e.g., 200 OK).
  - Custom Health Check: A more complex check that verifies the application's specific health status.

SSL Termination (SSL Offloading):
- Purpose: Decrypting HTTPS traffic at the load balancer instead of on the individual servers. This reduces the load on the servers and simplifies certificate management.
- How it Works: The load balancer handles the SSL/TLS encryption and decryption. Communication between the load balancer and the backend servers can be unencrypted (HTTP) if they are on a secure private network.
Layer 4 vs. Layer 7 Load Balancing:
- Layer 4 (Transport Layer): Load balancers operate on the transport layer (TCP/UDP). They make decisions based on IP addresses and ports. Simpler and faster.
- Layer 7 (Application Layer): Load balancers operate on the application layer (HTTP/HTTPS). They can make decisions based on the content of the request (e.g., URL, headers, cookies). More flexible and powerful. Allows for things like:
  - Content-Based Routing: Routing requests to different servers based on the URL path (e.g., sending requests for /images to a different server pool than requests for /api).
  - Header Manipulation: Adding, removing, or modifying HTTP headers.
Types (revisited):
- Hardware, Software, Cloud-Based: As discussed earlier. For Google, be familiar with cloud-based load balancers (GCP's offerings).