Scenario / When one server is not enough

Guided System Design Lesson: Load Balancing

A small app gets a launch spike that one server cannot handle cleanly.

Run this scenario

Briefing

Load balancing

Adding servers multiplies capacity only when a load balancer spreads traffic between them.

Adding servers multiplies capacity only when a load balancer spreads traffic between them.

Adding servers multiplies capacity only when a load balancer spreads traffic between them.

  • Watch horizontal scaling in the simulator.
  • Watch load balancing in the simulator.
  • Watch server bottlenecks in the simulator.

Contract

Uptime

95%

P95 latency

260ms

Budget

$430/mo

Traffic shape

Viral spike that ramps quickly and decays unevenly. Baseline 120 users; peak around 1,450 users over 24 hours.

Available components

Server

HTTP request handler Every web app needs at least one server. More servers let you handle more simultaneous requests before latency starts climbing.

Postgres

Primary data store Without a database, your app has no memory. Most dynamic requests eventually depend on it.

LB

Load balancer If you run more than one server, something needs to decide where each request goes. That is the load balancer.

Common mistakes

  • Adding capacity after the bottleneck has already saturated.

Interview adjacency

  • Scale a web service
  • Explain horizontal scaling
  • Design a social feed