Scenario / Caching strategies
Black Friday Traffic: Designing for Flash Sale Load
E-commerce platform. You know the spike is coming. Budget is generous but the SLA is brutal.
Run this scenarioBriefing
Predictable spikes & overprovisioning
Predictable spike planning means provisioning enough capacity before a known surge arrives, then judging whether that headroom was worth the cost.
The architecture can be technically correct and still fail if capacity arrives after the peak or consumes too much budget.
The optimal architecture is not the biggest one. It is the cheapest one that still survives the peak.
- Design for peak traffic, not the quiet baseline.
- Use cache, CDN, queue, and replicas to keep one tier from absorbing the entire spike.
- Keep enough budget margin for live incidents during the surge.
Contract
99.9%
150ms
$800/mo
Traffic shape
Flash-sale surge with a steep ramp and sustained peak. Baseline 200 users; peak around 6,500 users over 48 hours.
Available components
Server
HTTP request handler Every web app needs at least one server. More servers let you handle more simultaneous requests before latency starts climbing.
Postgres
Primary data store Without a database, your app has no memory. Most dynamic requests eventually depend on it.
Redis
In-memory cache layer Popular pages, profiles, and product data often get requested again and again. Serving those from memory is much faster and cheaper.
LB
Load balancer If you run more than one server, something needs to decide where each request goes. That is the load balancer.
Queue
Async job buffer Moving background work out of the request path keeps the app responsive even when extra processing is needed.
Replica
Read-only DB copy Many applications read far more often than they write. Replicas let you spread those reads across more machines.
CDN
Static asset edge cache Images, scripts, stylesheets, and some API responses do not need to travel all the way back to origin every time.
Rate limiter
Request throttle During abuse events, legitimate traffic competes with junk traffic for server capacity. Filtering noisy traffic at the edge protects the rest of the stack.
Common mistakes
- Traffic spikes need pre-positioned headroom and fast offload paths.
- Connection pools can fail before raw database capacity reaches 100%.
- DDoS response works best before traffic reaches app servers.
- Decoupling and fallback paths keep third-party failures from becoming total outages.
Interview adjacency
- Design an ecommerce checkout
- Design a flash sale system
- Design a ticketing platform