Scenario / Caching strategies

Black Friday Traffic: Designing for Flash Sale Load

E-commerce platform. You know the spike is coming. Budget is generous but the SLA is brutal.

Briefing

Predictable spikes & overprovisioning

Predictable spike planning means provisioning enough capacity before a known surge arrives, then judging whether that headroom was worth the cost.

The architecture can be technically correct and still fail if capacity arrives after the peak or consumes too much budget.

The optimal architecture is not the biggest one. It is the cheapest one that still survives the peak.

Design for peak traffic, not the quiet baseline.
Use cache, CDN, queue, and replicas to keep one tier from absorbing the entire spike.
Keep enough budget margin for live incidents during the surge.

Contract

Uptime

99.9%

P95 latency

150ms

Budget

$800/mo

Traffic shape

Flash-sale surge with a steep ramp and sustained peak. Baseline 200 users; peak around 6,500 users over 48 hours.

Available components

Server

HTTP request handler Every web app needs at least one server. More servers let you handle more simultaneous requests before latency starts climbing.

Postgres

Primary data store Without a database, your app has no memory. Most dynamic requests eventually depend on it.

Redis

In-memory cache layer Popular pages, profiles, and product data often get requested again and again. Serving those from memory is much faster and cheaper.

LB

Load balancer If you run more than one server, something needs to decide where each request goes. That is the load balancer.

Queue

Async job buffer Moving background work out of the request path keeps the app responsive even when extra processing is needed.

Replica

Read-only DB copy Many applications read far more often than they write. Replicas let you spread those reads across more machines.

CDN

Static asset edge cache Images, scripts, stylesheets, and some API responses do not need to travel all the way back to origin every time.

Rate limiter

Request throttle During abuse events, legitimate traffic competes with junk traffic for server capacity. Filtering noisy traffic at the edge protects the rest of the stack.

Common mistakes

Traffic spikes need pre-positioned headroom and fast offload paths.
Connection pools can fail before raw database capacity reaches 100%.
DDoS response works best before traffic reaches app servers.
Decoupling and fallback paths keep third-party failures from becoming total outages.

Interview adjacency

Design an ecommerce checkout
Design a flash sale system
Design a ticketing platform