Scenario / Graceful degradation

Circuit Breakers & Graceful Degradation Practice

A content recommendation feed where the ML ranking service becomes unstable. The feed must keep serving, even if recommendations are less personalized.

Run this scenario

Briefing

Graceful degradation & fallback responses

Graceful degradation intentionally reduces functionality to preserve core availability during dependency failures.

A slow dependency is worse than a down dependency because slow calls hold resources open and spread pressure upstream.

A slow dependency can take down healthy systems unless timeouts and fallbacks isolate it.

  • Use fallback responses for dependency failures.
  • Keep stale cache or default responses ready for critical paths.
  • Prefer fast degraded responses over slow perfect responses.

Contract

Uptime

99.5%

P95 latency

250ms

Budget

$550/mo

Traffic shape

Daily traffic curve with a predictable high-traffic window. Baseline 180 users; peak around 2,150 users over 48 hours.

Available components

Server

HTTP request handler Every web app needs at least one server. More servers let you handle more simultaneous requests before latency starts climbing.

Postgres

Primary data store Without a database, your app has no memory. Most dynamic requests eventually depend on it.

Redis

In-memory cache layer Popular pages, profiles, and product data often get requested again and again. Serving those from memory is much faster and cheaper.

LB

Load balancer If you run more than one server, something needs to decide where each request goes. That is the load balancer.

Queue

Async job buffer Moving background work out of the request path keeps the app responsive even when extra processing is needed.

Worker

Background job processor Separating background work keeps checkout, page loads, and other user actions from competing with batch processing.

Replica

Read-only DB copy Many applications read far more often than they write. Replicas let you spread those reads across more machines.

CDN

Static asset edge cache Images, scripts, stylesheets, and some API responses do not need to travel all the way back to origin every time.

Rate limiter

Request throttle During abuse events, legitimate traffic competes with junk traffic for server capacity. Filtering noisy traffic at the edge protects the rest of the stack.

Common mistakes

  • Decoupling and fallback paths keep third-party failures from becoming total outages.
  • Timeouts and fallbacks are reliability controls, not polish. Slow dependencies need hard limits.
  • Query-heavy paths need indexes, replicas, or search offload before peak traffic.
  • Caches need warm-up, jittered expiry, and fallback capacity.

Interview adjacency

  • Design a resilient feed
  • Handle dependency failures
  • Explain circuit breakers