Guided System Design Lesson: Caching Repeated Reads

Briefing

Caching repeated reads

A cache answers repeated reads from memory so the database only does work that is genuinely new.

Watch cache hit rate in the simulator.
Watch database load in the simulator.
Watch read-heavy traffic in the simulator.

Contract

Uptime

95%

P95 latency

320ms

Budget

$500/mo

Traffic shape

Morning surge that tests capacity during a narrow peak. Baseline 140 users; peak around 1,750 users over 26 hours.

Available components

Server

HTTP request handler Every web app needs at least one server. More servers let you handle more simultaneous requests before latency starts climbing.

Postgres

Primary data store Without a database, your app has no memory. Most dynamic requests eventually depend on it.

LB

Load balancer If you run more than one server, something needs to decide where each request goes. That is the load balancer.

Redis

In-memory cache layer Popular pages, profiles, and product data often get requested again and again. Serving those from memory is much faster and cheaper.

Common mistakes

Adding capacity after the bottleneck has already saturated.

Interview adjacency

Design a read-heavy app
Explain cache hit rate
Reduce database load