System design prep
One dense reference page for a system design interview: the framework to run the 45 minutes, the latency numbers and capacity math to estimate with, the building blocks to assemble, the database choices, and the trade-offs that separate a senior answer from a junior one. Skim it the night before, or use it as a checklist while you practise.
A system design interview is open-ended on purpose, and candidates who freeze are usually the ones without a process. Run these six steps in order and narrate as you go. The timings are a guide for a 45-minute round, not a rule, but the order matters: each step feeds the next.
Separate functional (what it does) from non-functional (scale, latency, availability, consistency). Pin down read/write ratio, expected users, and the one or two features the interview is really about. Do not start drawing until you have agreed scope.
Back-of-the-envelope: daily active users, requests per second, storage per year, and bandwidth. These numbers decide whether you need sharding, a cache, or a CDN, so they are not busywork - they drive every later decision.
Sketch the handful of endpoints (or RPCs) the core features need. Naming the API forces you to commit to the data that flows in and out, which makes the data model fall out naturally.
Choose SQL vs NoSQL from the access patterns, not from habit. Define the main entities, the primary keys, and how you would shard or index them at the scale you just estimated.
Draw the request path: clients, load balancer, application servers, cache, database, and any async workers or queues. Walk one read and one write through the diagram out loud.
Pick the one or two hardest parts the interviewer cares about and go deep: the hot-key problem, consistency on writes, the cache eviction policy, or how you scale the bottleneck. Name the trade-off for each choice.
You will never be asked to recite these, but you use them constantly to justify a cache, a CDN, or a different storage tier. Know the orders of magnitude. The latency figures are the classic "latency numbers every programmer should know" set, rounded to the scale that matters in an interview.
| L1 cache reference | ~1 ns |
| Branch mispredict | ~3 ns |
| L2 cache reference | ~4 ns |
| Main memory (RAM) reference | ~100 ns |
| Read 1 MB sequentially from memory | ~3 µs |
| SSD random read | ~16 µs |
| Read 1 MB sequentially from SSD | ~49 µs |
| Round trip within the same datacenter | ~0.5 ms |
| Read 1 MB sequentially from disk (HDD) | ~825 µs |
| Disk (HDD) seek | ~2-10 ms |
| Round trip CA to Netherlands and back | ~150 ms |
| Seconds in a day | ~86,400 (~10^5) |
| Requests/sec from 1M daily users (even spread) | ~12 RPS |
| Requests/sec from 1M daily users (peak ~5x) | ~60 RPS |
| Characters per typical tweet/post | ~140-280 bytes |
| One modern server | thousands of QPS, ~64-256 GB RAM |
| Read-heavy systems | cache aggressively, replicate reads |
| Write-heavy systems | shard, batch, use a queue |
Most designs are assembled from the same small kit. Know what each piece does and the specific problem it solves, so you add it for a reason rather than out of reflex.
Pick storage from the access pattern, not from familiarity. Say what you are optimising for and which store fits, then name the cost.
| Type | Reach for it when |
|---|---|
| Relational (PostgreSQL, MySQL) | Strong consistency, transactions, and rich queries with joins. Default choice unless scale or access patterns force otherwise. Scales vertically, then with read replicas and careful sharding. |
| Key-value (Redis, DynamoDB) | Simple, high-throughput lookups by key. Great for sessions, caches, and counters; predictable single-digit-millisecond reads at scale. |
| Document (MongoDB) | Flexible, nested records with no fixed schema. Good when the entity shape varies or evolves and queries are mostly by a single document. |
| Wide-column (Cassandra, Bigtable) | Massive write throughput and horizontal scale with tunable consistency. Reach for it on write-heavy, time-series, or feed workloads. |
| Search (Elasticsearch) | Full-text search, ranking, and aggregations. A secondary index alongside the source-of-truth database, not a replacement for it. |
The component is the easy half; the trade-off is what earns the senior signal. For every choice you make, say what you are giving up.
A cheat sheet is the index, not the whole course. Work the coding lists, the worked system-design guides, and the questions specific companies ask.