What the senior bar actually tests
At senior level the system design round is not checking whether you can name Kafka or Redis. It checks whether you can take a vague prompt, turn it into a scoped problem, propose an architecture that survives follow-up questions, and explain the cost of every choice. Junior candidates often try to recall a reference architecture. Senior candidates reason from requirements.
The interviewer is looking for a few specific signals. Can you find the real constraint hiding in a loose prompt. Do you separate what must be consistent from what can lag. Do you know where your design breaks and say so before being pushed. Do you connect a technical choice to an operational consequence, like on-call load or cost per request.
If you only practise drawing boxes, you will pass the first five minutes and stall on the deep dive, which is where senior offers are won or lost.
A structure you can repeat under pressure
Use the same opening every time so you do not freeze. The structure below takes about forty minutes and leaves room for the interviewer to redirect.
- Clarify the product and the users. Ask who uses this, how often, and what they expect.
- Pin down scale. Get rough numbers for daily active users, read and write ratio, and payload size.
- State explicit non-goals for the session.
- Define the core entities and the main API calls.
- Sketch the high level data flow.
- Choose storage by access pattern, not by brand.
- Add caching, queues, and the handling of uneven load.
- Cover reliability, abuse, and observability.
- Name what you would improve with more time.
Write these steps on the whiteboard or in the shared doc. It tells the interviewer you have a method, and it gives you a checklist when your mind goes blank.
Turn a vague prompt into numbers
If the prompt is "design a notification system," do not start designing. Ask questions that produce numbers.
- How many notifications per day at peak.
- Do users get email, push, SMS, or all three.
- Is a one minute delay acceptable, or must some be near instant.
- Can a notification be dropped, or must every one be delivered at least once.
Suppose you land on fifty million notifications per day, mixed channels, most can tolerate a minute of delay, and none can be silently dropped. That single paragraph already implies a queue, a per-channel sender with retries, idempotency keys, and a dead letter path.
Make storage choices defensible
The weakest senior answers say "use a NoSQL database" with no reasoning. The strongest ones describe the access pattern first, then pick a store that serves it.
For a notification system the access patterns might be:
- Append a notification record for a user, very high write volume.
- Read recent notifications for one user, ordered by time.
- Mark a notification as read, single row update.
- Fan out one event to many recipients.
That pattern fits a wide-column or key-value store partitioned by user id, with time as the sort key. A relational database can work at smaller scale, and you should say at what point you would move off it. Naming the migration point is a senior signal. It shows you know the choice has a lifespan.
type Notification = {
userId: string;
notificationId: string;
channel: "push" | "email" | "sms";
createdAtMs: number;
readAtMs?: number;
};
Keep records compact. Store a reference and hydrate the full payload in batches rather than copying large bodies into every recipient row.
Treat tradeoffs as the main event
Every interesting design has a tension. Strong reads versus cheap writes. Freshness versus cost. Simplicity versus scale. Your job is to surface the tension and pick a side with a reason.
Take fanout for an activity feed. Fanout on write makes reads cheap but punishes accounts with millions of followers. Fanout on read keeps writes cheap but makes every feed load expensive. The senior answer is usually a hybrid: precompute for normal accounts, compute at read time for the small set of very large accounts, and merge the two. State the threshold as a tunable config, not a constant, because real systems tune it with live metrics.
function chooseFanout(followerCount: number, threshold: number) {
return followerCount >= threshold ? "fanout_on_read" : "fanout_on_write";
}
When the interviewer pushes back, do not defend the answer to the death. Say what new information would change your mind. That flexibility reads as seniority, not weakness.
Show that you have run things in production
Senior interviewers care about what happens after launch. Bring up the parts that only show up on call.
- Rate limit writes so one misbehaving client cannot flood the queue.
- Make retries idempotent with a natural key, so a replayed job does not double send.
- Put fanout work on a queue so a post does not block on millions of writes.
- Track lag between event creation and delivery, queue depth, and the failure rate per channel.
Mentioning a dead letter queue, a poison message policy, or a backpressure plan separates people who have operated systems from people who have only read about them.
Common ways senior candidates lose the round
A few mistakes show up again and again.
- Drawing microservices before establishing requirements. Boxes without numbers look like memorisation.
- Treating a tool name as an explanation. "Use Kafka" is not a design, it is a noun.
- Ignoring the skewed cases, like the celebrity account or the one tenant that is a hundred times larger than the rest.
- Going silent during the deep dive. Think out loud so the interviewer can follow and help.
- Refusing to admit a weakness. Every design has one, and naming it first builds trust.
How to practise before the loop
Pick five canonical problems and run the full structure on each, out loud, on a timer. A rate limiter, a URL shortener, a chat system, a news feed, and a metrics pipeline cover most of the patterns you will be asked about. Record yourself, then check whether you scoped, sized, chose storage by access pattern, and named tradeoffs. The goal is not a perfect diagram. The goal is a calm, repeatable method that holds up when the interviewer changes the requirements halfway through.
Continue your prep
Pair this with role-specific question sets and sample answer outlines: