How would you add end-to-end encryption and what does it cost you (search, multi-device)?

HTTP request/response cannot push, so naive polling wastes resources and adds latency. Long polling is a stopgap. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.

How do you handle a thundering herd when a popular server restarts?

HTTP request/response cannot push, so naive polling wastes resources and adds latency. Long polling is a stopgap. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.

Design a Chat System (WhatsApp / Messenger) | System Design

Step 1 - Clarify the requirements

Never start drawing boxes. A strong candidate spends the first few minutes scoping the problem so the design that follows is justified. For a chat system, the questions worth asking are:

One-to-one only, or group chats too, and how large can groups get?
Do we need delivery and read receipts, and online presence?
Is message history persisted forever, or only until delivered?
Do we need end-to-end encryption?

Functional requirements

Send and receive one-to-one and group messages in real time.
Deliver messages to offline recipients when they reconnect.
Show delivery/read receipts and online presence.

Non-functional requirements

Low end-to-end latency for delivery while both parties are online.
Reliable delivery and consistent per-conversation ordering.
Scale to hundreds of millions of concurrent long-lived connections.

Step 2 - Back-of-the-envelope estimates

Sizing the system tells you which parts are hard. Round aggressively and state your assumptions out loud; the numbers matter less than showing you can reason about scale.

Metric	Estimate	Reasoning
Concurrent connections	tens of millions	Each online user holds a persistent connection, so connection count, not RPS, is the scaling axis.
Messages/day	tens of billions	Active messaging products move enormous message volume; storage and fan-out must keep up.

Step 3 - Data model and API

A compact data model and a small API surface anchor the rest of the discussion. Keep both minimal; you can always extend them when the interviewer pushes.

Core entities

messages

message_id (sortable), conversation_id, sender_id, body, created_at, status

Partition by conversation_id; use a time-sortable id for ordering.

conversations

conversation_id (PK), type (1:1/group), member_ids, last_message_at

Membership list drives fan-out for group sends.

user_sessions

user_id -> connection server id

Routing table so a sender's server can find the recipient's gateway.

API sketch

GETws://.../connect- Open a WebSocket; the server registers the user's session.
POST/api/v1/messages- Send a message (also flows over the socket).
GET/api/v1/conversations/{id}/messages- Load history, paginated by message id.

Step 4 - High-level design

Sketch the happy path end to end before optimising anything. This is the architecture you would draw on the whiteboard first:

1Clients hold a persistent WebSocket to a connection (chat) server through a load balancer.
2A presence/session service maps each online user to the server holding their connection.
3On send, the message is persisted, then routed to the recipient's connection server, which pushes it down the socket.
4If the recipient is offline, the message is stored and pushed (or pulled) on reconnect; a push notification nudges them.

Step 5 - Deep dives that separate strong answers

The high-level design is table stakes. Interviewers spend most of the time here, probing the decisions that actually carry the system. These are the ones to be ready for.

Real-time transport: WebSockets vs polling

HTTP request/response cannot push, so naive polling wastes resources and adds latency. Long polling is a stopgap. The right answer is a persistent bidirectional connection, normally a WebSocket (or a platform push channel on mobile). The server keeps the socket open and pushes messages as they arrive. This makes chat servers stateful: a given user's messages must route to the exact server holding their live connection, which is why you need a session registry mapping user to connection server.

Delivery, ordering, and offline messages

Each message gets a server-assigned, time-sortable id so a conversation has a single agreed order even if clients send concurrently. Persist the message before acknowledging the sender (so it survives a crash), then deliver. Track per-message status: sent, delivered (recipient's device received it), read (recipient opened it). For offline recipients, store undelivered messages and deliver them in order on reconnect; trigger a push notification through the notification service. Idempotent message ids let clients de-duplicate retried sends.

Group chat fan-out and presence

A group send looks up the member list and routes a copy to each online member's connection server, persisting once per recipient (or once with per-member delivery state). Very large groups become a fan-out problem similar to a feed and may warrant pull-based catch-up rather than pushing to thousands of sockets. Presence (online/last-seen) is high-churn and best kept in an in-memory store with a heartbeat; treat it as soft state that can be slightly stale rather than a strongly consistent fact.

Step 6 - Bottlenecks and how to scale past them

Naming where the design breaks, and the specific fix, is what signals seniority. For a chat system the pressure points are:

Connection servers run out of socket capacity.

Horizontally scale gateways; route by session registry; shed and reconnect gracefully.

Presence updates overwhelm the store.

Heartbeat with batching; keep presence in memory with short TTLs.

Large-group fan-out.

Switch big groups to pull-based catch-up; cap per-message push fan-out.

Step 7 - Key tradeoffs

There is rarely one right answer. State the tradeoff, then commit to a side with a reason tied to the requirements you clarified in step one.

Transport

WebSocket (true push, stateful)

Long polling (simpler, higher latency)

Guidance: WebSocket for real-time; long polling only as a fallback where sockets are blocked.

Group delivery

Push to every member (low latency)

Pull/catch-up (scales to huge groups)

Guidance: Push for small groups; pull for very large ones.

Common follow-up questions

When you finish the core design, expect the interviewer to pull on one of these threads. Have a one-paragraph answer ready for each.

How would you add end-to-end encryption and what does it cost you (search, multi-device)?: HTTP request/response cannot push, so naive polling wastes resources and adds latency. Long polling is a stopgap. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How do you sync history across a user's multiple devices?: Each message gets a server-assigned, time-sortable id so a conversation has a single agreed order even if clients send concurrently. Persist the message before acknowledging the sender (so it survives a crash), then deliver. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How do you guarantee ordering when a client sends while offline?: A group send looks up the member list and routes a copy to each online member's connection server, persisting once per recipient (or once with per-member delivery state). Very large groups become a fan-out problem similar to a feed and may warrant pull-based catch-up rather than pushing to thousands of sockets. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How do you handle a thundering herd when a popular server restarts?: HTTP request/response cannot push, so naive polling wastes resources and adds latency. Long polling is a stopgap. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.

Last reviewed by the site editor: June 2026

Step 1 - Clarify the requirements

Never start drawing boxes. A strong candidate spends the first few minutes scoping the problem so the design that follows is justified. For a chat system, the questions worth asking are:

One-to-one only, or group chats too, and how large can groups get?
Do we need delivery and read receipts, and online presence?
Is message history persisted forever, or only until delivered?
Do we need end-to-end encryption?

Functional requirements

Send and receive one-to-one and group messages in real time.
Deliver messages to offline recipients when they reconnect.
Show delivery/read receipts and online presence.

Non-functional requirements

Low end-to-end latency for delivery while both parties are online.
Reliable delivery and consistent per-conversation ordering.
Scale to hundreds of millions of concurrent long-lived connections.

Step 2 - Back-of-the-envelope estimates

Sizing the system tells you which parts are hard. Round aggressively and state your assumptions out loud; the numbers matter less than showing you can reason about scale.

Metric	Estimate	Reasoning
Concurrent connections	tens of millions	Each online user holds a persistent connection, so connection count, not RPS, is the scaling axis.
Messages/day	tens of billions	Active messaging products move enormous message volume; storage and fan-out must keep up.

Step 3 - Data model and API

A compact data model and a small API surface anchor the rest of the discussion. Keep both minimal; you can always extend them when the interviewer pushes.

Core entities

messages

message_id (sortable), conversation_id, sender_id, body, created_at, status

Partition by conversation_id; use a time-sortable id for ordering.

conversations

conversation_id (PK), type (1:1/group), member_ids, last_message_at

Membership list drives fan-out for group sends.

user_sessions

user_id -> connection server id

Routing table so a sender's server can find the recipient's gateway.

API sketch

GETws://.../connect- Open a WebSocket; the server registers the user's session.
POST/api/v1/messages- Send a message (also flows over the socket).
GET/api/v1/conversations/{id}/messages- Load history, paginated by message id.

Step 4 - High-level design

Sketch the happy path end to end before optimising anything. This is the architecture you would draw on the whiteboard first:

1Clients hold a persistent WebSocket to a connection (chat) server through a load balancer.
2A presence/session service maps each online user to the server holding their connection.
3On send, the message is persisted, then routed to the recipient's connection server, which pushes it down the socket.
4If the recipient is offline, the message is stored and pushed (or pulled) on reconnect; a push notification nudges them.

Step 5 - Deep dives that separate strong answers

The high-level design is table stakes. Interviewers spend most of the time here, probing the decisions that actually carry the system. These are the ones to be ready for.

Real-time transport: WebSockets vs polling

Delivery, ordering, and offline messages

Group chat fan-out and presence

Step 6 - Bottlenecks and how to scale past them

Naming where the design breaks, and the specific fix, is what signals seniority. For a chat system the pressure points are:

Connection servers run out of socket capacity.

Horizontally scale gateways; route by session registry; shed and reconnect gracefully.

Presence updates overwhelm the store.

Heartbeat with batching; keep presence in memory with short TTLs.

Large-group fan-out.

Switch big groups to pull-based catch-up; cap per-message push fan-out.

Step 7 - Key tradeoffs

There is rarely one right answer. State the tradeoff, then commit to a side with a reason tied to the requirements you clarified in step one.

Transport

WebSocket (true push, stateful)

Long polling (simpler, higher latency)

Guidance: WebSocket for real-time; long polling only as a fallback where sockets are blocked.

Group delivery

Push to every member (low latency)

Pull/catch-up (scales to huge groups)

Guidance: Push for small groups; pull for very large ones.

Common follow-up questions

When you finish the core design, expect the interviewer to pull on one of these threads. Have a one-paragraph answer ready for each.

How would you add end-to-end encryption and what does it cost you (search, multi-device)?: HTTP request/response cannot push, so naive polling wastes resources and adds latency. Long polling is a stopgap. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How do you sync history across a user's multiple devices?: Each message gets a server-assigned, time-sortable id so a conversation has a single agreed order even if clients send concurrently. Persist the message before acknowledging the sender (so it survives a crash), then deliver. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How do you guarantee ordering when a client sends while offline?: A group send looks up the member list and routes a copy to each online member's connection server, persisting once per recipient (or once with per-member delivery state). Very large groups become a fan-out problem similar to a feed and may warrant pull-based catch-up rather than pushing to thousands of sockets. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How do you handle a thundering herd when a popular server restarts?: HTTP request/response cannot push, so naive polling wastes resources and adds latency. Long polling is a stopgap. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.

Last reviewed by the site editor: June 2026

Design a Chat System (WhatsApp / Messenger)

Step 1 - Clarify the requirements

Functional requirements

Non-functional requirements

Step 2 - Back-of-the-envelope estimates

Step 3 - Data model and API

Core entities

API sketch

Step 4 - High-level design

Step 5 - Deep dives that separate strong answers

Real-time transport: WebSockets vs polling

Delivery, ordering, and offline messages

Group chat fan-out and presence

Step 6 - Bottlenecks and how to scale past them

Step 7 - Key tradeoffs

Common follow-up questions

Keep practising

Design a Chat System (WhatsApp / Messenger)

Step 1 - Clarify the requirements

Functional requirements

Non-functional requirements

Step 2 - Back-of-the-envelope estimates

Step 3 - Data model and API

Core entities

API sketch

Step 4 - High-level design

Step 5 - Deep dives that separate strong answers

Real-time transport: WebSockets vs polling

Delivery, ordering, and offline messages

Group chat fan-out and presence

Step 6 - Bottlenecks and how to scale past them

Step 7 - Key tradeoffs

Common follow-up questions

Keep practising