How do you prioritise transactional alerts over marketing blasts in the same pipeline?

Decouple the request from delivery with a durable message queue per channel. The intake service is fast and just enqueues; slow, flaky provider calls happen in workers that can scale independently. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.

How would you implement digest notifications (batch many into one)?

Two events can describe the same notification (a retry from the caller, a duplicate trigger). Store a dedup_key (event id plus user plus channel) and reject or coalesce repeats within a window. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.

How do you handle a provider that silently drops messages?

Decouple the request from delivery with a durable message queue per channel. The intake service is fast and just enqueues; slow, flaky provider calls happen in workers that can scale independently. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.

Design a Notification System | System Design

Step 1 - Clarify the requirements

Never start drawing boxes. A strong candidate spends the first few minutes scoping the problem so the design that follows is justified. For a notification system, the questions worth asking are:

Which channels: push, SMS, email, in-app, or all of them?
What volume and what is the latency expectation (transactional vs marketing)?
Do users have preferences and quiet hours we must honour?
Do we need delivery tracking and analytics?

Functional requirements

Accept notification requests from many internal services.
Deliver across push, SMS, email, and in-app channels.
Respect per-user preferences, opt-outs, and rate limits.

Non-functional requirements

Reliable, at-least-once delivery with retries and dead-letter handling.
High throughput with bursty load (e.g. a product launch blast).
No duplicate notifications for the same logical event.

Step 2 - Back-of-the-envelope estimates

Sizing the system tells you which parts are hard. Round aggressively and state your assumptions out loud; the numbers matter less than showing you can reason about scale.

Metric	Estimate	Reasoning
Peak send rate	bursty, 100x average	Marketing blasts and incident alerts create spikes; the pipeline must buffer them.
Provider latency	100s of ms, variable	Third-party APNs/FCM/SMS/email gateways are slow and occasionally fail, so calls must be async.

Step 3 - Data model and API

A compact data model and a small API surface anchor the rest of the discussion. Keep both minimal; you can always extend them when the interviewer pushes.

Core entities

notifications

notification_id, user_id, template_id, channel, status, dedup_key, created_at

dedup_key prevents duplicates for the same logical event.

preferences

user_id, channel, enabled, quiet_hours, frequency_cap

Checked before queuing a send.

device_tokens

user_id, platform, token, updated_at

Push tokens per device; prune invalid ones on provider feedback.

API sketch

POST/api/v1/notify- Internal services enqueue a notification (idempotent via dedup_key).
PUT/api/v1/preferences- Update a user's channel preferences.
GET/api/v1/notifications/{id}- Check delivery status.

Step 4 - High-level design

Sketch the happy path end to end before optimising anything. This is the architecture you would draw on the whiteboard first:

1A notification service validates the request, applies preferences and rate limits, then enqueues per-channel jobs.
2Channel workers pull from queues and call the relevant provider (APNs/FCM, an SMS gateway, an email provider).
3Failures are retried with backoff; permanent failures go to a dead-letter queue for inspection.
4Delivery callbacks update status and feed analytics.

Step 5 - Deep dives that separate strong answers

The high-level design is table stakes. Interviewers spend most of the time here, probing the decisions that actually carry the system. These are the ones to be ready for.

Queue-based fan-out and reliability

Decouple the request from delivery with a durable message queue per channel. The intake service is fast and just enqueues; slow, flaky provider calls happen in workers that can scale independently. Use at-least-once delivery with idempotent processing keyed on the dedup_key, so a retried job never sends twice. Apply exponential backoff with jitter on retries, cap the attempts, and route exhausted messages to a dead-letter queue rather than silently dropping or looping forever. This is the structural reason notifications are reliable despite unreliable downstreams.

Deduplication and rate limiting

Two events can describe the same notification (a retry from the caller, a duplicate trigger). Store a dedup_key (event id plus user plus channel) and reject or coalesce repeats within a window. Separately, protect users from spam and providers from overload with frequency caps per user and global rate limits per provider; a notification the user has effectively already received should be suppressed. Honour quiet hours and per-channel opt-outs before anything is queued, not after.

Channel-specific concerns

Each channel has its own provider, payload format, and failure modes. Push requires valid device tokens that expire and must be pruned on provider feedback; SMS has strict length and cost considerations and country-specific routing; email needs templating, unsubscribe links, and deliverability/reputation management. Abstract these behind a common worker interface but keep channel adapters separate so one provider outage degrades only its channel, and design templates so the same logical notification renders correctly per channel.

Step 6 - Bottlenecks and how to scale past them

Naming where the design breaks, and the specific fix, is what signals seniority. For a notification system the pressure points are:

Bursty load (launch blasts) overwhelms providers.

Buffer in queues and drain at a provider-safe rate; auto-scale workers.

Provider outage stalls a channel.

Per-channel isolation, retries with backoff, dead-letter queue, and failover providers.

Duplicate sends.

Idempotency via dedup_key plus a short-window suppression check.

Step 7 - Key tradeoffs

There is rarely one right answer. State the tradeoff, then commit to a side with a reason tied to the requirements you clarified in step one.

Delivery guarantee

At-least-once (reliable, needs dedup)

At-most-once (no dupes, may drop)

Guidance: At-least-once with idempotent consumers is the standard; never silently drop transactional alerts.

Intake coupling

Synchronous send (simple, fragile)

Queue then async send (resilient)

Guidance: Always queue: provider latency and failures must not block callers.

Common follow-up questions

When you finish the core design, expect the interviewer to pull on one of these threads. Have a one-paragraph answer ready for each.

How do you prioritise transactional alerts over marketing blasts in the same pipeline?: Decouple the request from delivery with a durable message queue per channel. The intake service is fast and just enqueues; slow, flaky provider calls happen in workers that can scale independently. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How would you implement digest notifications (batch many into one)?: Two events can describe the same notification (a retry from the caller, a duplicate trigger). Store a dedup_key (event id plus user plus channel) and reject or coalesce repeats within a window. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How do you track delivery and click-through across channels?: Each channel has its own provider, payload format, and failure modes. Push requires valid device tokens that expire and must be pruned on provider feedback; SMS has strict length and cost considerations and country-specific routing; email needs templating, unsubscribe links, and deliverability/reputation management. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How do you handle a provider that silently drops messages?: Decouple the request from delivery with a durable message queue per channel. The intake service is fast and just enqueues; slow, flaky provider calls happen in workers that can scale independently. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.

Last reviewed by the site editor: June 2026

Step 1 - Clarify the requirements

Never start drawing boxes. A strong candidate spends the first few minutes scoping the problem so the design that follows is justified. For a notification system, the questions worth asking are:

Which channels: push, SMS, email, in-app, or all of them?
What volume and what is the latency expectation (transactional vs marketing)?
Do users have preferences and quiet hours we must honour?
Do we need delivery tracking and analytics?

Functional requirements

Accept notification requests from many internal services.
Deliver across push, SMS, email, and in-app channels.
Respect per-user preferences, opt-outs, and rate limits.

Non-functional requirements

Reliable, at-least-once delivery with retries and dead-letter handling.
High throughput with bursty load (e.g. a product launch blast).
No duplicate notifications for the same logical event.

Step 2 - Back-of-the-envelope estimates

Sizing the system tells you which parts are hard. Round aggressively and state your assumptions out loud; the numbers matter less than showing you can reason about scale.

Metric	Estimate	Reasoning
Peak send rate	bursty, 100x average	Marketing blasts and incident alerts create spikes; the pipeline must buffer them.
Provider latency	100s of ms, variable	Third-party APNs/FCM/SMS/email gateways are slow and occasionally fail, so calls must be async.

Step 3 - Data model and API

A compact data model and a small API surface anchor the rest of the discussion. Keep both minimal; you can always extend them when the interviewer pushes.

Core entities

notifications

notification_id, user_id, template_id, channel, status, dedup_key, created_at

dedup_key prevents duplicates for the same logical event.

preferences

user_id, channel, enabled, quiet_hours, frequency_cap

Checked before queuing a send.

device_tokens

user_id, platform, token, updated_at

Push tokens per device; prune invalid ones on provider feedback.

API sketch

POST/api/v1/notify- Internal services enqueue a notification (idempotent via dedup_key).
PUT/api/v1/preferences- Update a user's channel preferences.
GET/api/v1/notifications/{id}- Check delivery status.

Step 4 - High-level design

Sketch the happy path end to end before optimising anything. This is the architecture you would draw on the whiteboard first:

1A notification service validates the request, applies preferences and rate limits, then enqueues per-channel jobs.
2Channel workers pull from queues and call the relevant provider (APNs/FCM, an SMS gateway, an email provider).
3Failures are retried with backoff; permanent failures go to a dead-letter queue for inspection.
4Delivery callbacks update status and feed analytics.

Step 5 - Deep dives that separate strong answers

The high-level design is table stakes. Interviewers spend most of the time here, probing the decisions that actually carry the system. These are the ones to be ready for.

Queue-based fan-out and reliability

Deduplication and rate limiting

Channel-specific concerns

Step 6 - Bottlenecks and how to scale past them

Naming where the design breaks, and the specific fix, is what signals seniority. For a notification system the pressure points are:

Bursty load (launch blasts) overwhelms providers.

Buffer in queues and drain at a provider-safe rate; auto-scale workers.

Provider outage stalls a channel.

Per-channel isolation, retries with backoff, dead-letter queue, and failover providers.

Duplicate sends.

Idempotency via dedup_key plus a short-window suppression check.

Step 7 - Key tradeoffs

There is rarely one right answer. State the tradeoff, then commit to a side with a reason tied to the requirements you clarified in step one.

Delivery guarantee

At-least-once (reliable, needs dedup)

At-most-once (no dupes, may drop)

Guidance: At-least-once with idempotent consumers is the standard; never silently drop transactional alerts.

Intake coupling

Synchronous send (simple, fragile)

Queue then async send (resilient)

Guidance: Always queue: provider latency and failures must not block callers.

Common follow-up questions

When you finish the core design, expect the interviewer to pull on one of these threads. Have a one-paragraph answer ready for each.

How do you prioritise transactional alerts over marketing blasts in the same pipeline?: Decouple the request from delivery with a durable message queue per channel. The intake service is fast and just enqueues; slow, flaky provider calls happen in workers that can scale independently. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How would you implement digest notifications (batch many into one)?: Two events can describe the same notification (a retry from the caller, a duplicate trigger). Store a dedup_key (event id plus user plus channel) and reject or coalesce repeats within a window. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How do you track delivery and click-through across channels?: Each channel has its own provider, payload format, and failure modes. Push requires valid device tokens that expire and must be pruned on provider feedback; SMS has strict length and cost considerations and country-specific routing; email needs templating, unsubscribe links, and deliverability/reputation management. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How do you handle a provider that silently drops messages?: Decouple the request from delivery with a durable message queue per channel. The intake service is fast and just enqueues; slow, flaky provider calls happen in workers that can scale independently. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.

Last reviewed by the site editor: June 2026

Design a Notification System

Step 1 - Clarify the requirements

Functional requirements

Non-functional requirements

Step 2 - Back-of-the-envelope estimates

Step 3 - Data model and API

Core entities

API sketch

Step 4 - High-level design

Step 5 - Deep dives that separate strong answers

Queue-based fan-out and reliability

Deduplication and rate limiting

Channel-specific concerns

Step 6 - Bottlenecks and how to scale past them

Step 7 - Key tradeoffs

Common follow-up questions

Keep practising

Design a Notification System

Step 1 - Clarify the requirements

Functional requirements

Non-functional requirements

Step 2 - Back-of-the-envelope estimates

Step 3 - Data model and API

Core entities

API sketch

Step 4 - High-level design

Step 5 - Deep dives that separate strong answers

Queue-based fan-out and reliability

Deduplication and rate limiting

Channel-specific concerns

Step 6 - Bottlenecks and how to scale past them

Step 7 - Key tradeoffs

Common follow-up questions

Keep practising