How would you implement surge pricing per region?

The core problem is 'find available drivers near this point' over objects that move every few seconds. A naive distance scan over all drivers is O(n) and far too slow. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.

How would you compute accurate ETAs?

The core problem is 'find available drivers near this point' over objects that move every few seconds. A naive distance scan over all drivers is O(n) and far too slow. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.

Design a Ride-Sharing Service (Uber / Lyft) | System Design

Step 1 - Clarify the requirements

Never start drawing boxes. A strong candidate spends the first few minutes scoping the problem so the design that follows is justified. For a ride-sharing service, the questions worth asking are:

What is the matching latency expectation (riders wait seconds, not minutes)?
How frequently do drivers report location, and how many are online per city?
Do we need surge pricing, pooling, and scheduled rides, or just on-demand?
Is matching nearest-driver, or optimised for ETA and acceptance probability?

Functional requirements

Riders request a ride; the system finds and assigns a nearby driver.
Track driver and trip location in real time and share ETA.
Handle the trip lifecycle (request, match, pickup, in-trip, complete, pay).

Non-functional requirements

Low matching latency so riders are not left waiting.
Handle very high-frequency location updates from active drivers.
High availability; an outage strands riders and drivers mid-trip.

Step 2 - Back-of-the-envelope estimates

Sizing the system tells you which parts are hard. Round aggressively and state your assumptions out loud; the numbers matter less than showing you can reason about scale.

Metric	Estimate	Reasoning
Location update rate	every ~4 sec/driver	Active drivers ping frequently; with hundreds of thousands online this is a huge write stream.
Active drivers (large city)	100K+	Location index and matching must handle dense, moving populations per region.

Step 3 - Data model and API

A compact data model and a small API surface anchor the rest of the discussion. Keep both minimal; you can always extend them when the interviewer pushes.

Core entities

driver_locations

driver_id, lat, lng, geohash, status, updated_at

Hot, frequently overwritten; kept in an in-memory geospatial index.

trips

trip_id (PK), rider_id, driver_id, state, origin, dest, fare, timestamps

Stateful lifecycle; the source of truth for billing.

riders

rider_id (PK), payment_method, rating

Profile and payment data.

API sketch

POST/api/v1/rides- Request a ride; triggers matching.
PUT/api/v1/drivers/location- Driver reports current location (high frequency).
GET/api/v1/trips/{id}- Poll/subscribe to trip state and driver ETA.

Step 4 - High-level design

Sketch the happy path end to end before optimising anything. This is the architecture you would draw on the whiteboard first:

1Drivers stream location updates into a location service backed by a geospatial index.
2A ride request hits a matching service that queries the index for nearby available drivers.
3The matcher offers the trip to candidate drivers; on acceptance, a trip record is created and tracked.
4A trip service drives the state machine through pickup, in-trip, and completion, then triggers payment.

Step 5 - Deep dives that separate strong answers

The high-level design is table stakes. Interviewers spend most of the time here, probing the decisions that actually carry the system. These are the ones to be ready for.

Geospatial indexing of moving drivers

The core problem is 'find available drivers near this point' over objects that move every few seconds. A naive distance scan over all drivers is O(n) and far too slow. The standard solution indexes locations by a spatial scheme: geohash (encode lat/lng into a string prefix so nearby points share prefixes), quadtree, or Uber's H3 hexagonal grid. A request maps to a cell, and you scan that cell plus its neighbours, which bounds the candidate set. Because positions change constantly, this index lives in memory (e.g. Redis with geo commands) and is updated on every ping rather than persisted relationally.

Handling the location-update write storm

Hundreds of thousands of drivers pinging every few seconds is an enormous write rate that no relational database should take directly. Absorb it with an in-memory store updated in place (the latest position overwrites the old; history, if needed, streams to a separate analytics pipeline). Partition the location service by region (city or geohash prefix) so each shard handles a bounded area and matching stays local. Updates are fire-and-forget where a missed ping is harmless because another arrives seconds later.

Matching, trip state, and surge

Matching is more than nearest driver: it weighs ETA, driver acceptance likelihood, direction of travel, and fairness, and must avoid offering the same driver to two riders at once, which needs a short lock or offer-then-confirm protocol. The trip itself is a state machine (requested, matched, en route, in trip, completed, cancelled) that must be durable and consistent because it drives billing. Surge pricing is computed per region from the live supply/demand ratio in each cell and applied at request time. Payment runs after completion through a separate payment service, ideally idempotent.

Step 6 - Bottlenecks and how to scale past them

Naming where the design breaks, and the specific fix, is what signals seniority. For a ride-sharing service the pressure points are:

Location write volume.

In-memory geo store, region sharding, overwrite-in-place, async history stream.

Matching latency in dense areas.

Cell-based candidate scoping (geohash/H3) plus per-region matchers.

Double-assigning a driver.

Offer-and-confirm with a short reservation lock per driver.

Step 7 - Key tradeoffs

There is rarely one right answer. State the tradeoff, then commit to a side with a reason tied to the requirements you clarified in step one.

Spatial index

Geohash (simple, string prefixes)

H3 / quadtree (uniform cells, richer neighbour queries)

Guidance: Geohash is easy to explain; H3 gives more uniform cells for dense, non-rectangular areas.

Matching objective

Nearest driver (lowest latency)

Optimised (ETA + acceptance + fairness)

Guidance: Start with nearest; layer in scoring to improve completion rate and balance.

Common follow-up questions

When you finish the core design, expect the interviewer to pull on one of these threads. Have a one-paragraph answer ready for each.

How would you implement surge pricing per region?: The core problem is 'find available drivers near this point' over objects that move every few seconds. A naive distance scan over all drivers is O(n) and far too slow. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How do you support carpooling/pooled rides?: Hundreds of thousands of drivers pinging every few seconds is an enormous write rate that no relational database should take directly. Absorb it with an in-memory store updated in place (the latest position overwrites the old; history, if needed, streams to a separate analytics pipeline). Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How do you make the trip state machine resilient to a server crash mid-trip?: Matching is more than nearest driver: it weighs ETA, driver acceptance likelihood, direction of travel, and fairness, and must avoid offering the same driver to two riders at once, which needs a short lock or offer-then-confirm protocol. The trip itself is a state machine (requested, matched, en route, in trip, completed, cancelled) that must be durable and consistent because it drives billing. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How would you compute accurate ETAs?: The core problem is 'find available drivers near this point' over objects that move every few seconds. A naive distance scan over all drivers is O(n) and far too slow. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.

Last reviewed by the site editor: June 2026

Step 1 - Clarify the requirements

Never start drawing boxes. A strong candidate spends the first few minutes scoping the problem so the design that follows is justified. For a ride-sharing service, the questions worth asking are:

What is the matching latency expectation (riders wait seconds, not minutes)?
How frequently do drivers report location, and how many are online per city?
Do we need surge pricing, pooling, and scheduled rides, or just on-demand?
Is matching nearest-driver, or optimised for ETA and acceptance probability?

Functional requirements

Riders request a ride; the system finds and assigns a nearby driver.
Track driver and trip location in real time and share ETA.
Handle the trip lifecycle (request, match, pickup, in-trip, complete, pay).

Non-functional requirements

Low matching latency so riders are not left waiting.
Handle very high-frequency location updates from active drivers.
High availability; an outage strands riders and drivers mid-trip.

Step 2 - Back-of-the-envelope estimates

Sizing the system tells you which parts are hard. Round aggressively and state your assumptions out loud; the numbers matter less than showing you can reason about scale.

Metric	Estimate	Reasoning
Location update rate	every ~4 sec/driver	Active drivers ping frequently; with hundreds of thousands online this is a huge write stream.
Active drivers (large city)	100K+	Location index and matching must handle dense, moving populations per region.

Step 3 - Data model and API

A compact data model and a small API surface anchor the rest of the discussion. Keep both minimal; you can always extend them when the interviewer pushes.

Core entities

driver_locations

driver_id, lat, lng, geohash, status, updated_at

Hot, frequently overwritten; kept in an in-memory geospatial index.

trips

trip_id (PK), rider_id, driver_id, state, origin, dest, fare, timestamps

Stateful lifecycle; the source of truth for billing.

riders

rider_id (PK), payment_method, rating

Profile and payment data.

API sketch

POST/api/v1/rides- Request a ride; triggers matching.
PUT/api/v1/drivers/location- Driver reports current location (high frequency).
GET/api/v1/trips/{id}- Poll/subscribe to trip state and driver ETA.

Step 4 - High-level design

Sketch the happy path end to end before optimising anything. This is the architecture you would draw on the whiteboard first:

1Drivers stream location updates into a location service backed by a geospatial index.
2A ride request hits a matching service that queries the index for nearby available drivers.
3The matcher offers the trip to candidate drivers; on acceptance, a trip record is created and tracked.
4A trip service drives the state machine through pickup, in-trip, and completion, then triggers payment.

Step 5 - Deep dives that separate strong answers

The high-level design is table stakes. Interviewers spend most of the time here, probing the decisions that actually carry the system. These are the ones to be ready for.

Geospatial indexing of moving drivers

Handling the location-update write storm

Matching, trip state, and surge

Step 6 - Bottlenecks and how to scale past them

Naming where the design breaks, and the specific fix, is what signals seniority. For a ride-sharing service the pressure points are:

Location write volume.

In-memory geo store, region sharding, overwrite-in-place, async history stream.

Matching latency in dense areas.

Cell-based candidate scoping (geohash/H3) plus per-region matchers.

Double-assigning a driver.

Offer-and-confirm with a short reservation lock per driver.

Step 7 - Key tradeoffs

There is rarely one right answer. State the tradeoff, then commit to a side with a reason tied to the requirements you clarified in step one.

Spatial index

Geohash (simple, string prefixes)

H3 / quadtree (uniform cells, richer neighbour queries)

Guidance: Geohash is easy to explain; H3 gives more uniform cells for dense, non-rectangular areas.

Matching objective

Nearest driver (lowest latency)

Optimised (ETA + acceptance + fairness)

Guidance: Start with nearest; layer in scoring to improve completion rate and balance.

Common follow-up questions

When you finish the core design, expect the interviewer to pull on one of these threads. Have a one-paragraph answer ready for each.

How would you implement surge pricing per region?: The core problem is 'find available drivers near this point' over objects that move every few seconds. A naive distance scan over all drivers is O(n) and far too slow. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How do you support carpooling/pooled rides?: Hundreds of thousands of drivers pinging every few seconds is an enormous write rate that no relational database should take directly. Absorb it with an in-memory store updated in place (the latest position overwrites the old; history, if needed, streams to a separate analytics pipeline). Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How do you make the trip state machine resilient to a server crash mid-trip?: Matching is more than nearest driver: it weighs ETA, driver acceptance likelihood, direction of travel, and fairness, and must avoid offering the same driver to two riders at once, which needs a short lock or offer-then-confirm protocol. The trip itself is a state machine (requested, matched, en route, in trip, completed, cancelled) that must be durable and consistent because it drives billing. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.
How would you compute accurate ETAs?: The core problem is 'find available drivers near this point' over objects that move every few seconds. A naive distance scan over all drivers is O(n) and far too slow. Sketch the change against the high-level design above and tie your choice back to the requirements you clarified, rather than reaching for the most complex option.

Last reviewed by the site editor: June 2026

Design a Ride-Sharing Service (Uber / Lyft)

Step 1 - Clarify the requirements

Functional requirements

Non-functional requirements

Step 2 - Back-of-the-envelope estimates

Step 3 - Data model and API

Core entities

API sketch

Step 4 - High-level design

Step 5 - Deep dives that separate strong answers

Geospatial indexing of moving drivers

Handling the location-update write storm

Matching, trip state, and surge

Step 6 - Bottlenecks and how to scale past them

Step 7 - Key tradeoffs

Common follow-up questions

Keep practising

Design a Ride-Sharing Service (Uber / Lyft)

Step 1 - Clarify the requirements

Functional requirements

Non-functional requirements

Step 2 - Back-of-the-envelope estimates

Step 3 - Data model and API

Core entities

API sketch

Step 4 - High-level design

Step 5 - Deep dives that separate strong answers

Geospatial indexing of moving drivers

Handling the location-update write storm

Matching, trip state, and surge

Step 6 - Bottlenecks and how to scale past them

Step 7 - Key tradeoffs

Common follow-up questions

Keep practising