Stripe is known for real-world, rigorous coding rather than abstract puzzles. Onsite rounds include practical implementation work and a bug-bash style round where you fix and extend a small codebase. The written-communication bar is high across every role, and system design rounds expect production realism.
Process timeline
Reported timeline: 2-4 weeks
1
Recruiter screen
Background and role fit.
2
Practical coding
Implementing real features rather than solving riddles.
3
Bug squash
Debugging and extending an existing codebase under time.
4
System design
Production-grade design with real failure handling.
5
Behavioural
Collaboration, judgement, and written-communication signal.
What Stripe looks for
What they value
Writing clean, working code in a realistic setting
Fast, careful debugging of unfamiliar code
Clear writing and reasoning under pressure
Culture signals
Rigor and getting the details exactly right
Strong written communication as a core skill
Caring about developers and end users of the API
Reported questions
Questions candidates report for this role at this company.
As asked
Design the backend for Twitter's home timeline. The system serves 500M monthly active users and must show recent tweets from followed accounts with low latency.
Sample answer outline
Two architectures to compare: fan-out-on-write (push: insert each new tweet into each follower's timeline cache - fast reads, expensive writes for high-follower accounts) vs fan-out-on-read (pull: query followed accounts at read time - flexible but slow). Real systems hybridise: fan-out-on-write for normal users, fan-out-on-read for celebrities (the Justin Bieber problem). Discuss storage (Redis for hot timelines, blob storage for media), caching layers, ranking, and the celebrity-account special case.
Expect these follow-ups
How do you handle a celebrity tweeting to 100M followers?
What is your cache invalidation strategy when a user unfollows?
How do you rank the timeline beyond pure recency?
fanoutcachingscale
As asked
Design a URL shortening service. It must accept long URLs and return short URLs, redirect short URLs to long ones with low latency, and support hundreds of millions of redirects per day.
Sample answer outline
Key generation: pre-generate a pool of 7-char base62 IDs (62^7 = 3.5T combinations) in a separate counter service, or use a hash of the URL truncated and checked for collisions. Storage: KV store keyed by short ID, with the long URL plus metadata (created_at, owner, click_count). Reads: cached in a CDN edge layer because redirect is the hot path. Writes: relatively rare. Discuss custom aliases, analytics, abuse prevention, and link expiry.
Expect these follow-ups
How do you prevent two users from picking the same custom alias simultaneously?
How do you handle a phishing link that spreads virally?
What if a long URL contains a session token - is shortening it a privacy issue?
kv-storecdnscale
As asked
Walk me through the biggest production incident you've personally been on the front line of. What happened, what did you do, and what changed afterwards?
Sample answer outline
Use the STAR structure but lean into the technical specifics. Briefly describe the impact (users affected, revenue, duration). Walk through detection (what alerted you and how), diagnosis (what you ruled out, the dead-end paths), mitigation (the actual fix), and the postmortem outcome (what changed in the system or process). Interviewers look for ownership, calm under pressure, root-cause thinking (not 'we restarted it'), and the discipline to convert pain into permanent fixes.
Expect these follow-ups
What would you do differently if you saw the same symptoms tomorrow?
Whose fault was it? How did you handle that conversation?
Did the postmortem actions actually land, or did they slip?
incidentsownershippostmortems
As asked
Implement cursor-based pagination for a feed API in TypeScript. Explain why cursor beats offset for this case.
Sample answer outline
Offset pagination is O(offset) per page on the database (the query still scans the skipped rows) and breaks when items are inserted or deleted mid-paging. Cursor pagination uses an opaque token encoding the last-seen sort key (e.g. created_at + id for uniqueness). The next query is WHERE (created_at, id) < (last_created_at, last_id) ORDER BY created_at DESC, id DESC LIMIT N. Constant time per page, stable under inserts. Encode the cursor as base64 of a small JSON object so clients treat it as opaque. Discuss edge cases: ties on the sort key, bidirectional paging.
Reference implementation (typescript)
type Cursor = { createdAt: string; id: string };
export function decodeCursor(token: string | null): Cursor | null {
if (!token) return null;
return JSON.parse(Buffer.from(token, "base64url").toString());
}
export function encodeCursor(c: Cursor): string {
return Buffer.from(JSON.stringify(c)).toString("base64url");
}
export async function fetchFeed(after: string | null, limit = 20) {
const cursor = decodeCursor(after);
const rows = await db.query(
`SELECT id, created_at, body FROM posts
WHERE ($1::timestamptz IS NULL OR (created_at, id) < ($1, $2))
ORDER BY created_at DESC, id DESC LIMIT $3`,
[cursor?.createdAt ?? null, cursor?.id ?? null, limit],
);
const last = rows[rows.length - 1];
return {
items: rows,
nextCursor: last ? encodeCursor({ createdAt: last.created_at, id: last.id }) : null,
};
}
Expect these follow-ups
How do you implement bidirectional pagination (back and forth)?
What if the sort field is a non-unique field like score?
When is offset still the right choice?
paginationapi-designsql
As asked
A page loads in 4 seconds. The database shows 200 queries per page load. You suspect an ORM N+1. Walk me through how you would fix it.
Sample answer outline
Confirm the diagnosis: turn on query logging, look at the call site that generates the chatty queries. Usually a loop that accesses a lazy-loaded relation. Fix with the ORM's eager loading primitive (include, with, populate, prefetch_related). Alternatively, batch the lookup with a single IN query. After the fix, verify the page is one or two queries. Watch for the second-order N+1 where the eagerly loaded set itself has a lazy relation. For React/Node: use a dataloader to batch requests inside a request scope.
Reference implementation (typescript)
// Bad: N+1 (one query per post for its author)
const posts = await prisma.post.findMany({ where: { feed: feedId } });
for (const post of posts) {
const author = await prisma.user.findUnique({ where: { id: post.authorId } });
// ...
}
// Good: eager load in a single query
const posts = await prisma.post.findMany({
where: { feed: feedId },
include: { author: true },
});
// Alternative: batch with DataLoader inside a request scope
const userLoader = new DataLoader<string, User>(async (ids) => {
const users = await prisma.user.findMany({ where: { id: { in: [...ids] } } });
const byId = new Map(users.map((u) => [u.id, u]));
return ids.map((id) => byId.get(id)!);
});
Expect these follow-ups
Why does an ORM lazy-load by default?
When is a JOIN slower than two queries?
How would you catch an N+1 in CI before it ships?
ormn-plus-oneperformance
As asked
Tell me about the biggest incident you handled that lived in the platform layer: a broken deploy pipeline, an infrastructure change gone wrong, a certificate expiry, or a cluster failure. Walk me through detection, mitigation, and what changed in the platform afterwards.
Sample answer outline
Frame it through a platform lens, where the blast radius is every team that depends on you. Describe the impact across consumers, not just one service. Detection: what alerted you, and whether it was your monitoring or a downstream team that noticed first. Mitigation: the rollback or break-glass procedure, and whether it existed before the incident or had to be improvised. The strong answer ends with platform-level prevention: a guardrail in the pipeline, a pre-deploy check, an expiry alert, automated rollback. Interviewers listen for ownership of shared infrastructure and the discipline to turn one painful event into a control that protects every team.
Expect these follow-ups
Did a self-service guardrail exist, or did you have to build one after?
How did you communicate with the many teams affected at once?
What pipeline or infrastructure check would have caught this earlier?
incidentsplatformpipelinesownership
Backend engineer interview detail at Stripe
How the Stripe loop applies to Backend engineer candidates
Stripe is a late-stage unicorn headquartered in South San Francisco, and the same 5-stage process described above is what a backend engineer candidate walks through, with the technical stages tuned to the engineering discipline. Stripe is known for real-world, rigorous coding rather than abstract puzzles. Onsite rounds include practical implementation work and a bug-bash style round where you fix and extend a small codebase. The written-communication bar is high across every role, and system design rounds expect production realism.
For a backend engineer, the load concentrates on practical coding and system design. Those are the stages where the engineering signal is read most closely, so they are where preparation pays off most. The non-technical stages (recruiter screen, bug squash, and behavioural) still gate the offer, but they assess fit and communication rather than role-specific depth.
What the backend engineer question mix signals
The 6 most-reported backend engineer questions cluster around behavioural (2), system design (2), backend (1). That distribution is the clearest read on what Stripe actually probes for this role: the more a topic recurs, the more reliably it shows up in the loop, so it is worth weighting practice the same way.
The set spans a easy-to-medium-to-hard difficulty range, topping out at hard problems. Beyond the headline topics, the long tail touches databases, so a backend engineer who only drills the top area will still hit unfamiliar ground in the onsite.
What moves a backend engineer offer forward at Stripe
Across the loop, the traits that consistently move a Stripe backend engineer offer forward are writing clean, working code in a realistic setting, fast, careful debugging of unfamiliar code, and clear writing and reasoning under pressure. These are not abstract values; interviewers score against them, so a backend engineer who demonstrates them explicitly — naming the tradeoff, stating the assumption, checking the edge case out loud — reads stronger than one who only reaches the right answer silently.
The behavioural and culture stages are checking for rigor and getting the details exactly right, strong written communication as a core skill, and caring about developers and end users of the api. For a backend engineer, the most credible way to show these is through specific, recent examples from real engineering work rather than rehearsed generalities.
How to read the backend engineer salary band
The salary signal shown for this role is the approximate senior median of $291,000 in San Francisco, reported as total compensation including bonus and equity and sourced from BLS, ONS, and Levels.fyi reference data. It is a market band for the backend engineer role and city, not a Stripe offer.
San Francisco carries a cost-of-living index of 112 on the scale where New York City equals 100, so read the headline figure alongside that index when comparing it with another market. Individual pay at Stripe varies by level, team, equity refresh, and negotiation, which the open salary breakdown for this role lays out city by city.