As asked
You are designing an orders topic with 50,000 events per second. How do you choose the partition key, and what can go wrong?
Sample answer outline
The partition key controls ordering, load distribution and consumer parallelism, so it should match the ordering requirement first. If all events for an order must be processed in sequence, orderId is a natural key; if customer-level ordering matters, customerId may be required but can create hot partitions. The number of partitions should allow enough consumer parallelism without creating unnecessary broker overhead. The answer should discuss skew, key cardinality, compaction needs and future repartitioning pain. Candidates often say random keys for balance while missing that this destroys per-entity ordering.
Expect these follow-ups
- How would you detect a hot partition in production?
- What happens to ordering when you increase partition count?
- When would you split one logical stream across multiple topics?