As asked
Design the backend for Twitter's home timeline. The system serves 500M monthly active users and must show recent tweets from followed accounts with low latency.
Sample answer outline
Two architectures to compare: fan-out-on-write (push: insert each new tweet into each follower's timeline cache - fast reads, expensive writes for high-follower accounts) vs fan-out-on-read (pull: query followed accounts at read time - flexible but slow). Real systems hybridise: fan-out-on-write for normal users, fan-out-on-read for celebrities (the Justin Bieber problem). Discuss storage (Redis for hot timelines, blob storage for media), caching layers, ranking, and the celebrity-account special case.
Expect these follow-ups
- How do you handle a celebrity tweeting to 100M followers?
- What is your cache invalidation strategy when a user unfollows?
- How do you rank the timeline beyond pure recency?