As asked
A customer says their weekly summary report takes 4 minutes to load. It used to be 30 seconds. The report runs a single SQL query across 3 joined tables. Walk me through how you find and fix the regression.
Sample answer outline
Start with EXPLAIN ANALYZE to identify sequential scans or hash joins on large tables. Check whether indexes exist on the join columns and the WHERE clause predicates. Look for data volume growth (did the customer's dataset grow 10x since the query was written?). Fixes might include adding a composite index, rewriting a correlated subquery as a join, or adding a materialized view for the aggregation. Candidates should also mention query plan caching and statistics freshness (ANALYZE in Postgres).
Expect these follow-ups
- What if adding the index makes writes noticeably slower for this customer? How do you present that tradeoff?
- How do you prevent this from happening to other customers whose data will eventually reach the same size?