As asked
You find application code that runs SELECT * FROM orders WHERE user_id = $1 in a loop for each user. The loop runs 5,000 times per request. Rewrite this as a single query and explain the join strategy the database should choose.
Sample answer outline
Replace the loop with SELECT o.* FROM orders o WHERE o.user_id = ANY($1) where $1 is an array of all user IDs, or join directly to a users query. The planner should choose a hash join on user_id if the user list is large, using the user_id array as the build side. The candidate should mention adding an index on orders.user_id and ensuring statistics are fresh so the planner estimates join cardinality correctly.
Reference implementation (sql)
-- Before: N+1 pattern
FOR user_id IN (SELECT id FROM users WHERE active = true) LOOP
SELECT * FROM orders WHERE user_id = user_id;
END LOOP;
-- After: single query
SELECT o.*
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.active = true;Expect these follow-ups
- When would you use a lateral join instead of a plain join here?