ML engineer analytics interview questions

4 questions on analytics for ml engineer candidates. Each entry has the question as asked, a sample answer outline, common follow-ups, and a reference implementation where applicable.

Showing 1 to 4 of 4 analytics questions.

As asked

You ran an A/B test on a new feature in Google Search. The experiment group shows a 0.5% increase in click-through rate with a p-value of 0.03. Your team wants to ship it. What questions do you ask before saying yes?

Sample answer outline

A strong answer goes beyond p-value: Was the sample size pre-determined or did the team peek and stop early (inflating false positives)? What is the practical significance of 0.5%? Is there novelty effect (did engagement spike because the feature is new)? What is the confidence interval, not just the point estimate? Are there segment breakdowns showing the lift is uniform, or does it only help one demographic? Did any guardrail metrics (latency, ad revenue, user satisfaction) move negatively?

Expect these follow-ups

What would you do if the lift was consistent across all segments but the p-value was 0.08?
How does the multiple comparisons problem affect this if you tested 10 variants simultaneously?

company:googlestatisticsab-testinganalyticsproduct-sense

As asked

You are running an A/B test on a new ranking feature for the TikTok feed. After two weeks, the treatment group shows a 3% lift in watch time but a 2% drop in next-day retention. How do you interpret this and what do you do?

Sample answer outline

A strong answer recognizes the tension between a short-term engagement metric (watch time) and a longer-term health metric (retention), and does not simply ship because watch time went up. The candidate should check for novelty effects (does the retention drop stabilize after week 3?), segment the results by user cohort (are new users being burned out?), and verify that the watch time increase is not from auto-play inflation or low-quality binge patterns. They should recommend holding the experiment longer and checking 7-day retention before deciding.

Expect these follow-ups

What is the minimum detectable effect size you would plan for when designing this experiment?
How do you detect if the treatment group and control group are interfering with each other in a social product?

company:bytedanceab-testingexperimentationmetricsrecommendationstatistics

As asked

You want to A/B test a new lane-change assist behavior that you plan to roll out via OTA. How do you design the experiment, assign vehicles to treatment versus control, define your metrics, and decide when you have enough data to call the test?

Sample answer outline

A strong answer covers: randomizing at the vehicle level (not trip level) to avoid within-vehicle contamination, stratifying by vehicle model and driving environment (urban versus highway) since the feature behaves differently in each, choosing primary metrics (successful lane-change rate, disengagement rate) and guardrail metrics (safety incident rate, user disable rate), computing required sample size via power analysis before starting, and using a sequential testing approach to allow early stopping without inflating the false-positive rate. The candidate should discuss why driver behavior change (novelty effect) can confound early results.

Expect these follow-ups

How do you prevent drivers from opting out of the experiment in ways that bias your results?
The test shows a statistically significant improvement in lane-change success but also a small (non-significant) increase in near-miss events. What do you do?

company:teslaab-testingstatisticsexperimentationautopilotota

As asked

Uber's growth team wants to test whether offering new riders a 20% discount on their first three trips increases 90-day retention. Walk me through how you would design this experiment, including randomization, metrics, and statistical analysis.

Sample answer outline

A strong answer covers the randomization unit (rider ID, not trip, to avoid the same rider getting mixed treatments), holdout sizing using power analysis at the expected effect size, the primary metric (90-day retention or trips in 90 days), guardrail metrics (margin, fraud rate), duration to run before reading results, and the statistical test (two-proportion z-test for retention, t-test for trips). The candidate should flag the novelty effect risk and the need to wait for users to complete their trial trips.

Expect these follow-ups

You see a significant lift in trips taken during the promotion but no lift in 90-day retention. How do you explain this and what does it mean for the decision?
How do you handle spillover if a treated rider refers a control-group rider and they both share a trip together?

company:uberanalyticsab-testingstatisticsexperimentationproduct

Practise these patterns on AlgoExpert

Recommended

200+ video-explained coding interview questions organised by the patterns covered on this page, with timed practice and solution walkthroughs.

Start practising

An external resource we recommend. AlgoExpert is not affiliated with us and we earn nothing from this link.

Tools to sharpen your prep

All tools

ML engineer analytics interview questions

4 questions on analytics for ml engineer candidates. Each entry has the question as asked, a sample answer outline, common follow-ups, and a reference implementation where applicable.

Showing 1 to 4 of 4 analytics questions.

As asked

Sample answer outline

Expect these follow-ups

What would you do if the lift was consistent across all segments but the p-value was 0.08?
How does the multiple comparisons problem affect this if you tested 10 variants simultaneously?

company:googlestatisticsab-testinganalyticsproduct-sense

As asked

Sample answer outline

Expect these follow-ups

What is the minimum detectable effect size you would plan for when designing this experiment?
How do you detect if the treatment group and control group are interfering with each other in a social product?

company:bytedanceab-testingexperimentationmetricsrecommendationstatistics

As asked

Sample answer outline

Expect these follow-ups

How do you prevent drivers from opting out of the experiment in ways that bias your results?
The test shows a statistically significant improvement in lane-change success but also a small (non-significant) increase in near-miss events. What do you do?

company:teslaab-testingstatisticsexperimentationautopilotota

As asked

Sample answer outline

Expect these follow-ups

You see a significant lift in trips taken during the promotion but no lift in 90-day retention. How do you explain this and what does it mean for the decision?
How do you handle spillover if a treated rider refers a control-group rider and they both share a trip together?

company:uberanalyticsab-testingstatisticsexperimentationproduct

Practise these patterns on AlgoExpert

Recommended

200+ video-explained coding interview questions organised by the patterns covered on this page, with timed practice and solution walkthroughs.

Start practising

An external resource we recommend. AlgoExpert is not affiliated with us and we earn nothing from this link.

Tools to sharpen your prep

All tools

ML engineer analytics interview questions

As asked

Sample answer outline

Expect these follow-ups

As asked

Sample answer outline

Expect these follow-ups

As asked

Sample answer outline

Expect these follow-ups

As asked

Sample answer outline

Expect these follow-ups

Related questions

Common A/B testing pitfalls in recommendation systems

Design an A/B test for a new Autopilot feature delivered via OTA

Design an A/B test to measure the effect of a rider discount promotion

Tell me about the biggest production incident you've handled

More ml engineer topics

Tools to sharpen your prep

ML engineer analytics interview questions

As asked

Sample answer outline

Expect these follow-ups

As asked

Sample answer outline

Expect these follow-ups

As asked

Sample answer outline

Expect these follow-ups

As asked

Sample answer outline

Expect these follow-ups

Related questions

Common A/B testing pitfalls in recommendation systems

Design an A/B test for a new Autopilot feature delivered via OTA

Design an A/B test to measure the effect of a rider discount promotion

Tell me about the biggest production incident you've handled

More ml engineer topics

Tools to sharpen your prep

Questions

How do you decide if an A/B test result is trustworthy?AnalyticsmediumVery common

As asked

Sample answer outline

Expect these follow-ups

Common A/B testing pitfalls in recommendation systemsAnalyticshardCommon

As asked

Sample answer outline

Expect these follow-ups

Design an A/B test for a new Autopilot feature delivered via OTAAnalyticshardCommon

As asked

Sample answer outline

Expect these follow-ups

Design an A/B test to measure the effect of a rider discount promotionAnalyticsmediumCommon

As asked

Sample answer outline

Expect these follow-ups

Related questions

Common A/B testing pitfalls in recommendation systems

Design an A/B test for a new Autopilot feature delivered via OTA

Design an A/B test to measure the effect of a rider discount promotion

Tell me about the biggest production incident you've handled

More ml engineer topics

Tools to sharpen your prep

Questions

How do you decide if an A/B test result is trustworthy?AnalyticsmediumVery common

As asked

Sample answer outline

Expect these follow-ups

Common A/B testing pitfalls in recommendation systemsAnalyticshardCommon

As asked

Sample answer outline

Expect these follow-ups

Design an A/B test for a new Autopilot feature delivered via OTAAnalyticshardCommon

As asked

Sample answer outline

Expect these follow-ups

Design an A/B test to measure the effect of a rider discount promotionAnalyticsmediumCommon

As asked

Sample answer outline

Expect these follow-ups

Related questions

Common A/B testing pitfalls in recommendation systems

Design an A/B test for a new Autopilot feature delivered via OTA

Design an A/B test to measure the effect of a rider discount promotion

Tell me about the biggest production incident you've handled

More ml engineer topics

Tools to sharpen your prep