As asked
You join a startup with 50 microservices and no observability. What do you put in place in the first 90 days?
Sample answer outline
Three pillars but in the right order. Logs first: ship structured logs to a single store (Loki, Elastic, or a vendor) with a trace ID field. Metrics next: Prometheus with a service-level dashboard per team and four golden signals (latency, traffic, errors, saturation). Tracing last because it requires app instrumentation: OpenTelemetry SDKs in every service, sampling at 1 to 10 percent at the head, tail sampling for errors. Tie it all together with the trace ID propagated through logs and metrics. Get an on-call rota and a paging policy in place at the same time.
Expect these follow-ups
- How do you sell the observability investment to a CEO who wants features?
- What is your retention policy and why?
- When is OpenTelemetry not the right choice?