As asked
Design an internal LLM gateway used by ten product teams. It must support provider fallback, prompt versioning, cost controls, and audit logging.
Sample answer outline
The gateway should be a policy and observability layer, not just an HTTP proxy. It needs typed request contracts, prompt and model version identifiers, tenant-level budgets, retries with idempotency, and provider fallback rules that are explicit about quality and compliance tradeoffs. Audit logs should capture inputs, outputs, metadata, and redaction state according to data policy. The answer should include latency budgets and failure modes, such as a fallback model producing different JSON shape or safety behaviour. Strong candidates avoid automatic fallback for high-risk flows unless the evals prove equivalence.
Expect these follow-ups
- How do you prevent fallback from masking a provider outage?
- What data do you store when prompts contain personal data?
- How do teams override defaults without bypassing governance?