Observability engineer coding interview questions

11 questions on coding for observability engineer candidates. Expect prompts such as “Write a PromQL recording rule and alert for high HTTP error rate” and “Implement a function to compute per-second rate from a Prometheus-style counter”, each with a worked answer outline and the follow-ups interviewers push on.

Showing 1 to 11 of 11 coding questions.

As asked

Write a Prometheus recording rule that pre-computes the 5-minute HTTP error rate per service, and then write an alert rule that fires when this rate exceeds 5% for more than 2 minutes. Show me the YAML.

Sample answer outline

The recording rule uses rate() over the error counter divided by rate() over the total counter, aggregated by service label. The alert rule references the pre-computed metric, sets a threshold of 0.05, uses a 'for: 2m' clause to avoid flapping, and includes meaningful labels and annotations including a runbook URL. A strong answer notes that dividing two rate() calls can produce NaN when the denominator is 0 and handles it with 'or on()' or a guard clause.

Reference implementation (yaml)

YAML

groups:
  - name: http_error_rate
    interval: 1m
    rules:
      - record: job:http_error_rate:rate5m
        expr: |
          sum by (service) (
            rate(http_requests_total{status=~"5.."}[5m])
          )
          /
          sum by (service) (
            rate(http_requests_total[5m])
          )

      - alert: HighHTTPErrorRate
        expr: job:http_error_rate:rate5m > 0.05
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "High error rate on {{ $labels.service }}"
          runbook: "https://wiki.example.com/runbooks/http-errors"

Expect these follow-ups

How do you prevent the alert from firing during a deployment when error counts briefly spike?
What is the difference between 'for: 2m' and 'keep_firing_for: 2m'?

prometheuspromqlalertingyamlrecording-rules

As asked

Given a list of (timestamp, value) pairs representing a monotonically increasing counter with possible resets, write a function that computes the per-second rate for each interval, handling counter resets correctly.

Sample answer outline

A strong solution detects a reset when the current value is less than the previous value, treats the reset value as an addition to the previous value (assuming the counter restarted from 0), computes delta / duration_seconds for each interval, and handles edge cases like the first sample (no previous value) and duplicate timestamps.

Reference implementation (python)

Python

def compute_rate(samples: list[tuple[float, float]]) -> list[tuple[float, float]]:
    # samples: list of (timestamp_seconds, counter_value)
    # returns: list of (timestamp, rate_per_second)
    # TODO: implement, handling resets
    rates = []
    for i in range(1, len(samples)):
        t_prev, v_prev = samples[i - 1]
        t_curr, v_curr = samples[i]
        dt = t_curr - t_prev
        if dt <= 0:
            continue
        # handle reset
        delta = v_curr - v_prev if v_curr >= v_prev else v_curr
        rates.append((t_curr, delta / dt))
    return rates

Expect these follow-ups

How does Prometheus handle counter resets in the rate() function differently from irate()?
What if a counter resets to a non-zero value?

countersprometheusratealgorithmsmetrics

As asked

Given a list of request outcomes (success or failure) with timestamps over a rolling 30-day window and an SLO target of 99.9%, write a function that returns the remaining error budget as a percentage and as a count of allowed failures.

Sample answer outline

The solution filters requests to the last 30 days, counts total and failed requests, computes actual reliability, compares to the SLO target to get consumed budget, and subtracts from the total budget (total_requests * 0.001) to get the remaining count. A strong answer uses a sliding window efficiently (binary search or deque) rather than re-scanning the full list each call.

Reference implementation (python)

Python

from datetime import datetime, timedelta

def remaining_error_budget(
    requests: list[tuple[datetime, bool]],  # (timestamp, is_success)
    slo_target: float = 0.999,
    window_days: int = 30,
) -> dict:
    now = datetime.utcnow()
    cutoff = now - timedelta(days=window_days)
    window = [(t, s) for t, s in requests if t >= cutoff]
    total = len(window)
    failures = sum(1 for _, s in window if not s)
    if total == 0:
        return {"remaining_pct": 100.0, "remaining_count": 0}
    allowed_failures = total * (1 - slo_target)
    remaining_count = allowed_failures - failures
    remaining_pct = (remaining_count / allowed_failures) * 100 if allowed_failures > 0 else 0.0
    return {"remaining_pct": remaining_pct, "remaining_count": remaining_count}

Expect these follow-ups

How would you extend this to an O(1) update function for real-time budget tracking?
What changes if the SLO window is calendar-based rather than rolling?

sloerror-budgetalgorithmssliding-windowreliability

As asked

Write a function that parses nginx combined log format lines into structured dicts, extracting method, path, status code, response bytes, and response time. Handle malformed lines without raising an exception.

Sample answer outline

A correct solution uses a regex with named groups matching the combined log format, returns None or an empty dict for lines that don't match rather than raising, and correctly parses the quoted request field to extract method, path, and protocol separately. A strong answer pre-compiles the regex outside the function to avoid re-compilation on every call.

Reference implementation (python)

Python

import re
from typing import Optional

LOG_PATTERN = re.compile(
    r'(?P<remote_addr>\S+) \S+ \S+ \[.+?\] '
    r'"(?P<method>\S+) (?P<path>\S+) \S+" '
    r'(?P<status>\d{3}) (?P<bytes>\d+|-) '
    r'"[^"]*" "[^"]*"'
)

def parse_nginx_log(line: str) -> Optional[dict]:
    m = LOG_PATTERN.match(line.strip())
    if not m:
        return None
    return {
        "remote_addr": m.group("remote_addr"),
        "method": m.group("method"),
        "path": m.group("path"),
        "status": int(m.group("status")),
        "bytes": int(m.group("bytes")) if m.group("bytes") != "-" else 0,
    }

Expect these follow-ups

How would you extend this to handle both combined and common log formats in the same stream?
How would you use this parser to feed log data into a Prometheus counter?

logsparsingregexstructured-loggingpython

As asked

You have a JSON file where each line is {"metric": "name", "labels": {"key": "value", ...}}. Write a function that reads it and returns the top K metric names by number of unique label combinations (series count).

Sample answer outline

A correct solution streams the file line by line (not loading all into memory), uses a defaultdict to count unique label tuples per metric name, then uses heapq.nlargest for efficient top-K retrieval. A strong answer hashes the label dict (sorted items tuple) to deduplicate identical series.

Reference implementation (python)

Python

import json
import heapq
from collections import defaultdict

def top_k_by_cardinality(filepath: str, k: int) -> list[tuple[str, int]]:
    series_sets: dict[str, set] = defaultdict(set)
    with open(filepath) as f:
        for line in f:
            try:
                obj = json.loads(line)
                metric = obj["metric"]
                label_key = tuple(sorted(obj.get("labels", {}).items()))
                series_sets[metric].add(label_key)
            except (json.JSONDecodeError, KeyError):
                continue
    counts = {m: len(s) for m, s in series_sets.items()}
    return heapq.nlargest(k, counts.items(), key=lambda x: x[1])

Expect these follow-ups

How would you modify this to also output the top label KEY by cardinality within each metric?
How does Prometheus expose this data natively via its API?

cardinalityalgorithmspythonmetricstsdb

As asked

Given an SLO target (e.g., 99.9%), a compliance window in hours (e.g., 720 hours for 30 days), and a desired alert window in hours (e.g., 1 hour), write a function that returns the burn rate multiple that would exhaust the entire error budget in exactly the alert window.

Sample answer outline

The burn rate that exhausts the budget in the alert window is compliance_window / alert_window. For a 30-day window and a 1-hour alert window, the burn rate is 720. A function implementing this correctly handles the edge case of alert_window = 0 (division by zero) and documents that the actual Prometheus alert threshold typically uses a fraction of the budget to leave headroom.

Reference implementation (python)

Python

def burn_rate_threshold(
    slo_target: float,       # e.g. 0.999
    compliance_window_hr: float,  # e.g. 720 for 30 days
    alert_window_hr: float,       # e.g. 1.0
) -> float:
    """
    Returns the burn rate multiple that, if sustained over alert_window_hr,
    would exhaust the entire error budget of the compliance window.
    """
    if alert_window_hr <= 0:
        raise ValueError("alert_window_hr must be positive")
    error_budget_fraction = 1 - slo_target
    # At burn_rate x baseline error rate, budget exhausted in:
    # compliance_window / burn_rate = alert_window
    # => burn_rate = compliance_window / alert_window
    return compliance_window_hr / alert_window_hr

Expect these follow-ups

Why does the Google SRE workbook recommend setting the alert at 2% budget consumed, not 100%?
How do you implement this alert in Prometheus given only a counter metric?

sloburn-ratealertingpythonerror-budget

As asked

Our application emits unstructured logs like: '2024-01-15 ERROR [payment] code=E1042 msg=timeout'. Write a LogQL query that extracts the error code and counts occurrences per code over the last hour.

Sample answer outline

A correct solution selects the log stream by label (job, app), uses the pattern or regexp parser to extract the code field, filters to ERROR lines, and uses sum by (code) (count_over_time(...[1h])) to aggregate. A strong answer distinguishes between the pattern parser (faster, positional) and the regexp parser (more flexible) and notes that the extracted field is a string label, so numeric comparisons require label_format or conversion.

Reference implementation (logql)

logql

// LogQL query to extract and count error codes
sum by (code) (
  count_over_time(
    {app="payment-service"}
    |= "ERROR"
    | regexp `code=(?P<code>\w+)`
    [1h]
  )
)

Expect these follow-ups

How would you create a Grafana panel that shows a time series of each error code?
What is the performance difference between pattern and regexp parsers in Loki at scale?

lokilogqllogsparsingalerting

As asked

Given a dict of upper bound to cumulative count for a Prometheus histogram, plus the total count and sum, implement a function that returns the estimated value at a given quantile using linear interpolation within the containing bucket.

Sample answer outline

The solution must sort buckets by upper bound, find the bucket whose cumulative count first exceeds quantile * total_count, then linearly interpolate between the previous bucket boundary and the current bucket boundary using the proportion of the quantile rank within that bucket. Edge cases include the +Inf bucket, empty histograms, and quantile=0 or quantile=1.

Reference implementation (python)

Python

def histogram_quantile(q: float, buckets: dict[float, float]) -> float:
    """
    buckets: {upper_bound: cumulative_count}
    q: quantile in [0, 1]
    Returns estimated value at quantile q.
    """
    sorted_bounds = sorted(buckets.keys())
    total = buckets[float('inf')] if float('inf') in buckets else max(buckets.values())
    target_count = q * total
    prev_bound = 0.0
    prev_count = 0.0
    for bound in sorted_bounds:
        count = buckets[bound]
        if count >= target_count:
            # interpolate within [prev_bound, bound]
            fraction = (target_count - prev_count) / (count - prev_count) if count != prev_count else 0
            return prev_bound + fraction * (bound - prev_bound)
        prev_bound = bound
        prev_count = count
    return sorted_bounds[-1]

Expect these follow-ups

Why does this estimation become inaccurate when a bucket contains most of the observations?
How does the Native Histogram (float64 schema) in Prometheus 2.x improve quantile accuracy?

prometheushistogramquantilealgorithmsstatistics

As asked

Write a Go function that wraps an OpenTelemetry SpanExporter, retrying failed exports with exponential backoff up to a max number of attempts. It should not retry on permanent errors (e.g., HTTP 400).

Sample answer outline

A correct solution implements the go.opentelemetry.io/otel/sdk/trace.SpanExporter interface, wraps the ExportSpans method in a retry loop using context-aware sleep (to respect cancellation), and distinguishes retryable errors (connection refused, 500, 503) from permanent ones (400, 401). A strong answer uses exponential backoff with jitter to avoid thundering herd on recovery.

Reference implementation (go)

package main

import (
    "context"
    "math"
    "time"
    sdktrace "go.opentelemetry.io/otel/sdk/trace"
)

type RetryExporter struct {
    wrapped    sdktrace.SpanExporter
    maxRetries int
    baseDelay  time.Duration
}

func (r *RetryExporter) ExportSpans(ctx context.Context, spans []sdktrace.ReadOnlySpan) error {
    var err error
    for attempt := 0; attempt <= r.maxRetries; attempt++ {
        if attempt > 0 {
            delay := r.baseDelay * time.Duration(math.Pow(2, float64(attempt-1)))
            select {
            case <-time.After(delay):
            case <-ctx.Done():
                return ctx.Err()
            }
        }
        err = r.wrapped.ExportSpans(ctx, spans)
        if err == nil || isPermanentError(err) {
            return err
        }
    }
    return err
}

func isPermanentError(err error) bool {
    // TODO: check for HTTP 4xx or non-retryable error types
    return false
}

func (r *RetryExporter) Shutdown(ctx context.Context) error {
    return r.wrapped.Shutdown(ctx)
}

Expect these follow-ups

How does the OTel Collector handle retry differently from the SDK exporter?
What span data is lost if the context is cancelled mid-retry?

opentelemetrygoretryexporterreliability

As asked

You receive a stream of alert events as (alert_name, labels_dict, timestamp). The same alert may fire many times within a 5-minute window. Write a function that returns only the first occurrence of each unique (alert_name, labels) pair within any 5-minute tumbling window.

Sample answer outline

The solution assigns each event to a 5-minute tumbling window bucket by integer-dividing the timestamp by 300, uses a set of (bucket, alert_name, frozenset(labels.items())) tuples for deduplication, and returns only events not yet seen in that bucket. A strong answer discusses whether to use tumbling or sliding windows and the memory implication of keeping seen sets indefinitely.

Reference implementation (python)

Python

from collections import defaultdict
from typing import NamedTuple

class Alert(NamedTuple):
    name: str
    labels: dict
    timestamp: float  # unix seconds

def deduplicate_alerts(events: list[Alert], window_seconds: int = 300) -> list[Alert]:
    seen: set = set()
    result = []
    for event in events:
        bucket = int(event.timestamp // window_seconds)
        key = (bucket, event.name, frozenset(event.labels.items()))
        if key not in seen:
            seen.add(key)
            result.append(event)
    return result

Expect these follow-ups

How would you handle a sliding window instead of a tumbling window?
What happens to the seen set over time and how do you expire entries?

alertingdeduplicationalgorithmssliding-windowpython

As asked

You have two OTel-style exponential histograms represented as sparse dicts mapping bucket_index to count. Write a function that merges them into a single histogram, handling the case where the two histograms have different scales (resolution levels).

Sample answer outline

A correct solution handles the same-scale case (merge counts for matching bucket indexes, add unique buckets), and the different-scale case by downscaling the higher-resolution histogram to match the lower one (right-shifting bucket indexes by the scale difference). The candidate should handle the zero_count and sum fields separately (they are additive at any scale) and note that downscaling is lossy (buckets are merged, reducing resolution).

Reference implementation (python)

Python

def merge_exponential_histograms(
    hist_a: dict,  # {"scale": int, "buckets": {int: int}, "sum": float, "count": int}
    hist_b: dict,
) -> dict:
    scale_a = hist_a["scale"]
    scale_b = hist_b["scale"]
    target_scale = min(scale_a, scale_b)

    def downscale(buckets: dict, from_scale: int, to_scale: int) -> dict:
        shift = from_scale - to_scale
        result: dict[int, int] = {}
        for idx, count in buckets.items():
            new_idx = idx >> shift  # integer right-shift merges buckets
            result[new_idx] = result.get(new_idx, 0) + count
        return result

    buckets_a = downscale(hist_a["buckets"], scale_a, target_scale)
    buckets_b = downscale(hist_b["buckets"], scale_b, target_scale)
    merged = dict(buckets_a)
    for idx, count in buckets_b.items():
        merged[idx] = merged.get(idx, 0) + count
    return {
        "scale": target_scale,
        "buckets": merged,
        "sum": hist_a["sum"] + hist_b["sum"],
        "count": hist_a["count"] + hist_b["count"],
    }

Expect these follow-ups

Why is it not valid to simply concatenate the two bucket dicts when scales differ?
How does the OTel Collector merge exponential histograms from multiple sources?

algorithmshistogramopentelemetrypythonmerging

Tools to sharpen your prep

All tools

groups: - name: http_error_rate interval: 1m rules: - record: job:http_error_rate:rate5m expr: | sum by (service) ( rate(http_requests_total{status=~"5.."}[5m]) ) / sum by (service) ( rate(http_requests_total[5m]) ) - alert: HighHTTPErrorRate expr: job:http_error_rate:rate5m > 0.05 for: 2m labels: severity: critical annotations: summary: "High error rate on {{ $labels.service }}" runbook: "https://wiki.example.com/runbooks/http-errors"

def compute_rate(samples: list[tuple[float, float]]) -> list[tuple[float, float]]: # samples: list of (timestamp_seconds, counter_value) # returns: list of (timestamp, rate_per_second) # TODO: implement, handling resets rates = [] for i in range(1, len(samples)): t_prev, v_prev = samples[i - 1] t_curr, v_curr = samples[i] dt = t_curr - t_prev if dt <= 0: continue # handle reset delta = v_curr - v_prev if v_curr >= v_prev else v_curr rates.append((t_curr, delta / dt)) return rates

from datetime import datetime, timedelta def remaining_error_budget( requests: list[tuple[datetime, bool]], # (timestamp, is_success) slo_target: float = 0.999, window_days: int = 30, ) -> dict: now = datetime.utcnow() cutoff = now - timedelta(days=window_days) window = [(t, s) for t, s in requests if t >= cutoff] total = len(window) failures = sum(1 for _, s in window if not s) if total == 0: return {"remaining_pct": 100.0, "remaining_count": 0} allowed_failures = total * (1 - slo_target) remaining_count = allowed_failures - failures remaining_pct = (remaining_count / allowed_failures) * 100 if allowed_failures > 0 else 0.0 return {"remaining_pct": remaining_pct, "remaining_count": remaining_count}

import re from typing import Optional LOG_PATTERN = re.compile( r'(?P<remote_addr>\S+) \S+ \S+ \[.+?\] ' r'"(?P<method>\S+) (?P<path>\S+) \S+" ' r'(?P<status>\d{3}) (?P<bytes>\d+|-) ' r'"[^"]*" "[^"]*"' ) def parse_nginx_log(line: str) -> Optional[dict]: m = LOG_PATTERN.match(line.strip()) if not m: return None return { "remote_addr": m.group("remote_addr"), "method": m.group("method"), "path": m.group("path"), "status": int(m.group("status")), "bytes": int(m.group("bytes")) if m.group("bytes") != "-" else 0, }

import json import heapq from collections import defaultdict def top_k_by_cardinality(filepath: str, k: int) -> list[tuple[str, int]]: series_sets: dict[str, set] = defaultdict(set) with open(filepath) as f: for line in f: try: obj = json.loads(line) metric = obj["metric"] label_key = tuple(sorted(obj.get("labels", {}).items())) series_sets[metric].add(label_key) except (json.JSONDecodeError, KeyError): continue counts = {m: len(s) for m, s in series_sets.items()} return heapq.nlargest(k, counts.items(), key=lambda x: x[1])

def burn_rate_threshold( slo_target: float, # e.g. 0.999 compliance_window_hr: float, # e.g. 720 for 30 days alert_window_hr: float, # e.g. 1.0 ) -> float: """ Returns the burn rate multiple that, if sustained over alert_window_hr, would exhaust the entire error budget of the compliance window. """ if alert_window_hr <= 0: raise ValueError("alert_window_hr must be positive") error_budget_fraction = 1 - slo_target # At burn_rate x baseline error rate, budget exhausted in: # compliance_window / burn_rate = alert_window # => burn_rate = compliance_window / alert_window return compliance_window_hr / alert_window_hr

def histogram_quantile(q: float, buckets: dict[float, float]) -> float: """ buckets: {upper_bound: cumulative_count} q: quantile in [0, 1] Returns estimated value at quantile q. """ sorted_bounds = sorted(buckets.keys()) total = buckets[float('inf')] if float('inf') in buckets else max(buckets.values()) target_count = q * total prev_bound = 0.0 prev_count = 0.0 for bound in sorted_bounds: count = buckets[bound] if count >= target_count: # interpolate within [prev_bound, bound] fraction = (target_count - prev_count) / (count - prev_count) if count != prev_count else 0 return prev_bound + fraction * (bound - prev_bound) prev_bound = bound prev_count = count return sorted_bounds[-1]

package main import ( "context" "math" "time" sdktrace "go.opentelemetry.io/otel/sdk/trace" ) type RetryExporter struct { wrapped sdktrace.SpanExporter maxRetries int baseDelay time.Duration } func (r *RetryExporter) ExportSpans(ctx context.Context, spans []sdktrace.ReadOnlySpan) error { var err error for attempt := 0; attempt <= r.maxRetries; attempt++ { if attempt > 0 { delay := r.baseDelay * time.Duration(math.Pow(2, float64(attempt-1))) select { case <-time.After(delay): case <-ctx.Done(): return ctx.Err() } } err = r.wrapped.ExportSpans(ctx, spans) if err == nil || isPermanentError(err) { return err } } return err } func isPermanentError(err error) bool { // TODO: check for HTTP 4xx or non-retryable error types return false } func (r *RetryExporter) Shutdown(ctx context.Context) error { return r.wrapped.Shutdown(ctx) }

from collections import defaultdict from typing import NamedTuple class Alert(NamedTuple): name: str labels: dict timestamp: float # unix seconds def deduplicate_alerts(events: list[Alert], window_seconds: int = 300) -> list[Alert]: seen: set = set() result = [] for event in events: bucket = int(event.timestamp // window_seconds) key = (bucket, event.name, frozenset(event.labels.items())) if key not in seen: seen.add(key) result.append(event) return result

def merge_exponential_histograms( hist_a: dict, # {"scale": int, "buckets": {int: int}, "sum": float, "count": int} hist_b: dict, ) -> dict: scale_a = hist_a["scale"] scale_b = hist_b["scale"] target_scale = min(scale_a, scale_b) def downscale(buckets: dict, from_scale: int, to_scale: int) -> dict: shift = from_scale - to_scale result: dict[int, int] = {} for idx, count in buckets.items(): new_idx = idx >> shift # integer right-shift merges buckets result[new_idx] = result.get(new_idx, 0) + count return result buckets_a = downscale(hist_a["buckets"], scale_a, target_scale) buckets_b = downscale(hist_b["buckets"], scale_b, target_scale) merged = dict(buckets_a) for idx, count in buckets_b.items(): merged[idx] = merged.get(idx, 0) + count return { "scale": target_scale, "buckets": merged, "sum": hist_a["sum"] + hist_b["sum"], "count": hist_a["count"] + hist_b["count"], }

Questions

Write a PromQL recording rule and alert for high HTTP error rateCodingmediumVery common

As asked

Sample answer outline

Reference implementation (yaml)

Expect these follow-ups

Implement a function to compute per-second rate from a Prometheus-style counterCodingmediumCommon

As asked

Sample answer outline

Reference implementation (python)

Expect these follow-ups

Compute remaining error budget from a list of request outcomesCodingmediumCommon

As asked

Sample answer outline

Reference implementation (python)

Expect these follow-ups

Parse unstructured nginx access logs into structured JSONCodingeasyCommon

As asked

Sample answer outline

Reference implementation (python)

Expect these follow-ups

Find the top-K most expensive metrics by series count from TSDB metadataCodingeasyCommon

As asked

Sample answer outline

Reference implementation (python)

Expect these follow-ups

Calculate the burn rate threshold for a given SLO and alert windowCodingmediumCommon

As asked

Sample answer outline

Reference implementation (python)

Expect these follow-ups

Write a LogQL query to extract and count error codes from unstructured logsCodingmediumCommon

As asked

Sample answer outline

Reference implementation (logql)

Expect these follow-ups

Implement histogram_quantile interpolation from bucket dataCodinghardOccasional

As asked

Sample answer outline

Reference implementation (python)

Expect these follow-ups

Implement a retry wrapper for an OTel span exporter with exponential backoffCodinghardOccasional

As asked

Sample answer outline

Reference implementation (go)

Expect these follow-ups

Deduplicate a stream of alert events within a time windowCodingmediumOccasional

As asked

Sample answer outline

Reference implementation (python)

Expect these follow-ups

Merge two exponential histogram bucket maps into oneCodinghardRare

As asked

Sample answer outline

Reference implementation (python)

Expect these follow-ups

Related questions

Implement a function to compute per-second rate from a Prometheus-style counter

Compute remaining error budget from a list of request outcomes

Parse unstructured nginx access logs into structured JSON

Find the top-K most expensive metrics by series count from TSDB metadata

More observability engineer topics

Tools to sharpen your prep

Questions

Write a PromQL recording rule and alert for high HTTP error rateCodingmediumVery common

As asked

Sample answer outline

Reference implementation (yaml)

Expect these follow-ups

Implement a function to compute per-second rate from a Prometheus-style counterCodingmediumCommon

As asked

Sample answer outline

Reference implementation (python)

Expect these follow-ups

Compute remaining error budget from a list of request outcomesCodingmediumCommon

As asked

Sample answer outline

Reference implementation (python)

Expect these follow-ups

Parse unstructured nginx access logs into structured JSONCodingeasyCommon