As asked
How does the CAP theorem apply to a lakehouse built on object storage? When you have concurrent readers and writers on an Iceberg table, what guarantees do you actually get, and what trade-offs does the table format's atomic commit mechanism represent?
Sample answer outline
Object storage is not a distributed database in the CAP sense, but the lakehouse table format creates an eventually consistent system where concurrent readers see the latest committed snapshot and writers use optimistic concurrency with atomic catalog swaps. Iceberg provides isolation at snapshot granularity: a reader always sees a consistent point-in-time view, but a writer that conflicts on the same files must retry. The tradeoff is availability (writers can succeed eventually) over strict linearizability. The candidate should note that the catalog backend (Hive Metastore, Nessie, REST) is the actual consistency boundary.
Expect these follow-ups
- How does using Nessie as a catalog change the consistency model?
- What happens if the catalog's atomic swap fails partway through?