Durability and Consistency
Durability
An append is acknowledged once a majority of voters has replicated it. A single node failure cannot lose an acknowledged write. Acknowledged data lives in replicated hot state across the cluster, and a background worker flushes older segments to S3 on a configurable interval, after which the data inherits S3-grade durability.
In a cross-region deployment with a 5-second flush interval, per-message durability is approximately 9-10 nines. The window where acknowledged-but-unflushed data is at risk from a simultaneous multi-region failure is measured in seconds.
Consistency
Writes are linearizable. Each Raft group serializes appends through its current leader, which assigns a total order.
Catch-up reads may be served by any replica that has applied the relevant stream state. Followers can return already replicated historical bytes locally because stream positions are immutable. Tail-sensitive reads still preserve protocol-visible semantics: HEAD, Stream-Up-To-Date: true, Stream-Closed: true, and offset=now are generated by the leader path unless the follower has applied a terminal closed state. Requests that need a write-side access transition, such as expiry or TTL touch, are also routed to the leader.
Stream-Up-To-Date on each read tells the client whether more committed data exists past next_offset. false means keep paging.
Availability and durability (standard layout)
Three voting replicas across availability zones, plus two non-voting replicas in a second region.
| Property | What it means |
|---|---|
| Write availability | Writes continue as long as a majority of voters is healthy. The layout tolerates any single voting-AZ failure. Non-voting replicas hold extra copies but do not vote. |
| Read availability | Any reachable replica can serve replicated historical catch-up data locally. Reads that need fresh tail metadata, leader-owned state changes, or live watcher ownership still track write availability per group, not per cluster. |
| Per-message durability | ~9-10 nines with a 5-second flush interval. |
| Annual zero-loss probability | ~3-4 nines. Probability of no data-loss events in a year. |