StreamHouse Console

The Durability Problem

Here's the worst-case scenario: your producer sends 10,000 events to a StreamHouse agent. The agent buffers them in memory, preparing a segment for S3 upload. Then the process crashes. The segment never reaches S3. Are those 10,000 events gone?

Without the Write-Ahead Log (WAL), yes. With it, no.

The Data Path

To understand the WAL, you need to understand the data path:

Producer → Agent (gRPC) → WAL (disk) → SegmentBuffer (RAM) → S3

Every record that enters an agent is written to the WAL before it enters the in-memory segment buffer. The WAL is a sequential, append-only file on the agent's local disk. If the agent crashes, the WAL file survives, and on restart, records are replayed from the WAL back into memory.

WAL Record Format

Each entry in the WAL contains everything needed to reconstruct the record:

┌─────────────────────────────────┐
│ WAL Entry                       │
│  - Length: u32                  │
│  - CRC32: u32                  │
│  - Topic: string               │
│  - Partition: u32              │
│  - Key: bytes                  │
│  - Value: bytes                │
│  - Timestamp: i64              │
│  - Headers: [(string, bytes)]  │
└─────────────────────────────────┘

The CRC32 checksum covers the entire entry, including the length field. This catches partial writes, bit flips, and filesystem corruption. During recovery, any entry with a bad CRC is discarded — it represents an incomplete write that was interrupted by the crash.

Three Sync Policies

The critical question is: when do we fsync the WAL to disk? StreamHouse offers three policies to match different durability-performance tradeoffs:

Always Sync

export WAL_SYNC_POLICY=always

Every record is fsync'd to disk before the produce request is acknowledged. Zero data loss even on power failure. Throughput: 50,000-100,000 records/sec.

This is the safest option. The latency cost is 100-500 microseconds per fsync on SSD, which is acceptable for most workloads.

Interval Sync (Recommended)

export WAL_SYNC_POLICY=interval
export WAL_SYNC_INTERVAL_MS=100

The WAL is fsync'd every 100ms. Records written between syncs may be lost on a power failure (not a process crash — the OS buffer cache survives process crashes on Linux). At risk: 100-1000 records in the worst case.

This is the recommended default. It achieves 1-2 million records/sec while losing at most 100ms of data on a catastrophic hardware failure.

Never Sync

export WAL_SYNC_POLICY=never

The WAL relies entirely on the OS buffer cache for durability. Data is durable against process crashes but may be lost on power failure. Throughput: 2+ million records/sec.

Use this for development or workloads where occasional data loss is acceptable (metrics, debug logs).

The Recovery Process

When a StreamHouse agent starts, it checks for an existing WAL file. If one exists, recovery runs automatically:

Open the WAL file and read from the beginning
Validate each entry by computing the CRC32 and comparing it to the stored checksum
Replay valid entries into the in-memory SegmentBuffer, partitioned by topic and partition
Skip invalid entries — these represent partially written records from the crash point
Resume normal operation — the next produce request appends to the existing WAL
Flush recovered segments to S3 following the normal segment lifecycle

Agent startup:
  [INFO] WAL file found: /data/wal/streamhouse.wal (24MB)
  [INFO] Replaying WAL entries...
  [INFO] Recovered 48,231 records across 12 partitions
  [INFO] Skipped 3 entries with invalid CRC (partial writes)
  [INFO] Recovery complete in 340ms
  [INFO] Agent ready to accept connections

The recovery process is fast — it reads sequentially from disk at SSD speed, typically recovering millions of records per second.

Failure Scenarios

Scenario 1: Agent Process Crash

The agent receives a SIGSEGV, OOM kill, or unhandled panic.

WAL entries already synced: Recovered on restart. Zero loss.
WAL entries in OS buffer cache: Recovered on restart (process crash doesn't clear the page cache). Zero loss.
Segment buffer (RAM): Anything already in the WAL is safe. The segment that was building in memory is reconstructed from WAL replay.

Scenario 2: Agent Crash During S3 Upload

The agent crashes while uploading a segment to S3.

S3 upload is atomic — either the full object lands or it doesn't. Partial uploads don't create visible objects.
On restart, the WAL replay reconstructs the segment buffer. The agent re-uploads the segment.
Duplicate segments are prevented by checking the metadata store for existing segment registrations before uploading.

Scenario 3: Power Failure

The physical machine loses power, wiping both RAM and the OS buffer cache.

With always sync: Zero loss. Every acknowledged record is on disk.
With interval sync: Loss of up to one sync interval (default 100ms) of data.
With never sync: Loss of all unflushed WAL data.

The Full Durability Stack

The WAL is one layer in a multi-layer durability strategy:

Layer 1: WAL (local disk)        → survives process crashes
Layer 2: S3 (object storage)     → survives hardware failure (11 nines)
Layer 3: PostgreSQL (metadata)   → survives with automated backups
Layer 4: CRC32 checksums         → detects corruption at every level

Once a segment is flushed to S3 and registered in the metadata store, the WAL entries for those records are no longer needed. The WAL is periodically truncated to reclaim disk space.

Monitoring the WAL

Keep an eye on these metrics:

streamhouse_wal_size_bytes           # Current WAL file size
streamhouse_wal_entries_total        # Total entries written
streamhouse_wal_recovery_records     # Records recovered on last startup
streamhouse_wal_sync_duration_ms     # Time spent in fsync
streamhouse_wal_corruption_detected  # CRC failures (should be 0)

Alert on wal_corruption_detected > 0 — it indicates a disk issue that needs investigation.

The Bottom Line

StreamHouse's WAL guarantees that acknowledged events survive agent crashes with configurable durability. Combined with S3's 11-nines durability and CRC32 checksums at every level, the system provides end-to-end data integrity from producer to consumer.

Choose your sync policy based on your workload:

Financial transactions: Always sync
General production: Interval sync (100ms)
Dev and metrics: Never sync

Your data is safe.

Zero Data Loss: How StreamHouse's Write-Ahead Log Prevents Lost Events