Retention Policies
Configure how long StreamHouse retains data.
5 min readStorage
Data Retention
Retention policies control how long messages are kept in a topic before being deleted. StreamHouse supports time-based retention, size-based retention, or both. When a retention limit is reached, the oldest segments are deleted from S3 and their metadata is removed.
Time-Based Retention
Time-based retention deletes segments older than the specified duration. This is the most common retention strategy.
bash
# Set 30-day retention
streamctl topic create --name events --retention 30d
# Supported units: m (minutes), h (hours), d (days)
# Examples: 1h, 7d, 90d
# Set infinite retention (never delete)
streamctl topic create --name audit-log --retention infiniteSize-Based Retention
Size-based retention caps the total size of a topic. When the limit is exceeded, the oldest segments are deleted.
bash
# Cap topic at 100GB
streamctl topic create --name metrics --retention-bytes 100GB
# Combine time and size (whichever triggers first)
streamctl topic create --name logs \
--retention 7d \
--retention-bytes 500GBLog Compaction
For topics that represent state (like a changelog), log compaction keeps only the latest value for each key. This is useful for maintaining a materialized view that can be rebuilt from the topic.
- Compaction runs as a background process on the agent
- Only the most recent record per key is retained
- Tombstone records (null value) mark a key for deletion
- Compacted topics can still have time-based retention for tombstones