Compression
How StreamHouse compresses data for efficient storage.
4 min readStorage
Compression in StreamHouse
StreamHouse compresses segment data before uploading to S3. Compression reduces storage costs and improves read performance by reducing the amount of data transferred from S3. The default compression algorithm is LZ4, chosen for its excellent balance of speed and ratio.
Supported Algorithms
StreamHouse supports multiple compression algorithms to suit different workloads.
- LZ4 (default): Fast compression and decompression. ~2:1 ratio for JSON data. Best for low-latency workloads.
- Zstd: Higher compression ratio (~4:1 for JSON) at the cost of slightly higher CPU usage. Best for cost-sensitive workloads with high data volumes.
- None: No compression. Use when data is already compressed or CPU is constrained.
Configuration
Compression is configured per-topic. Different topics can use different algorithms.
bash
# Set compression when creating a topic
streamctl topic create --name logs --partitions 6 --compression zstd
# Change compression for an existing topic (applies to new segments only)
streamctl topic update --name logs --compression lz4