Back to blog
ArchitectureJanuary 8, 20247 min read

Why We Built an S3-Native Streaming Platform

Traditional message brokers weren't designed for the cloud. Here's why we rethought streaming from first principles with object storage at the core.

The Problem with Traditional Brokers

When we started building StreamHouse, we asked ourselves a simple question: if you were designing a streaming platform today, with cloud-native infrastructure available, would you still use local disks?

The answer was clearly no.

The Hidden Costs of Disk-Based Streaming

Traditional brokers like Kafka were designed in an era when local SSDs were the fastest storage option. But this architecture carries significant hidden costs in cloud environments:

1. Replication Overhead

  • 3x the EBS storage costs
  • Significant inter-AZ network traffic (expensive)
  • Complex leader election and ISR management

2. Capacity Planning Nightmares

  • Over-provision to handle traffic spikes
  • Under-utilize most of the time
  • Manual rebalancing when adding capacity

3. Operational Complexity

  • Monitor disk usage and IOPS
  • Plan partition migrations
  • Handle broker failures and recovery

The S3 Insight

  • 11 nines of durability (99.999999999%)
  • Unlimited capacity
  • No replication management
  • Pay only for what you store

The question became: can we build a streaming platform that leverages S3 as the source of truth?

How StreamHouse Works

StreamHouse uses a disaggregated architecture:

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Producers  │────▶│   Agents    │────▶│     S3      │
└─────────────┘     └─────────────┘     └─────────────┘
                          │
                          ▼
                    ┌─────────────┐
                    │  Metadata   │
                    │  (Postgres) │
                    └─────────────┘
  • Buffer incoming events in memory
  • Flush segments to S3 periodically
  • Serve reads from cache or S3
  • Topic and partition configurations
  • Consumer group offsets
  • Agent leases and watermarks

No event data touches the metadata store. This keeps it lightweight and fast.

The Results

  • 80% lower costs vs. traditional Kafka deployments
  • Zero disk management - no provisioning, no monitoring
  • Instant scaling - add agents without data migration
  • Built-in durability - S3's 11 nines out of the box

Trade-offs

S3-native streaming isn't without trade-offs:

Latency: S3 writes add 50-100ms latency vs. local disk. For most use cases, this is acceptable. For ultra-low-latency requirements, hybrid architectures work well.

S3 API costs: High-throughput workloads can incur S3 API charges. StreamHouse mitigates this with intelligent batching and caching.

Conclusion

The cloud changed the economics of storage. StreamHouse embraces this reality by building on S3 from day one. The result is a simpler, cheaper, and more reliable streaming platform.

Try it yourself—get started free.

Tags:architectures3cloud-nativedesign

Ready to try StreamHouse?

Get started with S3-native event streaming in minutes.