Back to blog
TutorialJanuary 2, 20246 min read

Migrating from Kafka to StreamHouse: A Step-by-Step Guide

A practical guide to migrating your existing Kafka workloads to StreamHouse with zero downtime and minimal code changes.

Overview

Migrating from Kafka to StreamHouse is straightforward thanks to our Kafka-compatible protocol. This guide walks through the process step by step.

Prerequisites

  • StreamHouse 0.6+ deployed
  • Access to your Kafka cluster
  • Producer/consumer applications ready to update

Step 1: Deploy StreamHouse

Start with a minimal StreamHouse deployment:

# Start infrastructure
docker compose up -d minio postgres

# Run an agent
cargo run --bin agent

Step 2: Create Topics

Mirror your Kafka topic configuration:

streamctl topics create orders --partitions 8
streamctl topics create events --partitions 16

Step 3: Set Up Dual-Write

Update producers to write to both Kafka and StreamHouse:

// Pseudo-code for dual-write
async fn produce(event: Event) {
    // Write to Kafka (existing)
    kafka_producer.send(event.clone()).await?;

// Write to StreamHouse (new)
streamhouse_producer.send(event).await?;
}

Monitor both systems to ensure data consistency.

Step 4: Migrate Consumers

Once dual-write is stable, migrate consumers one at a time:

// Update connection string
let consumer = StreamHouseConsumer::new(
    "streamhouse://agent:9090",
    "orders",
    "order-processor"
).await?;

Step 5: Disable Kafka Writes

After all consumers are migrated, remove Kafka from producers:

async fn produce(event: Event) {
    streamhouse_producer.send(event).await?;
}

Step 6: Decommission Kafka

  1. Stop Kafka consumers
  2. Verify no active producers
  3. Backup Kafka data if needed
  4. Shut down Kafka cluster

Common Issues

Consumer Lag After Migration

If you see consumer lag after migration, it's likely due to offset differences. Reset offsets to earliest:

streamctl consumer-groups reset order-processor --topic orders --to-earliest

Missing Messages

Enable dual-read temporarily to verify data consistency:

let kafka_msg = kafka_consumer.poll().await?;
let sh_msg = streamhouse_consumer.poll().await?;
assert_eq!(kafka_msg.payload, sh_msg.payload);

Performance Tuning

StreamHouse defaults work well for most workloads, but you may want to tune:

[agent]
segment_flush_interval = "5s"  # Increase for better batching
cache_size_mb = 512            # Increase for read-heavy workloads

Conclusion

Migration from Kafka to StreamHouse can be completed in a few hours for simple deployments, or a few days for complex production systems. The key is the dual-write phase—take your time to verify data consistency before cutting over.

Questions? Join our Discord for help.

Tags:kafkamigrationtutorialgetting-started

Ready to try StreamHouse?

Get started with S3-native event streaming in minutes.