Producers
How to produce messages to StreamHouse topics.
Producer Overview
Producers are client applications that write messages to StreamHouse topics. Messages are sent to the StreamHouse agent via gRPC or the REST API. Each message consists of an optional key, a value (the payload), optional headers, and a timestamp.
Partition Assignment
When producing a message, the partition is determined by the message key. If a key is provided, it is hashed using murmur2 to determine the target partition (consistent with Kafka's default partitioner). If no key is provided, messages are distributed round-robin across all partitions.
// Partition assignment pseudocode
if message.key != null {
partition = murmur2(message.key) % topic.num_partitions
} else {
partition = round_robin_counter++ % topic.num_partitions
}Write Batching
For performance, the StreamHouse agent buffers incoming messages in memory before flushing to S3. The writer pool maintains per-partition buffers that are flushed when either the buffer reaches 64MB or 10 seconds have elapsed since the first message in the batch. This batching amortizes the cost of S3 uploads across many messages.
Schema Validation
If a topic has an associated schema in the Schema Registry, producers can opt into schema validation. The producer sends the schema ID in the message header, and the agent validates the message payload against the registered schema before accepting it. This catches data quality issues at write time.
# Produce with schema validation
streamctl produce --topic user-events \
--schema-id 1 \
--key "user-123" \
--message '{"user_id": "123", "event": "login", "timestamp": 1706000000}'