Creating Streams
Define SQL streams over StreamHouse topics.
8 min readSQL Processing
On this page
Creating a Stream
A stream is a SQL view over a StreamHouse topic. Streams define the schema of the data in a topic, enabling typed queries and validation.
sql
-- Create a stream over an existing topic
CREATE STREAM user_events (
user_id VARCHAR,
event VARCHAR,
page VARCHAR,
timestamp TIMESTAMP,
metadata JSON
) WITH (
topic = 'user-events',
format = 'json',
timestamp_field = 'timestamp'
);
-- Query the stream
SELECT user_id, event, count(*)
FROM user_events
WHERE timestamp > NOW() - INTERVAL '1 hour'
GROUP BY user_id, event;Continuous Queries
Continuous queries run indefinitely, processing each new message as it arrives and writing results to an output topic.
sql
-- Create a continuous query that counts events per minute
CREATE CONTINUOUS QUERY event_counts AS
SELECT
event,
count(*) as event_count,
window_start,
window_end
FROM TUMBLE(user_events, timestamp, INTERVAL '1 minute')
GROUP BY event, window_start, window_end
OUTPUT TO 'event-counts-per-minute';Data Formats
Streams support multiple data serialization formats.
- JSON: Self-describing format, flexible schema. Best for development and mixed-type data.
- Avro: Binary format with schema registry integration. Best for production with strict schemas.
- Protobuf: Google's binary format. Best for gRPC-heavy environments.
- CSV: Simple text format. Useful for log data and integration with legacy systems.