Stream Joins

Join multiple streams together in real-time.

8 min readSQL Processing

Joining Streams

StreamHouse supports joining two or more streams together based on matching keys and time constraints. Stream joins are useful for enriching events with context, correlating events from different sources, and detecting patterns across streams.

Windowed Joins

Windowed joins match events from two streams that fall within the same time window.

sql
-- Join clicks with impressions within a 1-hour window
SELECT
  c.user_id,
  c.ad_id,
  i.campaign_id,
  c.timestamp as click_time,
  i.timestamp as impression_time
FROM clicks c
JOIN impressions i
  ON c.ad_id = i.ad_id
  AND c.timestamp BETWEEN i.timestamp AND i.timestamp + INTERVAL '1 hour';

Stream-Table Joins

A stream can be joined with a compacted topic (treated as a table). The table represents the latest state for each key, and each stream event is enriched with the current table value.

sql
-- Enrich orders with latest customer data
CREATE TABLE customers (
  customer_id VARCHAR PRIMARY KEY,
  name VARCHAR,
  tier VARCHAR
) WITH (topic = 'customers', format = 'json');

SELECT
  o.order_id,
  o.amount,
  c.name as customer_name,
  c.tier as customer_tier
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id;