StreamHouse Console

Introduction

Real-time analytics pipelines traditionally require multiple systems: Kafka for ingestion, Flink for processing, and a database for serving. StreamHouse consolidates this into a single platform.

Architecture Overview

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Events    │────▶│ StreamHouse │────▶│  Dashboard  │
│   (API)     │     │    SQL      │     │  (Grafana)  │
└─────────────┘     └─────────────┘     └─────────────┘

Step 1: Set Up Event Ingestion

Create a topic for raw events:

streamctl topics create raw_events --partitions 4

Produce events from your API:

// Express middleware
app.use((req, res, next) => {
  producer.send('raw_events', {
    timestamp: Date.now(),
    path: req.path,
    method: req.method,
    user_id: req.user?.id,
    response_time_ms: res.responseTime
  });
  next();
});

Step 2: Create Aggregation Streams

Use SQL to compute real-time aggregates:

-- Requests per minute by endpoint
CREATE STREAM rpm_by_endpoint AS
SELECT
  path,
  COUNT(*) as request_count,
  AVG(response_time_ms) as avg_latency,
  TUMBLE_START(timestamp, INTERVAL '1' MINUTE) as window_start
FROM raw_events
GROUP BY
  path,
  TUMBLE(timestamp, INTERVAL '1' MINUTE);

Step 3: Connect to Grafana

StreamHouse exposes a Prometheus endpoint for metrics:

# prometheus.yml
scrape_configs:
  - job_name: 'streamhouse'
    static_configs:
      - targets: ['agent:8080']

Create Grafana dashboards to visualize your streams.

Step 4: Add Alerting

Configure alerts for anomalies:

CREATE STREAM high_latency_alerts AS
SELECT *
FROM rpm_by_endpoint
WHERE avg_latency > 500;

Connect alerts to PagerDuty or Slack via webhooks.

Performance Considerations

For high-throughput analytics:

Partition wisely: Use high-cardinality keys for even distribution
Window size: Larger windows reduce output volume but increase latency
Materialized views: Cache expensive aggregations

Conclusion

StreamHouse SQL processing eliminates the need for separate stream processing infrastructure. Build real-time analytics pipelines with familiar SQL syntax and minimal operational overhead.

Read the full tutorial →

Building a Real-Time Analytics Pipeline with StreamHouse

Introduction

Architecture Overview

Step 1: Set Up Event Ingestion

Step 2: Create Aggregation Streams

Step 3: Connect to Grafana

Step 4: Add Alerting

Performance Considerations

Conclusion

Keep Reading

More from the blog

Real-Time CDC Pipelines: Streaming Database Changes with StreamHouse

Schema Evolution Without the Pain: StreamHouse's Built-In Schema Registry

Ready to try StreamHouse?