Back to blog
Deep DiveFebruary 19, 20248 min read

Taming S3 at Scale: Circuit Breakers and Intelligent Rate Limiting

At high throughput, S3 will throttle you. Here's how StreamHouse uses token buckets, circuit breakers, and adaptive backoff to handle 10,000+ S3 operations per second without dropping events.

The S3 Throttling Problem

S3 is not infinitely fast. AWS documents specific request rate limits:

  • PUT/POST/DELETE: 3,500 requests/sec per prefix
  • GET/HEAD: 5,500 requests/sec per prefix

When you exceed these limits, S3 returns HTTP 503 (Service Unavailable) or 429 (Too Many Requests). In a high-throughput streaming platform, hitting these limits is not a theoretical concern — it's a Tuesday.

A single StreamHouse agent flushing 64MB segments every 10 seconds generates modest S3 traffic. But 20 agents across hundreds of partitions? That's thousands of PUT and GET operations per second.

Our Three-Layer Defense

StreamHouse implements three complementary mechanisms to handle S3 throttling:

Layer 1: Token Bucket Rate Limiter

Before any S3 operation, the agent must acquire a token from a rate limiter. Tokens replenish at a configured rate, spreading operations evenly over time.

Default limits:
  PUT operations:    3,000/sec (below S3's 3,500 limit)
  GET operations:    5,000/sec (below S3's 5,500 limit)
  DELETE operations: 3,000/sec

We set defaults below S3's documented limits to leave headroom for burst absorption. The token bucket allows short bursts (up to 2x the rate for 1 second) while maintaining the average rate.

Layer 2: Circuit Breaker

If S3 starts returning throttle responses despite rate limiting, the circuit breaker activates:

States:
  CLOSED  → Normal operation. All requests pass through.
  OPEN    → S3 is throttling. All requests are queued/rejected.
  HALF_OPEN → Testing recovery. Limited requests allowed.

Transitions:
CLOSED → OPEN: 5 throttle responses in 10 seconds
OPEN → HALF_OPEN: After 30 second cooldown
HALF_OPEN → CLOSED: 3 consecutive successes
HALF_OPEN → OPEN: Any throttle response

When the circuit breaker opens, segment flushes are paused and records accumulate in the agent's memory buffer. This is safe because the WAL ensures durability. When S3 recovers, the circuit breaker closes and buffered segments flush.

Layer 3: Adaptive Backoff

Individual retries use exponential backoff with jitter:

retry_delay = min(base_delay * 2^attempt + random_jitter, max_delay)

Attempt 1: 100ms + jitter
Attempt 2: 200ms + jitter
Attempt 3: 400ms + jitter
Attempt 4: 800ms + jitter
...
Max delay: 30 seconds
Max attempts: 10

The jitter prevents thundering herd — without it, all agents would retry simultaneously after a throttle event, causing another wave of throttling.

S3 Prefix Optimization

S3 rate limits are per-prefix. StreamHouse's path layout naturally distributes load:

s3://bucket/topics/user-events/partitions/0/segments/...
s3://bucket/topics/user-events/partitions/1/segments/...
s3://bucket/topics/order-events/partitions/0/segments/...

Each partition is a separate prefix, so a topic with 6 partitions gets 6x the rate limit. This is why increasing partition count can improve throughput — it's not just about consumer parallelism.

Monitoring Throttle Health

StreamHouse exposes detailed metrics for S3 operations:

streamhouse_s3_requests_total{operation="put"}     # Total S3 PUT requests
streamhouse_s3_throttled_total{operation="put"}     # Throttled responses
streamhouse_s3_circuit_breaker_state                # 0=closed, 1=open, 2=half_open
streamhouse_s3_rate_limiter_queued                  # Requests waiting for tokens
streamhouse_s3_retry_total                          # Total retries

Alert if the throttle rate exceeds 1%:

- alert: S3ThrottlingHigh
  expr: rate(streamhouse_s3_throttled_total[5m]) / rate(streamhouse_s3_requests_total[5m]) > 0.01
  for: 5m
  annotations:
    summary: "S3 throttle rate above 1%"
    runbook: "Consider increasing partition count or reducing flush frequency"

Configuration

All throttle parameters are tunable:

[storage.throttle]
put_rate_limit = 3000
get_rate_limit = 5000
delete_rate_limit = 3000
circuit_breaker_threshold = 5
circuit_breaker_cooldown_secs = 30
max_retry_attempts = 10
base_retry_delay_ms = 100

For workloads with dedicated S3 prefixes (where you won't share rate limits with other applications), you can increase these limits.

Real-World Impact

In our load tests with 20 agents and 200 partitions:

  • Without throttle protection: 12% of segment uploads failed during peak hours, requiring manual intervention
  • With throttle protection: 0% upload failures, automatic recovery from all S3 throttle events, maximum 30-second delay during circuit breaker open events

The buffering provided by the WAL and in-memory segment buffers means S3 throttling never results in data loss — only temporary increased latency.

S3 is an incredible storage backend. But at scale, you need to respect its limits. StreamHouse does this automatically so you don't have to.

Tags:s3reliabilitycircuit-breakerperformanceoperations

Ready to try StreamHouse?

Get started with S3-native event streaming in minutes.