What Is Webhook Fan-Out?
Webhook fan-out is the process of delivering a single event to multiple registered external endpoints (webhooks). When a payment platform processes a charge, it may need to notify a fraud system, an accounting service, a CRM, and a customer dashboard — all registered as webhook endpoints for the “charge.succeeded” event type. The fan-out system receives the event once and creates independent, isolated delivery attempts for each matching endpoint.
The design challenges of webhook fan-out differ from general message queues: endpoints are external HTTP servers with unpredictable latency and reliability, the number of registered endpoints per event type varies widely (from 1 to thousands), and slow or failing endpoints must never delay delivery to healthy endpoints.
Event Routing
Each WebhookEndpoint registers a topic_filter — a pattern matching the event types it wants to receive. Filters can be exact matches (“charge.succeeded”), prefix matches (“charge.*”), or wildcard patterns (“*.succeeded”). When an event arrives, the routing layer queries WebhookEndpoint for all endpoints whose topic_filter matches the event type. This lookup should be cached (the set of active endpoints rarely changes) and filtered in memory after loading from the database.
For high-throughput platforms with thousands of registered endpoints per event type, the routing query should use a denormalized or pre-computed mapping: at endpoint registration time, insert rows into a TopicEndpointMap table keyed by (event_type, endpoint_id). This makes routing a simple indexed lookup rather than a pattern-matching scan.
Fan-Out Worker and Parallel Delivery
The fan-out worker receives events from the event bus (Kafka, SQS, or internal queue) and, for each matching endpoint, creates a WebhookDelivery record and enqueues a delivery task. Delivery tasks are processed by a pool of delivery workers — HTTP client threads that POST the event payload to the endpoint URL and record the result.
Fan-out and delivery are intentionally separated:
- Fan-out is fast (database writes only, no HTTP calls) and can create thousands of delivery tasks per event in milliseconds.
- Delivery is slow and unpredictable (external HTTP calls, variable latency), but isolated per endpoint.
This separation ensures that fan-out throughput is bounded by database write capacity, not by the slowest endpoint's HTTP response time.
Queue Isolation Per Endpoint
Without queue isolation, a slow endpoint can back up the shared delivery queue and delay delivery to all other endpoints. The solution: each endpoint has its own virtual delivery queue — in practice, a delivery worker pool partitioned by endpoint_id, or a priority queue where tasks are dequeued in a round-robin or least-recently-served order per endpoint.
In Celery or similar task queues, this is implemented by routing delivery tasks to per-endpoint queues (endpoint_{id}) or by using a broker that supports per-key rate limiting and fair scheduling. In Kafka, use a topic partitioned by endpoint_id — each partition is consumed by a dedicated delivery worker, and a slow partition does not block other partitions.
Per-Endpoint Rate Limiting and Concurrency Control
Sending too many concurrent requests to a slow endpoint exhausts its connection pool and causes cascade failures. Per-endpoint rate limiting caps the number of in-flight delivery attempts for a given endpoint. Implementation options:
- Semaphore-based: each delivery worker checks a Redis counter for the endpoint before starting a delivery. If the counter equals max_concurrency, the task is requeued with a short delay.
- Queue depth cap: if the endpoint's delivery queue depth exceeds a threshold, reject new fan-out tasks for that endpoint (returning a backpressure signal to the fan-out worker).
max_concurrency is stored in WebhookEndpoint.max_concurrency (e.g., 5 concurrent deliveries per endpoint) and respected by all delivery workers.
Backpressure
When an endpoint queue depth exceeds a configured threshold (e.g., 10,000 pending deliveries), the fan-out worker stops enqueuing new tasks for that endpoint. New events for that endpoint are either dropped (with a log entry), written to a backpressure buffer (a secondary queue with lower priority), or throttled at the fan-out layer. The endpoint owner is notified via an alert that their endpoint is falling behind. Backpressure prevents the delivery queue from growing unboundedly, which would cause memory exhaustion and delay recovery when the endpoint becomes healthy again.
HMAC Request Signing
Webhook receivers need to verify that requests come from the legitimate sender, not a third party. The standard approach: HMAC-SHA256 of the request body using a shared secret, sent as the X-Webhook-Signature header. The endpoint registers a secret at creation time (stored as a hash in the database, or in a secret manager). The delivery worker computes HMAC-SHA256(secret, payload_bytes) and includes it in the delivery request. The receiver recomputes the HMAC with its stored secret and compares using a constant-time comparison function (to prevent timing attacks).
Delivery Observability
Per-endpoint operational metrics are critical for diagnosing delivery issues before they escalate. Key metrics tracked per endpoint per time window:
- success_rate — fraction of deliveries that received a 2xx HTTP response
- avg_latency_ms — mean HTTP response time for successful deliveries
- queue_depth — current number of pending delivery tasks
- error_rate — fraction of deliveries that failed after all retries
- retry_rate — fraction of deliveries that required at least one retry
These metrics are aggregated into hourly DeliveryMetric rows and surfaced in a per-endpoint observability dashboard. Anomaly detection on success_rate and queue_depth can trigger automatic endpoint suspension (stop delivering to an endpoint that has been failing for 24 hours) and alert the endpoint owner.
SQL Data Model
-- Registered webhook endpoints
CREATE TABLE WebhookEndpoint (
id BIGSERIAL PRIMARY KEY,
url TEXT NOT NULL,
topic_filter VARCHAR(255) NOT NULL, -- event type pattern (e.g., "charge.*")
secret_hash TEXT NOT NULL, -- HMAC secret stored as hash or in secret manager
max_concurrency INT NOT NULL DEFAULT 5,
status VARCHAR(32) NOT NULL DEFAULT 'active', -- active, suspended, disabled
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Individual delivery attempts
CREATE TABLE WebhookDelivery (
id BIGSERIAL PRIMARY KEY,
endpoint_id BIGINT NOT NULL REFERENCES WebhookEndpoint(id),
event_id VARCHAR(255) NOT NULL,
payload JSONB NOT NULL,
status VARCHAR(32) NOT NULL DEFAULT 'pending', -- pending, delivered, failed
attempts INT NOT NULL DEFAULT 0,
next_retry_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
delivered_at TIMESTAMPTZ
);
-- Hourly delivery metrics per endpoint
CREATE TABLE DeliveryMetric (
endpoint_id BIGINT NOT NULL REFERENCES WebhookEndpoint(id),
hour TIMESTAMPTZ NOT NULL,
success_count BIGINT NOT NULL DEFAULT 0,
failure_count BIGINT NOT NULL DEFAULT 0,
avg_latency_ms NUMERIC(10,2),
PRIMARY KEY (endpoint_id, hour)
);
CREATE INDEX idx_delivery_endpoint_status ON WebhookDelivery(endpoint_id, status, next_retry_at);
CREATE INDEX idx_delivery_event ON WebhookDelivery(event_id);
CREATE INDEX idx_metric_endpoint_hour ON DeliveryMetric(endpoint_id, hour DESC);
Python Implementation Sketch
import hashlib, hmac, time, json
import urllib.request, urllib.error
from dataclasses import dataclass
from typing import Optional
@dataclass
class DeliveryResult:
success: bool
status_code: Optional[int]
latency_ms: float
error: Optional[str] = None
def fan_out_event(event_type: str, payload: dict, event_id: str,
endpoints_db: list[dict], delivery_queue: list) -> int:
"""Create delivery tasks for all matching endpoints. Returns task count."""
created = 0
for endpoint in endpoints_db:
if not _matches_topic(event_type, endpoint["topic_filter"]):
continue
if endpoint.get("status") != "active":
continue
delivery_task = {
"endpoint_id": endpoint["id"],
"event_id": event_id,
"payload": payload,
"url": endpoint["url"],
"secret": endpoint["secret_hash"],
"attempts": 0,
}
delivery_queue.append(delivery_task)
created += 1
return created
def _matches_topic(event_type: str, pattern: str) -> bool:
"""Simple glob matching: exact, prefix with *, or full wildcard."""
if pattern == "*":
return True
if pattern.endswith(".*"):
return event_type.startswith(pattern[:-2])
return event_type == pattern
def deliver_to_endpoint(delivery: dict) -> DeliveryResult:
"""HTTP POST payload to endpoint URL with HMAC signature."""
payload_bytes = json.dumps(delivery["payload"]).encode()
signature = compute_hmac(delivery["secret"], payload_bytes)
headers = {
"Content-Type": "application/json",
"X-Webhook-Signature": f"sha256={signature}",
"X-Event-ID": delivery["event_id"],
"User-Agent": "WebhookDelivery/1.0",
}
start = time.monotonic()
try:
req = urllib.request.Request(
delivery["url"], data=payload_bytes, headers=headers, method="POST"
)
with urllib.request.urlopen(req, timeout=10) as resp:
latency_ms = (time.monotonic() - start) * 1000
return DeliveryResult(
success=200 <= resp.status str:
"""Compute HMAC-SHA256 of payload with secret."""
return hmac.new(secret.encode(), payload, hashlib.sha256).hexdigest()
def rate_limit_check(endpoint_id: int, in_flight_counts: dict,
max_concurrency: int) -> bool:
"""Return True if delivery is allowed under rate limit."""
current = in_flight_counts.get(endpoint_id, 0)
return current dict:
"""Compute success rate, avg latency, failure count for an endpoint."""
cutoff = time.time() - window_hours * 3600
recent = [d for d in deliveries
if d["endpoint_id"] == endpoint_id and d.get("created_at", 0) >= cutoff]
if not recent:
return {"success_rate": None, "avg_latency_ms": None,
"success_count": 0, "failure_count": 0}
successes = [d for d in recent if d.get("status") == "delivered"]
failures = [d for d in recent if d.get("status") == "failed"]
latencies = [d["latency_ms"] for d in successes if "latency_ms" in d]
return {
"success_rate": len(successes) / len(recent),
"avg_latency_ms": sum(latencies) / len(latencies) if latencies else None,
"success_count": len(successes),
"failure_count": len(failures),
}
Frequently Asked Questions
Queue Isolation Design
Without isolation, a failing endpoint backs up the shared queue and delays delivery to all healthy endpoints. Per-endpoint virtual queues (or Kafka partitioning by endpoint_id) ensure each endpoint's failure is contained — its tasks back up only its own queue while other endpoints continue unaffected.
Per-Endpoint Rate Limiting
Before each delivery, atomically increment a Redis counter for the endpoint. If counter exceeds max_concurrency, requeue the task with a short delay. Decrement the counter on delivery completion. This prevents overwhelming slow receivers with concurrent requests. max_concurrency is configured per endpoint and stored in the WebhookEndpoint table.
HMAC Verification
Receivers compute HMAC-SHA256 of the raw request body bytes using the shared secret, then compare with the X-Webhook-Signature header using constant-time comparison (hmac.compare_digest). Constant-time comparison prevents timing attacks. Reject requests with timestamps more than 5 minutes old to prevent replay attacks.
Observability Metrics
Most actionable metrics: success_rate (alert below 95%), queue_depth (steady growth means endpoint cannot keep up), avg_latency_ms (slow receivers reduce worker throughput), retry_rate (high values signal intermittent failures becoming permanent). Also track fan-out lag and delivery lag at the system level.
See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering