Ingestion Flow
A webhook ingestion service receives HTTP POST requests from third-party providers (Stripe, GitHub, Twilio) and delivers the events to internal consumers. The fundamental contract: respond to the provider quickly, process the payload asynchronously.
Third-party → POST /webhooks/{provider}
→ verify HMAC signature (reject if invalid)
→ acknowledge with HTTP 200 (within 5 seconds)
→ enqueue raw payload to internal message queue
→ return to caller
Internal consumers process from queue asynchronously
If the service takes longer than the provider's timeout (typically 5–30 seconds) to respond, the provider considers the delivery failed and retries. Fast acknowledgment is the primary reliability requirement on the ingestion path.
HMAC Signature Verification
Providers sign their webhook payloads with a shared secret using HMAC-SHA256. Verify before processing:
function verifySignature(rawBody, signatureHeader, secret) {
const expected = crypto
.createHmac('sha256', secret)
.update(rawBody) // raw bytes, not parsed JSON
.digest('hex');
const received = signatureHeader.replace('sha256=', '');
// Constant-time comparison prevents timing attacks
return crypto.timingSafeEqual(
Buffer.from(expected, 'hex'),
Buffer.from(received, 'hex')
);
}
Critical details: sign the raw request body bytes, not the parsed JSON (JSON serialization is not canonical — key order may differ). Use constant-time comparison — a naive string equality leaks timing information that could allow signature forgery. Reject with 401 Unauthorized on signature mismatch before touching the payload.
Deduplication
Providers retry failed deliveries. If the service acknowledged the first delivery but the provider's network timed out, the provider resends the same event. Consumers must handle duplicate deliveries. Two deduplication strategies:
- Event ID deduplication: Providers include a unique event ID in the payload or headers (
X-Webhook-Event-Id). Store event IDs in Redis with a 24-hour TTL:SET webhook:{event_id} 1 EX 86400 NX. IfNXreturns null, the event was already processed — acknowledge and discard. - Content hash deduplication: Hash the raw payload body. Use the hash as the deduplication key. Less reliable (providers may change timestamps between retries, producing a different hash for the same logical event) but works when providers do not include stable event IDs.
Acknowledge the duplicate immediately with 200 OK — do not return an error, which would cause the provider to retry again.
Event Routing and Fan-Out
Route events by event_type to appropriate internal queues:
switch (event.type) {
case 'payment.completed':
await queue.publish('payments', event);
break;
case 'user.created':
await queue.publish('user-onboarding', event);
break;
case 'subscription.cancelled':
await queue.publish('subscription-lifecycle', event);
break;
}
Fan-out: a single webhook event may need to trigger multiple internal handlers. For example, payment.completed might need to update the order service, trigger fulfillment, and send a receipt email. Publish to a single internal topic and let multiple consumer groups each subscribe independently — this decouples handler addition from the ingestion service.
Raw Payload Storage
Store every received webhook payload before enqueuing, regardless of signature validity on retried events:
CREATE TABLE webhook_events (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
provider TEXT NOT NULL,
event_type TEXT,
event_id TEXT, -- provider's event ID
raw_headers JSONB,
raw_payload JSONB,
received_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
processing_status TEXT DEFAULT 'pending', -- pending, processed, failed
processed_at TIMESTAMPTZ
);
This storage enables replay: if internal processing fails and the queue event is lost, replay directly from the raw_payload table without requiring the provider to resend. It also enables debugging — inspect exactly what the provider sent versus what the consumer processed.
Provider Registration and Secret Management
Each provider is registered with its endpoint configuration:
- Provider name and identifier (
stripe,github). - Signature header name (
Stripe-Signature,X-Hub-Signature-256). - Signature format (some providers include a timestamp in the signed payload to prevent replay attacks — handle provider-specific formats).
- Shared secret stored in Vault or AWS Secrets Manager — never in application config or environment variables in plaintext.
Monitoring
Key metrics for the ingestion service:
- Receipt rate by provider: Spike in Stripe webhooks may indicate unusual payment activity.
- Signature failure rate: A rise in 401s may indicate a secret rotation mismatch or an attack.
- Processing lag: Time from webhook receipt to internal consumer processing completion.
- Failure rate by event_type: Identifies which event types have broken consumers.
- Duplicate rate: High duplicate rate from a provider may indicate the service is not acknowledging fast enough.
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How is HMAC signature verification implemented for webhooks?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “The sender signs the raw request body using HMAC-SHA256 with a shared secret and includes the resulting hex digest in a header (e.g., X-Signature-256); the receiver independently computes the same HMAC over the raw bytes of the received body and compares the two digests using a constant-time comparison function to prevent timing attacks. The raw body must be read before any parsing to ensure the bytes used in verification match exactly what was transmitted.”
}
},
{
“@type”: “Question”,
“name”: “Why must the webhook endpoint respond quickly before processing?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Webhook senders enforce a short response timeout (commonly 5–30 seconds) and treat any non-2xx or timed-out response as a delivery failure, triggering retries with exponential back-off. The receiver should immediately enqueue the payload to a durable queue, return 200 OK, and process asynchronously to avoid timeouts caused by slow downstream work.”
}
},
{
“@type”: “Question”,
“name”: “How are duplicate webhook deliveries deduplicated?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Each event carries a unique event ID in its payload or headers; the consumer stores processed IDs in a Redis set or database table with a TTL covering the sender's retry window and skips processing if the ID is already present. This idempotency check must happen atomically — using SET NX in Redis or an INSERT … ON CONFLICT DO NOTHING — before any side-effectful work begins.”
}
},
{
“@type”: “Question”,
“name”: “How are failed webhook events replayed?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Failed events are written to a dead-letter queue with their full payload, headers, and failure reason; a separate replay job reads from the DLQ, re-enqueues events to the main processing queue, and tracks replay attempts to surface persistent failures for manual intervention. Replay can also be triggered on-demand via an admin API, which re-fetches the original event from the sender if available or re-processes the stored payload.”
}
}
]
}
See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering