JSON Webhook Implementation: HMAC-SHA256, Idempotency & Dead Letter Queues
Last updated:
A JSON webhook is an HTTP POST request sent by a service to your endpoint when an event occurs — the payload is a JSON body with an event type discriminator, timestamp, and event-specific data. HMAC-SHA256 signature verification prevents forged webhooks — compute hmac-sha256(secret, raw-body) and compare to the X-Signature-256 header using crypto.timingSafeEqual() to prevent timing attacks. Never verify the signature after parsing JSON — you must use the raw byte buffer.
This guide covers webhook payload structure, HMAC-SHA256 signature verification, idempotency keys to handle duplicate deliveries, retry strategies with exponential backoff, dead letter queues, and TypeScript discriminated unions for event routing. Every example uses Node.js with the crypto module.
JSON Webhook Payload Structure and Event Types
The standard JSON webhook payload follows a 4-field envelope: id (unique UUID for the delivery, used as the idempotency key), type (string discriminant in noun.verb format like order.created), timestamp (ISO 8601 UTC string for event ordering and replay-attack prevention), and data (the event-specific object). This pattern is used by Stripe, GitHub, Shopify, and Twilio. The type field is the primary discriminant clients use to route events — always check it before accessing data. GitHub is a notable exception: it places the event type in the X-GitHub-Event HTTP header rather than the JSON body, requiring header inspection for routing.
// ── Standard 4-field webhook JSON envelope ────────────────────────
{
"id": "evt_01HXYZ9ABC123DEF456GHI789", // UUIDv7 — sortable
"type": "order.created", // noun.verb discriminant
"timestamp": "2026-01-15T10:00:00.000Z", // ISO 8601 UTC
"data": {
"object": "order",
"id": "ord_abc123",
"amount": 4999,
"currency": "usd",
"customer_id": "cus_xyz789",
"line_items": [
{ "sku": "WIDGET-001", "qty": 2, "price": 1999 },
{ "sku": "WIDGET-002", "qty": 1, "price": 1001 }
],
"created_at": "2026-01-15T09:59:58.000Z"
}
}
// ── Stripe event — wraps data under data.object ───────────────────
{
"id": "evt_1NxyzABC123",
"object": "event",
"type": "payment_intent.succeeded",
"created": 1716199200, // Unix timestamp (Stripe legacy)
"livemode": false,
"api_version": "2024-06-20",
"data": {
"object": { // resource nested under data.object
"id": "pi_3NxyzABC123",
"object": "payment_intent",
"amount": 4999,
"currency": "usd",
"status": "succeeded"
}
}
}
// ── GitHub webhook — type in X-GitHub-Event header, not JSON body ──
// Headers:
// X-GitHub-Event: push
// X-Hub-Signature-256: sha256=<hex>
// X-GitHub-Delivery: 72d3162e-cc78-11e3-81ab-4c9367dc0958
{
"ref": "refs/heads/main",
"repository": { "id": 123456, "full_name": "org/repo" },
"pusher": { "name": "octocat" },
"commits": [{ "id": "abc123", "message": "Fix bug" }]
}
// ── Custom webhook with schema versioning ─────────────────────────
{
"id": "evt_01HXYZ9ABC123",
"type": "user.subscription_upgraded",
"timestamp": "2026-01-15T10:00:00.000Z",
"schema_version": "2", // bump only on breaking changes
"data": {
"user_id": "usr_abc123",
"old_plan": "starter",
"new_plan": "pro",
"effective_at": "2026-01-15T10:00:00.000Z",
"previous_attributes": { "plan": "starter" } // diff for update events
}
}
// ── Event type taxonomy best practices ───────────────────────────
// Good: order.created order.updated order.cancelled
// invoice.paid invoice.payment_failed invoice.voided
// user.created user.deleted user.plan_changed
// Avoid: orderCreated (camelCase — inconsistent with JSON conventions)
// created_order (verb first — hard to filter by noun)
// order_event (too generic)Always include the full resource object in data — consumers should not need a follow-up GET request to understand the event. For update events, add a previous_attributes diff map showing what changed. Use UUIDv7 for event IDs — they are lexicographically sortable by creation time, enabling cursor-based pagination of event logs without an additional timestamp column. See our JSON API design guide for more envelope design patterns.
HMAC-SHA256 Signature Verification
HMAC-SHA256 signature verification authenticates that a webhook was sent by the expected provider and that the JSON body was not tampered with in transit. The provider computes HMAC-SHA256(secret, rawBody) and places the hex digest in a request header; your endpoint independently computes the same hash and compares using crypto.timingSafeEqual(). The critical rule: hash the raw body bytes before any JSON parsing — JSON.stringify(JSON.parse(body)) can reorder keys, strip whitespace, or change number formatting, producing a different byte sequence and a completely different digest. See our JSON security guide for broader attack surface coverage.
import crypto from 'node:crypto'
// ── Generic HMAC-SHA256 verification ──────────────────────────────
function verifyHmacSha256(
rawBody: Buffer,
secret: string,
receivedSignature: string // hex string from header
): boolean {
const computed = crypto
.createHmac('sha256', secret)
.update(rawBody)
.digest('hex')
const a = Buffer.from(computed, 'hex')
const b = Buffer.from(receivedSignature, 'hex')
// timingSafeEqual requires equal-length buffers
if (a.length !== b.length) return false
return crypto.timingSafeEqual(a, b) // constant-time — no timing oracle
}
// ── Stripe: signed payload = timestamp + "." + rawBody ────────────
// Header: Stripe-Signature: t=1716199200,v1=abc123def456...
function verifyStripeSignature(
rawBody: Buffer,
sigHeader: string,
webhookSecret: string,
toleranceSecs = 300 // reject events older than 5 minutes
): boolean {
const parts = Object.fromEntries(
sigHeader.split(',').map(p => p.split('=') as [string, string])
)
const timestamp = parts['t']
const v1Sig = parts['v1']
if (!timestamp || !v1Sig) return false
// Reject replayed events outside the tolerance window
const age = Math.floor(Date.now() / 1000) - parseInt(timestamp, 10)
if (Math.abs(age) > toleranceSecs) return false
// Stripe signed payload is "timestamp.rawBodyUtf8"
const signedPayload = `${timestamp}.${rawBody.toString('utf8')}`
const computed = crypto
.createHmac('sha256', webhookSecret)
.update(signedPayload)
.digest('hex')
const a = Buffer.from(computed, 'hex')
const b = Buffer.from(v1Sig, 'hex')
if (a.length !== b.length) return false
return crypto.timingSafeEqual(a, b)
}
// ── GitHub: X-Hub-Signature-256: sha256=<hex> ─────────────────────
function verifyGitHubSignature(
rawBody: Buffer,
sigHeader: string, // "sha256=abc123..."
secret: string
): boolean {
const receivedHex = sigHeader.replace(/^sha256=/, '')
return verifyHmacSha256(rawBody, secret, receivedHex)
}
// ── Shopify: X-Shopify-Hmac-Sha256: <base64> ──────────────────────
function verifyShopifySignature(
rawBody: Buffer,
sigHeader: string, // base64-encoded (not hex)
secret: string
): boolean {
const computed = crypto
.createHmac('sha256', secret)
.update(rawBody)
.digest('base64')
const a = Buffer.from(computed, 'base64')
const b = Buffer.from(sigHeader, 'base64')
if (a.length !== b.length) return false
return crypto.timingSafeEqual(a, b)
}
// ── Next.js App Router endpoint — raw bytes before JSON.parse ─────
export async function POST(request: Request) {
const ab = await request.arrayBuffer()
const rawBody = Buffer.from(ab) // raw bytes for HMAC
const sig = request.headers.get('stripe-signature') ?? ''
// Step 1: verify signature on raw bytes
if (!verifyStripeSignature(rawBody, sig, process.env.STRIPE_WEBHOOK_SECRET!)) {
return new Response('Invalid signature', { status: 400 })
}
// Step 2: only parse JSON after verification passes
const event = JSON.parse(rawBody.toString('utf8'))
// ... process event
return new Response('ok', { status: 200 })
}Never use === for signature comparison — standard string equality short-circuits at the first differing character, allowing an attacker to guess the correct signature one byte at a time by measuring response latency (a timing oracle attack). crypto.timingSafeEqual compares all bytes in constant time regardless of where the first mismatch occurs. Both buffers must be the same length before calling timingSafeEqual — check lengths first and return false if they differ (differing lengths also indicate a mismatch). Timestamp validation — rejecting events older than 300 seconds — is the primary defense against replay attacks where a captured valid webhook is re-sent later.
Idempotency: Handling Duplicate Webhook Deliveries
Idempotent webhook processing ensures that receiving the same event multiple times produces the same outcome as receiving it once. Without idempotency, a payment webhook delivered twice charges the customer twice; a fulfillment webhook delivered twice ships the order twice. The implementation pattern is atomic-insert-before-process: attempt to record the event ID in a durable store before acting on it, and skip processing if the ID already exists. The event id field in the JSON envelope is the idempotency key — providers always use the same ID for retried deliveries of the same event.
// ── Option 1: PostgreSQL UNIQUE constraint ────────────────────────
-- Migration: create the idempotency table
CREATE TABLE processed_webhook_events (
event_id VARCHAR(255) PRIMARY KEY,
received_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
event_type VARCHAR(100) NOT NULL,
status VARCHAR(20) NOT NULL DEFAULT 'processing'
CHECK (status IN ('processing', 'completed', 'failed'))
);
-- Nightly cleanup: DELETE FROM processed_webhook_events
-- WHERE received_at < NOW() - INTERVAL '4 days';
// In your handler (Prisma example)
async function processWebhookIdempotent(
eventId: string,
eventType: string,
data: unknown,
db: PrismaClient
) {
// Atomic insert — throws unique violation if duplicate
try {
await db.processedWebhookEvent.create({
data: { eventId, eventType, status: 'processing' },
})
} catch (err: unknown) {
const isDuplicate =
err instanceof Error && err.message.includes('Unique constraint')
if (isDuplicate) {
console.log(`Skipping duplicate webhook: ${eventId}`)
return // return 200 — already processed
}
throw err
}
try {
await handleWebhookData(eventType, data)
await db.processedWebhookEvent.update({
where: { eventId },
data: { status: 'completed' },
})
} catch (err) {
await db.processedWebhookEvent.update({
where: { eventId },
data: { status: 'failed' },
})
throw err // re-throw → return 5xx → provider retries
}
}
// ── Option 2: Redis SET NX (atomic, sub-millisecond) ──────────────
import { createClient } from 'redis'
const redis = createClient({ url: process.env.REDIS_URL })
async function claimWebhookEvent(eventId: string): Promise<boolean> {
// SET webhook:{id} 1 NX EX 259200
// NX = only set if key does not exist (atomic)
// EX = expire after 259200 seconds (3 days — Stripe retry window)
const result = await redis.set(
`webhook:event:${eventId}`,
'1',
{ NX: true, EX: 259200 }
)
// 'OK' → key was newly set → this process owns the event
// null → key already existed → duplicate delivery
return result === 'OK'
}
// Usage in webhook handler
const claimed = await claimWebhookEvent(event.id)
if (!claimed) {
return new Response('ok', { status: 200 }) // idempotent 200
}
await processEvent(event)
// ── Race condition: two concurrent deliveries ─────────────────────
// Both arrive within milliseconds — both pass "not seen" check before
// either inserts. Solution: UNIQUE constraint (Postgres) and SET NX (Redis)
// are both atomic at the database/cache level — only one writer wins.
// The loser gets a duplicate error and returns 200 safely.The Redis SET NX approach is faster (sub-millisecond vs. multi-millisecond for a database round-trip) but requires Redis availability; the PostgreSQL UNIQUE constraint approach uses your existing database and survives cache restarts. For high-volume systems, prefer Redis. Never process-then-store — a crash or timeout between processing and storing the ID produces a phantom event that is re-processed on the next retry with no idempotency record. Always store first, process second.
Async Processing: Queues and Background Jobs
Async processing decouples HTTP response latency from business logic execution: verify the signature, enqueue the JSON event, return HTTP 200 immediately, then process in a background worker. This pattern is non-negotiable for any processing that takes longer than the provider timeout — Stripe cuts off at 30 seconds, GitHub at 10 seconds, Shopify at 5 seconds. Any timeout causes the provider to mark the delivery failed and retry, so even a single slow database write that exceeds the limit produces duplicate deliveries. The queue also acts as a buffer against downstream service outages.
// ── BullMQ (Redis-backed) queue setup ────────────────────────────
import { Queue, Worker } from 'bullmq'
import { connection } from '@/lib/redis'
const webhookQueue = new Queue('webhooks', { connection })
// HTTP handler — enqueue and respond in < 1 second
export async function POST(request: Request) {
const ab = await request.arrayBuffer()
const rawBody = Buffer.from(ab)
const sig = request.headers.get('stripe-signature') ?? ''
if (!verifyStripeSignature(rawBody, sig, process.env.STRIPE_WEBHOOK_SECRET!)) {
return new Response('Invalid signature', { status: 400 })
}
const event = JSON.parse(rawBody.toString('utf8'))
await webhookQueue.add(event.type, event, {
jobId: event.id, // deduplication: BullMQ rejects duplicate jobIds
attempts: 3,
backoff: { type: 'exponential', delay: 5000 },
})
return new Response('ok', { status: 200 }) // respond before processing
}
// ── Worker — runs in a separate process ───────────────────────────
const worker = new Worker('webhooks', async (job) => {
const event = job.data as WebhookEvent
switch (event.type) {
case 'payment_intent.succeeded':
await fulfillOrder(event.data)
break
case 'customer.subscription.deleted':
await cancelSubscription(event.data)
break
case 'invoice.payment_failed':
await notifyCustomerPaymentFailed(event.data)
break
default:
// Log unhandled events — do not throw; return success
console.log(`Unhandled webhook event type: ${event.type}`)
}
}, { connection })
// ── pg-boss: PostgreSQL-backed queue (no Redis dependency) ────────
import PgBoss from 'pg-boss'
const boss = new PgBoss(process.env.DATABASE_URL!)
await boss.start()
// Enqueue
await boss.send('webhooks', event, { singletonKey: event.id })
// Worker
boss.work('webhooks', async ([job]) => {
await processEvent(job.data as WebhookEvent)
})
// ── AWS SQS: serverless, no persistent workers ────────────────────
import { SQSClient, SendMessageCommand } from '@aws-sdk/client-sqs'
const sqs = new SQSClient({ region: 'us-east-1' })
await sqs.send(new SendMessageCommand({
QueueUrl: process.env.WEBHOOK_QUEUE_URL,
MessageBody: JSON.stringify(event),
MessageGroupId: event.type,
MessageDeduplicationId: event.id, // FIFO queue deduplication
}))BullMQ's jobId option provides a second layer of deduplication at the queue level — if the same event is enqueued twice (e.g., due to a provider retry arriving before the first was acknowledged), BullMQ rejects the duplicate job. This complements, but does not replace, the idempotency check in the worker — the worker-level check handles duplicates that survive queue deduplication. For serverless deployments (Vercel, Netlify), use SQS FIFO queues or database-backed queues since there is no persistent process to run a worker.
Retry Strategies and Dead Letter Queues
Webhook providers implement at-least-once delivery: they retry failed deliveries until they receive HTTP 200-299 or the retry window expires. At-least-once delivery is not exactly-once — your endpoint must handle duplicates (covered by idempotency above). Understanding each provider's retry schedule lets you size your idempotency TTL correctly and plan DLQ alert thresholds. A dead letter queue captures events that exhaust all retries — without it, those events are silently discarded, causing data loss in billing, fulfillment, or audit systems.
// ── Provider retry schedules ──────────────────────────────────────
// Stripe: 5 attempts over 3 days (backoff: 5m → 30m → 2h → 5h → 10h)
// GitHub: up to 3 days (exponential backoff)
// Shopify: 19 attempts over 48h (exponential backoff)
// Twilio: 3 attempts over 24h
// SendGrid: 3 attempts, 30m intervals
// ── BullMQ dead letter queue pattern ─────────────────────────────
import { Queue, Worker, QueueEvents } from 'bullmq'
const webhookQueue = new Queue('webhooks', { connection })
const deadLetterQueue = new Queue('webhooks-dlq', { connection })
const worker = new Worker('webhooks', async (job) => {
await processEvent(job.data as WebhookEvent)
}, {
connection,
// Worker-level retry config (independent of queue-level)
limiter: { max: 10, duration: 1000 }, // 10 jobs/sec rate limit
})
// Move to DLQ after all attempts are exhausted
worker.on('failed', async (job, err) => {
if (!job) return
const maxAttempts = job.opts.attempts ?? 1
if (job.attemptsMade >= maxAttempts) {
await deadLetterQueue.add('failed-webhook', {
eventId: job.data.id,
eventType: job.data.type,
error: err.message,
stack: err.stack,
failedAt: new Date().toISOString(),
attemptsMade: job.attemptsMade,
rawPayload: job.data,
})
console.error(`Webhook moved to DLQ: ${job.data.id} (${err.message})`)
}
})
// ── DLQ alert: notify on depth spike ─────────────────────────────
// Run on a cron (every 5 minutes)
async function checkDlqDepth() {
const count = await deadLetterQueue.getJobCounts('wait', 'delayed', 'failed')
const total = count.wait + count.delayed
if (total > 10) {
await alertSlack(`DLQ depth: ${total} failed webhook events`)
}
}
// ── DLQ replay: reprocess after fixing the bug ───────────────────
async function replayDlqEvents(limit = 100) {
const jobs = await deadLetterQueue.getJobs(['wait'], 0, limit)
for (const job of jobs) {
const { rawPayload } = job.data
await webhookQueue.add(rawPayload.type, rawPayload, {
jobId: `replay:${rawPayload.id}:${Date.now()}`, // new jobId for replay
})
await job.remove()
console.log(`Replayed: ${rawPayload.id}`)
}
}
// ── HTTP status semantics for providers ──────────────────────────
// 200-299 → received, stop retrying (even if not yet processed)
// 400 → bad request, STOP retrying (signature invalid, malformed JSON)
// 422 → unprocessable, STOP retrying (some providers treat as 4xx)
// 500-503 → transient failure, RETRY (database down, queue full)
// timeout → RETRY (treat same as 5xx)Configure your DLQ alerts to fire when depth exceeds a threshold that represents business risk — for a payment system, even 1 failed billing event warrants a page; for analytics events, 100 might be acceptable. Replay is safe only after fixing the underlying bug — replaying against a broken handler just re-populates the DLQ. Use a new jobId for replayed events (prefix with replay:) so BullMQ does not reject them as duplicates of the original failed jobs.
Testing Webhooks Locally with ngrok and Stripe CLI
Local webhook testing requires simulating provider deliveries — including correct HMAC signatures and raw body bytes — without deploying to a public server. Three tools cover interactive development; integration tests cover CI without external dependencies. The integration test approach is the most reliable: it generates real HMAC signatures in test code, exercises the full verification path, and runs without any external service or network access.
// ── Tool 1: ngrok — public HTTPS tunnel to localhost ─────────────
$ brew install ngrok && ngrok config add-authtoken YOUR_TOKEN
$ ngrok http 3000
// Output: Forwarding https://abc123.ngrok.io -> http://localhost:3000
// Register https://abc123.ngrok.io/api/webhooks in provider dashboard
// Inspector UI: http://localhost:4040 — shows request/response pairs
// ── Tool 2: Stripe CLI — forward + sign events ────────────────────
$ brew install stripe/stripe-cli/stripe && stripe login
$ stripe listen --forward-to localhost:3000/api/webhooks/stripe
// Output: Ready! Webhook signing secret: whsec_test_...
// Use that secret as STRIPE_WEBHOOK_SECRET in .env.local
// Trigger test events in a separate terminal:
$ stripe trigger payment_intent.succeeded
$ stripe trigger customer.subscription.deleted
$ stripe trigger invoice.payment_failed
// Stripe CLI automatically signs events — no manual HMAC needed
// ── Tool 3: GitHub smee.io proxy ──────────────────────────────────
// 1. Visit smee.io — create a new channel URL
// 2. Register https://smee.io/AbCdEfGh1234567 as GitHub webhook URL
$ npx smee -u https://smee.io/AbCdEfGh1234567 -t http://localhost:3000/api/webhooks/github
// ── Integration tests: generate real HMAC, no external services ───
import { createHmac } from 'node:crypto'
import { POST } from '@/app/api/webhooks/stripe/route'
import { NextRequest } from 'next/server'
const WEBHOOK_SECRET = 'whsec_test_secret_for_ci_only'
function buildStripeRequest(payload: object): NextRequest {
const body = JSON.stringify(payload)
const timestamp = Math.floor(Date.now() / 1000).toString()
const signed = `${timestamp}.${body}`
const sig = createHmac('sha256', WEBHOOK_SECRET).update(signed).digest('hex')
return new NextRequest('http://localhost/api/webhooks/stripe', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Stripe-Signature': `t=${timestamp},v1=${sig}`,
},
body,
})
}
test('returns 200 for valid payment_intent.succeeded webhook', async () => {
process.env.STRIPE_WEBHOOK_SECRET = WEBHOOK_SECRET
const payload = {
id: 'evt_test_001', type: 'payment_intent.succeeded',
data: { object: { id: 'pi_test_001', amount: 4999, status: 'succeeded' } },
}
const response = await POST(buildStripeRequest(payload))
expect(response.status).toBe(200)
})
test('returns 400 for invalid signature', async () => {
const request = new NextRequest('http://localhost/api/webhooks/stripe', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Stripe-Signature': 't=1234567890,v1=badhex',
},
body: JSON.stringify({ type: 'payment_intent.succeeded' }),
})
const response = await POST(request)
expect(response.status).toBe(400)
})
test('returns 400 for replayed event (expired timestamp)', async () => {
const body = JSON.stringify({ id: 'evt_old', type: 'order.created' })
const timestamp = (Math.floor(Date.now() / 1000) - 400).toString() // 400s ago
const signed = `${timestamp}.${body}`
const sig = createHmac('sha256', WEBHOOK_SECRET).update(signed).digest('hex')
const request = new NextRequest('http://localhost/api/webhooks/stripe', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Stripe-Signature': `t=${timestamp},v1=${sig}`,
},
body,
})
const response = await POST(request)
expect(response.status).toBe(400) // timestamp outside 300s tolerance
})The integration test approach is preferable to mocking crypto — mocking the security primitive defeats the purpose of the test. Generate real signatures in test setup code and assert real status codes. Test three scenarios minimum: valid signature and recent timestamp (200), invalid signature (400), and valid signature with expired timestamp (400 — replay attack). Use webhook.site for exploratory testing when you need to inspect the exact headers and body that a provider sends — invaluable for debugging signature issues on new integrations.
TypeScript Discriminated Unions for Webhook Event Routing
TypeScript discriminated unions type webhook payloads by using the type field as the discriminant — each event variant gets its own data shape, and TypeScript narrows the type automatically in switch/case branches. This eliminates any casts, surfaces breaking changes at compile time, and provides IDE autocomplete for event-specific fields. The pattern mirrors TypeScript's own tagged union pattern and integrates cleanly with runtime validators like Zod. See our TypeScript JSON types guide for broader type-safe JSON patterns.
// ── Base webhook envelope ─────────────────────────────────────────
interface WebhookBase<T extends string, D> {
id: string
type: T
timestamp: string
data: D
}
// ── Event-specific data shapes ────────────────────────────────────
interface OrderCreatedData {
object: 'order'
id: string
amount: number
currency: string
customer_id: string
line_items: Array<{ sku: string; qty: number; price: number }>
}
interface PaymentFailedData {
object: 'payment_intent'
id: string
amount: number
currency: string
failure_code: string
failure_message: string
}
interface SubscriptionCancelledData {
object: 'subscription'
id: string
customer_id: string
cancelled_at: string
reason: 'user_cancelled' | 'payment_failed' | 'admin'
}
// ── Discriminated union: type field is the discriminant ───────────
type OrderCreatedEvent = WebhookBase<'order.created', OrderCreatedData>
type PaymentFailedEvent = WebhookBase<'payment_intent.failed', PaymentFailedData>
type SubscriptionCancelledEvent = WebhookBase<'subscription.cancelled', SubscriptionCancelledData>
type WebhookEvent =
| OrderCreatedEvent
| PaymentFailedEvent
| SubscriptionCancelledEvent
// ── Router: TypeScript narrows type in each branch ────────────────
async function routeWebhookEvent(event: WebhookEvent): Promise<void> {
switch (event.type) {
case 'order.created':
// event.data is OrderCreatedData here — full autocomplete
await fulfillOrder(event.data.id, event.data.line_items)
break
case 'payment_intent.failed':
// event.data is PaymentFailedData — failure_code is typed
await notifyPaymentFailed(event.data.customer_id, event.data.failure_code)
break
case 'subscription.cancelled':
// event.data is SubscriptionCancelledData
await downgradeAccount(event.data.customer_id)
break
default:
// TypeScript exhaustiveness check: if all cases handled, 'event' is 'never'
const _exhaustive: never = event
console.log(`Unhandled event type: ${(_exhaustive as WebhookEvent).type}`)
}
}
// ── Runtime validation with Zod ───────────────────────────────────
import { z } from 'zod'
const OrderCreatedSchema = z.object({
id: z.string(),
type: z.literal('order.created'),
timestamp: z.string().datetime(),
data: z.object({
id: z.string(),
amount: z.number().int().positive(),
currency: z.string().length(3),
customer_id: z.string(),
line_items: z.array(z.object({
sku: z.string(),
qty: z.number().int().positive(),
price: z.number().int().nonnegative(),
})),
}),
})
// Parse after signature verification — throws ZodError if schema mismatch
function parseWebhookEvent(raw: unknown): WebhookEvent {
const parsed = JSON.parse(typeof raw === 'string' ? raw : JSON.stringify(raw))
switch (parsed.type) {
case 'order.created': return OrderCreatedSchema.parse(parsed)
default: throw new Error(`Unknown webhook type: ${parsed.type}`)
}
}
// ── See also: json-error-handling for ZodError handling patterns ──The exhaustiveness check (const _exhaustive: never = event) causes a TypeScript compile error if a new event type is added to the union but not handled in the switch — this surfaces missing handlers at build time rather than at runtime. Zod runtime validation complements TypeScript types: TypeScript provides compile-time safety, Zod validates the actual JSON payload against the expected schema at runtime, catching schema drift when the provider changes their payload format. See our JSON error handling guide for patterns on handling ZodError and reporting validation failures.
Key Terms
- webhook
- An HTTP POST request sent by a service (the sender) to a pre-registered URL (the receiver endpoint) when an event occurs. The payload is typically a JSON body describing the event — event type, a unique event ID, a timestamp, and event-specific data. Unlike polling (the receiver periodically checks for changes), webhooks are push-based: the sender notifies the receiver immediately when something happens. Webhook delivery semantics are at-least-once — the sender retries on non-2xx responses — so receivers must handle duplicate deliveries idempotently using the event ID. Major providers using JSON webhooks include Stripe, GitHub, Shopify, Twilio, SendGrid, and Slack.
- HMAC-SHA256
- Hash-based Message Authentication Code using the SHA-256 hash function. Computed as
HMAC-SHA256(secret, message)— a keyed hash that authenticates both the identity of the sender (they know the shared secret) and the integrity of the message (any change to the bytes produces a completely different digest). In webhook contexts, the provider computesHMAC-SHA256(webhookSecret, rawBodyBytes)and places the hex digest in a request header. The receiver independently computes the same hash and compares using timing-safe equality. HMAC-SHA256 does not encrypt the body — the payload is still plaintext; it only provides authentication and integrity verification. The shared secret must be kept confidential; rotate it if compromised. - timingSafeEqual
- A cryptographic comparison function (
crypto.timingSafeEqual(a, b)in Node.js) that compares twoBufferobjects in constant time — the comparison takes the same duration regardless of how many bytes match before the first mismatch. Standard string or byte comparison short-circuits at the first differing position, leaking information about the correct value through response latency. An attacker exploiting a timing oracle can guess an HMAC signature one byte at a time by sending thousands of candidate signatures and measuring which ones produce a slightly longer response time.timingSafeEqualeliminates this attack by making response time independent of the comparison result. Both buffers must be equal length before calling it; check length equality first and return false if they differ. - idempotency key
- A unique identifier used to deduplicate operations so that performing the same operation multiple times produces the same result as performing it once. In webhook processing, the event
idfield serves as the idempotency key — providers always use the same event ID for retried deliveries of the same event. The receiver stores processed event IDs in a durable store (database UNIQUE constraint or Redis SET NX) and skips processing if the ID already exists. Idempotency keys prevent duplicate charges, duplicate order fulfillment, and duplicate notifications when providers retry failed webhook deliveries. The idempotency key must be stored before processing begins — store-then-process, never process-then-store. - at-least-once delivery
- A message delivery guarantee where the sender ensures every message is delivered at least one time, but may deliver it more than once. Webhook providers implement at-least-once delivery by retrying on non-2xx responses — the receiver may receive the same event two or more times if the first delivery timed out or returned a 5xx error. At-least-once delivery is not exactly-once delivery — receivers must handle duplicates idempotently. The alternative, exactly-once delivery, requires distributed consensus protocols that are significantly more complex and slower. At-least-once is the standard webhook guarantee because it prioritizes delivery reliability over deduplication complexity — the receiver is better positioned to deduplicate using the event ID than the sender is to implement distributed exactly-once semantics.
- dead letter queue
- A secondary queue that receives messages that failed processing after all retry attempts are exhausted, preserving them for manual review and reprocessing rather than discarding them. When a webhook event fails consistently (due to bugs, schema changes, or poison pill data that triggers exceptions), the queue worker moves it to the DLQ after the maximum attempt count is reached. Each DLQ entry stores the original event JSON, the error message, the attempt count, and the failure timestamp. DLQ depth alerts notify the engineering team of systematic failures. After fixing the underlying bug, DLQ events can be replayed against the fixed handler. Without a DLQ, events that exhaust all retries are silently lost — this causes invisible data loss in billing, fulfillment, and audit systems.
- exponential backoff
- A retry strategy where the delay between successive attempts grows exponentially — for example, 5 minutes, 30 minutes, 2 hours, 5 hours, 10 hours. Exponential backoff prevents a recovering downstream service from being overwhelmed by a flood of simultaneous retries from many senders. Webhook providers use exponential backoff for their own retry schedules: Stripe retries 5 times over 3 days with exponentially increasing delays. Internal queue workers (BullMQ, SQS) also use exponential backoff for processing retries. Jitter — adding a random component to each delay — is often combined with exponential backoff to prevent synchronized retry storms when many events fail simultaneously. Configure exponential backoff in BullMQ with
backoff: { type: "exponential", delay: 5000 }.
FAQ
How do I verify a JSON webhook signature?
JSON webhook signature verification requires four steps: (1) read the raw request body as bytes — Buffer.from(await request.arrayBuffer()) in Next.js App Router — before any JSON.parse(); (2) retrieve the signature from the provider-specific header (Stripe-Signature, X-Hub-Signature-256, or X-Shopify-Hmac-Sha256); (3) compute HMAC-SHA256 over the raw body bytes using your webhook secret with crypto.createHmac('sha256', secret).update(rawBody).digest('hex'); (4) compare using crypto.timingSafeEqual() — never with ===, which leaks timing information enabling signature guessing attacks. For Stripe, reconstruct the signed payload as timestamp.rawBody before hashing, and reject events where the timestamp is older than 300 seconds to prevent replay attacks. Only parse JSON after signature verification passes.
What is the structure of a JSON webhook payload?
A JSON webhook payload follows a 4-field envelope used by Stripe, GitHub, Shopify, and Twilio: id (unique UUID identifying this event delivery, used as the idempotency key), type (string discriminant in noun.verb format — order.created, payment.failed), timestamp (ISO 8601 UTC string for event ordering and replay-attack prevention), and data (the event-specific object payload). The type field is the primary discriminant for routing events — always check it before accessing data. Stripe wraps the resource under data.object and adds api_version; GitHub places the event type in the X-GitHub-Event HTTP header instead of the body. When designing your own webhook schema, add schema_version for breaking changes and include the full resource in data so receivers do not need a follow-up API call.
How do I handle duplicate webhook deliveries?
Duplicate webhook deliveries are handled by idempotent processing using the event id as a deduplication key. Implementation: (1) extract the event id from the JSON envelope; (2) attempt an atomic insert into a processed_events table with a PRIMARY KEY/UNIQUE constraint, or use Redis SET webhook:{id} 1 NX EX 259200 (NX = only set if not exists, 3-day TTL); (3) if the insert fails with a duplicate key error (or Redis returns null), return HTTP 200 immediately — already processed; (4) if the insert succeeds, process the event and mark it completed. Both the UNIQUE constraint and Redis SET NX are atomic operations, preventing race conditions when the same event arrives concurrently. Always store the event ID before processing — never after. Clean up event IDs after the provider retry window expires (4 days for Stripe) to prevent unbounded storage growth.
How do I process JSON webhooks asynchronously?
Async webhook processing: verify the signature, enqueue the JSON event, return HTTP 200, then process in a background worker. With BullMQ: await webhookQueue.add(event.type, event, { jobId: event.id, attempts: 3, backoff: { type: "exponential", delay: 5000 } }) — the jobId provides queue-level deduplication. A separate Worker process consumes events from the queue with a switch on event.type. Provider timeouts are strict — Stripe cuts off at 30 seconds, GitHub at 10 seconds, Shopify at 5 seconds — so any database write or API call must happen after returning 200, not before. For serverless environments without persistent workers, use PostgreSQL-backed queues (pg-boss) or managed services (AWS SQS FIFO). Enqueue only the verified event JSON, not the raw body bytes — the raw bytes are only needed for HMAC verification, which happens in the HTTP handler before enqueuing.
What HTTP status should my webhook endpoint return?
Return HTTP 200-299 within the provider timeout to signal successful receipt — any other status or timeout causes the provider to retry. The status code semantics differ from standard REST: return 200 even before processing the event (enqueue it, then respond). Return 400 only for genuine signature verification failures or malformed payloads — 4xx tells providers the event is permanently rejected and retries stop. Return 500-503 for transient failures (database unavailable, queue full) that you want retried — 5xx signals a temporary failure. Never return 4xx for business logic failures like “user not found” — return 200 and handle the edge case in your async processing code. Return 200 for duplicate deliveries (idempotent response) — do not return an error just because you already processed the event. Providers may disable your endpoint after too many consecutive 5xx responses.
How do I test JSON webhooks locally?
Three tools cover local testing. (1) ngrok: run ngrok http 3000 to get a public HTTPS URL, register it as the webhook endpoint in the provider dashboard — ngrok proxies POSTs to your local server with a request inspector at http://localhost:4040. (2) Stripe CLI: run stripe listen --forward-to localhost:3000/api/webhooks/stripe — forwards all Stripe events with automatic HMAC signing; trigger events with stripe trigger payment_intent.succeeded. (3) GitHub smee.io: create a channel at smee.io, register it as the GitHub webhook URL, and run npx smee -u https://smee.io/channel -t http://localhost:3000/api/webhooks/github. For CI without external services, generate real HMAC signatures in test code using crypto.createHmac, build a NextRequest with the correct headers, and assert response status codes — this exercises the full verification path without any external dependency.
What is a dead letter queue for webhooks?
A dead letter queue (DLQ) is a secondary queue that captures webhook events that failed processing after all retry attempts are exhausted, preserving them for review and reprocessing instead of silently discarding them. When a webhook event fails consistently — due to bugs, schema changes, or poison pill data — the queue worker moves it to the DLQ after exhausting all attempts. Each DLQ entry stores the original event JSON, error message, attempt count, and failure timestamp. In BullMQ, listen to the worker.on('failed', ...) event and add to a dead-letter queue when job.attemptsMade >= maxAttempts. Set up alerts on DLQ depth — a spike indicates systematic failure requiring developer attention. After fixing the underlying bug, replay DLQ events against the fixed handler using a new jobId to avoid deduplication rejection.
How do I type webhook JSON payloads in TypeScript?
TypeScript discriminated unions type webhook payloads using the type field as the discriminant. Define a generic base interface WebhookBase<T extends string, D> with type: T and data: D, create specific interfaces for each event type (OrderCreatedEvent, PaymentFailedEvent), and union them into a WebhookEvent type. In a switch on event.type, TypeScript narrows event.data to the correct shape automatically — no casts needed. Add an exhaustiveness check: const _exhaustive: never = event in the default branch — TypeScript raises a compile error when a new event type is added to the union but not handled. Use Zod for runtime validation: parse the raw JSON before TypeScript typing, so types reflect actually validated data. Never use any for webhook event data — it defeats type safety in the most security-sensitive path of your application.
Further reading and primary sources
- Stripe: Webhook Signature Verification — Official Stripe docs for HMAC-SHA256 signature verification, Stripe-Signature header format, and replay attack prevention with timestamp tolerance
- GitHub: Validating Webhook Deliveries — GitHub documentation for X-Hub-Signature-256 HMAC verification and secure webhook delivery validation
- BullMQ: Queue and Worker Patterns — BullMQ documentation for Redis-backed job queues, retry policies, dead letter queue patterns, and rate limiting
- Node.js: crypto.timingSafeEqual — Node.js official documentation for crypto.timingSafeEqual — constant-time buffer comparison for HMAC verification
- Shopify: Webhook HMAC Verification — Shopify webhook verification using the X-Shopify-Hmac-Sha256 base64-encoded signature header