AI Webhook Processor Template: Starter Guide
Event-driven AI processing template with webhooks
AI Webhook Processor: Starter Template
"Webhook in → AI processes → action out" is the backbone shape of half of all practical AI automation: classify the new support ticket, summarize the merged PR, triage the form submission. This starter guide gives you the production-shaped template — acknowledge fast, verify signatures, process async, stay idempotent — because the naive version (do the LLM call inside the webhook handler) fails in exactly four predictable ways.
Why the naive version fails
text
Naive: webhook → [LLM call, 5-30s] → 200 OK
The template
python
FastAPI + queue worker shape — swap the queue for your stack (SQS/BullMQ/pg-boss)
import hashlib, hmac, json
from fastapi import FastAPI, Request, HTTPExceptionapp = FastAPI()
@app.post('/webhooks/tickets')
async def receive(request: Request):
raw = await request.body()
# 1. Verify signature on the RAW body (parsing first breaks the MAC)
sig = request.headers.get('x-signature', '')
expected = hmac.new(SECRET, raw, hashlib.sha256).hexdigest()
if not hmac.compare_digest(sig, expected):
raise HTTPException(401)
event = json.loads(raw)
# 2. Idempotency key from the sender's event ID
if await already_processed(event['id']):
return {'status': 'duplicate'} # 200 — sender stops retrying
# 3. Enqueue and ACK immediately — milliseconds, not seconds
await queue.send(event)
return {'status': 'queued'}
python
worker.py — where the AI actually runs
async def handle(event: dict):
if await already_processed(event['id']): # check again at execution time
return
result = await classify_ticket(event['payload']) # the LLM step, schema-validated
await act(result) # route / label / notify
await mark_processed(event['id'], result) # provenance row
The four fixes, mapped: ack-then-process kills timeouts; event-ID idempotency (checked at receive AND execute) kills retry duplicates; HMAC on the raw body kills spoofing; the queue gives you backpressure, concurrency caps, and retry-with-backoff for free.
The AI step itself
Standard discipline applies, condensed:
Operational notes
FAQ
Why check idempotency twice? Receive-time check stops queue pollution; execute-time check covers the race where two retries both enqueued before the first marked done. Cheap insurance.
Sync response required by the sender? Some webhooks want a verdict in the response. Options: a fast-path small model with a strict timeout (and a safe default on timeout), or renegotiate to async callback — don't put frontier-model latency inside a webhook SLA.
Batch the LLM calls? If the sender bursts (100s of events/minute) and per-event latency doesn't matter, micro-batch in the worker (group 10-20 per prompt) — order-of-magnitude cost/throughput win, same batching logic at small scale.
*Last updated: June 2026.*
Also available in 中文.