← Back to tutorials

AI Webhook Processor Template: Starter Guide

Event-driven AI processing template with webhooks

AI Webhook Processor: Starter Template

"Webhook in → AI processes → action out" is the backbone shape of half of all practical AI automation: classify the new support ticket, summarize the merged PR, triage the form submission. This starter guide gives you the production-shaped template — acknowledge fast, verify signatures, process async, stay idempotent — because the naive version (do the LLM call inside the webhook handler) fails in exactly four predictable ways.

Why the naive version fails

text

Naive: webhook → [LLM call, 5-30s] → 200 OK

  • Timeouts: webhook senders expect a response in seconds; an LLM call blows the budget and the sender retries → duplicate processing.
  • Retry storms: any 5xx triggers sender retries; without idempotency you process N times.
  • No signature check = anyone who finds the URL feeds garbage (or prompt injection) into your pipeline.
  • Backpressure: a burst of events (bulk import, incident flood) piles concurrent LLM calls until rate limits or memory give out.
  • The template

    python
    

    FastAPI + queue worker shape — swap the queue for your stack (SQS/BullMQ/pg-boss)

    import hashlib, hmac, json from fastapi import FastAPI, Request, HTTPException

    app = FastAPI()

    @app.post('/webhooks/tickets') async def receive(request: Request): raw = await request.body() # 1. Verify signature on the RAW body (parsing first breaks the MAC) sig = request.headers.get('x-signature', '') expected = hmac.new(SECRET, raw, hashlib.sha256).hexdigest() if not hmac.compare_digest(sig, expected): raise HTTPException(401)

    event = json.loads(raw) # 2. Idempotency key from the sender's event ID if await already_processed(event['id']): return {'status': 'duplicate'} # 200 — sender stops retrying # 3. Enqueue and ACK immediately — milliseconds, not seconds await queue.send(event) return {'status': 'queued'}

    python
    

    worker.py — where the AI actually runs

    async def handle(event: dict): if await already_processed(event['id']): # check again at execution time return result = await classify_ticket(event['payload']) # the LLM step, schema-validated await act(result) # route / label / notify await mark_processed(event['id'], result) # provenance row

    The four fixes, mapped: ack-then-process kills timeouts; event-ID idempotency (checked at receive AND execute) kills retry duplicates; HMAC on the raw body kills spoofing; the queue gives you backpressure, concurrency caps, and retry-with-backoff for free.

    The AI step itself

    Standard discipline applies, condensed:

  • Structured output with a schema, validated before any action (Zod vs Pydantic) — webhook payloads are exactly where malformed output must not flow into actions.
  • Treat payload content as untrusted: it's user-originated text entering a prompt — prompt-injection surface. Keep instructions in the system role, payload strictly as data, and gate consequential actions (approval patterns).
  • Mini-tier model + semaphore concurrency in the worker (async patterns); bursts drain at your pace, not the sender's.
  • Dead-letter queue for events that fail N times — with the LLM's error attached, so debugging starts with context.
  • Operational notes

  • Replay capability: store raw events (with TTL) so you can re-run after a prompt fix — pairs with provenance discipline on results.
  • Graceful shutdown: workers must finish-and-ack or nack-for-redelivery on SIGTERM — the shutdown guide covers it.
  • Monitor the queue depth, not just errors — a growing backlog is your earliest warning of rate-limit trouble or a poison event.
  • No-code version: this exact shape is buildable in n8n (Webhook trigger → AI node → actions) for internal-tool-grade loads; graduate to the coded template when volume, latency, or compliance demand it.
  • FAQ

    Why check idempotency twice? Receive-time check stops queue pollution; execute-time check covers the race where two retries both enqueued before the first marked done. Cheap insurance.

    Sync response required by the sender? Some webhooks want a verdict in the response. Options: a fast-path small model with a strict timeout (and a safe default on timeout), or renegotiate to async callback — don't put frontier-model latency inside a webhook SLA.

    Batch the LLM calls? If the sender bursts (100s of events/minute) and per-event latency doesn't matter, micro-batch in the worker (group 10-20 per prompt) — order-of-magnitude cost/throughput win, same batching logic at small scale.


    *Last updated: June 2026.*

    Also available in 中文.

    AI Webhook Processor Template: Starter Guide | AI Skill Navigation | AI Skill Navigation