Serverless platforms are an appealing destination for webhook consumers. You pay per invocation, you scale to zero when traffic is quiet, and you don't manage servers. For many teams, the operational simplicity justifies the trade-offs.

But serverless functions and webhook delivery interact in ways that are not obvious until you hit them in production. Cold starts add latency to your first delivery. Concurrent invocations require your handler to be strictly stateless. Function timeouts impose a hard ceiling on how long your consumer can take. And if your serverless platform throttles you, incoming webhook requests pile up on the sender's side — or get dropped entirely.

This post covers the patterns that make webhook consumers reliable on Lambda and Cloud Run, the failure modes to understand before you deploy, and the configuration knobs that matter most.

The Cold Start Problem for Webhook Consumers

A serverless cold start happens when a new function instance spins up to handle a request. On AWS Lambda, a cold start for a Node.js or Python function typically costs 200–500 ms. For a Java or .NET function, it can easily exceed 1–2 seconds. Cloud Run cold starts depend on your container image size and startup time, but 1–3 seconds is common for non-trivial workloads.

For a webhook consumer, cold start latency shows up directly in the delivery latency your sender observes. Most webhook senders (including GetHook) have a configurable delivery timeout. If you set a 5-second destination timeout and your Lambda cold starts take 2 seconds, your effective processing window is 3 seconds. That sounds fine — until you factor in database connections, external API calls, and the actual business logic.

The concrete failure mode:

Sender issues POST → Lambda cold start: 1.8s
Connection pool init: 0.3s
Processing: 0.8s
Total: 2.9s

Next delivery (warm): 1.2s total. Well within timeout.
After a traffic lull (new cold start): 2.9s again.

The inconsistency is harder to debug than a consistent slow response. Your P99 latency looks fine because most deliveries are warm. Your P999 — the occasional cold start — looks terrible and may be causing silent retries from the sender.

Provisioned Concurrency vs. Minimum Instances

The standard fix for cold starts is to keep at least one warm instance running at all times.

On AWS Lambda, this is Provisioned Concurrency. You pay to keep N function instances initialized and ready, even when idle:

bash

# Keep 2 warm instances for the webhook handler function
aws lambda put-provisioned-concurrency-config \
  --function-name webhook-consumer \
  --qualifier production \
  --provisioned-concurrent-executions 2

On Cloud Run, the equivalent is minimum instances:

yaml

# cloudbuild.yaml / service.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: webhook-consumer
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/minScale: "2"
        autoscaling.knative.dev/maxScale: "50"
    spec:
      containerConcurrency: 10
      timeoutSeconds: 30
      containers:
        - image: gcr.io/myproject/webhook-consumer:latest

Two minimum instances is usually the right floor. One instance handles the burst while the second is still warming. It also gives you zero-downtime deploys — the old instance handles traffic while the new one starts.

The cost of 2 minimum instances on Cloud Run is roughly $15–20/month. On Lambda with provisioned concurrency, it depends on your function's memory allocation, but for a 512 MB function it's around $10–15/month. For a webhook consumer handling real traffic, this is money well spent.

Concurrency, Connection Limits, and Database Connections

The combination of serverless scaling and database connections is one of the most common ways to break a webhook consumer under load.

When your Lambda function scales from 1 to 100 concurrent instances in response to a traffic burst, each instance tries to open a database connection. PostgreSQL has a default max_connections of 100. You hit the limit, and new instances start failing with too many clients already.

Serverless runtime	Concurrency model	Default DB connection approach
AWS Lambda	One connection per instance	Risk: 100+ concurrent instances = connection pool exhaustion
Cloud Run	Multiple requests per instance (`containerConcurrency`)	Better: shared pool across concurrent requests
Cloud Functions (Google)	One request per instance by default	Same risk as Lambda
Azure Functions	Configurable	Depends on hosting plan

For Lambda, use RDS Proxy (AWS) or PgBouncer in transaction pooling mode. Neither requires code changes — they sit in front of your database and multiplex connections:

bash

# Connect through RDS Proxy instead of directly to RDS
DB_HOST=my-rds-proxy.proxy-xxxx.us-east-1.rds.amazonaws.com

For Cloud Run, set containerConcurrency to a value greater than 1 (typically 10–50 depending on your workload) and use a connection pool within your container. This way, 10 concurrent requests share a pool of 5 connections rather than opening 10.

// Initialize pool at container startup, not per-request
var db *sql.DB

func init() {
    var err error
    db, err = sql.Open("postgres", os.Getenv("DB_URL"))
    if err != nil {
        log.Fatalf("db open: %v", err)
    }
    db.SetMaxOpenConns(5)
    db.SetMaxIdleConns(2)
    db.SetConnMaxLifetime(5 * time.Minute)
}

func handleWebhook(w http.ResponseWriter, r *http.Request) {
    // Use the shared pool, not a new connection
    _, err := db.ExecContext(r.Context(), "INSERT INTO ...")
    // ...
}

The init() function in Go (and equivalent module-level initialization in other languages) runs once when the container starts, not on every invocation. The connection pool is shared across all concurrent requests in that instance.

Timeout Alignment

Your function timeout, your destination timeout (configured on the sender), and your downstream call timeouts must all be consistent. Getting this wrong causes confusing failure modes.

The correct hierarchy:

Downstream call timeout < Function timeout < Sender destination timeout

If your function timeout is 30 seconds but your sender's destination timeout is 10 seconds, the sender will retry before your function finishes. Your function continues processing an event the sender has already given up on — potentially causing duplicate processing when the sender retries successfully to a warm instance.

A safe configuration for a typical webhook consumer:

Parameter	Value	Notes
Downstream DB/API call timeout	5s	Individual external calls
Function total timeout	15s	Lambda/Cloud Run function timeout
Sender destination timeout	20s	Set on the webhook route in GetHook or your gateway
Sender retry backoff (first retry)	30s	Enough time for a cold start to resolve

With this alignment, if your function takes longer than 15 seconds, it fails cleanly and the sender receives a 502 or connection closed. The sender then retries after 30 seconds — by which time the cold start that caused the slowness has resolved.

Idempotency Is Not Optional

Serverless webhook consumers almost always receive duplicate events. The sender retries on timeout. Your function times out just after writing to the database but before returning 200. The sender marks the delivery as failed and retries. Your consumer processes the same event twice.

Idempotency keys are your defense. Extract the event ID from the incoming request and use it as a deduplication key:

func handleWebhook(w http.ResponseWriter, r *http.Request) {
    // GetHook sends the event ID in the X-Webhook-Event-ID header
    eventID := r.Header.Get("X-Webhook-Event-ID")
    if eventID == "" {
        http.Error(w, "missing event ID", http.StatusBadRequest)
        return
    }

    var processed bool
    err := db.QueryRowContext(r.Context(),
        `INSERT INTO processed_events (event_id, processed_at)
         VALUES ($1, NOW())
         ON CONFLICT (event_id) DO NOTHING
         RETURNING false`,
        eventID,
    ).Scan(&processed)

    if err == sql.ErrNoRows {
        // Already processed — return 200 so sender doesn't retry
        w.WriteHeader(http.StatusOK)
        return
    }
    if err != nil {
        http.Error(w, "db error", http.StatusInternalServerError)
        return
    }

    // Process the event
    if err := processEvent(r.Context(), r.Body); err != nil {
        http.Error(w, "processing error", http.StatusInternalServerError)
        return
    }

    w.WriteHeader(http.StatusOK)
}

The ON CONFLICT DO NOTHING pattern is a single atomic operation — no race condition between checking and inserting. If two concurrent instances receive the same event ID (possible during a burst with retries in flight), only one processes it.

Observability: What to Instrument

Serverless functions give you less visibility than a long-running server by default. Add these four measurements to every webhook handler invocation:

start := time.Now()
defer func() {
    duration := time.Since(start)
    log.Printf(
        "event_id=%s event_type=%s duration_ms=%d status=%d cold_start=%v",
        eventID, eventType, duration.Milliseconds(), statusCode, isColdStart,
    )
}()

Detecting cold starts in Go on Lambda: check whether a module-level boolean was set. On first invocation it's false, then set it to true:

var warmedUp bool

func init() {
    // Initialization logic...
    warmedUp = false
}

func handleWebhook(w http.ResponseWriter, r *http.Request) {
    isColdStart := !warmedUp
    warmedUp = true
    // ...
}

Log cold starts explicitly. When you aggregate them in CloudWatch Logs Insights or Google Cloud Logging, you can correlate cold start rate with P99 latency spikes — and use that data to justify the cost of provisioned concurrency or minimum instances.

When Serverless Is the Wrong Choice

Serverless is a poor fit for webhook consumers that have:

›
Strict ordering requirements. Concurrent function instances process events in parallel. If your consumer requires that order.created is always processed before order.updated for the same order, serverless adds significant complexity — you need distributed locks or a sequential queue per entity. A long-running worker process with a Postgres queue is simpler.
›
Very high sustained throughput. Above roughly 500 events/second sustained, the per-invocation overhead of serverless (billing model, cold start frequency, connection pool churn) often makes a fixed worker fleet cheaper to operate.
›
Long processing times. If your handler takes 45–60 seconds — for example, making multiple downstream API calls or running a report — you're fighting against serverless's timeout model. A queue-backed worker is a better fit.

For the majority of webhook consumers — moderate traffic, mostly stateless processing, bursty or unpredictable arrival patterns — serverless is a solid choice. The cold start tax is real but manageable with provisioned concurrency, and the operational simplicity of not running a fleet of workers is genuine.

Building a reliable webhook consumer on Lambda or Cloud Run is mostly about understanding the constraints upfront — cold starts, connection limits, timeouts, concurrency — and designing around them before you deploy, not after you hit your first incident.

If you want to control destination timeouts, configure per-route retry policies, and inspect every delivery attempt regardless of how your consumer is hosted, get started with GetHook. The delivery gateway handles the reliability layer so your serverless function can focus on processing, not on retry logic.

Webhook Consumers in Serverless Environments: Lambda, Cloud Run, and the Cold Start Tax

The Cold Start Problem for Webhook Consumers

Provisioned Concurrency vs. Minimum Instances

Concurrency, Connection Limits, and Database Connections

Timeout Alignment

Idempotency Is Not Optional

Observability: What to Instrument

When Serverless Is the Wrong Choice

Related articles

Webhook Payload Transformation: Normalizing, Enriching, and Redacting Events at the Gateway

Webhook Consumer Observability: Metrics and Alerts on the Receiving End

Designing a Great Webhook SDK: Verification, Typing, and Developer Ergonomics

Stop losing webhook events.