Back to Blog
testingCI/CDwebhooksengineeringintegration tests

Testing Webhook Integrations in CI/CD Pipelines

Webhooks are notoriously hard to test in automated pipelines — they're async, they require a reachable HTTP endpoint, and third-party providers can't push to localhost. Here's how to build a reliable CI/CD testing strategy that actually catches webhook bugs before production.

D
Dmitri Volkov
Distributed Systems Engineer
March 22, 2026
11 min read

Webhooks have a testing problem. Unlike a REST API call where you control both ends of the connection, webhooks are push-based — the external provider decides when to send events, and they need an HTTP endpoint to push to. That endpoint can't be localhost:8080 in a GitHub Actions runner.

Most teams end up with one of two bad outcomes: they skip automated webhook testing entirely ("we'll test it manually in staging"), or they write unit tests that mock so much they don't catch the integration bugs that matter. Both approaches get you paged at 2am.

This post covers a layered testing strategy that works in CI/CD: what to test at each layer, how to handle the async problem, and how to make your pipeline actually fail when webhook delivery breaks.


The Four Layers of Webhook Testing

A complete webhook test suite covers four distinct concerns. Most teams only address the first two.

LayerWhat it testsWhere it runs
UnitSignature verification, payload parsing, retry logicCI (fast, no network)
IntegrationEnd-to-end flow: ingest → queue → deliveryCI (with Postgres, mock destination)
ContractYour webhook schema doesn't break consumersCI (schema validation)
Live smokeReal events reach a real staging endpointDeployment pipeline

Start at the bottom and work up. You can't catch delivery bugs with unit tests — but unit tests catch the signature bugs that would make your integration tests meaningless.


Layer 1: Unit Tests

Unit tests for webhook code should cover:

Signature verification — given a known payload and secret, does your verification function accept the correct signature and reject an incorrect one?

go
func TestHMACVerification(t *testing.T) {
    secret := "whsec_test_secret"
    payload := []byte(`{"event":"order.created","id":"evt_123"}`)
    timestamp := "1711112400"

    // Construct the expected signature
    mac := hmac.New(sha256.New, []byte(secret))
    mac.Write([]byte(timestamp + "." + string(payload)))
    expected := hex.EncodeToString(mac.Sum(nil))

    sig := fmt.Sprintf("t=%s,v1=%s", timestamp, expected)

    err := VerifySignature(payload, sig, secret)
    if err != nil {
        t.Fatalf("expected valid signature to pass: %v", err)
    }

    // Tampered payload must fail
    err = VerifySignature([]byte(`{"event":"order.created","id":"evt_999"}`), sig, secret)
    if err == nil {
        t.Fatal("expected tampered payload to fail verification")
    }
}

Timestamp replay protection — signatures older than your tolerance window should be rejected even if the HMAC is correct.

Payload parsing — your event handler correctly deserializes known good payloads, and returns appropriate errors for malformed ones.

Retry backoff calculation — given attempt number N, does NextAttemptAt() return the right delay?

Unit tests are fast, deterministic, and run on every commit. They should catch logic errors before you even spin up a container.


Layer 2: Integration Tests

This is where most teams underinvest. Integration tests verify that the full pipeline works: an HTTP request arrives at the ingest endpoint, gets persisted, gets queued, and gets delivered to a destination.

The setup problem

You need:

  1. A real Postgres instance (not mocked)
  2. A running API server
  3. A running delivery worker
  4. A destination HTTP server you can inspect

In CI, spin these up with Docker Compose. Here's a minimal docker-compose.test.yml:

yaml
version: "3.9"
services:
  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: gethook_test
      POSTGRES_USER: gethook
      POSTGRES_PASSWORD: gethook
    ports:
      - "5432:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U gethook"]
      interval: 2s
      retries: 10

  api:
    image: gethook/api:latest
    environment:
      DATABASE_URL: postgres://gethook:gethook@postgres:5432/gethook_test?sslmode=disable
      PORT: "8080"
    depends_on:
      postgres:
        condition: service_healthy
    ports:
      - "8080:8080"

  worker:
    image: gethook/worker:latest
    environment:
      DATABASE_URL: postgres://gethook:gethook@postgres:5432/gethook_test?sslmode=disable
    depends_on:
      - api

  destination:
    image: mendhak/http-https-echo:latest
    ports:
      - "8081:8080"

The destination service is a simple HTTP echo server — it accepts any request and responds 200. You can inspect what it received via its log output.

The async problem

The core difficulty with integration testing webhooks is that delivery is asynchronous. You send an event to the ingest endpoint, and then you have to wait for the worker to pick it up and deliver it. If you poll naively with time.Sleep(5 * time.Second), your tests are slow and flaky.

The right pattern is to poll with a short interval and a hard timeout:

go
func waitForDelivery(t *testing.T, client *APIClient, eventID string, timeout time.Duration) {
    t.Helper()
    deadline := time.Now().Add(timeout)
    for time.Now().Before(deadline) {
        event, err := client.GetEvent(eventID)
        if err != nil {
            t.Fatalf("failed to fetch event: %v", err)
        }
        if event.Status == "delivered" {
            return
        }
        if event.Status == "dead_letter" {
            t.Fatalf("event entered dead letter: %s", eventID)
        }
        time.Sleep(200 * time.Millisecond)
    }
    t.Fatalf("event %s not delivered within %s", eventID, timeout)
}

This polls every 200ms with a maximum wait time. In a local Postgres-backed system, delivery typically happens within 1–2 seconds. Set the timeout to 15 seconds to handle CI variance without making the test suite slow.

What to assert in an integration test

go
func TestIngestAndDeliver(t *testing.T) {
    // 1. Create account, source, destination, route
    account := createTestAccount(t)
    source := createSource(t, account.APIKey, "test-source")
    dest := createDestination(t, account.APIKey, "http://destination:8080/hook")
    createRoute(t, account.APIKey, source.ID, dest.ID)

    // 2. POST to ingest endpoint
    payload := `{"event":"order.created","order_id":"ord_abc"}`
    resp := postToIngest(t, source.PathToken, payload)
    assert.Equal(t, 200, resp.StatusCode)

    var ingestResp struct{ Data struct{ ID string } }
    json.NewDecoder(resp.Body).Decode(&ingestResp)
    eventID := ingestResp.Data.ID

    // 3. Wait for delivery
    waitForDelivery(t, account.APIKey, eventID, 15*time.Second)

    // 4. Verify delivery attempt recorded
    event := getEvent(t, account.APIKey, eventID)
    assert.Equal(t, "delivered", event.Status)
    assert.Equal(t, 1, event.AttemptsCount)

    // 5. Verify destination received correct payload
    // (inspect destination echo server logs or a side channel)
}

Write at least these integration scenarios:

  • Happy path: ingest → immediate delivery
  • Retry path: destination returns 500, then 200 on retry
  • Dead letter: destination returns 500 on all 5 attempts
  • Replay: manually replay a dead-letter event, verify redelivery
  • Wrong signature: ingest with invalid HMAC, verify rejection

Layer 3: Contract Testing

If you send webhooks to customers, your event schema is a public API. Breaking it breaks customer integrations silently — they don't get an error, they just stop processing events correctly.

Contract tests verify your event payloads match a documented schema. Define your event schemas as JSON Schema:

json
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "order.created",
  "type": "object",
  "required": ["event", "id", "created_at", "data"],
  "properties": {
    "event": { "type": "string", "const": "order.created" },
    "id": { "type": "string", "pattern": "^evt_" },
    "created_at": { "type": "string", "format": "date-time" },
    "data": {
      "type": "object",
      "required": ["order_id", "amount", "currency"],
      "properties": {
        "order_id": { "type": "string" },
        "amount": { "type": "integer" },
        "currency": { "type": "string", "pattern": "^[A-Z]{3}$" }
      }
    }
  }
}

In your CI pipeline, validate every event your system can produce against its schema:

bash
# Install ajv-cli
npm install -g ajv-cli

# Validate a sample payload against the schema
ajv validate \
  -s schemas/order.created.json \
  -d test/fixtures/order.created.sample.json

Run this in CI on every PR. When an engineer adds a new field without marking it optional, or renames an existing field, the contract test fails before it ships.


Layer 4: Live Smoke Tests in the Deployment Pipeline

After a staging deployment, run a smoke test that sends a real event through the real stack and verifies it reaches a controlled endpoint.

The trick is the "controlled endpoint" — you need an HTTPS URL that's reachable from your staging environment and that you can query to confirm receipt. Options:

A webhook.site-style endpoint you control — spin up a minimal HTTP service in staging that accepts requests and stores them in Redis or Postgres. After the smoke test, query that service to confirm the event arrived.

A dedicated smoke-test destination — a URL like https://smoke.staging.yoursaas.com/hook that writes received events to a table. Your smoke test queries that table after a delay.

bash
#!/bin/bash
set -euo pipefail

API_KEY="${STAGING_API_KEY}"
BASE_URL="https://api.staging.yoursaas.com"
SMOKE_DEST="https://smoke.staging.yoursaas.com/hook"

# Create a source for this smoke test run
SOURCE=$(curl -sf -X POST "$BASE_URL/v1/sources" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"name":"smoke-test-'$(date +%s)'"}')

SOURCE_TOKEN=$(echo "$SOURCE" | jq -r '.data.path_token')
SOURCE_ID=$(echo "$SOURCE" | jq -r '.data.id')

# Send an event
EVENT=$(curl -sf -X POST "$BASE_URL/ingest/$SOURCE_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"event":"smoke.test","ts":'$(date +%s)'}')

EVENT_ID=$(echo "$EVENT" | jq -r '.data.id')

# Wait and poll for delivery (max 30s)
for i in $(seq 1 15); do
  STATUS=$(curl -sf -H "Authorization: Bearer $API_KEY" \
    "$BASE_URL/v1/events/$EVENT_ID" | jq -r '.data.status')
  if [ "$STATUS" = "delivered" ]; then
    echo "Smoke test passed: event $EVENT_ID delivered"
    exit 0
  fi
  sleep 2
done

echo "Smoke test FAILED: event $EVENT_ID not delivered after 30s"
exit 1

Run this script at the end of your staging deployment job. If it fails, block the production promotion.


CI Pipeline Structure

Here's how these layers fit together in a GitHub Actions workflow:

on: [push, pull_request]

jobs:
  unit-tests:
    runs-on: ubuntu-latest
    steps:
      - run: make test          # fast, no containers

  integration-tests:
    runs-on: ubuntu-latest
    steps:
      - run: docker compose -f docker-compose.test.yml up -d
      - run: make integration-test
      - run: docker compose down

  contract-tests:
    runs-on: ubuntu-latest
    steps:
      - run: npm install -g ajv-cli
      - run: make validate-schemas

  deploy-staging:
    needs: [unit-tests, integration-tests, contract-tests]
    steps:
      - run: ./scripts/deploy-staging.sh

  smoke-test:
    needs: [deploy-staging]
    steps:
      - run: ./scripts/smoke-test.sh

Total CI time for a well-structured pipeline: unit tests in under 30 seconds, integration tests in 2–3 minutes, smoke tests in under 60 seconds. The full pipeline takes under 5 minutes — fast enough to run on every PR.


Common Pitfalls

Mocking too much in integration tests. If your integration test replaces the delivery worker with a mock, you're not testing delivery. Use real workers.

Not testing the retry path. The happy path works because it always worked. Retry paths break because nobody tests them. Deliberately inject a destination that fails twice before succeeding.

Flaky async assertions. time.Sleep(3 * time.Second) is the most common source of flaky webhook tests. Use polling with a deadline instead.

Not cleaning up test data. Integration tests that leave orphaned events and sources accumulate over time and slow down subsequent runs. Clean up in t.Cleanup() or use a test-specific database schema that gets dropped after each run.

Ignoring contract tests until you break a customer. Schema changes feel small from the inside. To a customer whose code expects amount in cents and suddenly gets it in dollars — or doesn't get it at all — it's an outage.


Webhook reliability starts before you deploy. A pipeline that exercises ingest, delivery, retry, and dead-letter paths on every PR catches the bugs that would otherwise become customer incidents.

GetHook exposes event status, attempt count, and delivery outcomes through the API — making the polling-based assertion pattern above straightforward to implement against a real webhook gateway. See the GetHook API reference →

Stop losing webhook events.

GetHook gives you reliable delivery, automatic retry, and full observability — in minutes.