Back to Blog
gowebhooksbackendreliabilitypatterns

Writing a Webhook Consumer in Go: Idiomatic Patterns for Reliability

Most webhook consumer bugs share the same root causes: synchronous processing, missing idempotency, skipped signature verification. Here's how to write a production-grade Go webhook handler that avoids all three.

J
Jordan Okafor
Senior Backend Engineer
April 21, 2026
11 min read

Writing a webhook consumer looks deceptively simple: expose an HTTP endpoint, parse the JSON, do something with it. The naive version works fine in development and starts causing problems in production — usually right after a high-volume event lands during a traffic spike, or when a provider retries an event your handler already processed, or when your database is slow and the provider's 30-second timeout fires.

This post walks through the patterns that separate a webhook handler that works from one that's production-ready: signature verification you can actually trust, async processing that never blocks the HTTP response, idempotency that survives retries, and structured logging that makes debugging tractable.

The examples are Go using only the standard library. None of this requires a framework.


Start with Signature Verification

Every inbound webhook handler must verify the request signature before processing the payload. Signature verification is not optional polish — it's the mechanism that prevents an attacker from POSTing fabricated events to your endpoint.

Most providers use HMAC-SHA256 with a Stripe-compatible format: a header containing a Unix timestamp and a hex-encoded signature, like t=1714561200,v1=abc123.... The timestamp is included to prevent replay attacks.

go
package webhook

import (
	"crypto/hmac"
	"crypto/sha256"
	"encoding/hex"
	"errors"
	"fmt"
	"io"
	"net/http"
	"strconv"
	"strings"
	"time"
)

const (
	signatureHeader    = "X-Webhook-Signature"
	maxBodyBytes       = 1 << 20 // 1 MB
	replayWindowSecs   = 300     // reject timestamps older than 5 minutes
)

var (
	ErrMissingSignature = errors.New("missing signature header")
	ErrInvalidSignature = errors.New("invalid signature")
	ErrTimestampTooOld  = errors.New("timestamp outside replay window")
)

// VerifySignature reads the body, verifies the HMAC, and returns the raw body
// bytes so callers don't need to re-read the (already consumed) request body.
func VerifySignature(r *http.Request, secret string) ([]byte, error) {
	sigHeader := r.Header.Get(signatureHeader)
	if sigHeader == "" {
		return nil, ErrMissingSignature
	}

	var ts, v1 string
	for _, part := range strings.Split(sigHeader, ",") {
		if strings.HasPrefix(part, "t=") {
			ts = strings.TrimPrefix(part, "t=")
		}
		if strings.HasPrefix(part, "v1=") {
			v1 = strings.TrimPrefix(part, "v1=")
		}
	}
	if ts == "" || v1 == "" {
		return nil, ErrInvalidSignature
	}

	unix, err := strconv.ParseInt(ts, 10, 64)
	if err != nil {
		return nil, ErrInvalidSignature
	}

	// Reject events outside the replay window.
	age := time.Now().Unix() - unix
	if age < 0 || age > replayWindowSecs {
		return nil, ErrTimestampTooOld
	}

	body, err := io.ReadAll(io.LimitReader(r.Body, maxBodyBytes))
	if err != nil {
		return nil, fmt.Errorf("reading body: %w", err)
	}

	// Reconstruct the signed payload: "<timestamp>.<body>"
	signed := ts + "." + string(body)
	mac := hmac.New(sha256.New, []byte(secret))
	mac.Write([]byte(signed))
	expected := hex.EncodeToString(mac.Sum(nil))

	// Use hmac.Equal to avoid timing attacks.
	expectedBytes, _ := hex.DecodeString(expected)
	gotBytes, err := hex.DecodeString(v1)
	if err != nil || !hmac.Equal(expectedBytes, gotBytes) {
		return nil, ErrInvalidSignature
	}

	return body, nil
}

Three details that matter here:

  1. io.LimitReader caps the body at 1 MB. Without this, a malicious sender can POST a 500 MB body and exhaust your process's memory.
  2. hmac.Equal does a constant-time comparison. A naive expected == got string comparison is vulnerable to timing attacks.
  3. The replay window check rejects events with timestamps older than 5 minutes. This prevents an attacker who captured a valid request from replaying it later.

Never Process Synchronously

The most common mistake in webhook consumer design is doing real work inside the HTTP handler. Real work means: database writes, calls to downstream services, email sending, inventory updates. Any of these can be slow or fail.

When your handler does real work synchronously and takes more than the provider's timeout (commonly 10–30 seconds), the provider receives no response, marks the delivery as failed, and retries. Now your handler processes the same event again — and again. You've manufactured a retry loop not because of a transient failure, but because your handler was too slow.

The correct pattern: acknowledge immediately, process asynchronously.

go
type OrderHandler struct {
	queue  chan<- orderEvent
	secret string
	log    *slog.Logger
}

type orderEvent struct {
	EventID string          `json:"event_id"`
	Type    string          `json:"type"`
	Payload json.RawMessage `json:"payload"`
}

func (h *OrderHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
	body, err := VerifySignature(r, h.secret)
	if err != nil {
		h.log.Warn("signature verification failed",
			slog.String("error", err.Error()),
			slog.String("remote_addr", r.RemoteAddr),
		)
		http.Error(w, "unauthorized", http.StatusUnauthorized)
		return
	}

	var evt orderEvent
	if err := json.Unmarshal(body, &evt); err != nil {
		h.log.Error("malformed payload", slog.String("error", err.Error()))
		http.Error(w, "bad request", http.StatusBadRequest)
		return
	}

	// Non-blocking send: if the queue is full, return 503 so the provider retries.
	select {
	case h.queue <- evt:
		h.log.Info("event enqueued",
			slog.String("event_id", evt.EventID),
			slog.String("type", evt.Type),
		)
		w.WriteHeader(http.StatusOK)
	default:
		h.log.Warn("queue full, returning 503", slog.String("event_id", evt.EventID))
		http.Error(w, "service unavailable", http.StatusServiceUnavailable)
	}
}

The handler's job is exactly: verify, parse, enqueue, respond. The select with a default branch is intentional — if the in-memory queue is at capacity, returning a 503 is better than blocking the HTTP goroutine (which would exhaust your server's goroutine pool under sustained load). The provider will retry the event after backoff.

For production, replace the chan orderEvent with a durable queue backed by Postgres or a message broker. In-memory channels don't survive restarts. Any event in the channel when your process dies is lost.


Idempotency Is Not Optional

Providers retry on any non-2xx response — including timeouts, network errors, and 5xx from your handler. Your consumer will receive the same event more than once. Design for it from the start.

The standard pattern is to track processed event IDs in a database table and skip duplicates:

go
// idempotency_keys table:
// CREATE TABLE processed_events (
//     event_id    TEXT PRIMARY KEY,
//     processed_at TIMESTAMPTZ NOT NULL DEFAULT now()
// );

func processEvent(ctx context.Context, db *sql.DB, evt orderEvent) error {
	// Attempt to claim the event ID. If the INSERT fails with a unique
	// constraint violation, this event was already processed — skip it.
	_, err := db.ExecContext(ctx,
		`INSERT INTO processed_events (event_id) VALUES ($1)
		 ON CONFLICT (event_id) DO NOTHING`,
		evt.EventID,
	)
	if err != nil {
		return fmt.Errorf("claiming event: %w", err)
	}

	// Check rows affected to distinguish "inserted" from "already existed".
	// With ON CONFLICT DO NOTHING, zero rows affected = duplicate.
	// Use a transaction if the downstream operation must be atomic with the claim.
	return handleOrderEvent(ctx, db, evt)
}

The ON CONFLICT DO NOTHING approach works for simple cases. For operations that must be atomic — claim the event AND update the order record — wrap both in a transaction:

go
func processEventTx(ctx context.Context, db *sql.DB, evt orderEvent) error {
	tx, err := db.BeginTx(ctx, nil)
	if err != nil {
		return err
	}
	defer tx.Rollback()

	var exists bool
	err = tx.QueryRowContext(ctx,
		`INSERT INTO processed_events (event_id) VALUES ($1)
		 ON CONFLICT (event_id) DO UPDATE SET event_id = EXCLUDED.event_id
		 RETURNING (xmax = 0) AS inserted`,
		evt.EventID,
	).Scan(&exists)
	if err != nil {
		return fmt.Errorf("idempotency check: %w", err)
	}
	if !exists {
		// Already processed; commit the no-op and return.
		return tx.Commit()
	}

	if err := updateOrderInTx(ctx, tx, evt); err != nil {
		return err
	}

	return tx.Commit()
}

The xmax = 0 trick returns true when the row was freshly inserted (not conflicted), letting you distinguish new events from duplicates in a single round trip.


Structured Logging for Debuggability

When a webhook fails to process correctly — wrong payload shape, a downstream service error, a duplicate you didn't expect — you need to be able to reconstruct what happened from logs. Structured logging with consistent fields makes this tractable.

Every log line from your webhook handler should include:

FieldWhy
event_idCorrelate across multiple log lines for the same event
event_typeFilter by what happened (order.created vs. order.cancelled)
sourceWhich provider or source sent this event
attempt_numberDistinguish first deliveries from retries
latency_msHow long processing took; surface slow handlers before they timeout
outcomesuccess, duplicate, processing_error, invalid_signature

Using Go's slog package (available since 1.21):

go
func (w *worker) process(ctx context.Context, evt orderEvent) {
	start := time.Now()
	logger := w.log.With(
		slog.String("event_id", evt.EventID),
		slog.String("event_type", evt.Type),
	)

	err := processEventTx(ctx, w.db, evt)
	latency := time.Since(start).Milliseconds()

	if err != nil {
		logger.Error("processing failed",
			slog.String("outcome", "processing_error"),
			slog.Int64("latency_ms", latency),
			slog.String("error", err.Error()),
		)
		return
	}

	logger.Info("event processed",
		slog.String("outcome", "success"),
		slog.Int64("latency_ms", latency),
	)
}

With this structure, finding all processing errors for a specific event type over the last hour is a single log query — no grep-and-parse required.


Graceful Shutdown

When your process receives a SIGTERM (during a deploy or a scale-down), any events currently in your in-memory queue need to be flushed before the process exits. Without graceful shutdown, those events are dropped and the provider will eventually retry them — but with a gap in your processing timeline.

go
func main() {
	queue := make(chan orderEvent, 1000)
	handler := &OrderHandler{queue: queue, secret: os.Getenv("WEBHOOK_SECRET")}

	srv := &http.Server{
		Addr:    ":8080",
		Handler: handler,
	}

	ctx, stop := signal.NotifyContext(context.Background(), syscall.SIGTERM, syscall.SIGINT)
	defer stop()

	var wg sync.WaitGroup
	wg.Add(1)
	go func() {
		defer wg.Done()
		runWorker(ctx, queue, db)
	}()

	go func() {
		if err := srv.ListenAndServe(); err != nil && !errors.Is(err, http.ErrServerClosed) {
			log.Fatal(err)
		}
	}()

	<-ctx.Done()

	// Stop accepting new requests.
	shutdownCtx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
	defer cancel()
	srv.Shutdown(shutdownCtx)

	// Close the queue channel and wait for the worker to drain it.
	close(queue)
	wg.Wait()
}

The worker's runWorker loop should range over the channel — when the channel is closed, the range exits after processing remaining items, giving you a clean drain.


Error Response Strategy

What status code to return from your webhook handler matters more than most engineers realize, because it directly controls provider retry behavior:

Your responseProvider behavior
2xxDelivery considered successful; no retry
4xx (except 429)Delivery considered permanently failed; no retry (most providers)
429Retry after backoff; respect Retry-After if present
5xxRetry with backoff
Timeout (no response)Retry with backoff

Return 400 for payloads that are structurally invalid — wrong schema, missing required fields. Retrying a malformed event won't fix it. Return 500 for transient failures — database connection errors, downstream service unavailable. These are worth retrying. Return 200 as soon as you've enqueued the event (not after processing), so the provider doesn't timeout waiting for your processing to complete.

The one subtle case: if your idempotency check detects a duplicate, return 200, not 409. From the provider's perspective, the event was delivered successfully on the first attempt. A 4xx on a duplicate often confuses retry logic and can trigger alerts on the provider side.


Putting It Together

A production-grade webhook consumer in Go has five components working together:

  1. Signature verification — before any parsing or processing, reject unauthenticated requests
  2. Body limit — cap at 1 MB (or your provider's documented max) to prevent memory exhaustion
  3. Async processing — acknowledge immediately, process out-of-band
  4. Idempotency — track processed event IDs; skip duplicates without erroring
  5. Graceful shutdown — drain the in-memory queue before the process exits

None of these are difficult to implement individually. The challenge is that they interact: async processing requires durable queuing to be reliable; idempotency requires a persistent store; graceful shutdown requires coordinating the HTTP server and the worker goroutine. Getting all five right together, on the first implementation, is the part that takes experience.

If you're building the sending side — exposing webhooks to your customers — GetHook handles delivery retries, dead-letter queuing, signing, and replay for you. The consumer patterns above apply to any webhook endpoint your team writes, regardless of which gateway sends the events.

See how GetHook handles outbound delivery →

Stop losing webhook events.

GetHook gives you reliable delivery, automatic retry, and full observability — in minutes.