Back to Blog
reliabilitydeduplicationidempotencyarchitecturebest practices

Webhook Deduplication: Identifying and Handling Duplicate Events at Scale

Duplicate webhook events are inevitable — providers retry, networks glitch, and load balancers replay requests. Here's how to detect and discard duplicates before they reach your services.

S
Sofia Andreou
Product Manager
March 29, 2026
9 min read

Duplicate webhook events are not a provider bug — they're a feature of how reliable delivery works. Every webhook system that guarantees at-least-once delivery will occasionally send the same event more than once. Stripe says so explicitly in their docs. GitHub does too. When your retry policy overlaps with a slow destination, you'll create duplicates yourself.

The question isn't whether you'll receive duplicates. It's whether you've built the infrastructure to handle them correctly before they reach your business logic.

This post focuses on deduplication at the gateway layer — detecting and discarding duplicate events before they're enqueued for processing. This is distinct from idempotency in your handlers (which is still necessary, but is your last line of defense, not your first).


Why Duplicates Happen

Understanding the source of duplicates helps you choose the right deduplication strategy.

1. Provider retry on 5xx or timeout

If your ingest endpoint is slow (>10s) or returns a 500, the provider retries. If your original request actually succeeded but the connection dropped before you could respond, you now have the same event queued twice.

2. Provider-side retry storms

During a provider incident, queued retries can flood in all at once. A single event that failed to deliver 3 hours ago may arrive simultaneously with its current version if the provider flushed a retry queue.

3. Your own gateway retry on ingest

If you run multiple ingest nodes behind a load balancer and one node fails mid-write, a retry might land on a different node — creating a second record.

4. Replay by the event creator

Developers and operators manually replay events ("resend this payment.succeeded") not realizing the original was already processed.

5. Multi-region fanout

Providers that send events from multiple regions don't always deduplicate across those origins. The same event can arrive from two different source IPs within milliseconds.


The Two Classes of Duplicates

Before building deduplication logic, you need to know what you're deduplicating on.

ClassDescriptionDetection method
Exact duplicatesSame event ID, same payload, identical requestEvent ID lookup
Semantic duplicatesSame payload content, different event IDsContent hash
Ambiguous near-duplicatesSlightly different payloads for the same logical eventBusiness logic (hard)

Ambiguous near-duplicates are not a gateway problem — they require domain knowledge. Focus your deduplication infrastructure on exact and semantic duplicates.


Strategy 1: Event ID Deduplication (The Right Default)

Most well-behaved webhook providers include a stable, unique event identifier in their payload or headers. Stripe uses id in the payload body. GitHub uses X-GitHub-Delivery. Shopify uses X-Shopify-Webhook-Id.

The deduplication logic is simple: record every event ID you've seen. Reject events whose IDs you've already processed.

go
// PostgreSQL schema
CREATE TABLE webhook_event_ids (
    event_id     TEXT        NOT NULL,
    source_id    UUID        NOT NULL,
    received_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    PRIMARY KEY (source_id, event_id)
);

CREATE INDEX ON webhook_event_ids (received_at);
go
func (s *IngestStore) IsDuplicate(ctx context.Context, sourceID, eventID string) (bool, error) {
    _, err := s.db.ExecContext(ctx, `
        INSERT INTO webhook_event_ids (event_id, source_id)
        VALUES ($1, $2)
        ON CONFLICT (source_id, event_id) DO NOTHING
    `, eventID, sourceID)
    if err != nil {
        return false, err
    }

    // rowsAffected == 0 means the INSERT was a no-op (duplicate)
    // rowsAffected == 1 means it's a new event
    // (check rows affected in your driver)
    return false, nil // simplified
}

The ON CONFLICT DO NOTHING pattern is an atomic check-and-insert. There's no race condition between checking and inserting — even under concurrent ingest workers, only one INSERT will succeed per (source_id, event_id) pair.

Retention window: You don't need to keep event IDs forever. A 30-day window covers the vast majority of retries. Prune old records with a scheduled job:

sql
DELETE FROM webhook_event_ids
WHERE received_at < NOW() - INTERVAL '30 days';

Strategy 2: Content Hash Deduplication (For Providers Without Event IDs)

Some webhook providers — particularly older enterprise systems — don't include a stable event identifier. If you can't use an event ID, hash the payload.

go
import (
    "crypto/sha256"
    "encoding/hex"
)

func payloadHash(body []byte) string {
    h := sha256.Sum256(body)
    return hex.EncodeToString(h[:])
}

Store the hash alongside the event record and check for collisions before inserting:

go
hash := payloadHash(rawBody)

_, err := db.ExecContext(ctx, `
    INSERT INTO events (id, source_id, payload_hash, raw_body, ...)
    VALUES ($1, $2, $3, $4, ...)
    ON CONFLICT (source_id, payload_hash) DO NOTHING
`, newUUID(), sourceID, hash, rawBody)

Add a unique index:

sql
CREATE UNIQUE INDEX events_source_payload_hash_idx
    ON events (source_id, payload_hash)
    WHERE received_at > NOW() - INTERVAL '24 hours';

Caveats for content hashing:

  • Some providers include a timestamp in the payload that changes on every delivery attempt. You'd need to strip that field before hashing, which requires payload-specific logic.
  • Hash collisions are astronomically unlikely with SHA-256 but not impossible. If your business logic has consequences severe enough that a false deduplication would matter, add a secondary check on event ID as well.

Strategy 3: Time-Windowed Deduplication for High-Throughput Ingest

At high volume (millions of events per day), the webhook_event_ids table can become a write bottleneck. A few optimizations help:

Use a partial index with a time window:

sql
CREATE UNIQUE INDEX webhook_event_ids_recent_idx
    ON webhook_event_ids (source_id, event_id)
    WHERE received_at > NOW() - INTERVAL '1 hour';

This keeps the index small by only enforcing uniqueness within the recent window. Events older than 1 hour are pruned from the index (though their rows remain for audit purposes).

Batch the dedup check:

If you're processing events in micro-batches, check the entire batch against the dedup table in one query rather than one query per event:

go
func filterDuplicates(ctx context.Context, db *sql.DB, events []InboundEvent) ([]InboundEvent, error) {
    ids := make([]string, len(events))
    for i, e := range events {
        ids[i] = e.ProviderEventID
    }

    rows, err := db.QueryContext(ctx, `
        SELECT event_id FROM webhook_event_ids
        WHERE source_id = $1 AND event_id = ANY($2)
    `, events[0].SourceID, pq.Array(ids))
    if err != nil {
        return nil, err
    }
    defer rows.Close()

    seen := map[string]bool{}
    for rows.Next() {
        var id string
        rows.Scan(&id)
        seen[id] = true
    }

    var fresh []InboundEvent
    for _, e := range events {
        if !seen[e.ProviderEventID] {
            fresh = append(fresh, e)
        }
    }
    return fresh, nil
}

The Deduplication Window Tradeoff

Every deduplication strategy involves a window — the period during which you'll recognize and discard a duplicate. Choosing the window requires balancing two failure modes:

Window too shortWindow too long
Legitimate retries (hours later) slip through as duplicates processedTable grows large, index scans slow down
Provider delivers after a 48-hour outage → double processingStorage cost increases

A 7-day window is a good default. It covers the longest retry windows of major providers (Stripe retries for 3 days; GitHub retries for 72 hours). Beyond 7 days, most providers have given up.

If you're building for a provider with unusual retry behavior — some enterprise ERP systems retry for 30 days — extend the window accordingly. The storage cost is linear and manageable.


What Deduplication Doesn't Replace

Gateway-level deduplication reduces duplicate processing significantly but cannot eliminate it entirely. Here's why:

  1. The dedup check and the event insert are not atomic across services. If your ingest worker dies after passing the dedup check but before persisting the event, a retry of the same event will pass the check again (it was never recorded as seen).

  2. Replay operations are intentionally re-processing past events. Your dedup layer should not block replays initiated by operators — those are distinct from provider retries.

  3. Your handlers still need idempotency. A bug in your dedup layer, a table migration that clears the event IDs, or a network partition can all let a duplicate through. Your downstream handlers must be safe to call twice.

Think of gateway deduplication as a throughput optimization and user experience improvement — it prevents duplicate events from appearing in your event log, cluttering your observability dashboards, and triggering redundant downstream effects. It's not a substitute for idempotent handler design.

GetHook tracks a provider_event_id field on every inbound event and applies the ON CONFLICT DO NOTHING pattern at ingest time, so duplicates from the same provider are quietly discarded before they're queued for delivery.


Observability for Deduplication

Add a counter for discarded duplicates. This metric will tell you which providers are noisiest and whether your dedup window needs adjusting.

go
// Prometheus counter example
duplicatesDiscarded.With(prometheus.Labels{
    "source_id":    sourceID,
    "provider":     source.ProviderType,
}).Inc()

Alert if the duplicate rate for any source exceeds 5% over a 1-hour window. A spike usually means either the provider is stuck in a retry loop or your ingest endpoint returned a false negative (you processed the event but responded with 5xx).


Summary

ScenarioRecommended approach
Provider includes stable event IDON CONFLICT DO NOTHING on event ID
Provider has no stable event IDSHA-256 hash of normalized payload
High-throughput (>1M events/day)Partial index with time window + batch dedup check
Replay events from operatorsSkip dedup check (use explicit replay flag)
Near-duplicate business eventsHandle in application logic, not the gateway

Deduplication is plumbing — invisible when it works, painful when it doesn't. The patterns above are straightforward to implement and will handle the vast majority of real-world duplicate scenarios without adding meaningful latency to your ingest path.

See how GetHook handles ingest reliability →

Stop losing webhook events.

GetHook gives you reliable delivery, automatic retry, and full observability — in minutes.