Back to Blog
webhooksarchitecturemigrationreliabilityengineering

Migrating from Polling to Webhooks: A Step-by-Step Guide

Polling is the training wheels of API integrations. Here's how to replace it with a production-grade webhook integration — without dropping events during the cutover.

A
Aleksa Vukovic
Developer Relations
March 23, 2026
11 min read

Polling works until it doesn't. You start with a cron job that calls /api/orders?since=... every minute, it's fine for a while, and then three things happen: your provider starts rate-limiting you, your data freshness requirements tighten, and your server bill grows because you're making 1,440 API calls a day to retrieve an average of 12 new events.

The fix is obvious — switch to webhooks. The provider pushes events to you the moment they happen. No polling, no wasted requests, sub-second freshness.

The migration, though, is where teams get tripped up. You can't just "turn off the poll and turn on the webhook." There's a transition window where events can fall through the gap, a new set of reliability concerns to handle on your side, and a cutover sequence that has to be executed carefully. This guide walks through it end to end.


Why Polling Breaks Down

Before committing to the migration, it's worth being precise about the failure modes. Polling fails in three specific ways:

Failure modeSymptomRoot cause
Rate limiting429 Too Many Requests from providerToo many calls per minute/hour
Event gapsEvents missed between poll cyclesEvents created and resolved within one poll interval
Thundering herdSpike in load after downtimeCatching up on missed polls simultaneously
Stale dataUsers see outdated stateLong poll intervals (5m, 15m) to reduce API load

The event gap problem is the most insidious. If you poll every 60 seconds and a payment is created and marked as failed within that window, your poll might never see the intermediate payment.created state — only the payment.failed. Or depending on your query, it might miss the event entirely if your timestamp filter is off by milliseconds.

Webhooks eliminate all four. Events are pushed immediately. No polling budget, no gaps, no staleness.


Step 1: Map What You're Polling

Before writing any code, produce a complete inventory of what your polling job does. This is the most commonly skipped step, and it causes incomplete webhook configurations later.

For each polling loop, document:

  • Which endpoint is being polled
  • Which fields from the response are being used
  • What action is taken on each record (write to DB, trigger workflow, send email)
  • How deduplication is handled — do you check if you've already processed a record?
  • What the polling interval is and what freshness SLA that represents

Example inventory table for a payment integration:

Poll targetIntervalActionDedup mechanism
GET /payments?status=pending60sCharge card, update order statusCheck payment_id in processed_payments table
GET /refunds?created_after=...5mIssue credit to customer balanceCheck refund_id in ledger_entries
GET /disputes?status=open15mAlert support team via SlackCheck dispute_id in dispute_alerts

Every row in this table maps to one or more webhook event types you'll need to subscribe to.


Step 2: Identify the Equivalent Webhook Events

Most providers document their webhook event catalog alongside their REST API. Map each poll target to its webhook equivalent:

Current pollEquivalent webhook event(s)
GET /payments?status=pendingpayment.created, payment.updated
GET /refunds?created_after=...refund.created
GET /disputes?status=opendispute.created, dispute.updated

Watch for mismatches. A webhook may fire for states you don't care about (e.g., payment.updated fires for every status change, not just the ones your poll was filtering for). You'll need to add filtering logic on your side that your polling query previously handled implicitly with a WHERE clause.

Also check whether the webhook payload contains everything your handler needs. Some providers send "thin" events — just { "type": "payment.updated", "id": "pay_123" } — requiring you to make a follow-up API call to fetch the full record. That's a fetch-on-webhook pattern, and it's worth knowing upfront because it changes your handler design.


Step 3: Build the Webhook Handler (Before Cutting Over)

The critical rule: build and validate your webhook handler while the polling job is still running. Don't cut over until you've confirmed the handler works in production.

A production-ready webhook handler needs four things:

1. Signature verification

Validate the HMAC signature on every request before doing anything else. This prevents spoofed events.

go
func verifySignature(body []byte, sigHeader, secret string) error {
    // Stripe-style: "t=<unix>,v1=<hex>"
    parts := strings.SplitN(sigHeader, ",", 2)
    if len(parts) != 2 {
        return errors.New("malformed signature header")
    }
    timestamp := strings.TrimPrefix(parts[0], "t=")
    signature := strings.TrimPrefix(parts[1], "v1=")

    mac := hmac.New(sha256.New, []byte(secret))
    mac.Write([]byte(timestamp + "." + string(body)))
    expected := hex.EncodeToString(mac.Sum(nil))

    if !hmac.Equal([]byte(signature), []byte(expected)) {
        return errors.New("signature mismatch")
    }

    // Reject events older than 5 minutes (replay attack prevention)
    ts, err := strconv.ParseInt(timestamp, 10, 64)
    if err != nil || time.Now().Unix()-ts > 300 {
        return errors.New("event timestamp out of tolerance")
    }
    return nil
}

2. Idempotent processing

Your polling job probably had implicit idempotency — you checked whether you'd seen a record before acting. Make that explicit in your webhook handler:

go
func handlePaymentUpdated(ctx context.Context, db *sql.DB, event PaymentEvent) error {
    // Idempotency check: have we already processed this event?
    var exists bool
    err := db.QueryRowContext(ctx,
        `SELECT EXISTS(SELECT 1 FROM processed_events WHERE event_id = $1)`,
        event.ID,
    ).Scan(&exists)
    if err != nil {
        return err
    }
    if exists {
        return nil // already processed, return 200 to ack
    }

    // Process the event
    if err := applyPaymentUpdate(ctx, db, event); err != nil {
        return err
    }

    // Mark as processed
    _, err = db.ExecContext(ctx,
        `INSERT INTO processed_events (event_id, processed_at) VALUES ($1, NOW())`,
        event.ID,
    )
    return err
}

3. Fast acknowledgement

Return 200 OK within 5 seconds (most providers have this timeout). Do the heavy lifting asynchronously:

go
func webhookHandler(w http.ResponseWriter, r *http.Request) {
    body, _ := io.ReadAll(r.Body)

    if err := verifySignature(body, r.Header.Get("X-Webhook-Signature"), secret); err != nil {
        http.Error(w, "unauthorized", http.StatusUnauthorized)
        return
    }

    // Enqueue for async processing, ack immediately
    if err := queue.Enqueue(body); err != nil {
        http.Error(w, "internal error", http.StatusInternalServerError)
        return
    }

    w.WriteHeader(http.StatusOK)
}

4. Structured logging

Log every event with enough context to reconstruct the sequence later:

json
{
  "event": "webhook.received",
  "event_type": "payment.updated",
  "event_id": "evt_1NxPQ2LkdIw...",
  "source": "stripe",
  "payload_bytes": 1243,
  "signature_valid": true,
  "timestamp": "2026-03-23T10:14:22.003Z"
}

Step 4: Run Both Systems in Parallel

This is the overlap period — the most important phase of the migration.

Register your webhook endpoint with the provider. Leave your polling job running. For the next 48–72 hours, both systems are active. Your webhook handler should process events but write a flag indicating the source:

sql
ALTER TABLE processed_events ADD COLUMN source TEXT DEFAULT 'poll';
-- 'poll' or 'webhook'

Run queries to compare coverage:

sql
-- Events processed by webhook but not poll (webhook is ahead)
SELECT event_id FROM processed_events WHERE source = 'webhook'
EXCEPT
SELECT event_id FROM processed_events WHERE source = 'poll';

-- Events processed by poll but not webhook (webhook missed something)
SELECT event_id FROM processed_events WHERE source = 'poll'
  AND created_at > '<webhook_registration_time>'
EXCEPT
SELECT event_id FROM processed_events WHERE source = 'webhook';

The second query is the important one. Any event that the poll caught but the webhook missed is a gap you need to investigate before cutting over.

Common causes of gaps during parallel mode:

  • The webhook event type doesn't cover all the states your poll was filtering for
  • The webhook subscription was created after some events had already fired
  • Events are being delivered to a different environment (staging vs. production)

Resolve gaps before proceeding. Don't rush this step.


Step 5: Handle the Backfill Window

When you register a webhook, the provider typically only sends events going forward. It does not replay historical events. This means any events that occurred between your last successful poll and your webhook registration time will never arrive via webhook.

Explicitly backfill that window:

  1. Note the timestamp of your last successful poll (T_last_poll)
  2. Note the timestamp of your webhook registration (T_webhook_start)
  3. Run a one-time script that polls the API for events between T_last_poll and T_webhook_start and processes them through your handler
bash
# Example: fetch all events in the backfill window
curl "https://api.provider.com/events?created_after=T_last_poll&created_before=T_webhook_start" \
  -H "Authorization: Bearer $API_KEY" \
  | jq '.data[]' \
  | xargs -I{} ./process-historical-event '{}'

Mark these events with source = 'backfill' so you can distinguish them in audits.


Step 6: Cut Over

Once parallel mode has run cleanly for 48–72 hours with no gaps, you're ready to cut over.

The sequence:

  1. Disable the polling job — comment out the cron entry or set a feature flag
  2. Keep the polling code deployed for one more release cycle (easy rollback)
  3. Monitor your webhook handler error rate and processing volume for 24 hours
  4. Set alerts on webhook delivery failures via your gateway (GetHook, or your provider's dashboard)
  5. After 7 days of clean operation, delete the polling code

Do not delete the polling code at cutover. You want rollback to take seconds, not a redeploy.


Step 7: Manage Reliability Post-Cutover

Webhooks shift reliability responsibility to your side. The provider will retry if you return a non-2xx response, but you need to handle:

ConcernPolling equivalentWebhook equivalent
Provider outagePoll fails; retry next cycleNo events received; gaps when provider recovers
Your outagePoll resumes from last timestampEvents retried by provider; check retry window
Event orderingQuery ordered by timestampNot guaranteed; use event timestamps, not arrival order
Duplicate eventsIdempotency check on IDSame — idempotency check is still required

For provider outages, the most important mitigation is knowing your provider's event retention policy. Stripe, for example, retries webhook delivery for 72 hours. If your outage exceeds that window, you'll need to backfill from the API — the same pattern you used during the migration.

GetHook helps on both ends: it absorbs inbound webhooks from providers and retries delivery to your services independently of the provider's retry logic, giving you a wider recovery window and full event history for backfill queries.


Checklist Before Cutover

Use this before disabling any polling job:

  • Webhook handler deployed to production with signature verification
  • Idempotency check in place for every event type
  • Events queued asynchronously — handler returns 200 in under 2 seconds
  • Parallel mode ran for at least 48 hours with no unexplained gaps
  • Backfill window between last poll and webhook registration was processed
  • Alerts configured on webhook delivery failure rate
  • Rollback plan documented (re-enable polling cron, timeline for rollback decision)

The migration from polling to webhooks is one of the higher-leverage infrastructure improvements you can make. Fewer wasted API calls, better data freshness, and a more honest model of how event-driven systems should work. Done carefully, the cutover is low-risk. Done hastily, you drop events in production.

Take the parallel period seriously, validate your idempotency logic, and the rest follows.

Start receiving webhooks reliably with GetHook →

Stop losing webhook events.

GetHook gives you reliable delivery, automatic retry, and full observability — in minutes.