Back to Blog
infrastructureperformancearchitectureglobal delivery

Webhook Delivery at the Edge: Reducing Latency with Regional Forwarding

When your webhook gateway lives in us-east-1 and your customer's endpoint is in Tokyo, every delivery pays a 200ms tax. Here's how to architect regional forwarding to cut delivery latency without duplicating your entire infrastructure stack.

L
Lena Hartmann
Infrastructure Engineer
April 3, 2026
9 min read

Most webhook infrastructure is built in a single region. That decision makes complete sense at the start — your app is in us-east-1, your Postgres is in us-east-1, and your webhook worker is right there alongside them. Latency to your own database is sub-millisecond, and everything works.

Then your SaaS starts acquiring customers in Europe and Asia-Pacific. A customer in Frankfurt asks why their order confirmation webhooks take 350ms to arrive. A customer in Singapore sees 400ms delivery times for events their downstream system is waiting on synchronously. Your single-region webhook worker is making a round trip across two ocean cables for every delivery attempt.

This post walks through how to architect regional webhook forwarding: where the latency comes from, the tradeoffs involved, and a deployment pattern that reduces cross-region delivery latency significantly without requiring you to replicate your entire data plane.


Where the Latency Actually Lives

Before optimizing, it helps to decompose webhook delivery latency into its components.

For a typical delivery from us-east-1 to a destination in ap-southeast-1 (Singapore):

ComponentTypical latency
Event enqueue (producer → queue)2–5 ms
Worker poll interval0–500 ms
TCP connect to destination (cross-region)170–200 ms
TLS handshake (cross-region)170–200 ms (1 RTT)
HTTP request send + response20–50 ms
Worker write delivery result to DB2–5 ms
Total (worst case)~960 ms

The single biggest cost is the TCP+TLS handshake: roughly two round trips across a cross-region path, each at ~180ms, adds up to ~360ms before a single byte of your payload is transmitted. This isn't a software problem — it's physics. Light takes time to cross an ocean.

The worker poll interval is the other high-variance component. If you're using a Postgres job queue with a 500ms sleep between polls, you're adding up to 500ms of idle wait before delivery even starts.


The Naive Approach: Multi-Region Workers

The first instinct is to run webhook workers in multiple regions and let each worker deliver to nearby destinations. This seems straightforward but has an important complication: your event queue is probably tied to a single Postgres instance, and running workers in multiple regions means cross-region reads on every job poll.

us-east-1 Postgres ─────→ ap-southeast-1 Worker ─────→ Customer (Singapore)
     (event queue)              (200ms poll RTT)              (10ms)

You've moved the cross-region hop from the delivery step to the queue polling step. You haven't saved much, and you've added database connection complexity.

If you go this route, you need to be careful about worker assignment. Allowing any worker in any region to claim any job means a Singapore worker might pick up a job destined for a London endpoint, delivering it inefficiently. You need destination-aware job routing.


The Better Approach: Regional Forwarders with a Central Control Plane

The architecture that actually reduces latency keeps a single control plane (queue, event storage, delivery logs) in one primary region and deploys lightweight forwarding agents closer to destination clusters.

                     ┌─────────────────────────────────────┐
                     │         Primary Region (us-east-1)  │
                     │                                      │
  Ingest ───────────→│  Event Queue + Postgres              │
                     │  Management API                      │
                     │  Delivery result store               │
                     └──────────┬────────────┬─────────────┘
                                │            │
                         HTTPS  │            │  HTTPS
                    (encrypted  │            │  (encrypted
                     internal)  │            │   internal)
                                ▼            ▼
                    ┌───────────────┐  ┌───────────────┐
                    │  eu-west-1    │  │ ap-southeast-1│
                    │  Forwarder    │  │  Forwarder    │
                    └───────┬───────┘  └───────┬───────┘
                            │                  │
                            ▼                  ▼
                    EU customer endpoints  APAC customer endpoints

The forwarder is a small stateless service. It receives delivery jobs from the primary region over a persistent connection (or via a pull-based API), delivers the webhook to the destination using a local TCP connection, and reports the outcome back to the primary region.

The key win: the forwarding agent in ap-southeast-1 makes a local TCP connection to the Singapore endpoint (~5ms RTT) rather than one initiated from us-east-1 (~190ms RTT). The TLS handshake cost drops from ~360ms to ~10ms.


Implementing the Forwarder

The forwarder itself is intentionally thin. It handles:

  1. Pulling pending delivery jobs routed to its region
  2. Executing the HTTP delivery with HMAC signing
  3. Reporting outcomes back to the primary

Here's the core delivery loop in Go:

go
type RegionalForwarder struct {
    region      string
    controlAPI  string
    httpClient  *http.Client
    apiKey      string
}

func (f *RegionalForwarder) Run(ctx context.Context) error {
    for {
        jobs, err := f.fetchJobs(ctx)
        if err != nil {
            log.Printf("fetch error: %v", err)
            time.Sleep(1 * time.Second)
            continue
        }

        for _, job := range jobs {
            go f.deliver(ctx, job)
        }

        if len(jobs) == 0 {
            // Adaptive sleep: back off when queue is empty
            select {
            case <-time.After(250 * time.Millisecond):
            case <-ctx.Done():
                return ctx.Err()
            }
        }
    }
}

func (f *RegionalForwarder) deliver(ctx context.Context, job DeliveryJob) {
    req, _ := http.NewRequestWithContext(ctx, "POST", job.DestinationURL, bytes.NewReader(job.Payload))
    req.Header.Set("Content-Type", "application/json")
    req.Header.Set("Webhook-Signature", job.Signature)

    resp, err := f.httpClient.Do(req)

    outcome := DeliveryOutcome{JobID: job.ID}
    if err != nil {
        outcome.Result = "network_error"
        outcome.Error = err.Error()
    } else {
        defer resp.Body.Close()
        body, _ := io.ReadAll(io.LimitReader(resp.Body, 512))
        outcome.Result = classifyStatus(resp.StatusCode)
        outcome.StatusCode = resp.StatusCode
        outcome.ResponseExcerpt = string(body)
    }

    f.reportOutcome(ctx, outcome)
}

The httpClient is configured with connection pooling tuned for the local network conditions — higher MaxIdleConnsPerHost than you'd use for cross-region, lower dial timeout since latency is predictable:

go
httpClient := &http.Client{
    Timeout: 30 * time.Second,
    Transport: &http.Transport{
        MaxIdleConns:        200,
        MaxIdleConnsPerHost: 20,
        IdleConnTimeout:     90 * time.Second,
        DialContext: (&net.Dialer{
            Timeout:   5 * time.Second,  // Local dial, short timeout
            KeepAlive: 30 * time.Second,
        }).DialContext,
        TLSHandshakeTimeout: 5 * time.Second,
    },
}

Routing Jobs to the Right Region

For regional forwarding to work, you need to know which region a destination is in. You have two options:

Option 1: Customer declares their region. When a customer configures a destination URL, they select their region (or it's inferred from a form asking where their service is hosted). This is simple but relies on user input.

Option 2: Latency probe at destination creation time. When a destination URL is registered, probe it from all forwarder regions and record the observed latency. Route future deliveries to the lowest-latency region.

The probe approach is more accurate, but adds complexity and requires all regional forwarders to be available at destination registration time. For most systems, the customer-declared approach is sufficient.

Store the routing decision in your destinations table:

sql
ALTER TABLE destinations
    ADD COLUMN preferred_region TEXT NOT NULL DEFAULT 'us-east-1';

CREATE INDEX idx_destinations_region
    ON destinations (preferred_region)
    WHERE status = 'active';

The primary region worker skips jobs whose preferred_region doesn't match its own; regional forwarders claim only jobs matching their region.


Tradeoffs to Understand Before You Build This

Regional forwarding is not free. Before committing to this architecture, weigh the following:

FactorRegional forwarding costWhat helps
Operational complexityMore services to deploy, monitor, and upgradeContainer orchestration (ECS, GKE), GitOps
Cross-region control trafficForwarder ↔ primary API adds ~200ms per job fetchBatch job fetch (pull 10–50 jobs per request)
Outcome reporting latencyResults reach primary DB 200ms laterAcceptable; delivery happened, result is eventual
HMAC signing key distributionForwarder needs destination signing secretsPull secrets at job time or cache with short TTL
Retry coordinationFailed jobs must re-enter the central queueForwarder reports failure; primary reschedules

The cross-region control traffic deserves special attention. If your forwarder fetches one job at a time over a 200ms cross-region path, you've just moved the latency problem from delivery to job acquisition. Always batch: fetch 10–50 jobs per API call, deliver them in parallel, batch-report outcomes.


What This Looks Like in Practice

With regional forwarders deployed in eu-west-1 and ap-southeast-1, a delivery from the primary region to a Singapore customer changes:

Before:

us-east-1 worker → TCP+TLS to Singapore (360ms) → HTTP send/receive (50ms)
Total: ~410ms delivery latency

After:

us-east-1 → ap-southeast-1 forwarder (200ms, batched)
ap-southeast-1 forwarder → TCP+TLS to Singapore (10ms) → HTTP send/receive (10ms)
Total: ~220ms delivery latency, with the cross-region hop amortized across a batch

For individual deliveries, the reduction is meaningful but not dramatic — roughly halved. The more significant gain is reliability: a local TCP connection to a Singapore destination fails fast and predictably, rather than timing out after 30 seconds when the cross-region link degrades.


When You Actually Need This

Regional forwarding is an optimization worth building when:

  • You have customers on multiple continents with latency-sensitive downstream systems
  • Your webhook SLA specifies delivery within N seconds, and cross-region physics makes that impossible from a single region
  • You're seeing destination timeout rates spike for specific geographic clusters (a good signal that distance, not destination health, is the failure mode)

For most teams at under 1M events/day with globally distributed customers, the simpler approach is to run workers in your primary region and accept 200–400ms cross-region delivery latency. That's often invisible to end users.

If you're building on GetHook and have customers across regions, start by reviewing your delivery attempt logs for patterns in timeout outcomes grouped by destination geography. If you see a concentration of timeouts for destinations in a specific region, that's your signal to invest in regional forwarding before it becomes a customer escalation.


Start instrumenting your delivery latency by destination region today so you have the data to make this decision when you need it. Set up GetHook →

Stop losing webhook events.

GetHook gives you reliable delivery, automatic retry, and full observability — in minutes.