Most webhook infrastructure is built in a single region. That decision makes complete sense at the start — your app is in us-east-1, your Postgres is in us-east-1, and your webhook worker is right there alongside them. Latency to your own database is sub-millisecond, and everything works.
Then your SaaS starts acquiring customers in Europe and Asia-Pacific. A customer in Frankfurt asks why their order confirmation webhooks take 350ms to arrive. A customer in Singapore sees 400ms delivery times for events their downstream system is waiting on synchronously. Your single-region webhook worker is making a round trip across two ocean cables for every delivery attempt.
This post walks through how to architect regional webhook forwarding: where the latency comes from, the tradeoffs involved, and a deployment pattern that reduces cross-region delivery latency significantly without requiring you to replicate your entire data plane.
Where the Latency Actually Lives
Before optimizing, it helps to decompose webhook delivery latency into its components.
For a typical delivery from us-east-1 to a destination in ap-southeast-1 (Singapore):
| Component | Typical latency |
|---|---|
| Event enqueue (producer → queue) | 2–5 ms |
| Worker poll interval | 0–500 ms |
| TCP connect to destination (cross-region) | 170–200 ms |
| TLS handshake (cross-region) | 170–200 ms (1 RTT) |
| HTTP request send + response | 20–50 ms |
| Worker write delivery result to DB | 2–5 ms |
| Total (worst case) | ~960 ms |
The single biggest cost is the TCP+TLS handshake: roughly two round trips across a cross-region path, each at ~180ms, adds up to ~360ms before a single byte of your payload is transmitted. This isn't a software problem — it's physics. Light takes time to cross an ocean.
The worker poll interval is the other high-variance component. If you're using a Postgres job queue with a 500ms sleep between polls, you're adding up to 500ms of idle wait before delivery even starts.
The Naive Approach: Multi-Region Workers
The first instinct is to run webhook workers in multiple regions and let each worker deliver to nearby destinations. This seems straightforward but has an important complication: your event queue is probably tied to a single Postgres instance, and running workers in multiple regions means cross-region reads on every job poll.
us-east-1 Postgres ─────→ ap-southeast-1 Worker ─────→ Customer (Singapore)
(event queue) (200ms poll RTT) (10ms)You've moved the cross-region hop from the delivery step to the queue polling step. You haven't saved much, and you've added database connection complexity.
If you go this route, you need to be careful about worker assignment. Allowing any worker in any region to claim any job means a Singapore worker might pick up a job destined for a London endpoint, delivering it inefficiently. You need destination-aware job routing.
The Better Approach: Regional Forwarders with a Central Control Plane
The architecture that actually reduces latency keeps a single control plane (queue, event storage, delivery logs) in one primary region and deploys lightweight forwarding agents closer to destination clusters.
┌─────────────────────────────────────┐
│ Primary Region (us-east-1) │
│ │
Ingest ───────────→│ Event Queue + Postgres │
│ Management API │
│ Delivery result store │
└──────────┬────────────┬─────────────┘
│ │
HTTPS │ │ HTTPS
(encrypted │ │ (encrypted
internal) │ │ internal)
▼ ▼
┌───────────────┐ ┌───────────────┐
│ eu-west-1 │ │ ap-southeast-1│
│ Forwarder │ │ Forwarder │
└───────┬───────┘ └───────┬───────┘
│ │
▼ ▼
EU customer endpoints APAC customer endpointsThe forwarder is a small stateless service. It receives delivery jobs from the primary region over a persistent connection (or via a pull-based API), delivers the webhook to the destination using a local TCP connection, and reports the outcome back to the primary region.
The key win: the forwarding agent in ap-southeast-1 makes a local TCP connection to the Singapore endpoint (~5ms RTT) rather than one initiated from us-east-1 (~190ms RTT). The TLS handshake cost drops from ~360ms to ~10ms.
Implementing the Forwarder
The forwarder itself is intentionally thin. It handles:
- ›Pulling pending delivery jobs routed to its region
- ›Executing the HTTP delivery with HMAC signing
- ›Reporting outcomes back to the primary
Here's the core delivery loop in Go:
type RegionalForwarder struct {
region string
controlAPI string
httpClient *http.Client
apiKey string
}
func (f *RegionalForwarder) Run(ctx context.Context) error {
for {
jobs, err := f.fetchJobs(ctx)
if err != nil {
log.Printf("fetch error: %v", err)
time.Sleep(1 * time.Second)
continue
}
for _, job := range jobs {
go f.deliver(ctx, job)
}
if len(jobs) == 0 {
// Adaptive sleep: back off when queue is empty
select {
case <-time.After(250 * time.Millisecond):
case <-ctx.Done():
return ctx.Err()
}
}
}
}
func (f *RegionalForwarder) deliver(ctx context.Context, job DeliveryJob) {
req, _ := http.NewRequestWithContext(ctx, "POST", job.DestinationURL, bytes.NewReader(job.Payload))
req.Header.Set("Content-Type", "application/json")
req.Header.Set("Webhook-Signature", job.Signature)
resp, err := f.httpClient.Do(req)
outcome := DeliveryOutcome{JobID: job.ID}
if err != nil {
outcome.Result = "network_error"
outcome.Error = err.Error()
} else {
defer resp.Body.Close()
body, _ := io.ReadAll(io.LimitReader(resp.Body, 512))
outcome.Result = classifyStatus(resp.StatusCode)
outcome.StatusCode = resp.StatusCode
outcome.ResponseExcerpt = string(body)
}
f.reportOutcome(ctx, outcome)
}The httpClient is configured with connection pooling tuned for the local network conditions — higher MaxIdleConnsPerHost than you'd use for cross-region, lower dial timeout since latency is predictable:
httpClient := &http.Client{
Timeout: 30 * time.Second,
Transport: &http.Transport{
MaxIdleConns: 200,
MaxIdleConnsPerHost: 20,
IdleConnTimeout: 90 * time.Second,
DialContext: (&net.Dialer{
Timeout: 5 * time.Second, // Local dial, short timeout
KeepAlive: 30 * time.Second,
}).DialContext,
TLSHandshakeTimeout: 5 * time.Second,
},
}Routing Jobs to the Right Region
For regional forwarding to work, you need to know which region a destination is in. You have two options:
Option 1: Customer declares their region. When a customer configures a destination URL, they select their region (or it's inferred from a form asking where their service is hosted). This is simple but relies on user input.
Option 2: Latency probe at destination creation time. When a destination URL is registered, probe it from all forwarder regions and record the observed latency. Route future deliveries to the lowest-latency region.
The probe approach is more accurate, but adds complexity and requires all regional forwarders to be available at destination registration time. For most systems, the customer-declared approach is sufficient.
Store the routing decision in your destinations table:
ALTER TABLE destinations
ADD COLUMN preferred_region TEXT NOT NULL DEFAULT 'us-east-1';
CREATE INDEX idx_destinations_region
ON destinations (preferred_region)
WHERE status = 'active';The primary region worker skips jobs whose preferred_region doesn't match its own; regional forwarders claim only jobs matching their region.
Tradeoffs to Understand Before You Build This
Regional forwarding is not free. Before committing to this architecture, weigh the following:
| Factor | Regional forwarding cost | What helps |
|---|---|---|
| Operational complexity | More services to deploy, monitor, and upgrade | Container orchestration (ECS, GKE), GitOps |
| Cross-region control traffic | Forwarder ↔ primary API adds ~200ms per job fetch | Batch job fetch (pull 10–50 jobs per request) |
| Outcome reporting latency | Results reach primary DB 200ms later | Acceptable; delivery happened, result is eventual |
| HMAC signing key distribution | Forwarder needs destination signing secrets | Pull secrets at job time or cache with short TTL |
| Retry coordination | Failed jobs must re-enter the central queue | Forwarder reports failure; primary reschedules |
The cross-region control traffic deserves special attention. If your forwarder fetches one job at a time over a 200ms cross-region path, you've just moved the latency problem from delivery to job acquisition. Always batch: fetch 10–50 jobs per API call, deliver them in parallel, batch-report outcomes.
What This Looks Like in Practice
With regional forwarders deployed in eu-west-1 and ap-southeast-1, a delivery from the primary region to a Singapore customer changes:
Before:
us-east-1 worker → TCP+TLS to Singapore (360ms) → HTTP send/receive (50ms)
Total: ~410ms delivery latencyAfter:
us-east-1 → ap-southeast-1 forwarder (200ms, batched)
ap-southeast-1 forwarder → TCP+TLS to Singapore (10ms) → HTTP send/receive (10ms)
Total: ~220ms delivery latency, with the cross-region hop amortized across a batchFor individual deliveries, the reduction is meaningful but not dramatic — roughly halved. The more significant gain is reliability: a local TCP connection to a Singapore destination fails fast and predictably, rather than timing out after 30 seconds when the cross-region link degrades.
When You Actually Need This
Regional forwarding is an optimization worth building when:
- ›You have customers on multiple continents with latency-sensitive downstream systems
- ›Your webhook SLA specifies delivery within N seconds, and cross-region physics makes that impossible from a single region
- ›You're seeing destination timeout rates spike for specific geographic clusters (a good signal that distance, not destination health, is the failure mode)
For most teams at under 1M events/day with globally distributed customers, the simpler approach is to run workers in your primary region and accept 200–400ms cross-region delivery latency. That's often invisible to end users.
If you're building on GetHook and have customers across regions, start by reviewing your delivery attempt logs for patterns in timeout outcomes grouped by destination geography. If you see a concentration of timeouts for destinations in a specific region, that's your signal to invest in regional forwarding before it becomes a customer escalation.
Start instrumenting your delivery latency by destination region today so you have the data to make this decision when you need it. Set up GetHook →