Most webhook tutorials assume you control both sides of the delivery: you own the sender and the receiver, or you're integrating one external provider into one app. The multi-tenant case is different. You're building a platform where dozens — or thousands — of tenants each connect their own Stripe account, their own Shopify store, their own PagerDuty org. All of those providers need somewhere to send events. You're operating the receiving infrastructure.
This post covers the architectural and security decisions that separate a reliable multi-tenant ingest layer from one that leaks tenant data, gets overwhelmed by a single noisy tenant, or fails signature verification because secrets are mixed up.
The Core Problem: Shared Infrastructure, Isolated State
When you run a shared ingest endpoint (https://hooks.yourplatform.com/ingest), every external provider is delivering to the same infrastructure. The challenge is threefold:
- ›Routing — which tenant does this webhook belong to?
- ›Verification — each tenant has their own signing secret with their provider; verification must use the correct secret.
- ›Isolation — a burst from one tenant's Shopify store must not delay delivery for another tenant's Stripe events.
Getting routing wrong means events land in the wrong tenant's queue. Getting verification wrong means you either reject valid events or accept spoofed ones. Getting isolation wrong means one tenant's Black Friday traffic becomes every tenant's problem.
Step 1: Per-Tenant Ingest Tokens, Not a Shared Path
The first architectural decision is whether to use a single shared endpoint or per-tenant paths. The answer is always per-tenant paths.
# Wrong — single endpoint, routing by payload content
POST /ingest
# Right — per-tenant path token, routing by URL
POST /ingest/src_7k2mxP9nJqR3bTvLA path token approach (where each tenant gets a unique opaque token as part of the URL) gives you:
- ›Unambiguous routing at the HTTP layer, before you parse the body
- ›Revocation without coordination — if a tenant's token is compromised, generate a new one; other tenants are unaffected
- ›Audit granularity — you can log ingest traffic per token without joining on payload contents
Tokens should be cryptographically random, at least 20 bytes of entropy, unpredictable. src_ prefix helps ops distinguish them from other credentials in logs. Store only the token itself — no need to hash it since it isn't a secret on your side; it only authenticates the sender's knowledge of the URL, not the tenant's identity.
The one exception: if you need a human-memorable URL for a specific enterprise integration, you can layer a slug on top (/ingest/acme-corp/stripe) — but always validate with a lookup, never trust the slug directly as an authorization mechanism.
Step 2: Per-Tenant, Per-Provider Signing Secret Storage
Every tenant configures their own webhook secret with each external provider. Stripe gives tenant A a different secret than tenant B. You need to store all of these, verify inbound payloads against the correct one, and never mix them up.
The storage schema:
CREATE TABLE webhook_sources (
id TEXT PRIMARY KEY, -- "src_7k2mxP9..."
tenant_id UUID NOT NULL REFERENCES tenants(id),
provider TEXT NOT NULL, -- "stripe", "shopify", etc.
path_token TEXT NOT NULL UNIQUE, -- the ingest URL segment
signing_secret BYTEA NOT NULL, -- AES-256-GCM encrypted
status TEXT NOT NULL DEFAULT 'active',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX webhook_sources_path_token ON webhook_sources (path_token);
CREATE INDEX webhook_sources_tenant_id ON webhook_sources (tenant_id);The signing_secret column stores the provider's webhook secret, encrypted at rest using AES-256-GCM. The encryption key lives in your secrets manager, not in the database. Decryption happens at verification time, in memory, and the plaintext secret is never logged or written to disk.
The lookup at ingest time is a single indexed read:
func (s *IngestHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
token := pathSegment(r.URL.Path, 2) // extract "src_7k2mxP9..." from URL
source, err := s.store.GetSourceByToken(r.Context(), token)
if err != nil || source.Status != "active" {
// Return 404, not 401 — don't confirm that any token exists
http.NotFound(w, r)
return
}
secret, err := s.enc.Decrypt(source.SigningSecret)
if err != nil {
s.log.Error("failed to decrypt signing secret", "source_id", source.ID)
http.Error(w, "internal error", http.StatusInternalServerError)
return
}
body, err := io.ReadAll(io.LimitReader(r.Body, 10<<20)) // 10MB limit
if err != nil {
http.Error(w, "read error", http.StatusBadRequest)
return
}
if !verifySignature(source.Provider, body, r.Header, secret) {
http.Error(w, "invalid signature", http.StatusUnauthorized)
return
}
// Enqueue for this tenant, tagged with source and tenant IDs
if err := s.queue.Enqueue(r.Context(), QueueEntry{
TenantID: source.TenantID,
SourceID: source.ID,
Provider: source.Provider,
Payload: body,
Headers: extractRelevantHeaders(r.Header, source.Provider),
}); err != nil {
http.Error(w, "internal error", http.StatusInternalServerError)
return
}
w.WriteHeader(http.StatusOK)
}The 404 on unknown tokens is deliberate. Returning 401 would confirm that a token exists but is invalid — useful to an attacker enumerating tokens. 404 leaks nothing.
Step 3: Provider-Specific Signature Verification
Different providers use different signing schemes. You need a dispatch table, not a single verification function:
| Provider | Signature Header | Algorithm | Timestamp Field |
|---|---|---|---|
| Stripe | Stripe-Signature | HMAC-SHA256, t=<ts>,v1=<hex> | Embedded in header |
| Shopify | X-Shopify-Hmac-Sha256 | HMAC-SHA256, base64 over body | None (replay window via X-Shopify-Webhook-Id) |
| GitHub | X-Hub-Signature-256 | sha256=<hex> over body | None |
| PagerDuty | X-PagerDuty-Signature | HMAC-SHA256, v1=<hex>, multiple sigs possible | None |
| SendGrid | X-Twilio-Email-Event-Webhook-Signature | ECDSA P-256 | X-Twilio-Email-Event-Webhook-Timestamp |
Each has its own replay-attack defense (or lack thereof). The safest approach is to implement provider-specific verification for each provider you support and reject the event if no matching verifier exists for the declared provider:
func verifySignature(provider string, body []byte, headers http.Header, secret []byte) bool {
switch provider {
case "stripe":
return verifyStripe(body, headers.Get("Stripe-Signature"), secret)
case "shopify":
return verifyShopify(body, headers.Get("X-Shopify-Hmac-Sha256"), secret)
case "github":
return verifyGitHub(body, headers.Get("X-Hub-Signature-256"), secret)
case "pagerduty":
return verifyPagerDuty(body, headers.Get("X-PagerDuty-Signature"), secret)
default:
return false // unknown provider, reject
}
}For providers without a timestamp field (GitHub, Shopify), you need a different replay defense. Track recently seen delivery IDs in a short-lived cache (Redis or a Postgres table with a TTL index) and reject any event whose delivery ID you've seen in the past 5 minutes.
Step 4: Per-Tenant Queue Isolation
Enqueueing all tenant events into a single queue table without partitioning is asking for head-of-line blocking. Tenant A sends 50,000 events during a promotion. Tenant B's time-sensitive payment webhooks queue up behind them.
Three approaches, in increasing operational complexity:
Option A: Tenant-tagged rows with weighted worker polling
The simplest approach: a single queue table with a tenant_id column and a priority column. Workers poll using SELECT ... FOR UPDATE SKIP LOCKED ORDER BY priority DESC, created_at ASC. Assign higher priority to tenants on higher-tier plans.
SELECT id, tenant_id, payload
FROM webhook_queue
WHERE status = 'pending'
AND next_attempt_at <= NOW()
ORDER BY priority DESC, created_at ASC
LIMIT 10
FOR UPDATE SKIP LOCKED;This is simple but doesn't prevent starvation. If 50,000 high-priority rows exist, lower-priority tenants can still wait.
Option B: Round-robin worker with per-tenant cursor
Workers maintain a cursor per tenant and rotate through tenants in round-robin order. Each pass through a tenant processes at most N events before moving to the next. This gives every tenant a fair share of worker capacity regardless of queue depth.
func (w *Worker) pollRoundRobin(ctx context.Context) {
tenants, _ := w.store.ActiveTenants(ctx)
for _, tenantID := range tenants {
events, _ := w.store.DequeueForTenant(ctx, tenantID, w.batchSize)
for _, e := range events {
w.deliver(ctx, e)
}
}
}Option C: Separate queue partition per tenant (for large platforms)
At scale (thousands of tenants, millions of events per day), per-tenant Postgres partitions or separate queue tables give the strongest isolation. You can vacuum, index, and allocate workers independently per tenant. The operational cost is proportionally higher.
For most SaaS platforms, Option B covers the fair-scheduling problem with minimal complexity. Option C is for when you're running dedicated infrastructure per enterprise customer.
Step 5: Delivery Isolation Downstream
Ingest isolation is only half the picture. Downstream delivery — your worker calling the tenant's registered webhook endpoint — introduces a different isolation risk: a single slow tenant endpoint can exhaust worker goroutines.
Enforce a per-tenant concurrency limit:
type DeliveryWorker struct {
semaphores map[string]chan struct{} // keyed by tenant_id
mu sync.Mutex
maxPerTenant int
}
func (w *DeliveryWorker) semaphoreFor(tenantID string) chan struct{} {
w.mu.Lock()
defer w.mu.Unlock()
if _, ok := w.semaphores[tenantID]; !ok {
w.semaphores[tenantID] = make(chan struct{}, w.maxPerTenant)
}
return w.semaphores[tenantID]
}
func (w *DeliveryWorker) deliver(ctx context.Context, event Event) {
sem := w.semaphoreFor(event.TenantID)
select {
case sem <- struct{}{}:
defer func() { <-sem }()
w.forwardEvent(ctx, event)
default:
// Tenant at concurrency limit — requeue with backoff
w.store.RequeueWithDelay(ctx, event.ID, 5*time.Second)
}
}The per-tenant semaphore ensures that a tenant whose endpoint is slow or returning errors can't consume all delivery goroutines. Their events back up in the queue; everyone else's deliveries continue unaffected.
GetHook implements this isolation model natively — each source and destination has independent retry state, and delivery concurrency is bounded per destination rather than globally, which achieves the same effect without custom semaphore management.
Step 6: What to Audit Log (and What Not To)
Multi-tenant ingest creates compliance obligations. You're processing webhook payloads that may contain PII — customer names, email addresses, order details — belonging to your tenants' end customers.
Log at the envelope level, not the payload level:
{
"event": "ingest.received",
"source_id": "src_7k2mxP9nJqR3bTvL",
"tenant_id": "ten_a1b2c3d4",
"provider": "stripe",
"provider_event_id": "evt_1OqTH2LkdIwHu7ixGdCN4j3y",
"payload_bytes": 1842,
"signature_valid": true,
"timestamp": "2026-04-16T09:14:22Z"
}The payload_bytes field lets you track volume without logging the body. The provider_event_id lets you correlate with the provider's own logs for debugging. Payload content stays out of your audit log — your tenants' customers' PII doesn't accumulate in your logging infrastructure.
For GDPR purposes: if a tenant requests deletion, you need to delete the queued and delivered event payloads for that tenant. Store payloads in a tenant-scoped table (or a separate payload store keyed by tenant_id) so deletion is a targeted DELETE WHERE tenant_id = $1 rather than a scan of a shared events table.
Checklist: Multi-Tenant Ingest in Production
| Concern | Decision |
|---|---|
| Routing | Per-tenant path token; indexed lookup, no payload parsing for routing |
| Secret storage | AES-256-GCM encrypted per source; decrypted in memory only |
| Verification failure response | 401 Unauthorized; provider will retry |
| Unknown token response | 404 Not Found; leaks no information |
| Replay defense | Provider-specific (timestamp check or delivery ID dedup) |
| Queue isolation | Round-robin polling or per-tenant partition |
| Delivery isolation | Per-tenant concurrency semaphore |
| Audit logging | Envelope metadata only; no payload content |
| Tenant deletion | Payload table partitioned by tenant_id for clean wipe |
Multi-tenant webhook ingest is one of those infrastructure problems where the first version looks fine and the second version avoids three different production incidents. Getting the security boundaries right early — isolated secrets, 404 on unknown tokens, per-tenant queue isolation — is far cheaper than retrofitting them after you've had a cross-tenant data leak or a retry storm from one customer that took down everyone else's delivery pipeline.
If you want a solid foundation to build on rather than implementing this from scratch, GetHook handles per-tenant source isolation, encrypted secret storage, and independent delivery queues out of the box. Start here.