Most engineering teams treat webhook infrastructure as an undifferentiated line item — compute, database, and egress costs lumped into "platform overhead." That works until you need to answer one of these questions:

›What does it actually cost to deliver a webhook to one of your customers?
›Which customer segments generate the most delivery retries, and what is that costing you?
›If you add ten new enterprise customers with aggressive webhook usage, what incremental infrastructure spend should you plan for?

Without cost attribution at the event level, you're flying blind on webhook unit economics. This post walks through how to model the cost per delivered event, what drives that cost up, and the levers you have to pull it back down.

Why Webhook Costs Are Non-Linear

A naive cost model says: more events, proportionally more cost. Reality is more complicated, because webhook infrastructure has several non-linear cost drivers.

Retry amplification. A failed delivery that retries five times generates five times the delivery cost of a succeeded-on-first-attempt delivery. If your p50 delivery is one attempt but your p99 is four attempts, your average cost per event may be double what a simple throughput calculation suggests. A destination with a consistently unhealthy endpoint can generate 80% of your retry costs while accounting for 5% of your events.

Dead letter accumulation. Events that exhaust their retry policy end up in dead letter queues. If you store those events indefinitely, storage costs grow monotonically even when delivery volume is flat. Many teams never purge DLQ entries after a customer-side issue is resolved.

Egress amplification. Each retry is a network egress call. At cloud pricing — typically $0.08–$0.12 per GB for inter-region or internet egress — a 50 KB payload retried five times across 1,000 events costs $0.02–$0.03 in egress alone. Small individually, but at scale these numbers compound.

Fan-out multipliers. One inbound event routed to ten destinations generates ten delivery attempts, ten sets of retry state, and ten audit log entries. Cost-per-event attribution must account for route multiplicity, not just raw event count.

Building a Cost Model

Start with these four components. Measure each separately before combining them.

Component	What drives it	How to measure
Compute (ingest)	Events received per second	CPU core-hours ÷ ingest throughput
Compute (delivery worker)	Delivery attempts per second (not events)	CPU core-hours ÷ attempt throughput
Database (storage)	Events stored × retention period	Storage GB × hourly rate
Database (queue ops)	`FOR UPDATE SKIP LOCKED` poll frequency	Query count × avg query duration
Network egress	Delivery attempt payload size × attempt count	GB egressed × egress rate
DLQ storage	Failed events × retention period	Storage GB × hourly rate

The key distinction: cost the delivery worker on attempts, not events. An event with three delivery attempts to two destinations generates six attempts total. If your worker costs $0.002 per attempt-second of compute, an event that generates six attempts costs $0.012 in compute before you factor in storage or egress.

Here is a simplified SQL query you can run against your events and delivery_attempts tables to get attempt-to-event ratios per customer:

sql

SELECT
    e.account_id,
    COUNT(DISTINCT e.id)                                     AS total_events,
    COUNT(da.id)                                             AS total_attempts,
    ROUND(COUNT(da.id)::numeric / NULLIF(COUNT(DISTINCT e.id), 0), 2)
                                                             AS attempts_per_event,
    COUNT(da.id) FILTER (WHERE da.outcome = 'success')       AS successful_attempts,
    COUNT(da.id) FILTER (WHERE da.outcome IN ('timeout', 'network_error', 'http_5xx'))
                                                             AS failed_attempts
FROM events e
LEFT JOIN delivery_attempts da ON da.event_id = e.id
WHERE e.created_at >= NOW() - INTERVAL '30 days'
GROUP BY e.account_id
ORDER BY total_attempts DESC;

Run this monthly. Accounts with attempts_per_event above 2.0 are your cost outliers — either their destinations are unhealthy, their event volume is unusually bursty, or your retry policy is misconfigured for their usage pattern.

The Retry Tax

Retry behavior is the largest variable cost in webhook delivery. Your retry schedule determines how many attempts a failing event accumulates before reaching dead letter status.

A typical exponential backoff schedule — 0s, 30s, 2m, 10m, 1h — means an event that never delivers generates five attempts spread over roughly 72 minutes. For a destination that is down for six hours, every event ingested during that window exhausts the full retry schedule. If you receive 10,000 events during a six-hour outage and your cost per attempt is $0.001, that's $50 in retry cost from one destination outage — before accounting for storage and egress.

Three levers reduce retry tax:

Circuit breaking. Stop retrying to a destination that has failed consistently for the past N attempts or the past M minutes. A destination in an open-circuit state accumulates zero retry cost. The tradeoff is that events may reach DLQ faster than they would with continued retrying — acceptable if you offer event replay.

Per-destination retry limits. Rather than a global retry policy, let customers configure retry aggressiveness. A customer who processes high-volume, low-value events (analytics pings) may prefer fast DLQ to prolonged retry expense. A customer processing payment confirmations wants aggressive retry.

Delivery attempt caps per time window. Instead of allowing five retries regardless of destination state, cap the total attempts a single destination can generate in a rolling hour. This prevents one unhealthy destination from consuming disproportionate worker capacity.

Egress: The Cost That Sneaks Up on You

Egress billing varies significantly by provider and architecture:

Traffic path	Typical cost
Same region, same VPC	Free or near-free
Same region, different VPC	$0.01–$0.02/GB
Cross-region (same cloud)	$0.02–$0.08/GB
Internet egress	$0.08–$0.12/GB
Internet egress (committed use)	$0.04–$0.06/GB

If your webhook gateway delivers to customer endpoints on the public internet, every delivery attempt carries egress cost. For a 10 KB payload with a 3× retry multiplier, the egress per delivered event is roughly 30 KB. At $0.09/GB that's $0.0000027 — negligible per event but $2.70 per million events. At 100 million events per month, egress alone is $270 — enough to be worth optimizing.

The most effective egress reduction is payload compression. Most webhook payloads are JSON, which compresses well. Adding Content-Encoding: gzip to outbound delivery attempts reduces typical payload size by 60–80%, cutting egress costs proportionally. Verify that the destination can handle compressed bodies before enabling — most modern frameworks do, but some legacy systems do not.

Storage Cost Attribution

Webhook event storage costs are driven by three variables: event count, payload size, and retention period.

The retention period is the lever most teams underutilize. If you store all events indefinitely, storage costs grow forever. Define a retention policy and enforce it:

sql

-- Purge delivered events older than 90 days
DELETE FROM events
WHERE status = 'delivered'
  AND created_at < NOW() - INTERVAL '90 days';

-- Purge dead-letter events older than 180 days
DELETE FROM events
WHERE status = 'dead_letter'
  AND created_at < NOW() - INTERVAL '180 days';

-- Purge delivery attempts for purged events (cascade may handle this)
DELETE FROM delivery_attempts
WHERE event_id NOT IN (SELECT id FROM events);

Run these as scheduled jobs, not bulk deletes. Deleting millions of rows in a single transaction locks the table. Delete in batches of 1,000–10,000 rows with a short sleep between batches to avoid I/O saturation.

DLQ events deserve separate retention logic. A dead-letter event has value for debugging — it represents a delivery failure that the customer may want to investigate and replay. Purging them too aggressively destroys that value. Purging them too conservatively inflates storage costs. A 180-day DLQ retention with optional customer-triggered purge is a reasonable default.

Surfacing Cost Attribution to Customers

If you operate a multi-tenant webhook platform — whether as a SaaS product or internal platform team serving multiple engineering teams — cost attribution enables better conversations.

Instead of absorbing all webhook infrastructure costs as platform overhead, surface per-account usage in terms that correlate to actual cost drivers:

›Total events received (30 days)
›Total delivery attempts (30 days)
›Delivery success rate
›Average attempts per event
›Total payload bytes delivered

You do not need to expose raw dollar figures. Surfacing these metrics lets customers understand the delivery health of their own destinations — and creates natural incentives to fix unhealthy endpoints that are generating retry cost for everyone.

GetHook's events dashboard surfaces delivery attempt counts alongside success rates, so customers can see at a glance whether a destination's retry count is unusually high. That visibility alone typically prompts customers to investigate and fix unhealthy endpoints before they generate significant retry accumulation.

A Practical Cost Per Event Calculation

Putting it together: here is a simplified model for a mid-scale deployment.

Assumptions:

›5 million events per month
›1.4 average attempts per event = 7 million total attempts
›Average payload: 8 KB
›Compute: 2 vCPUs at $0.048/vCPU-hour = $69/month
›Database: 100 GB at $0.115/GB-month = $11.50/month
›Egress: 7M × 8 KB = 56 GB × $0.09 = $5.04/month
›Storage: 5M events × 2 KB metadata = 10 GB × $0.115 = $1.15/month

Total: ~$86.69/month for 5 million events = $0.0000174 per event

The 1.4× retry multiplier is the swing factor. If your retry multiplier climbs to 2.0 (from destination instability), attempts double to 10 million, compute rises to ~$98/month, egress to ~$7.20/month, and your cost-per-event increases by roughly 40%.

Keeping retry multipliers low — through circuit breaking, destination health monitoring, and customer-side endpoint reliability — is where the leverage is.

Webhook infrastructure is predictable to cost if you measure the right things. The teams that get surprised by infrastructure bills are the ones treating event count as their only metric. Track attempts-per-event, watch egress, enforce retention policies, and circuit-break unhealthy destinations. Those four habits keep webhook unit economics stable as you scale.

If you want event-level delivery telemetry and per-destination attempt counts without building the instrumentation yourself, start with GetHook — the data you need to run this model is available out of the box.

Webhook Cost Attribution: Measuring the True Infrastructure Cost Per Event

Why Webhook Costs Are Non-Linear

Building a Cost Model

The Retry Tax

Egress: The Cost That Sneaks Up on You

Storage Cost Attribution

Surfacing Cost Attribution to Customers

A Practical Cost Per Event Calculation

Related articles

Webhook Consumer Observability: Metrics and Alerts on the Receiving End

Designing a Great Webhook SDK: Verification, Typing, and Developer Ergonomics

Synthetic End-to-End Testing for Webhook Delivery Pipelines

Stop losing webhook events.