Most webhook delivery infrastructure is designed around one goal: get the event to the destination as fast as possible. That works for many use cases. It doesn't work well for customers running legacy on-premise systems that can't handle sudden spikes, or enterprise customers whose operations teams have defined maintenance windows, or businesses serving end-users in a single time zone who find 3 AM delivery of non-urgent events wasteful and noisy.
Delivery scheduling — the ability to control when an event lands, not just whether it arrives — is an underengineered part of webhook infrastructure. This post covers the three problems in this space: quiet-hour windows, time-zone-aware scheduling, and burst shaping. Each has distinct mechanics and trade-offs.
The Three Problems
Before jumping to implementation, it helps to be precise about what you're actually solving:
| Problem | Customer asks | What you build |
|---|---|---|
| Quiet hours | "Don't send webhooks between midnight and 6 AM our time" | Delivery hold with automatic release |
| Burst shaping | "Never deliver more than 100 events per minute to us" | Rate-limited delivery queue per destination |
| Scheduled delivery | "Deliver this event at 9 AM tomorrow" | Deferred delivery with explicit timestamp |
These are related but distinct. A customer might want burst shaping at all hours (their system can't handle spikes) but not quiet hours. Another might want quiet hours on non-critical event types but immediate delivery on payment events regardless of time. Design the controls independently so they can be composed.
Quiet Hours: Holding Events for a Delivery Window
A quiet-hour window means: if an event would be delivered during a blocked time range, hold it and release it when the window opens. The event is not dropped — it's held durably and delivered as soon as the window ends.
The data model
You need to store the quiet-hour configuration per destination:
CREATE TABLE destination_delivery_schedule (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
destination_id UUID NOT NULL REFERENCES destinations(id) ON DELETE CASCADE,
timezone TEXT NOT NULL, -- IANA timezone, e.g. 'America/New_York'
quiet_start TIME NOT NULL, -- e.g. '00:00:00'
quiet_end TIME NOT NULL, -- e.g. '06:00:00'
applies_to TEXT[] NOT NULL DEFAULT '{}', -- empty = all event types
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);The timezone field is critical. Storing 00:00 to 06:00 without a time zone is ambiguous — when a customer says "midnight to 6 AM," they mean their local midnight, not UTC. Use IANA zone names (America/New_York, Europe/Berlin) rather than UTC offsets because offsets shift with DST and break twice a year.
Computing the next delivery window
When your delivery worker picks up an event, it needs to determine whether the current time falls inside a quiet window and, if so, when to reschedule:
func NextDeliveryTime(now time.Time, schedule *DeliverySchedule) time.Time {
loc, err := time.LoadLocation(schedule.Timezone)
if err != nil {
// Unknown timezone: deliver immediately rather than blocking indefinitely
return now
}
localNow := now.In(loc)
localTime := localNow.Format("15:04:05")
quietStart := schedule.QuietStart // e.g. "00:00:00"
quietEnd := schedule.QuietEnd // e.g. "06:00:00"
// Check if current local time is inside the quiet window
if localTime >= quietStart && localTime < quietEnd {
// Schedule delivery at quiet_end today (or tomorrow if quiet_end already passed)
endParts := strings.Split(quietEnd, ":")
h, _ := strconv.Atoi(endParts[0])
m, _ := strconv.Atoi(endParts[1])
candidate := time.Date(
localNow.Year(), localNow.Month(), localNow.Day(),
h, m, 0, 0, loc,
)
if candidate.Before(localNow) {
candidate = candidate.Add(24 * time.Hour)
}
return candidate.UTC()
}
return now
}This function returns now if the event should be delivered immediately, or a future timestamp if it falls inside a quiet window. Your delivery worker sets next_attempt_at to this value before parking the job.
Cross-midnight windows
The logic above handles windows like 00:00–06:00 cleanly. Windows that cross midnight (23:00–05:00) need an adjustment: if quiet_end < quiet_start, the window spans two calendar days. Adjust the comparison:
if quietEnd < quietStart {
// Cross-midnight window: quiet if time >= start OR time < end
inQuiet = localTime >= quietStart || localTime < quietEnd
} else {
inQuiet = localTime >= quietStart && localTime < quietEnd
}Burst Shaping: Rate-Limiting Per Destination
Burst shaping is the outbound equivalent of rate limiting on your ingest API. Instead of constraining how fast events come in, you constrain how fast they go out to a specific destination.
The use case: a customer's endpoint can handle 100 req/min sustainably but not the 3,000-event spike that happens when you process a batch job at 2 AM. Without burst shaping, your worker delivers all 3,000 events as fast as possible, the destination's web server queues collapse, and you get 503s followed by a retry storm.
Token bucket per destination
The right data structure is a token bucket: a counter that refills at a fixed rate, capped at a maximum burst size. Each delivery consumes one token. If no tokens are available, the event is rescheduled.
CREATE TABLE destination_rate_state (
destination_id UUID PRIMARY KEY REFERENCES destinations(id) ON DELETE CASCADE,
tokens NUMERIC NOT NULL,
last_refill_at TIMESTAMPTZ NOT NULL DEFAULT now(),
rate_per_minute INT NOT NULL DEFAULT 0, -- 0 = unlimited
burst_capacity INT NOT NULL DEFAULT 0 -- max tokens at any one time
);Refill and consume atomically to avoid race conditions between concurrent workers:
-- Refill and attempt to consume one token
WITH refilled AS (
UPDATE destination_rate_state
SET
tokens = LEAST(
burst_capacity,
tokens + rate_per_minute * EXTRACT(EPOCH FROM (now() - last_refill_at)) / 60.0
),
last_refill_at = now()
WHERE destination_id = $1
RETURNING tokens, burst_capacity
)
UPDATE destination_rate_state
SET tokens = tokens - 1
WHERE destination_id = $1
AND (SELECT tokens FROM refilled) >= 1
RETURNING tokens;If the update returns no rows (tokens < 1), the event gets rescheduled by 1.0 / rate_per_minute minutes — the time it takes to accumulate one new token.
What to expose to customers
Rate limit configuration should be per destination, not global. A reasonable API:
PATCH /v1/destinations/{id}
{
"rate_limit": {
"max_per_minute": 100,
"burst_capacity": 250
}
}The burst_capacity field lets customers absorb short spikes (up to 250 events in a burst) while keeping sustained throughput at 100/min. Setting burst equal to max_per_minute gives strict rate limiting with no burst tolerance.
When a destination's queue is being shaped, surface this in your delivery logs: "status": "rate_limited", "retry_at": "2026-04-23T14:32:00Z". Customers debugging why events are slow to arrive need to know whether they're looking at a destination problem or self-imposed shaping.
Combining Quiet Hours and Burst Shaping
These two controls compose naturally. An event scheduled for delivery during quiet hours gets its next_attempt_at pushed to the window open time. When the window opens, the burst shaping layer controls how fast the backlog drains.
This means: if a customer has a 6-hour quiet window and received 10,000 events during it, those 10,000 events don't all land at 06:00:01. They land at the rate their burst shaping config allows. With max_per_minute: 100 and burst_capacity: 250, the first 250 events land in the first second, then the remainder drain at 100/min over ~163 minutes.
This is the right behavior. A single flood event at window open time defeats the purpose of having rate controls.
Scheduled drain order
When a quiet window opens and multiple events are queued, deliver them in event creation order (FIFO), not retry-attempt order. The customer's system should see events in the order they occurred, with a delivery timestamp shift — not scrambled because retried events get priority.
SELECT id FROM events
WHERE destination_id = $1
AND status = 'retry_scheduled'
AND next_attempt_at <= now()
ORDER BY occurred_at ASC, id ASC
LIMIT 100
FOR UPDATE SKIP LOCKED;The occurred_at ordering preserves causality. If an order.created event and an order.fulfilled event both land in the quiet window, the customer's handler should see them in the right sequence.
Event Priority: Bypassing Schedules for Critical Events
Not all events should be subject to quiet-hour holds. A payment failure or a security alert at 2 AM should probably land immediately, regardless of what the delivery schedule says.
Implement this with a priority flag on event types:
{
"delivery_schedule": {
"quiet_start": "00:00",
"quiet_end": "06:00",
"timezone": "America/Chicago",
"bypass_event_types": ["payment.failed", "account.suspended", "fraud.alert"]
}
}Events whose type is in bypass_event_types skip the quiet-hour check entirely. Your delivery worker checks this list before computing the next delivery time:
func ShouldHold(event *Event, schedule *DeliverySchedule, now time.Time) (bool, time.Time) {
for _, bypassType := range schedule.BypassEventTypes {
if event.EventType == bypassType {
return false, now
}
}
next := NextDeliveryTime(now, schedule)
return next.After(now), next
}This gives customers fine-grained control: "hold most things, but let critical alerts through immediately."
Observability for Scheduled Delivery
When events are being held or shaped, you need visibility into the backlog. Three metrics matter:
| Metric | Query anchor | Alert condition |
|---|---|---|
| Quiet-hold backlog | COUNT(*) WHERE status = 'retry_scheduled' AND hold_reason = 'quiet_hours' | Backlog > 24h of normal volume |
| Rate-limit queue depth | COUNT(*) WHERE status = 'retry_scheduled' AND hold_reason = 'rate_limited' | Queue not draining; growing over 30 min |
| Oldest held event age | MIN(occurred_at) WHERE status = 'retry_scheduled' per destination | Oldest event > 2× expected hold duration |
The third metric catches edge cases: if a customer's quiet window configuration is wrong (e.g., overlapping windows, invalid time zone) and events are being held indefinitely, the oldest-event-age alert fires before the customer notices their integration is backed up.
GetHook surfaces per-destination queue depth and hold state in the delivery dashboard so you can see at a glance which destinations are in a quiet window and how many events are queued behind them.
What to Build vs. What to Configure
If you're building this from scratch, prioritize in this order:
- ›
Burst shaping first — it prevents the retry storms that turn customer outages into infrastructure incidents. A destination that can't handle your delivery rate is a reliability problem for everyone.
- ›
Quiet hours second — valuable for enterprise customers with on-call schedules and SLA windows. Not needed for every deployment, but high-signal when it is needed.
- ›
Priority bypass third — without it, quiet hours become a liability the first time a critical alert gets held until morning.
Expose all three as per-destination configuration. Global defaults are useful for setting a sensible baseline, but the destination level is where real control happens.
If you're building outbound webhook infrastructure for your own customers and want delivery scheduling without reimplementing all of this yourself, start with GetHook — per-destination rate controls and delivery scheduling are available out of the box.