Most webhook delivery infrastructure is designed around one goal: get the event to the destination as fast as possible. That works for many use cases. It doesn't work well for customers running legacy on-premise systems that can't handle sudden spikes, or enterprise customers whose operations teams have defined maintenance windows, or businesses serving end-users in a single time zone who find 3 AM delivery of non-urgent events wasteful and noisy.

Delivery scheduling — the ability to control when an event lands, not just whether it arrives — is an underengineered part of webhook infrastructure. This post covers the three problems in this space: quiet-hour windows, time-zone-aware scheduling, and burst shaping. Each has distinct mechanics and trade-offs.

The Three Problems

Before jumping to implementation, it helps to be precise about what you're actually solving:

Problem	Customer asks	What you build
Quiet hours	"Don't send webhooks between midnight and 6 AM our time"	Delivery hold with automatic release
Burst shaping	"Never deliver more than 100 events per minute to us"	Rate-limited delivery queue per destination
Scheduled delivery	"Deliver this event at 9 AM tomorrow"	Deferred delivery with explicit timestamp

These are related but distinct. A customer might want burst shaping at all hours (their system can't handle spikes) but not quiet hours. Another might want quiet hours on non-critical event types but immediate delivery on payment events regardless of time. Design the controls independently so they can be composed.

Quiet Hours: Holding Events for a Delivery Window

A quiet-hour window means: if an event would be delivered during a blocked time range, hold it and release it when the window opens. The event is not dropped — it's held durably and delivered as soon as the window ends.

The data model

You need to store the quiet-hour configuration per destination:

sql

CREATE TABLE destination_delivery_schedule (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    destination_id  UUID NOT NULL REFERENCES destinations(id) ON DELETE CASCADE,
    timezone        TEXT NOT NULL,               -- IANA timezone, e.g. 'America/New_York'
    quiet_start     TIME NOT NULL,               -- e.g. '00:00:00'
    quiet_end       TIME NOT NULL,               -- e.g. '06:00:00'
    applies_to      TEXT[] NOT NULL DEFAULT '{}', -- empty = all event types
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now()
);

The timezone field is critical. Storing 00:00 to 06:00 without a time zone is ambiguous — when a customer says "midnight to 6 AM," they mean their local midnight, not UTC. Use IANA zone names (America/New_York, Europe/Berlin) rather than UTC offsets because offsets shift with DST and break twice a year.

Computing the next delivery window

When your delivery worker picks up an event, it needs to determine whether the current time falls inside a quiet window and, if so, when to reschedule:

func NextDeliveryTime(now time.Time, schedule *DeliverySchedule) time.Time {
    loc, err := time.LoadLocation(schedule.Timezone)
    if err != nil {
        // Unknown timezone: deliver immediately rather than blocking indefinitely
        return now
    }

    localNow := now.In(loc)
    localTime := localNow.Format("15:04:05")

    quietStart := schedule.QuietStart // e.g. "00:00:00"
    quietEnd := schedule.QuietEnd     // e.g. "06:00:00"

    // Check if current local time is inside the quiet window
    if localTime >= quietStart && localTime < quietEnd {
        // Schedule delivery at quiet_end today (or tomorrow if quiet_end already passed)
        endParts := strings.Split(quietEnd, ":")
        h, _ := strconv.Atoi(endParts[0])
        m, _ := strconv.Atoi(endParts[1])

        candidate := time.Date(
            localNow.Year(), localNow.Month(), localNow.Day(),
            h, m, 0, 0, loc,
        )
        if candidate.Before(localNow) {
            candidate = candidate.Add(24 * time.Hour)
        }
        return candidate.UTC()
    }

    return now
}

This function returns now if the event should be delivered immediately, or a future timestamp if it falls inside a quiet window. Your delivery worker sets next_attempt_at to this value before parking the job.

Cross-midnight windows

The logic above handles windows like 00:00–06:00 cleanly. Windows that cross midnight (23:00–05:00) need an adjustment: if quiet_end < quiet_start, the window spans two calendar days. Adjust the comparison:

if quietEnd < quietStart {
    // Cross-midnight window: quiet if time >= start OR time < end
    inQuiet = localTime >= quietStart || localTime < quietEnd
} else {
    inQuiet = localTime >= quietStart && localTime < quietEnd
}

Burst Shaping: Rate-Limiting Per Destination

Burst shaping is the outbound equivalent of rate limiting on your ingest API. Instead of constraining how fast events come in, you constrain how fast they go out to a specific destination.

The use case: a customer's endpoint can handle 100 req/min sustainably but not the 3,000-event spike that happens when you process a batch job at 2 AM. Without burst shaping, your worker delivers all 3,000 events as fast as possible, the destination's web server queues collapse, and you get 503s followed by a retry storm.

Token bucket per destination

The right data structure is a token bucket: a counter that refills at a fixed rate, capped at a maximum burst size. Each delivery consumes one token. If no tokens are available, the event is rescheduled.

sql

CREATE TABLE destination_rate_state (
    destination_id  UUID PRIMARY KEY REFERENCES destinations(id) ON DELETE CASCADE,
    tokens          NUMERIC NOT NULL,
    last_refill_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
    rate_per_minute INT NOT NULL DEFAULT 0,   -- 0 = unlimited
    burst_capacity  INT NOT NULL DEFAULT 0    -- max tokens at any one time
);

Refill and consume atomically to avoid race conditions between concurrent workers:

sql

-- Refill and attempt to consume one token
WITH refilled AS (
    UPDATE destination_rate_state
    SET
        tokens = LEAST(
            burst_capacity,
            tokens + rate_per_minute * EXTRACT(EPOCH FROM (now() - last_refill_at)) / 60.0
        ),
        last_refill_at = now()
    WHERE destination_id = $1
    RETURNING tokens, burst_capacity
)
UPDATE destination_rate_state
SET tokens = tokens - 1
WHERE destination_id = $1
  AND (SELECT tokens FROM refilled) >= 1
RETURNING tokens;

If the update returns no rows (tokens < 1), the event gets rescheduled by 1.0 / rate_per_minute minutes — the time it takes to accumulate one new token.

What to expose to customers

Rate limit configuration should be per destination, not global. A reasonable API:

json

PATCH /v1/destinations/{id}

{
  "rate_limit": {
    "max_per_minute": 100,
    "burst_capacity": 250
  }
}

The burst_capacity field lets customers absorb short spikes (up to 250 events in a burst) while keeping sustained throughput at 100/min. Setting burst equal to max_per_minute gives strict rate limiting with no burst tolerance.

When a destination's queue is being shaped, surface this in your delivery logs: "status": "rate_limited", "retry_at": "2026-04-23T14:32:00Z". Customers debugging why events are slow to arrive need to know whether they're looking at a destination problem or self-imposed shaping.

Combining Quiet Hours and Burst Shaping

These two controls compose naturally. An event scheduled for delivery during quiet hours gets its next_attempt_at pushed to the window open time. When the window opens, the burst shaping layer controls how fast the backlog drains.

This means: if a customer has a 6-hour quiet window and received 10,000 events during it, those 10,000 events don't all land at 06:00:01. They land at the rate their burst shaping config allows. With max_per_minute: 100 and burst_capacity: 250, the first 250 events land in the first second, then the remainder drain at 100/min over ~163 minutes.

This is the right behavior. A single flood event at window open time defeats the purpose of having rate controls.

Scheduled drain order

When a quiet window opens and multiple events are queued, deliver them in event creation order (FIFO), not retry-attempt order. The customer's system should see events in the order they occurred, with a delivery timestamp shift — not scrambled because retried events get priority.

sql

SELECT id FROM events
WHERE destination_id = $1
  AND status = 'retry_scheduled'
  AND next_attempt_at <= now()
ORDER BY occurred_at ASC, id ASC
LIMIT 100
FOR UPDATE SKIP LOCKED;

The occurred_at ordering preserves causality. If an order.created event and an order.fulfilled event both land in the quiet window, the customer's handler should see them in the right sequence.

Event Priority: Bypassing Schedules for Critical Events

Not all events should be subject to quiet-hour holds. A payment failure or a security alert at 2 AM should probably land immediately, regardless of what the delivery schedule says.

Implement this with a priority flag on event types:

json

{
  "delivery_schedule": {
    "quiet_start": "00:00",
    "quiet_end": "06:00",
    "timezone": "America/Chicago",
    "bypass_event_types": ["payment.failed", "account.suspended", "fraud.alert"]
  }
}

Events whose type is in bypass_event_types skip the quiet-hour check entirely. Your delivery worker checks this list before computing the next delivery time:

func ShouldHold(event *Event, schedule *DeliverySchedule, now time.Time) (bool, time.Time) {
    for _, bypassType := range schedule.BypassEventTypes {
        if event.EventType == bypassType {
            return false, now
        }
    }
    next := NextDeliveryTime(now, schedule)
    return next.After(now), next
}

This gives customers fine-grained control: "hold most things, but let critical alerts through immediately."

Observability for Scheduled Delivery

When events are being held or shaped, you need visibility into the backlog. Three metrics matter:

Metric	Query anchor	Alert condition
Quiet-hold backlog	`COUNT(*) WHERE status = 'retry_scheduled' AND hold_reason = 'quiet_hours'`	Backlog > 24h of normal volume
Rate-limit queue depth	`COUNT(*) WHERE status = 'retry_scheduled' AND hold_reason = 'rate_limited'`	Queue not draining; growing over 30 min
Oldest held event age	`MIN(occurred_at) WHERE status = 'retry_scheduled'` per destination	Oldest event > 2× expected hold duration

The third metric catches edge cases: if a customer's quiet window configuration is wrong (e.g., overlapping windows, invalid time zone) and events are being held indefinitely, the oldest-event-age alert fires before the customer notices their integration is backed up.

GetHook surfaces per-destination queue depth and hold state in the delivery dashboard so you can see at a glance which destinations are in a quiet window and how many events are queued behind them.

What to Build vs. What to Configure

If you're building this from scratch, prioritize in this order:

›
Burst shaping first — it prevents the retry storms that turn customer outages into infrastructure incidents. A destination that can't handle your delivery rate is a reliability problem for everyone.
›
Quiet hours second — valuable for enterprise customers with on-call schedules and SLA windows. Not needed for every deployment, but high-signal when it is needed.
›
Priority bypass third — without it, quiet hours become a liability the first time a critical alert gets held until morning.

Expose all three as per-destination configuration. Global defaults are useful for setting a sensible baseline, but the destination level is where real control happens.

If you're building outbound webhook infrastructure for your own customers and want delivery scheduling without reimplementing all of this yourself, start with GetHook — per-destination rate controls and delivery scheduling are available out of the box.

Webhook Delivery Scheduling: Quiet Hours, Time Zones, and Burst Shaping for Global Customers