Most webhook pipelines validate payloads too late. The event arrives at the ingest endpoint, passes a basic content-type check, gets persisted to the queue, travels through the delivery worker, and only fails when the consumer's application code tries to dereference a field that isn't there — or worse, silently processes a malformed event and corrupts state downstream.
Shifting validation left — into the gateway layer, before the event enters your queue — eliminates an entire class of failures. It also gives you a place to surface clear errors to the sender rather than silent delivery failures that nobody notices until a customer files a support ticket.
This post covers how to implement JSON Schema validation at the ingest layer, the trade-offs involved, and how to handle the operational realities of schema evolution without breaking existing producers.
Why the Ingest Layer Is the Right Place
You have three places to validate a webhook payload: at the sender, at the gateway, or at the consumer. Each has a different failure mode.
| Validation point | Who sees the error | Failure mode when skipped |
|---|---|---|
| Sender-side | Sender only (before sending) | Malformed events silently enter the pipeline |
| Gateway (ingest) | Sender gets HTTP 4xx immediately | No — this is the goal |
| Consumer-side | Shows up as a processing error, often hours later | Events in DLQ with no actionable context |
Consumer-side validation is necessary — you should validate before acting on any data — but it is not sufficient as a replacement for gateway validation. By the time an event reaches a consumer, you've already consumed queue capacity, triggered retries, and potentially routed the event to multiple destinations. A rejection at the consumer level tells you something went wrong; it doesn't tell the sender what to fix.
Gateway validation returns a 400 Bad Request synchronously to the sender, with a machine-readable description of exactly which fields are wrong. The sender can fix the payload and resend. No events enter the queue, no downstream systems are affected.
Defining Schemas Per Source
The unit of validation is a source schema: a JSON Schema document attached to a source endpoint. Different sources can have different schemas — a Stripe-shaped ingest endpoint should not accept the same payload shape as a GitHub-shaped one.
A minimal but useful JSON Schema for a webhook event:
{
"$schema": "https://json-schema.org/draft/2020-12",
"type": "object",
"required": ["id", "type", "created_at", "data"],
"additionalProperties": true,
"properties": {
"id": {
"type": "string",
"minLength": 1,
"description": "Unique event identifier"
},
"type": {
"type": "string",
"pattern": "^[a-z][a-z0-9_]*\\.[a-z][a-z0-9_]*$",
"description": "Event type in dot-notation, e.g. order.created"
},
"created_at": {
"type": "string",
"format": "date-time",
"description": "ISO 8601 timestamp"
},
"data": {
"type": "object",
"description": "Event-specific payload"
}
}
}The key decisions here:
additionalProperties: true — allow unknown fields at the envelope level. Providers add fields over time; rejecting unknown fields breaks producers on schema changes they didn't know were coming.
Pattern constraint on type — the dot-notation pattern (order.created, payment.failed) enforces a consistent event taxonomy without being overly restrictive.
format: date-time on created_at — format validation is typically opt-in in JSON Schema validators. Enable it. A malformed timestamp that slips through becomes a parsing error in every downstream consumer.
Store the schema document as a JSONB column on your sources table:
ALTER TABLE sources
ADD COLUMN payload_schema JSONB,
ADD COLUMN schema_mode TEXT NOT NULL DEFAULT 'disabled';
-- disabled | warn | enforceThe schema_mode column is important. Don't hard-launch schema enforcement — roll it out in stages.
Validation in the Ingest Handler
Your ingest handler runs after signature verification and before persisting the event. The validation step sits between those two:
func (h *IngestHandler) Handle(w http.ResponseWriter, r *http.Request) {
source, err := h.sources.GetByToken(r.Context(), pathToken(r))
if err != nil || source == nil {
httpx.NotFound(w, "source not found")
return
}
body, err := io.ReadAll(io.LimitReader(r.Body, maxBodyBytes))
if err != nil {
httpx.BadRequest(w, "failed to read body")
return
}
// 1. Verify signature (fail fast before any other work)
if err := verifySignature(source, r.Header, body); err != nil {
httpx.Unauthorized(w, "signature verification failed")
return
}
// 2. Validate payload schema if configured
if source.PayloadSchema != nil && source.SchemaMode != "disabled" {
violations, err := h.validator.Validate(source.PayloadSchema, body)
if err != nil {
httpx.InternalError(w, err)
return
}
if len(violations) > 0 {
if source.SchemaMode == "enforce" {
httpx.BadRequest(w, formatViolations(violations))
return
}
// warn mode: log violations but accept the event
log.Printf("schema violations on source %s: %v", source.ID, violations)
}
}
// 3. Persist and queue
event, err := h.events.Create(r.Context(), source, body)
if err != nil {
httpx.InternalError(w, err)
return
}
httpx.Created(w, map[string]string{"event_id": event.ID.String()})
}The formatViolations function serializes violations into a JSON response the sender can parse:
func formatViolations(violations []ValidationViolation) string {
type errResponse struct {
Error string `json:"error"`
Violations []ValidationViolation `json:"violations"`
}
b, _ := json.Marshal(errResponse{
Error: "payload does not match source schema",
Violations: violations,
})
return string(b)
}A well-formed error response looks like:
{
"error": "payload does not match source schema",
"violations": [
{
"path": "/type",
"message": "does not match pattern '^[a-z][a-z0-9_]*\\.[a-z][a-z0-9_]*$'",
"value": "OrderCreated"
},
{
"path": "/created_at",
"message": "invalid date-time format",
"value": "2026-04-19 09:00:00"
}
]
}This tells the sender exactly what to fix. "OrderCreated" should be "order.created". The timestamp is missing the T separator and timezone offset.
Choosing a JSON Schema Validator
In Go, the mature options are:
| Library | Draft support | Format validation | Performance |
|---|---|---|---|
santhosh-tekuri/jsonschema | Draft 4–2020-12 | Built-in, configurable | Fast, low allocation |
xeipuuv/gojsonschema | Draft 4 only | Partial | Moderate |
qri-io/jsonschema | Draft 7 | Partial | Slower at high volume |
santhosh-tekuri/jsonschema is the strongest choice for production use: full 2020-12 support, explicit format validators, and it's safe to call concurrently. Compile the schema once at startup and reuse the compiled form — schema compilation is expensive; validation against a compiled schema is not.
import "github.com/santhosh-tekuri/jsonschema/v6"
type Validator struct {
// compiled schemas keyed by source ID
cache map[uuid.UUID]*jsonschema.Schema
mu sync.RWMutex
}
func (v *Validator) Validate(sourceID uuid.UUID, schemaDoc json.RawMessage, payload []byte) ([]string, error) {
v.mu.RLock()
compiled, ok := v.cache[sourceID]
v.mu.RUnlock()
if !ok {
var err error
compiled, err = compileSchema(schemaDoc)
if err != nil {
return nil, fmt.Errorf("compile schema: %w", err)
}
v.mu.Lock()
v.cache[sourceID] = compiled
v.mu.Unlock()
}
var doc any
if err := json.Unmarshal(payload, &doc); err != nil {
return []string{"body is not valid JSON"}, nil
}
if err := compiled.Validate(doc); err != nil {
var ve *jsonschema.ValidationError
if errors.As(err, &ve) {
return extractViolations(ve), nil
}
return nil, err
}
return nil, nil
}Invalidate a source's compiled schema from the cache whenever the schema document is updated via the management API. A simple approach: add a schema_version integer column to sources and evict the cache entry on any increment.
Rolling Out Without Breaking Producers
Do not flip schema_mode from disabled to enforce on a production source. Run through this sequence:
Step 1: Warn mode (1–2 weeks)
Set schema_mode = 'warn'. Log violations. Do not reject. This gives you a real-world sample of what your producers are actually sending. You will discover fields you missed in the schema definition and format assumptions that don't match reality.
Step 2: Alert on violations
While in warn mode, emit a metric for each violation by source. Set an alert if the violation rate exceeds 1% of events. This tells you whether there's active drift between your schema and producer behavior before enforcement bites anyone.
Step 3: Schema review with producers
Share the violation log with whoever owns the sender. In practice, the violations often fall into two categories: genuine bugs in the producer (wrong field name, wrong format) and intentional extensions (fields you didn't know about). Update the schema to reflect the extensions; work with the producer to fix the bugs.
Step 4: Enforce on a test source first
Create a shadow source with identical configuration but schema_mode = 'enforce'. Route a copy of production traffic to it (if your ingest supports mirroring) and confirm the rejection rate is zero before enabling enforcement on the live source.
Step 5: Enforce on production
Flip schema_mode = 'enforce'. Monitor the 400 rate on the ingest endpoint for the next 30 minutes.
This rollout takes longer than just flipping a switch, but it means you don't break a production integration by enforcing a schema that was never tested against real traffic.
Event Type-Specific Schemas
The envelope schema catches missing required fields, but the data object is where most domain-specific bugs live. You can extend validation to cover the contents of data based on the type field using JSON Schema's if/then or oneOf constructs:
{
"type": "object",
"required": ["id", "type", "created_at", "data"],
"if": {
"properties": { "type": { "const": "order.created" } }
},
"then": {
"properties": {
"data": {
"type": "object",
"required": ["order_id", "customer_id", "total_cents", "currency"],
"properties": {
"order_id": { "type": "string" },
"customer_id": { "type": "string" },
"total_cents": { "type": "integer", "minimum": 0 },
"currency": { "type": "string", "pattern": "^[A-Z]{3}$" }
}
}
}
}
}This approach works well when you have a bounded set of event types. For sources with many event types, maintain event-type schemas as separate documents and compose them at runtime. The compilation cost is paid once; all subsequent validations are fast.
Operational Metrics to Track
Once enforcement is live, track these four signals:
| Metric | Meaning | Alert if |
|---|---|---|
ingest.validation.violations_total | Count of validation violations (warn mode) | Rate spikes above baseline |
ingest.rejected_total | Count of 400s from schema enforcement | Any value above zero in steady state |
ingest.schema_compile_errors_total | Failed schema compilations (bad schema doc) | Any |
ingest.validation_duration_p99 | Latency added by validation | Above 5 ms |
The last metric matters more than you'd expect. JSON Schema validation against a compiled schema is fast — typically under 500 µs for a typical payload. But if your schema uses deeply nested oneOf or large enum arrays, validation latency can creep up. Profile before enabling enforcement on high-throughput sources.
Schema enforcement at the ingest layer is one of those investments that pays off invisibly: fewer production incidents, cleaner event logs, and faster debugging when something does go wrong because the event that entered the pipeline is guaranteed to be structurally sound.
If you want to add payload schema validation to your ingest pipeline without building the enforcement layer yourself, GetHook supports per-source JSON Schema configuration — you upload the schema, set the mode, and the gateway handles rejection, logging, and violation metrics automatically.