Product

How Fast Should a Visitor-ID Webhook Fire?

Webhook latency is the difference between a useful signal and a dead alert. The engineering behind sub-second delivery: hot path, retries, slow endpoints.

Nicolas Canal Nicolas Canal · · 10 min read
How Fast Should a Visitor-ID Webhook Fire?

A webhook that fires five minutes after the visit is not a real-time signal. It is a notification that your rep will see tomorrow morning if they check Slack, by which point it will be one more dead alert in a feed of dead alerts.

I am Nicolas, head of partnerships at Leadpipe. When we ship an integration with a partner, webhook latency is the first number anyone asks about. This post is the engineering walkthrough for how to think about the pixel-to-endpoint path, the failure modes that matter, the retry behaviors any serious delivery system has to design around, and where customer endpoints typically slow things down.


What “real-time” actually means

The category uses “real-time” casually. It is worth being precise about what the buyer actually needs.

Use caseRequired latencyWhat breaks at this latency
AI SDR autonomous outreachUnder a few secondsAgent needs signal to craft email before visitor leaves site
Live chat pop-up for high-intent visitorsUnder a secondThe visitor is gone if you take longer
Slack alert for sales repUnder 10 secondsRep wants to call while the visitor is still on the page
CRM record creationUnder a minuteDownstream workflows will not fire until the record exists
Daily batch enrichmentUnder 24 hoursMost batch jobs run nightly anyway

A single webhook has to hit the tightest of these, because the same delivery feeds all of them. Our internal target is sub-second from pixel fire to webhook dispatch, with retry handling built in for the tail.


The path from pixel to endpoint

A visitor loads a page. The pixel fires. Somewhere on the other side, your endpoint receives a POST. Between those two events, a lot happens.

1. Pixel fire
   └── browser POSTs event to ingestion endpoint
2. Ingest
   └── event validated, normalized, placed on match queue
3. Match
   └── graph lookup, suppression check, identity resolution
4. Dispatch decision
   └── first-match or every-update rule evaluated
5. Payload assembly
   └── person, company, visit, HEMs packed into JSON
6. Webhook dispatch
   └── HTTPS POST to customer endpoint with retry-on-failure
7. Customer endpoint
   └── your server returns 2xx (or fails and we retry)

Each of those steps has a latency budget. If any one step overruns, the webhook is late. The engineering is about holding each step to its budget even under load.


Where the latency budget goes

A breakdown of where time gets spent on a successful webhook delivery.

StageNotional budgetCan it be reduced
Pixel POST to ingest~50-150msNetwork, mostly fixed
Ingest validation and enqueueSmallAlready lean
Match queue waitVariableDepends on backpressure
Match lookup (graph + suppression)Small, cachedIndex warmth is the lever
Payload assemblySmallMostly JSON serialization
Webhook dispatch~50-300msCustomer endpoint location dominates
Customer endpoint processingYour codeNot on our side

The two stages most worth optimizing are the match queue wait (which can spike during business-hour peaks) and the match lookup (where index warmth determines whether you are hitting RAM or falling through to a slower tier).

For the broader architecture context, see scaling the identity graph to 100M+ matches a day.


First Match versus Every Update

Two dispatch modes, each with different latency characteristics.

ModeFires whenLatency priority
First MatchA visitor is identified for the first timeHighest priority, sub-second target
Every UpdateA known visitor returns or appends new dataLower priority, still fast but may batch

First Match is the one that drives the “hot lead just landed” alert. Every Update feeds the behavioral layer: return visits, new pages, engagement scoring. Both matter. They have different latency budgets.

The reason for the distinction: First Match is an acquisition event, and acquisition is time-sensitive. Every Update is a behavioral event, and behavior aggregates over a session. Treating them identically would cost throughput on the behavioral side without buying anything on the acquisition side.

For the payload structure on either mode, see the webhook payload reference.


Retry and failure handling

Customer endpoints fail. Their servers go down, their rate limits get hit, their code throws exceptions. A webhook delivery system that does not handle failure is a webhook delivery system that silently drops signal.

The behaviors any serious system designs around.

Exponential backoff

A failed delivery gets retried on an exponential schedule. The pattern is straightforward: retry quickly at first, then back off exponentially as failures continue. If your endpoint is down for an hour, the delivery is held and retried. If it is down for a day, the delivery terminates and the failure is logged in the webhook monitoring dashboard.

The exact schedule is a tuning parameter, not a public commitment. The principle is what matters. A serious system has a deterministic retry schedule, a defined terminal threshold, and a way to surface the terminal state so customers can investigate.

Delivery isolation

A broken customer endpoint does not cascade. Failed deliveries go to a dedicated retry queue with its own compute, so other customers’ webhooks keep flowing. This is the single most important decision in a multi-tenant delivery system. Without it, one bad endpoint can slow everyone down.

Idempotency keys

Every webhook payload includes an event ID. If your endpoint receives the same event twice (because you 2xx’d after processing but before acknowledging, for example), you can deduplicate on the event ID. We guarantee at-least-once delivery, not exactly-once. The idempotency key is how you handle the difference cleanly on your side.

Observability

A webhook monitoring dashboard should show per-endpoint delivery rate, failure rate, median and tail latency, retry counts, and last successful timestamp. Customers who are serious about integration watch it. Customers who are not, do not, and they are the ones who miss signal when something breaks.


What a slow customer endpoint looks like

Customer endpoints are often the latency bottleneck. Typical patterns:

  1. Cold serverless function. Your first webhook wakes up a Lambda that cold-starts in 800ms. Subsequent ones are fast.
  2. Synchronous enrichment. You receive the webhook and synchronously call Clay, Clearbit, or another enrichment API before returning 2xx. Now your webhook processing time is the sum of all upstream APIs.
  3. CRM write inline. You write the contact to HubSpot or Salesforce inside the webhook handler. Those APIs are not always fast.

The fix is the same in each case. Receive the webhook, enqueue the work internally, return 2xx quickly, then process asynchronously. Your webhook endpoint should do almost nothing synchronously. The pattern is in the developer guide and the self-serve API post.


The sub-second target

The internal target is sub-second from pixel fire to webhook dispatch on First Match. Hitting it consistently means:

  • The match index has to be in-memory for the common cases. Falling through to a slower tier costs the budget.
  • The suppression check has to be a cached lookup, not a database query. See the suppression lists post.
  • The dispatch queue has to have capacity headroom. A queue running at 90% saturation is a queue that spikes latency on traffic bursts.
  • The webhook HTTP client has to have connection pooling, keepalive, and short timeouts.

Sub-second is not a free choice. It constrains every layer of the architecture, from how the graph is sharded to how the suppression list is cached. It is the constraint that drives most of the engineering decisions on the serve path.

The pricing-page rule is a useful reframe. A returning buyer on your pricing page is in the room for roughly seven minutes on average across B2B. If your webhook takes ten minutes to fire, the visitor is gone before the alert lands. Any latency budget below the visit duration is a usable budget. Any latency budget above it is theater.


When sub-second is not the right answer

Not every integration needs sub-second. A few cases where slower is fine or even better.

Daily batch enrichment

If you are syncing to a data warehouse overnight, a daily CSV export is simpler and cheaper than a webhook receiver. The CSV export path exists for exactly this.

High-volume aggregation

If you are tracking aggregate behavior (large session counts rolled up into reports), webhook-per-event is overkill. Pull the data in batches via the REST API.

Integrations that write back

If your workflow is “receive webhook, do work, write back to Leadpipe,” and the write-back touches the next webhook, you can create feedback loops. Sometimes async-with-queue is a safer design than sub-second synchronous.

The right question is never “how fast can we go” but “how fast does the use case actually need.” Sub-second is the target for First Match on the acquisition path. Other paths have other targets.


What I would prioritize today

Two things on the improvement list.

  1. Per-customer latency budgets. Today the latency target is global. Enterprise customers with high-value pipelines would arguably benefit from higher-priority routing. Differentiated dispatch priority by plan tier is a design decision worth making explicit, with the tradeoffs documented for customers.
  2. Delivery receipts as a first-class surface. The monitoring dashboard shows aggregate delivery health. An event-level delivery receipt (per webhook, per retry, per response code) would give customers a much cleaner debugging surface. Much of that data exists in logs. Exposing it in the product is the work.

The engineering principles, summarized

Five principles that any serious webhook delivery system follows. Use them as a checklist when evaluating vendors.

PrincipleWhat to look for
Ordered delivery within a sessionEvents for the same person arrive in the order they happened
Retry-aware with exponential backoffDefined retry schedule and terminal threshold, both surfaced to customers
IdempotentEvery payload carries an event ID, deduplication is possible on the customer side
Isolated failure domainsOne slow customer endpoint does not slow others
ObservablePer-endpoint delivery health visible in a dashboard, not buried in logs

A vendor that can answer cleanly to all five is a vendor whose webhook delivery you can build a workflow on. A vendor that cannot is one where you will eventually find a Slack channel full of dead alerts and ask why.


What this means for customers

If you are wiring Leadpipe into an AI SDR, a live-chat pop-up, or any time-sensitive workflow, the webhook path is what you are depending on. The design targets sub-second dispatch on First Match, with exponential-backoff retry for the tail, delivery isolation so one bad endpoint does not slow others, and a monitoring surface you can actually use.

You do not need to think about any of this if your endpoint is well-designed. You definitely will think about it if it is not.


Every plan ships with the same identity graph, 23 REST endpoints, webhooks, and a 27-tool MCP server. Start in 5 minutes →