How Fast Should a Visitor-ID Webhook Fire?

A webhook that fires five minutes after the visit is not a real-time signal. It is a notification that your rep will see tomorrow morning if they check Slack, by which point it will be one more dead alert in a feed of dead alerts.

I am Nicolas, head of partnerships at Leadpipe. When we ship an integration with a partner, webhook latency is the first number anyone asks about. This post is the engineering walkthrough for how to think about the pixel-to-endpoint path, the failure modes that matter, the retry behaviors any serious delivery system has to design around, and where customer endpoints typically slow things down.

What “real-time” actually means

The category uses “real-time” casually. It is worth being precise about what the buyer actually needs.

Use case	Required latency	What breaks at this latency
AI SDR autonomous outreach	Under a few seconds	Agent needs signal to craft email before visitor leaves site
Live chat pop-up for high-intent visitors	Under a second	The visitor is gone if you take longer
Slack alert for sales rep	Under 10 seconds	Rep wants to call while the visitor is still on the page
CRM record creation	Under a minute	Downstream workflows will not fire until the record exists
Daily batch enrichment	Under 24 hours	Most batch jobs run nightly anyway

A single webhook has to hit the tightest of these, because the same delivery feeds all of them. Our internal target is sub-second from pixel fire to webhook dispatch, with retry handling built in for the tail.

The path from pixel to endpoint

A visitor loads a page. The pixel fires. Somewhere on the other side, your endpoint receives a POST. Between those two events, a lot happens.

1. Pixel fire
   └── browser POSTs event to ingestion endpoint
2. Ingest
   └── event validated, normalized, placed on match queue
3. Match
   └── graph lookup, suppression check, identity resolution
4. Dispatch decision
   └── first-match or every-update rule evaluated
5. Payload assembly
   └── person, company, visit, HEMs packed into JSON
6. Webhook dispatch
   └── HTTPS POST to customer endpoint with retry-on-failure
7. Customer endpoint
   └── your server returns 2xx (or fails and we retry)

Each of those steps has a latency budget. If any one step overruns, the webhook is late. The engineering is about holding each step to its budget even under load.

Where the latency budget goes

A breakdown of where time gets spent on a successful webhook delivery.

Stage	Notional budget	Can it be reduced
Pixel POST to ingest	~50-150ms	Network, mostly fixed
Ingest validation and enqueue	Small	Already lean
Match queue wait	Variable	Depends on backpressure
Match lookup (graph + suppression)	Small, cached	Index warmth is the lever
Payload assembly	Small	Mostly JSON serialization
Webhook dispatch	~50-300ms	Customer endpoint location dominates
Customer endpoint processing	Your code	Not on our side

The two stages most worth optimizing are the match queue wait (which can spike during business-hour peaks) and the match lookup (where index warmth determines whether you are hitting RAM or falling through to a slower tier).

For the broader architecture context, see scaling the identity graph to 100M+ matches a day.

First Match versus Every Update

Two dispatch modes, each with different latency characteristics.

Mode	Fires when	Latency priority
First Match	A visitor is identified for the first time	Highest priority, sub-second target
Every Update	A known visitor returns or appends new data	Lower priority, still fast but may batch

First Match is the one that drives the “hot lead just landed” alert. Every Update feeds the behavioral layer: return visits, new pages, engagement scoring. Both matter. They have different latency budgets.

The reason for the distinction: First Match is an acquisition event, and acquisition is time-sensitive. Every Update is a behavioral event, and behavior aggregates over a session. Treating them identically would cost throughput on the behavioral side without buying anything on the acquisition side.

For the payload structure on either mode, see the webhook payload reference.

Retry and failure handling

Customer endpoints fail. Their servers go down, their rate limits get hit, their code throws exceptions. A webhook delivery system that does not handle failure is a webhook delivery system that silently drops signal.

The behaviors any serious system designs around.

Exponential backoff

A failed delivery gets retried on an exponential schedule. The pattern is straightforward: retry quickly at first, then back off exponentially as failures continue. If your endpoint is down for an hour, the delivery is held and retried. If it is down for a day, the delivery terminates and the failure is logged in the webhook monitoring dashboard.

The exact schedule is a tuning parameter, not a public commitment. The principle is what matters. A serious system has a deterministic retry schedule, a defined terminal threshold, and a way to surface the terminal state so customers can investigate.

Delivery isolation

A broken customer endpoint does not cascade. Failed deliveries go to a dedicated retry queue with its own compute, so other customers’ webhooks keep flowing. This is the single most important decision in a multi-tenant delivery system. Without it, one bad endpoint can slow everyone down.

Idempotency keys

Every webhook payload includes an event ID. If your endpoint receives the same event twice (because you 2xx’d after processing but before acknowledging, for example), you can deduplicate on the event ID. We guarantee at-least-once delivery, not exactly-once. The idempotency key is how you handle the difference cleanly on your side.

Observability

A webhook monitoring dashboard should show per-endpoint delivery rate, failure rate, median and tail latency, retry counts, and last successful timestamp. Customers who are serious about integration watch it. Customers who are not, do not, and they are the ones who miss signal when something breaks.

What a slow customer endpoint looks like

Customer endpoints are often the latency bottleneck. Typical patterns:

Cold serverless function. Your first webhook wakes up a Lambda that cold-starts in 800ms. Subsequent ones are fast.
Synchronous enrichment. You receive the webhook and synchronously call Clay, Clearbit, or another enrichment API before returning 2xx. Now your webhook processing time is the sum of all upstream APIs.
CRM write inline. You write the contact to HubSpot or Salesforce inside the webhook handler. Those APIs are not always fast.

The fix is the same in each case. Receive the webhook, enqueue the work internally, return 2xx quickly, then process asynchronously. Your webhook endpoint should do almost nothing synchronously. The pattern is in the developer guide and the self-serve API post.

The sub-second target

The internal target is sub-second from pixel fire to webhook dispatch on First Match. Hitting it consistently means:

The match index has to be in-memory for the common cases. Falling through to a slower tier costs the budget.
The suppression check has to be a cached lookup, not a database query. See the suppression lists post.
The dispatch queue has to have capacity headroom. A queue running at 90% saturation is a queue that spikes latency on traffic bursts.
The webhook HTTP client has to have connection pooling, keepalive, and short timeouts.

Sub-second is not a free choice. It constrains every layer of the architecture, from how the graph is sharded to how the suppression list is cached. It is the constraint that drives most of the engineering decisions on the serve path.

The pricing-page rule is a useful reframe. A returning buyer on your pricing page is in the room for roughly seven minutes on average across B2B. If your webhook takes ten minutes to fire, the visitor is gone before the alert lands. Any latency budget below the visit duration is a usable budget. Any latency budget above it is theater.

When sub-second is not the right answer

Not every integration needs sub-second. A few cases where slower is fine or even better.

Daily batch enrichment

If you are syncing to a data warehouse overnight, a daily CSV export is simpler and cheaper than a webhook receiver. The CSV export path exists for exactly this.

High-volume aggregation

If you are tracking aggregate behavior (large session counts rolled up into reports), webhook-per-event is overkill. Pull the data in batches via the REST API.

Integrations that write back

If your workflow is “receive webhook, do work, write back to Leadpipe,” and the write-back touches the next webhook, you can create feedback loops. Sometimes async-with-queue is a safer design than sub-second synchronous.

The right question is never “how fast can we go” but “how fast does the use case actually need.” Sub-second is the target for First Match on the acquisition path. Other paths have other targets.

What I would prioritize today

Two things on the improvement list.

Per-customer latency budgets. Today the latency target is global. Enterprise customers with high-value pipelines would arguably benefit from higher-priority routing. Differentiated dispatch priority by plan tier is a design decision worth making explicit, with the tradeoffs documented for customers.
Delivery receipts as a first-class surface. The monitoring dashboard shows aggregate delivery health. An event-level delivery receipt (per webhook, per retry, per response code) would give customers a much cleaner debugging surface. Much of that data exists in logs. Exposing it in the product is the work.

The engineering principles, summarized

Five principles that any serious webhook delivery system follows. Use them as a checklist when evaluating vendors.

Principle	What to look for
Ordered delivery within a session	Events for the same person arrive in the order they happened
Retry-aware with exponential backoff	Defined retry schedule and terminal threshold, both surfaced to customers
Idempotent	Every payload carries an event ID, deduplication is possible on the customer side
Isolated failure domains	One slow customer endpoint does not slow others
Observable	Per-endpoint delivery health visible in a dashboard, not buried in logs

A vendor that can answer cleanly to all five is a vendor whose webhook delivery you can build a workflow on. A vendor that cannot is one where you will eventually find a Slack channel full of dead alerts and ask why.

What this means for customers

If you are wiring Leadpipe into an AI SDR, a live-chat pop-up, or any time-sensitive workflow, the webhook path is what you are depending on. The design targets sub-second dispatch on First Match, with exponential-backoff retry for the tail, delivery isolation so one bad endpoint does not slow others, and a monitoring surface you can actually use.

You do not need to think about any of this if your endpoint is well-designed. You definitely will think about it if it is not.

Every plan ships with the same identity graph, 23 REST endpoints, webhooks, and a 27-tool MCP server. Start in 5 minutes →