A webhook that fires five minutes after the visit is not a real-time signal. It is a notification that your rep will see tomorrow morning if they check Slack, by which point it will be one more dead alert in a feed of dead alerts.
I am Nicolas, head of partnerships at Leadpipe. When we ship an integration with a partner, webhook latency is the first number anyone asks about. This post is the engineering walkthrough for how to think about the pixel-to-endpoint path, the failure modes that matter, the retry behaviors any serious delivery system has to design around, and where customer endpoints typically slow things down.
What “real-time” actually means
The category uses “real-time” casually. It is worth being precise about what the buyer actually needs.
| Use case | Required latency | What breaks at this latency |
|---|---|---|
| AI SDR autonomous outreach | Under a few seconds | Agent needs signal to craft email before visitor leaves site |
| Live chat pop-up for high-intent visitors | Under a second | The visitor is gone if you take longer |
| Slack alert for sales rep | Under 10 seconds | Rep wants to call while the visitor is still on the page |
| CRM record creation | Under a minute | Downstream workflows will not fire until the record exists |
| Daily batch enrichment | Under 24 hours | Most batch jobs run nightly anyway |
A single webhook has to hit the tightest of these, because the same delivery feeds all of them. Our internal target is sub-second from pixel fire to webhook dispatch, with retry handling built in for the tail.
The path from pixel to endpoint
A visitor loads a page. The pixel fires. Somewhere on the other side, your endpoint receives a POST. Between those two events, a lot happens.
1. Pixel fire
└── browser POSTs event to ingestion endpoint
2. Ingest
└── event validated, normalized, placed on match queue
3. Match
└── graph lookup, suppression check, identity resolution
4. Dispatch decision
└── first-match or every-update rule evaluated
5. Payload assembly
└── person, company, visit, HEMs packed into JSON
6. Webhook dispatch
└── HTTPS POST to customer endpoint with retry-on-failure
7. Customer endpoint
└── your server returns 2xx (or fails and we retry)
Each of those steps has a latency budget. If any one step overruns, the webhook is late. The engineering is about holding each step to its budget even under load.
Where the latency budget goes
A breakdown of where time gets spent on a successful webhook delivery.
| Stage | Notional budget | Can it be reduced |
|---|---|---|
| Pixel POST to ingest | ~50-150ms | Network, mostly fixed |
| Ingest validation and enqueue | Small | Already lean |
| Match queue wait | Variable | Depends on backpressure |
| Match lookup (graph + suppression) | Small, cached | Index warmth is the lever |
| Payload assembly | Small | Mostly JSON serialization |
| Webhook dispatch | ~50-300ms | Customer endpoint location dominates |
| Customer endpoint processing | Your code | Not on our side |
The two stages most worth optimizing are the match queue wait (which can spike during business-hour peaks) and the match lookup (where index warmth determines whether you are hitting RAM or falling through to a slower tier).
For the broader architecture context, see scaling the identity graph to 100M+ matches a day.
First Match versus Every Update
Two dispatch modes, each with different latency characteristics.
| Mode | Fires when | Latency priority |
|---|---|---|
| First Match | A visitor is identified for the first time | Highest priority, sub-second target |
| Every Update | A known visitor returns or appends new data | Lower priority, still fast but may batch |
First Match is the one that drives the “hot lead just landed” alert. Every Update feeds the behavioral layer: return visits, new pages, engagement scoring. Both matter. They have different latency budgets.
The reason for the distinction: First Match is an acquisition event, and acquisition is time-sensitive. Every Update is a behavioral event, and behavior aggregates over a session. Treating them identically would cost throughput on the behavioral side without buying anything on the acquisition side.
For the payload structure on either mode, see the webhook payload reference.
Retry and failure handling
Customer endpoints fail. Their servers go down, their rate limits get hit, their code throws exceptions. A webhook delivery system that does not handle failure is a webhook delivery system that silently drops signal.
The behaviors any serious system designs around.
Exponential backoff
A failed delivery gets retried on an exponential schedule. The pattern is straightforward: retry quickly at first, then back off exponentially as failures continue. If your endpoint is down for an hour, the delivery is held and retried. If it is down for a day, the delivery terminates and the failure is logged in the webhook monitoring dashboard.
The exact schedule is a tuning parameter, not a public commitment. The principle is what matters. A serious system has a deterministic retry schedule, a defined terminal threshold, and a way to surface the terminal state so customers can investigate.
Delivery isolation
A broken customer endpoint does not cascade. Failed deliveries go to a dedicated retry queue with its own compute, so other customers’ webhooks keep flowing. This is the single most important decision in a multi-tenant delivery system. Without it, one bad endpoint can slow everyone down.
Idempotency keys
Every webhook payload includes an event ID. If your endpoint receives the same event twice (because you 2xx’d after processing but before acknowledging, for example), you can deduplicate on the event ID. We guarantee at-least-once delivery, not exactly-once. The idempotency key is how you handle the difference cleanly on your side.
Observability
A webhook monitoring dashboard should show per-endpoint delivery rate, failure rate, median and tail latency, retry counts, and last successful timestamp. Customers who are serious about integration watch it. Customers who are not, do not, and they are the ones who miss signal when something breaks.
What a slow customer endpoint looks like
Customer endpoints are often the latency bottleneck. Typical patterns:
- Cold serverless function. Your first webhook wakes up a Lambda that cold-starts in 800ms. Subsequent ones are fast.
- Synchronous enrichment. You receive the webhook and synchronously call Clay, Clearbit, or another enrichment API before returning 2xx. Now your webhook processing time is the sum of all upstream APIs.
- CRM write inline. You write the contact to HubSpot or Salesforce inside the webhook handler. Those APIs are not always fast.
The fix is the same in each case. Receive the webhook, enqueue the work internally, return 2xx quickly, then process asynchronously. Your webhook endpoint should do almost nothing synchronously. The pattern is in the developer guide and the self-serve API post.
The sub-second target
The internal target is sub-second from pixel fire to webhook dispatch on First Match. Hitting it consistently means:
- The match index has to be in-memory for the common cases. Falling through to a slower tier costs the budget.
- The suppression check has to be a cached lookup, not a database query. See the suppression lists post.
- The dispatch queue has to have capacity headroom. A queue running at 90% saturation is a queue that spikes latency on traffic bursts.
- The webhook HTTP client has to have connection pooling, keepalive, and short timeouts.
Sub-second is not a free choice. It constrains every layer of the architecture, from how the graph is sharded to how the suppression list is cached. It is the constraint that drives most of the engineering decisions on the serve path.
The pricing-page rule is a useful reframe. A returning buyer on your pricing page is in the room for roughly seven minutes on average across B2B. If your webhook takes ten minutes to fire, the visitor is gone before the alert lands. Any latency budget below the visit duration is a usable budget. Any latency budget above it is theater.
When sub-second is not the right answer
Not every integration needs sub-second. A few cases where slower is fine or even better.
Daily batch enrichment
If you are syncing to a data warehouse overnight, a daily CSV export is simpler and cheaper than a webhook receiver. The CSV export path exists for exactly this.
High-volume aggregation
If you are tracking aggregate behavior (large session counts rolled up into reports), webhook-per-event is overkill. Pull the data in batches via the REST API.
Integrations that write back
If your workflow is “receive webhook, do work, write back to Leadpipe,” and the write-back touches the next webhook, you can create feedback loops. Sometimes async-with-queue is a safer design than sub-second synchronous.
The right question is never “how fast can we go” but “how fast does the use case actually need.” Sub-second is the target for First Match on the acquisition path. Other paths have other targets.
What I would prioritize today
Two things on the improvement list.
- Per-customer latency budgets. Today the latency target is global. Enterprise customers with high-value pipelines would arguably benefit from higher-priority routing. Differentiated dispatch priority by plan tier is a design decision worth making explicit, with the tradeoffs documented for customers.
- Delivery receipts as a first-class surface. The monitoring dashboard shows aggregate delivery health. An event-level delivery receipt (per webhook, per retry, per response code) would give customers a much cleaner debugging surface. Much of that data exists in logs. Exposing it in the product is the work.
The engineering principles, summarized
Five principles that any serious webhook delivery system follows. Use them as a checklist when evaluating vendors.
| Principle | What to look for |
|---|---|
| Ordered delivery within a session | Events for the same person arrive in the order they happened |
| Retry-aware with exponential backoff | Defined retry schedule and terminal threshold, both surfaced to customers |
| Idempotent | Every payload carries an event ID, deduplication is possible on the customer side |
| Isolated failure domains | One slow customer endpoint does not slow others |
| Observable | Per-endpoint delivery health visible in a dashboard, not buried in logs |
A vendor that can answer cleanly to all five is a vendor whose webhook delivery you can build a workflow on. A vendor that cannot is one where you will eventually find a Slack channel full of dead alerts and ask why.
What this means for customers
If you are wiring Leadpipe into an AI SDR, a live-chat pop-up, or any time-sensitive workflow, the webhook path is what you are depending on. The design targets sub-second dispatch on First Match, with exponential-backoff retry for the tail, delivery isolation so one bad endpoint does not slow others, and a monitoring surface you can actually use.
You do not need to think about any of this if your endpoint is well-designed. You definitely will think about it if it is not.
Every plan ships with the same identity graph, 23 REST endpoints, webhooks, and a 27-tool MCP server. Start in 5 minutes →