Your enrichment provider says it covers 70% of your contacts. So you feed it a list, run the waterfall, and sure enough - most records come back filled. Emails, phone numbers, firmographics. Feels productive.
But step back for a second.
Where did that list come from? Form fills? A scraped LinkedIn export? A conference badge scan from six months ago? Whatever the source, you’re enriching contacts you already know about. The 97% of website visitors who never filled out a form? Your enrichment stack has no idea they exist.
That’s the gap. And it’s enormous.
Single-provider enrichment covers 50-70% of records you give it. Waterfall enrichment - cascading through multiple providers - pushes that to 85-95%. Impressive numbers. But both approaches share the same blind spot: they only work on known contacts.
What about the thousands of anonymous visitors hitting your pricing page, reading your case studies, and bouncing every single day? Those are your highest-intent prospects, and your enrichment stack can’t touch them because it doesn’t know they’re there.
This guide shows you how to fix that by building the complete signal stack - from anonymous visitor to fully enriched, scored, researched lead - using visitor identification as the foundation layer.
Table of Contents
- Why Single-Provider Enrichment Fails
- The Waterfall Model Explained
- The Missing Layer: Visitor Identification as Layer 0
- The Complete Signal Stack
- Layer-by-Layer Breakdown
- Coverage Improvements: The Math
- Cost Per Fully Enriched Lead
- Implementation Options
- Pro Tips for Credit Efficiency
- FAQ
Why Single-Provider Enrichment Fails
No single data provider has everything. Each one maintains a different database, collects data through different methods, and covers different segments of the market. The result: predictable coverage gaps.
Here’s a rough picture of how the major providers stack up:
| Provider | Strong On | Weak On | Typical Fill Rate |
|---|---|---|---|
| Apollo | Email addresses, B2B contacts | Phone numbers, SMB coverage | 55-65% |
| Lusha | Direct dials, phone numbers | Smaller overall database | 50-60% |
| Clearbit (now Breeze) | Company data, firmographics | Person-level contact info | 45-55% |
| ZoomInfo | Enterprise contacts, org charts | SMB, international | 60-70% |
| PeopleDataLabs | Breadth, developer-friendly API | Data freshness, accuracy | 50-60% |
See the pattern? Apollo might nail 60% of your email lookups but whiff on half the phone numbers. Lusha fills in the phones but has a smaller total database. Clearbit gives you beautiful company profiles but won’t reliably produce direct emails for the people at those companies.
When you rely on a single provider, you’re accepting a 30-50% gap across your records. That’s not a rounding error. That’s half your pipeline walking around with missing phone numbers, outdated titles, or no email at all.
And the data decay problem makes it worse. People change jobs. Companies get acquired. Phone numbers go stale. Even a provider with 70% initial coverage starts degrading the moment you pull the data.
The industry figured out a solution: don’t pick one provider. Pick all of them.
The Waterfall Model Explained
Waterfall enrichment is simple in concept. Instead of querying one provider and accepting whatever comes back, you query Provider A first. For any fields that come back empty, you query Provider B. Still missing data? Try Provider C. And so on.
Record: jane@acme.com
│
▼
┌──────────┐
│ Apollo │ → Found email ✓, phone ✗, title ✗
└────┬─────┘
│ (phone + title still missing)
▼
┌──────────┐
│ Lusha │ → Found phone ✓, title ✗
└────┬─────┘
│ (title still missing)
▼
┌──────────┐
│ PDL │ → Found title ✓
└──────────┘
Result: email ✓ phone ✓ title ✓ (90% fill)
The math stacks up fast:
- Provider A alone: 60% fill rate
- Provider A + B: catches 20% of what A missed = ~80% fill rate
- Provider A + B + C: catches 10% of what B missed = ~90% fill rate
This is why tools like Clay have exploded in popularity. Clay automates the waterfall across 150+ data providers so you don’t have to manually orchestrate API calls. You set up the cascade logic once, and every record gets run through the chain automatically.
The results are real. OpenAI reportedly used Clay’s waterfall model to double their enrichment rates from 40% to 80%. Across Clay’s customer base, teams regularly hit 85-95% fill rates when waterfalling through three or more providers.
But there’s a catch that almost nobody talks about.
Waterfall enrichment makes your known contacts more complete. It does nothing for contacts you don’t know about.
You still need a starting input. A name. An email. A domain. The waterfall enriches what you feed it. If your input is a list of 300 form fills from last month, you’ll get 270 beautifully enriched records. Meanwhile, the 9,700 anonymous visitors who browsed your site and left? The waterfall never saw them.
That’s where the entire model breaks.
The Missing Layer: Visitor Identification as Layer 0
Waterfall enrichment is Layer 1. It takes known contacts and makes them more complete. But the real leverage is in what happens before the waterfall - identifying who those anonymous visitors are in the first place.
This is what we call Layer 0: Identification.
Think about where your highest-intent buyers actually are right now. They’re not sitting in your CRM. They’re not on your email list. They’re on your website, right now, reading your pricing page, browsing your integrations, comparing you against alternatives. And 97% of them will leave without ever telling you who they are.
Visitor identification solves this by resolving anonymous website sessions into real contact records - person-level data including name, email, company, job title, LinkedIn URL, and behavioral signals like pages visited and session duration.
Here’s why Layer 0 changes the economics of everything downstream:
- Without Layer 0: Your waterfall only enriches form fills (3% of traffic). You’re waterfalling cold lists.
- With Layer 0: Your waterfall enriches identified visitors (30-40% of traffic). You’re waterfalling warm, high-intent contacts.
That’s the difference between enriching 300 contacts from form fills vs. 3,000-4,000 contacts from visitor identification. Same traffic. Same waterfall. 10-13x more leads entering the enrichment pipeline.
And because these contacts were actively browsing your site when they were identified, they’re inherently higher intent than any scraped list or purchased database. The enrichment data you add downstream gets applied to people who already showed buying signals.
The Complete Signal Stack
Here’s the full architecture. Six layers. Each one transforms the data from the layer above and feeds it to the layer below.
┌─────────────────────────────────────────────────────────┐
│ THE COMPLETE SIGNAL STACK │
├─────────────────────────────────────────────────────────┤
│ │
│ Layer 0: IDENTIFY Leadpipe → Who is this visitor? │
│ ↓ Name, email, company, pages │
│ │
│ Layer 1: VALIDATE ZeroBounce / NeverBounce │
│ ↓ Is the email deliverable? │
│ │
│ Layer 2: ENRICH Apollo → Lusha → PeopleDataLabs │
│ ↓ Phone, LinkedIn, firmographics │
│ │
│ Layer 3: INTENT Leadpipe Orbit API │
│ ↓ What topics are they researching? │
│ │
│ Layer 4: RESEARCH Claygent / Perplexity │
│ ↓ Company news, challenges, fit │
│ │
│ Layer 5: SCORE ICP formula │
│ ↓ Qualified? Priority tier? │
│ │
│ Layer 6: ACT CRM / AI SDR / Slack │
│ Outreach, sequence, notify │
└─────────────────────────────────────────────────────────┘
Each layer multiplies the value of the one before it. Identification without enrichment gives you partial records. Enrichment without validation wastes credits on bad emails. Validation without intent scoring treats every lead equally. Intent without research produces generic outreach.
The stack only works when all layers are connected. Miss one, and you lose the compounding effect.
Most teams have some of these layers. Almost nobody has all of them wired together. The good news: the tooling has matured to the point where you can build this entire stack in an afternoon for under $650/month.
Layer-by-Layer Breakdown
Let’s walk through each layer - what it does, the recommended tool, the cost, and what it adds to your lead record.
Layer 0: Identify
Purpose: Turn anonymous website visitors into known contacts.
A JavaScript pixel on your site resolves anonymous visitors using deterministic matching against a proprietary identity graph. When someone visits your pricing page, you get their name, email (personal and professional), company, job title, LinkedIn URL, and full behavioral data - pages visited, session duration, return visits.
Why it matters: Everything downstream depends on having a contact to work with. Without this layer, you’re limited to the 3% who fill out forms.
Layer 1: Validate
Purpose: Confirm email deliverability before spending enrichment credits.
Tools like ZeroBounce or NeverBounce check whether the identified email addresses are actually deliverable. This filters out invalid, disposable, and catch-all addresses before you waste money enriching them.
Why it matters: Enriching an invalid email is burning money. Validation typically costs $0.005-0.01 per check and saves you from wasting $0.10-0.50 in enrichment credits on dead addresses. It also protects your sender reputation if you plan to email these contacts.
Layer 2: Enrich
Purpose: Fill in the gaps - phone numbers, LinkedIn profiles, firmographics, tech stack data.
This is where the waterfall lives. Your identified contact goes through Apollo, then Lusha, then PeopleDataLabs (or whatever provider cascade you’ve configured). Each provider fills in what the previous one missed.
Why it matters: Leadpipe identifies the person and gives you an email. The enrichment waterfall adds the phone number, verifies the LinkedIn profile, appends company size, industry, revenue, tech stack, and job seniority. Your SDR (human or AI) now has a complete picture.
Layer 3: Intent
Purpose: Layer buying intent signals on top of contact data.
Leadpipe’s Orbit API tracks cross-site research behavior and assigns intent scores from 1-100. This tells you not just WHO the visitor is, but WHAT they’re actively researching across the web. Someone with an intent score of 85 for “visitor identification software” is a very different lead than someone who stumbled onto your blog from a random Google search.
Why it matters: Intent scoring lets you prioritize which leads get immediate attention vs. which go into a nurture sequence. Without it, every identified visitor looks the same.
Layer 4: Research
Purpose: Generate personalized context for outreach.
AI research tools like Claygent or Perplexity analyze the enriched company data and produce summaries: recent funding rounds, product launches, hiring patterns, competitive landscape, challenges the company is likely facing. This becomes the raw material for personalized emails and talk tracks.
Why it matters: “Hi Jane, I saw you visited our pricing page” is lazy outreach. “Hi Jane, I noticed Acme Corp just expanded into EMEA and you’re scaling the marketing team - here’s how companies in similar growth stages handle visitor identification” is the kind of message that gets replies.
Layer 5: Score
Purpose: Qualify leads against your ICP and assign priority tiers.
Using the enriched data (company size, industry, revenue, job title) plus intent signals, you score each lead against your ideal customer profile. Tier 1 leads go straight to your AI SDR or sales team. Tier 2 enters a nurture sequence. Tier 3 gets dropped.
Why it matters: Your sales team’s time is finite. Scoring ensures they spend it on the leads most likely to convert - not on every person who happened to visit your site.
Layer 6: Act
Purpose: Trigger outreach, create CRM records, notify sales.
The scored, enriched, researched lead gets routed to the right destination. That could be a CRM deal, an AI SDR sequence, a Slack alert, or a manual task for your sales team. The routing depends on the lead’s score and tier.
Here’s what each layer adds to the total cost and coverage:
| Layer | Tool | Monthly Cost | What It Adds | Cumulative Coverage |
|---|---|---|---|---|
| 0: Identify | Leadpipe | $299 | Name, email, company from anonymous traffic | 30-40% of visitors |
| 1: Validate | ZeroBounce | ~$50 | Email deliverability check | Filters to valid emails |
| 2: Enrich | Clay waterfall | $185 | Phone, LinkedIn, firmographics | 85-95% fill on identified |
| 3: Intent | Leadpipe Orbit | Included | Topic research signals, score 1-100 | Adds buying intent layer |
| 4: Research | Claygent | ~$100 | Company analysis, personalization fuel | Adds context for outreach |
| 5: Score | Clay formulas | Included | ICP qualification, priority tiers | Filters to qualified leads |
| Total | ~$634/mo | Anonymous to fully enriched, scored, researched |
Coverage Improvements: The Math
This is where the signal stack earns its keep. Let’s run the numbers for a site with 10,000 monthly visitors.
Without Layer 0 (No Visitor Identification)
Your enrichment waterfall only processes form fills:
| Metric | Count |
|---|---|
| Monthly visitors | 10,000 |
| Form fill rate | 3% |
| Form fills (known contacts) | 300 |
| Enrichment fill rate (waterfall) | 85% |
| Fully enriched leads | 255 |
| ICP qualification rate | 20-30% |
| Qualified, enriched leads | 50-80 |
Fifty to eighty qualified leads from 10,000 visitors. That’s a 0.5-0.8% yield on your traffic.
With Layer 0 (Visitor Identification Added)
Now add Leadpipe as Layer 0 before the enrichment waterfall:
| Metric | Count |
|---|---|
| Monthly visitors | 10,000 |
| Leadpipe identification rate | 30-40% |
| Identified visitors | 3,000-4,000 |
| Email validation pass rate | ~85% |
| Valid, identified contacts | 2,550-3,400 |
| Enrichment fill rate (waterfall) | 85% |
| Fully enriched leads | 2,170-2,890 |
| ICP qualification rate | 20-30% |
| Qualified, enriched leads | 430-870 |
That’s 6-16x more qualified leads from the exact same traffic. No additional ad spend. No new content. No new campaigns. Just a Layer 0 that captures the intent signals your enrichment stack was blind to.
The takeaway: Waterfall enrichment is an optimization. Visitor identification is a category shift. Waterfalling gets you from 60% to 90% fill rates on known contacts. Adding Layer 0 gets you from 300 contacts to 3,000+ contacts. The leverage is in the identification, not the enrichment.
And it compounds. Those 430-870 qualified leads didn’t come from a purchased list or a scraped database. They came from your own website traffic - people who were actively researching your product. The conversion rates downstream are dramatically higher because the intent signal is already baked in.
Cost Per Fully Enriched Lead
One of the most common questions: how does the cost stack up compared to other approaches?
| Stack Configuration | Monthly Cost | Leads/Month | Cost per Lead |
|---|---|---|---|
| Leadpipe + Clay + Orbit | ~$484 | 500-1,500 | $0.32-0.97 |
| Clay only (from lists) | $185 | Depends on input | $0.50-2.00 |
| ZoomInfo + Clay | $1,400+ | Depends on input | $2.00-5.00 |
| Manual enrichment | $2,000+ (time cost) | 50-100 | $20-40 |
The Leadpipe + Clay combination hits a sweet spot because Leadpipe handles the highest-cost step - identification - at a flat monthly rate. You’re not paying per API call for the identity resolution. That means your per-lead cost actually decreases as your traffic grows. More visitors means more identifications at the same price, spreading the fixed cost across more leads.
Compare that to ZoomInfo, where you’re paying $15,000-25,000/year for seat licenses before you even start enriching. Or to manual research, where a junior SDR spending 15 minutes per lead at $25/hour is burning $6.25 per record and still missing half the data.
For teams evaluating data providers for AI SDRs, the cost-per-enriched-lead metric is especially critical because AI agents chew through leads at high volume. A $5/lead input cost destroys your unit economics when the AI is processing thousands of contacts per month. RevOps teams that want to feed this enriched data directly into their warehouse or CDP should look at Leadpipe for RevOps: Programmatic Data for Your Stack for the integration patterns.
Implementation Options
There are three paths to building this stack. Pick the one that matches your team’s technical capacity and existing tools.
Option 1: Clay-Based (Recommended)
Best for: Teams already using or evaluating Clay. Most automated path.
Leadpipe webhook → Clay webhook table → Clay waterfall → CRM export
Leadpipe fires a webhook every time it identifies a visitor. Clay receives the webhook into a table, then automatically runs the enrichment waterfall, validation, scoring, and research steps. Qualified leads get pushed to your CRM or AI SDR.
The full setup is covered step-by-step in our Clay waterfall integration guide. If you’re also using HubSpot as your CRM, the Leadpipe + Clay + HubSpot guide covers the end-to-end pipeline.
Setup time: 30-60 minutes.
Option 2: Custom Pipeline
Best for: Engineering teams that want full control. Most flexible.
Leadpipe webhook → Your backend → API calls to enrichment providers → Database → CRM
You receive the Leadpipe webhook in your own backend, then orchestrate API calls to Apollo, Lusha, PeopleDataLabs, or any other provider directly. This gives you complete control over the waterfall logic, deduplication, error handling, and data storage.
This is the path that platforms and agencies take when building visitor identification into their own products via the Leadpipe API. It’s also the right choice if you need to keep all data in your own infrastructure for compliance reasons.
Setup time: 2-5 days depending on complexity.
Option 3: Zapier/Make
Best for: Non-technical teams that want something working today.
Leadpipe webhook → Zapier/Make → Enrichment steps → CRM
Leadpipe’s native integrations work with Zapier and Make out of the box. You can build a Zap that receives identified visitors, runs them through enrichment steps (many providers have native Zapier integrations), and pushes qualified leads to your CRM.
It’s not as powerful or cost-efficient as the Clay-based approach, but it works without touching a single API or writing a line of code. For teams under 5,000 monthly visitors, the identity-data-as-a-service approach with Zapier may be the fastest path to value.
Setup time: 1-2 hours.
Pro Tips for Credit Efficiency
Building the stack is one thing. Running it efficiently is another. Here are the tactics that separate teams burning money from teams printing pipeline.
1. Only enrich high-intent visitors.
Not every identified visitor is worth enriching. Someone who bounced off your homepage after 5 seconds is not the same as someone who spent 4 minutes on your pricing page. Use Leadpipe’s page-level filtering to only trigger enrichment webhooks for high-intent pages: pricing, demo, case studies, comparison pages, and integrations.
2. Use excluded paths to skip low-value traffic.
Leadpipe’s exclusion list feature lets you block identification on pages where visitors are unlikely to be buyers - support docs, blog posts about general topics, careers pages. This conserves your monthly identification credits for the traffic that actually matters.
3. Gate phone enrichment behind email validation.
Phone number lookups are typically the most expensive enrichment credit. Don’t waste them on contacts with invalid email addresses. Run email validation (Layer 1) first, and only send validated contacts into the phone number enrichment step. This alone can cut enrichment costs by 15-20%.
4. Use intent scores to tier your enrichment.
Not every lead deserves the full six-layer treatment. Use Leadpipe Orbit’s intent scores to create tiers:
- Intent score 70-100: Full enrichment + AI research + immediate SDR outreach
- Intent score 40-69: Basic enrichment + nurture sequence
- Intent score 0-39: Identification only, no enrichment spend
This approach can reduce your enrichment costs by 40-60% while focusing budget on the leads most likely to convert.
5. Run lightweight enrichment on all, full enrichment on qualified.
There’s a two-pass approach that works well at scale: run a cheap enrichment step (just company data + email validation) on all identified visitors. Then score them against your ICP. Only the leads that pass ICP qualification get the full waterfall treatment with phone numbers, AI research, and personalization.
Try Leadpipe free with 500 leads to test how many of your anonymous visitors are identifiable before committing to the full stack.
FAQ
How is visitor identification different from enrichment?
Enrichment takes a contact you already know and adds more data to their record. Visitor identification discovers contacts you didn’t know existed by resolving anonymous website sessions into real people. They’re complementary - identification creates the contacts, enrichment completes them. For a deeper comparison, see our guide on how these categories relate.
Does Leadpipe replace Clay?
No. Leadpipe and Clay do fundamentally different things. Leadpipe identifies anonymous website visitors (Layer 0). Clay enriches known contacts through a multi-provider waterfall (Layer 2). They work together in the same stack. Leadpipe creates the leads. Clay makes them complete. Most teams using both report 5-10x more enriched leads than Clay alone because Clay finally has a source of high-intent contacts to work with.
What match rate should I expect from the identification layer?
Leadpipe’s deterministic matching typically achieves 30-40% match rates depending on traffic quality. B2B-heavy sites with US traffic tend to land at the higher end. International traffic and B2C-heavy sites will be lower. The key differentiator is that Leadpipe uses its own proprietary identity graph - not a resold third-party graph - and identifies visitors even without LinkedIn profiles, which is a limitation of tools like RB2B.
Can I build the waterfall myself without Clay?
Yes. If you have engineering resources, you can orchestrate the waterfall directly via API calls to enrichment providers. The Leadpipe developer guide covers webhook payloads and API integration in detail. You’ll need to handle the cascade logic, error handling, deduplication, and rate limiting yourself. Clay handles all of that out of the box, which is why it’s the recommended path for most teams.
Related Articles
- Add Visitor ID to Your Clay Waterfall (2026 Guide) - Step-by-step setup for the Leadpipe + Clay pipeline
- Best Contact Enrichment APIs in 2026 (Compared) - Deep comparison of 12 enrichment providers and waterfall strategies
- The AI SDR Data Stack: Visitor to Booked Meeting - Full pipeline from anonymous visitor to booked meeting
- Visitor Identification API: Complete Developer Guide - API reference for building custom integrations
- Person-Level Intent Data: How It Works - Deep dive on Leadpipe Orbit and cross-site intent signals
- The Data Layer AI Sales Agents Are Missing - Why AI SDRs fail without proper identification data
- Leadpipe + Clay + HubSpot Integration - End-to-end pipeline guide for HubSpot users
The gap in most B2B data stacks isn’t enrichment quality. It’s enrichment coverage. You can waterfall through every provider on the planet and still miss 97% of your website visitors because you never identified them in the first place.
Layer 0 fixes that. Add visitor identification before your enrichment waterfall, and you go from enriching 300 form fills to enriching 3,000+ identified, high-intent contacts from the same traffic.
The complete signal stack - identify, validate, enrich, score intent, research, qualify, and act - costs under $650/month and produces fully enriched leads at $0.32-0.97 each. That’s cheaper than a single ZoomInfo seat. And the leads are warmer because they were on your site, showing real intent, when you identified them.
Start with 500 free identified leads and see how many of your anonymous visitors are already identifiable. No credit card. No sales call. Just data.