Your CMO bought visitor identification. Your CRO approved the Salesforce integration. You, marketing ops, are the one who has to keep the data clean for the next 18 months while 3 different teams use it for 9 different things.
This is the hygiene playbook. Not the integration tutorial. Not the vendor comparison. The weekly and monthly rituals a marketing ops lead uses to keep visitor data from becoming landfill inside your CRM.
Who this post is for
You are a marketing ops manager, director of marketing operations, or senior MOps analyst at a B2B company running HubSpot, Salesforce, or Marketo. Your team is 1 to 4 people. You have at least 1 visitor identification tool feeding records into your CRM. You are responsible for data quality metrics.
The answer up front: visitor data goes bad in 5 specific ways. Duplicates, stale records, wrong-segment tagging, orphaned activity, and broken attribution. Each has a ritual that prevents it. If you run the 5 rituals weekly or monthly, the data stays usable. If you don’t, expect to spend 40+ hours per quarter fixing retroactively.
The 5 failure modes
| Failure mode | How it shows up | Frequency to check |
|---|---|---|
| 1. Duplicates | Same person on 3 Lead records | Weekly |
| 2. Stale records | Leads with 0 activity in 180+ days | Monthly |
| 3. Wrong-segment tagging | ICP filter drifting | Monthly |
| 4. Orphaned activity | Website visits on Contacts with no Account | Weekly |
| 5. Broken attribution | Pipeline report misses visitor touches | Monthly |
Failure 1: duplicates
Visitor identification tools send multiple records per person. The same visitor comes back next week, their email bounces once and resends, their title updates. Without strong dedup, each event creates a new Lead or Contact.
Weekly ritual:
- Run the “Leads created in last 7 days” report, grouped by email.
- Flag any email with >1 record.
- Check if the duplicate is a visitor ID false positive (same email, different IP) or a CRM dedup failure (same email, wrong matching rules).
- Merge manually or via DemandTools, whichever you already use.
Quarterly cleanup:
Run a retroactive dedup across all Leads created in the past 90 days. Use email as the primary key, domain + name as the fallback. If the rate is over 5% of records, review dedup rules with RevOps.
Prevention:
- Enable Salesforce or HubSpot dedup rules at the object level.
- Add a LinkedIn URL custom field and include it in matching where available.
- Use email normalization (lowercase, strip + aliases) in the sync layer before insertion.
See the RevOps post on merging visitor data into Salesforce for the deeper CRM-side dedup configuration.
Failure 2: stale records
A Lead identified 8 months ago who never responded, never visited again, and never engaged a campaign is not a warm lead. It is a cold email list polluting your database.
Monthly ritual:
- Pull every Lead with Leadpipe_First_Seen__c older than 180 days.
- Filter to those with 0 marketing or sales activity in the last 90 days.
- Filter again by last email engagement (open or click) in the last 90 days.
- For the subset with 0 engagement on all axes, move to an “Archived” status or delete.
Why it matters:
Stale records inflate your lead count for vanity metrics, degrade your email deliverability (bounce and spam complaints), and break ICP-fit reporting because the firmographic data is 9 months old.
The honest benchmark. At a typical B2B SaaS, 60-70% of identified visitors never re-engage after the first visit. That is normal. The mistake is treating them as warm leads 6 months later.
Failure 3: wrong-segment tagging
Your ICP filter was set up on day 1. 6 months later, the product line has expanded, the target company size has shifted, and the tagging in the CRM doesn’t reflect it.
Monthly ritual:
- Export 200 random Leads tagged as ICP = True in the last 30 days.
- Manually score 50 of them against the current ICP definition.
- Measure: what percentage are actually ICP-fit today?
| Current ICP-fit rate | Action |
|---|---|
| >85% | Tagging is healthy |
| 70-85% | Tune the filter |
| <70% | Filter is broken, full rebuild |
Prevention:
Review the ICP filter every 90 days with sales leadership. Business changes faster than your integration’s filter config.
For how to define ICP clearly, see the glossary on ICP.
Failure 4: orphaned activity
Visitor identification tools log website visits as activities. Sometimes those activities land on Contact records without Accounts, Leads without Campaigns, or Accounts without Owners. Each orphan is a silent data gap.
Weekly ritual:
- Run a report for “Activities of type Website Visit in last 7 days.”
- Filter to activities where parent record has missing required fields (no Account, no Owner, no ICP tag).
- For each orphan, either enrich the parent record or delete the orphan.
Automation:
Set up a Salesforce flow or HubSpot workflow that rejects Website Visit activities if the parent record fails validation. Better to lose an event than pollute reporting.
Why it matters:
Orphaned activity is how your “website-influenced pipeline” report starts showing phantom influence. The activity exists but the parent record isn’t real, so every downstream report inherits the noise.
Failure 5: broken attribution
You built the attribution report on day 1. 3 months later it shows different numbers than the raw visitor identification dashboard. Your CMO asks why.
Monthly ritual:
- Compare 3 totals: visitor identification dashboard, CRM Lead count, CRM activity count.
- Accept up to 5% drift. Investigate anything more.
- Common root causes:
- Integration silently dropped events after a vendor-side update.
- A custom field changed format and stopped mapping.
- Dedup is merging records that should be separate (e.g., two different people at same domain).
Quarterly attribution check:
Pull 20 closed-won Opportunities from the last quarter. For each, manually verify:
- Are all identified visitor touches logged on the Opportunity’s Contacts?
- Is the Leadpipe_Influenced__c flag set correctly?
- Does the source tier field reflect the top page?
If any of the 3 fails, your attribution report is underreporting. Fix the mapping and rerun the 20-Opp check.
See the CRO’s pipeline source audit for the full attribution overlay.
The weekly and monthly rhythm
| Day | Task | Time |
|---|---|---|
| Mon | Duplicate check (last 7 days) | 15 min |
| Mon | Orphan activity scan | 10 min |
| Wed | Campaign-to-visit mapping spot-check | 10 min |
| Fri | Owner assignment anomalies | 10 min |
| Monthly (1st of month) | Stale record purge | 60 min |
| Monthly | ICP filter audit | 45 min |
| Monthly | Attribution drift check | 45 min |
| Quarterly | Full dedup sweep | 3 hours |
| Quarterly | 20-Opp attribution check | 2 hours |
Total time: roughly 2 hours per week plus 8 hours at month-end. One MOps person can hold this.
Field governance
Visitor identification tools ship more fields than most CRMs need. Every field you sync is a field you have to maintain.
The minimum set to sync:
| Object | Field | Purpose |
|---|---|---|
| Lead / Contact | Email, First Name, Last Name, Title, Company | Basic |
| Lead / Contact | LinkedIn URL | Dedup aid |
| Lead / Contact | Leadpipe_First_Seen, Leadpipe_Last_Seen, Leadpipe_Visit_Count | Attribution |
| Lead / Contact | Leadpipe_Intent_Score, Leadpipe_Top_Page | Routing |
| Account | Domain, Industry, Employees, Revenue | Firmographics |
| Activity | Session timestamp, Pages, Duration | Engagement |
Fields to NOT sync unless you have a use case:
- Age range, gender, income, net worth, homeowner status, marital status. These exist in the identity graph but don’t belong in a B2B CRM.
- Hashed emails (HEMs). Useful for ad platform match, not for CRM.
- Device IDs. Technical, not operational.
Adding every available field to Salesforce creates schema bloat, report confusion, and privacy exposure without operational benefit.
Privacy and compliance hygiene
Your ritual needs a compliance leg.
- Monthly. Check the opt-out and do-not-contact list for new entries and propagate them into suppression lists used by sequences and ads.
- Quarterly. Audit GDPR handling: any visitor identified as EU/UK should be company-level by default unless you have affirmative consent. Leadpipe enforces this at the pixel level but your downstream workflows should respect it too.
- Annually. Review the subprocessor and DPA list with legal.
For the compliance foundation, see GDPR compliant visitor identification.
What NOT to do
- Don’t let marketing import CSVs of visitor data outside the integration. Every side-door import breaks dedup.
- Don’t run one “big cleanup” per year. The problem compounds. Weekly and monthly ritual wins.
- Don’t write Salesforce formulas that calculate attribution inside the CRM. Do it in your warehouse and push summary fields back. CRMs are not analytics engines.
- Don’t mix test and production records. Visitor data from dev or QA should never flow into production CRM.
- Don’t treat the visitor ID vendor’s dashboard as the source of truth. The CRM is the source of truth. The vendor is upstream data.
The health dashboard a MOps lead should maintain
| Metric | Target | Check |
|---|---|---|
| Dedup rate (last 30 days) | <2% | Weekly |
| Stale record rate | <15% | Monthly |
| ICP-fit accuracy on tagged | >85% | Monthly |
| Attribution drift | <5% | Monthly |
| Orphaned activity rate | <1% | Weekly |
Share this dashboard with your CMO and RevOps lead monthly. When any metric trends wrong, you know where to spend the week.
Tools and workflows
| Function | What to use |
|---|---|
| Visitor identification source | Leadpipe Pro $147/mo, Growth $299/mo, Scale $599/mo |
| CRM | Salesforce, HubSpot, or equivalent |
| Dedup tooling | Native dedup rules + DemandTools, Apsona, or HubSpot’s duplicate manager |
| Monitoring | Salesforce reports + a warehouse snapshot (Snowflake, BigQuery) |
| Suppression | Built into the sync layer, maintained by CS and legal |
What good looks like
A marketing ops lead who runs this playbook can answer 3 questions at any moment. How many identified visitors hit the CRM last week. What percentage were ICP-fit. How many converted to a meeting. If the answers are clean, the data is healthy. If any answer takes more than 10 minutes to produce, the hygiene has slipped.
The cost of bad hygiene isn’t a bad report. It is a CMO who stops trusting the report. And once trust breaks, the tool gets cancelled.
Leadpipe identifies 30-40%+ of your US B2B visitors with full contact data on the Pro plan at $147/mo. No credit card to start the 500-lead trial. Start identifying visitors →