Every B2B marketer has at some point opened their Google Analytics acquisition report, seen “Direct / None” as the second or third largest source, and felt a small pang of denial. Direct is not a source. Direct is where attribution goes to die. Whatever that traffic really was, a paid click that lost the UTM, an email link shared in Slack, a referral from a platform that strips query strings, a return visit from a bookmarked tab, your analytics tool shrugged and logged it as direct.
I am George, founder of Leadpipe. We run an identity graph behind 280M verified profiles, 5M websites monitored, and 60B intent signals refreshed every 24 hours. The whole point of identity-graph-based visitor identification is that we can stitch a session back to a person even when the UTMs are missing. That gives us a second lens on where traffic actually comes from, and the gap between that lens and the GA4 / HubSpot / Salesforce attribution view is the subject of this post.
This is not a benchmark study. It is an honest framework: what UTM-based attribution is structurally bad at, why, and how to fix it without throwing your analytics stack out.
The thesis in one paragraph
If your channel report shows 20-30% direct traffic, you do not have a strong brand. You have an attribution failure. Most of that “direct” is leaked attribution from organic search, dark social, and paid clicks with broken UTMs. Identity-based attribution recovers most of it. UTM-only reporting cannot.
Three structural mechanisms break UTM-based attribution. They compound across multi-session journeys. By the time a buyer converts, the chain is usually broken at least once.
Mechanism one: UTM hygiene is worse than teams realize
Even when traffic is paid and should have a working UTM, a meaningful share does not. The reasons in order of frequency:
| Reason | Where it breaks |
|---|---|
| Campaign template did not inherit the UTM | Bulk campaign setup, ad-platform automations |
| Destination URL pasted with a fragment that truncated the query string | Manual copy-paste workflows |
| Mobile SDK stripped the parameter | App-to-web traffic |
| Click went through a redirect that did not preserve state | Affiliate links, click-trackers, branded shorteners |
| UTM keys collided with site routing | Custom site frameworks that consume query params |
Every paid channel is structurally under-credited in default attribution, even before you get to multi-touch and journey effects. A team running a quarterly UTM audit on its own paid traffic typically finds a non-trivial share of clicks arriving without working UTMs.
The audit you can run on your own traffic in an afternoon:
UTM hygiene audit (run quarterly):
1. Pull last 30 days of paid clicks from each ad platform.
2. Cross-reference against landing-page sessions in your analytics.
3. Compute the share of platform-confirmed paid clicks
that arrive on your site without a usable utm_source.
4. Flag anything above 5% as a hygiene problem.
5. Find the broken templates / redirects / SDK paths and fix them.
This single audit, repeated quarterly, recovers more attribution accuracy than most teams get from upgrading their analytics tool.
Mechanism two: the multi-session cascade
A B2B buyer takes multiple sessions before they convert. The return-visit curve study walks through the typical shape: first-touch is rarely the converting touch. By the time a buyer fills your form, they have visited 4-15 times across days or weeks.
If even one of those sessions arrives without a usable source, the most recent resolvable source becomes the “attributed” channel. The chain breaks, and the break propagates downstream into your reporting.
| Visits before conversion | Probability that at least one session is “Direct / None” |
|---|---|
| 1 | Low |
| 3 | Moderate |
| 5 | High |
| 10+ | Near certain |
The math compounds. Every session is a chance for the chain to break. By the fifth or sixth session, almost every multi-session buyer has at least one session attributed to “direct” simply because of cookie expiry, browser privacy defaults, or a referrer that got stripped.
Mechanism three: privacy defaults and referrer policies
Modern browsers strip referrers more aggressively than they did three years ago. Cross-site cookies are functionally dead. Click-IDs survive better than referrers, which means paid attribution holds up better than organic attribution. The result:
| Channel | Default GA4 attribution behavior in 2026 |
|---|---|
| Paid search (with click ID) | Holds up reasonably; click ID survives most policies |
| Paid social (with click ID) | Holds up; platform-specific parameters survive |
| Organic search | Increasingly stripped by referrer policy; falls into “direct” |
| Holds up if UTMs survive paste-share; otherwise falls into “direct” | |
| Dark social (Slack, LinkedIn DMs, Notion) | Almost always falls into “direct” |
| Referral (other sites) | Variable; depends on the source site’s referrer policy |
The structural pattern is that organic and dark-social attribution decay faster than paid attribution, which means most teams systematically over-credit paid channels and under-credit the rest. Your CFO is making investment decisions on a report that flattens the differences.
What “Direct / None” really is
If you take a typical “Direct / None” bucket from a B2B GA4 property and reattribute the visitors using a deterministic identity layer with a 30-day lookback, the bucket breaks down structurally as follows.
| True source | Why it ended up in “Direct” |
|---|---|
| Returning visitor from a prior known session | Cookie expired, but the person had visited before from a real source |
| Organic search with stripped referrer | Browser referrer policy removed the source |
| Dark social (Slack, LinkedIn DM, email paste) | The link arrived without UTMs because someone copy-pasted it |
| Paid ads with broken UTM | Hygiene failure on the ad-side |
| Truly direct (typed URL, bookmark) | A small minority |
Most teams find that “truly direct” is a small fraction of what GA4 calls direct. The rest is attribution loss, and the largest single bucket is usually returning visitors whose original source was real but invisible to current-session reporting.
This is the same thesis we lay out in Google Analytics is lying about pipeline. The mechanism is identity discontinuity. GA4 cannot tell that the “direct” visitor today is the same human who arrived from organic search 12 days ago.
How identity-based attribution closes the gap
The fix is not “kill UTMs.” It is “stop trusting them as the only attribution layer.” Add a person-level identity layer alongside your existing analytics, and you can reconstruct the multi-session journey for the share of visitors you can identify.
Concrete architecture:
| Layer | What it does | What it cannot do |
|---|---|---|
| UTMs / click-IDs | Tag the click moment | Survive copy-paste, browser policy, multi-session journeys |
| Session-based analytics (GA4) | Channel reporting on hygiene-clean sessions | Stitch the same buyer across sessions and devices |
| Visitor identification (deterministic) | Resolve anonymous sessions to a persistent person | Help on visitors outside the identity graph |
| CRM | Tie people to opportunities | Tell you what they did before the form fill |
| Off-site intent (Orbit) | Surface buyers researching off your site | Identify your actual on-site visitors |
When you combine the layers, you can answer:
- Which channel actually drove this opportunity, across the full multi-session journey, not just the last click.
- How much of “direct” is actually returning organic / dark social / broken paid.
- Which content and pages consistently appear earlier in the journey than your last-click report credits them for.
- Which buyer-account is currently active even when they have not filled a form.
For US B2B traffic, the deterministic identity layer matches 30-40%+ of visitors at the person level. That is not “all visitors,” but it is enough volume to reconstruct the channel mix on the share you do match. The pattern of what “direct” really is generalizes from there.
A working framework you can run
Six steps. None of this requires throwing out your analytics stack.
Step 1: Audit UTM hygiene quarterly
The audit described earlier in this post. Run it every quarter. Fix the templates, redirects, and SDKs that strip parameters. This is the cheapest accuracy upgrade available.
Step 2: Add an identity layer
Deploy a deterministic visitor identification pixel alongside GA4. We built Leadpipe for this; the website visitor tracking pillar walks through the broader category.
Step 3: Reattribute “Direct / None” with a lookback
For every “direct” session, check whether the same identified person has a prior session within 30 days from a known source. If yes, reattribute. If no, leave as truly direct. This single step changes the shape of your channel report.
Step 4: Report on 90-day windows, not last-click
The first-touch channel for most B2B buyers is invisible in last-click reporting. Journey-based attribution, or at minimum first-session attribution within a 90-day window, gives a more honest picture. The midbound thesis walks through how this reshapes the operating model.
Step 5: Treat “Direct / None” as a diagnostic, not a channel
If “direct” is more than 10-15% of your traffic, that is a warning light, not a brand strength signal. Dig into what is actually in the bucket.
Step 6: Audit your CRM data quality
If your CRM is the source of truth for downstream attribution reporting, bad CRM data compounds the upstream attribution errors. See Salesforce is full of bad data for the hygiene teardown.
What this changes operationally
Three implications that come out of running this framework consistently.
Stop treating “Direct / None” as a channel. It is a bucket of failed attribution. Most of what is in it is leaked organic, dark social, and broken paid clicks. Every slide that attributed 20% of pipeline to “direct” was probably right about a small fraction of that and wrong about the rest.
Audit UTM hygiene quarterly. Every paid campaign should be checked for UTM inheritance, landing-page parameter preservation, and redirect safety. The minutes of work involved are an order of magnitude smaller than the budget you are misallocating from a broken report.
Layer identity-based attribution on top of session-based attribution. A deterministic identity graph stitches a buyer’s multi-session journey into one story, which is what session-based attribution cannot do once cookies expire or referrers get stripped. This is the operational reason sellers pair visitor identification with their existing analytics, not a replacement for GA4.
What this does not solve
Worth being honest about the limits.
- Identity coverage is not 100%. Deterministic matching on US B2B traffic resolves a meaningful share, not all visitors. International traffic and non-B2B audiences resolve at lower rates. The reattribution applies only to the share you match.
- 30-day lookback is a choice, not a law. A 90-day lookback would reattribute more but introduce its own attribution noise. Pick a window and stick to it.
- Platforms change. Ad platforms and browsers change referrer behavior frequently. Quarterly audits matter because the baseline keeps moving.
- Multi-touch vs first-touch. This framework is first-session-within-window attribution. Multi-touch models are a separate question. For most B2B teams, the first-session view is more useful than a multi-touch model run on broken inputs.
The point of the framework is not to produce a perfect attribution model. Perfect attribution does not exist. The point is to produce a model honest enough that the team can make budget decisions on it.
What 2026 changes
Three trends that make UTM-only attribution worse over the next 18 months:
| Trend | Effect |
|---|---|
| AI agent traffic (Perplexity, ChatGPT, agent browsers) | Adds traffic that arrives without conventional referrer or UTM patterns |
| Tighter browser privacy defaults | Referrer policies will keep stripping more sources into “direct” |
| Cross-app linking on mobile | App-to-web traffic increasingly drops parameters on the way |
The trend is one direction. The “direct” bucket grows every year. UTM-only attribution gets less useful every year. The teams that win the next two years are the ones who built an identity-based attribution backbone before the bucket got too big to ignore.
Leadpipe identifies 30-40%+ of your US B2B visitors with full contact data on the Pro plan at $147/mo. No credit card to start the 500-lead trial. Start identifying visitors →