The Measurement Problem Isn't AI — It's Physics

Brian Handrigan January 27, 2026

The Measurement Problem Isn’t AI — It’s Physics

There’s a rush happening right now to fix advertising measurement with AI. Every vendor is adding machine learning. Every platform is announcing smarter attribution. Ad Age wrote that “AI won’t fix measurement if the foundation is broken.”

They’re right. But I don’t think enough people are asking what’s actually broken about the foundation.

It’s not the data. It’s not the algorithms. It’s the assumptions the models are built on — assumptions about how human beings respond to advertising that violate basic physics.

The assumption nobody questions

Here’s the core problem, and once you see it, you can’t unsee it.

The most widely used model for weighting advertising response over time is exponential decay. It’s the default in Google Analytics and Adobe Analytics for digital multi-touch attribution. It’s the standard in marketing mix models — Meta’s Robyn, Google’s Meridian, PyMC-Marketing — all use geometric (exponential) decay as their default adstock function.

The model says: maximum response at time zero. Then it decays from there.

Think about what that means for a TV ad. An ad airs at 8:15 PM. According to exponential decay, the single highest-probability moment for someone to visit a website, search on Google, or pick up a phone is 8:15:00 PM. The literal instant the ad hits the screen.

No human can do that. The viewer has to notice the ad. Process the message. Decide to act. Pick up their phone. Unlock it. Open a browser or search app. Type something. Wait for the page to load. That’s a physical sequence performed by a human being with finite reaction time. It doesn’t happen in zero seconds. It can’t happen in zero seconds.

And yet the model that the entire industry uses as its default assumes exactly that.

What actually happens

I’ve spent 26 years observing what happens after TV ads air. Not modeling it. Not estimating it. Watching it in the data.

We calibrated against 983,921 paid traffic web sessions matched to TV ad airings — one of the largest known calibration datasets for TV-to-web attribution. The pattern is consistent and reproducible across thousands of campaigns, dozens of markets, and multiple years:

Zero response at the instant the ad airs. Nobody acts in the first few seconds. They’re still watching.

A ramp-up as viewers register the stimulus. Response builds as people pick up devices, open browsers, type searches. The ramp is steep — from near-zero to significant activity in under two minutes.

A peak at approximately 150 seconds. Two and a half minutes after the ad. This is where the bulk of genuine ad-driven response concentrates. Not at time zero. Not at thirty seconds. At 150 seconds — the time it takes for a human to process an ad and act on it.

The decay tail and the shape

A decay tail. Response trails off after the peak. By five minutes, roughly 90% of the total response has occurred. By ten minutes, virtually all of it. A small sustained lift — approximately 17% above the pre-airing baseline — persists beyond the primary window.

This shape follows a gamma distribution. Not because we chose it from a statistics textbook. Because we measured nearly a million real web sessions and that’s what the data looked like.

The academic evidence

Here’s what convinced me this wasn’t just our data telling our story. Five published studies — different teams, different countries, different product categories, different methodologies — all found the same thing: TV-driven response doesn’t start at maximum. It builds, peaks, and decays.

Veverka & Holy (2024) in Applied Stochastic Models in Business and Industry fitted a Weibull distribution to TV-driven e-commerce response and found a peak at 57 seconds. Faster than our 150-second peak, but the product was single-brand FMCG e-commerce — one click to purchase versus the multi-step journeys we typically trace (search → website → call → job). Different latency, same shape.

Liaukonyte, Teixeira & Wilbur (2015) in Marketing Science demonstrated significant web traffic lift within a two-minute post-ad window across 20 brands. The concentrated window aligns with the peak region of the gamma curve.

Search behavior and sales effects

Joo, Wilbur, Cowgill & Zhu (2014) in Management Science measured TV advertising’s effect on branded search queries and found a 0.17 elasticity — quantitative proof that TV drives specific, measurable search behavior. Not vague awareness. Actual typed queries.

Lewis & Reiley (2013) at EC ‘13 detected search spikes within 15 seconds of ad conclusion, with effects persisting up to an hour. Rapid onset, extended tail — the gamma shape again.

Shapiro, Hitsch & Tuchman (2021) in Econometrica — arguably the most rigorous large-scale study of TV advertising effectiveness — demonstrated measurable sales effects across 288 brands.

Same shape across every study

No single study replicates our exact conditions. But across all five — spanning nearly a decade of research, multiple journals, multiple continents — the observed shape is consistent. Response doesn’t start at maximum. It starts at zero, builds, peaks, and decays.

The exponential decay model gets this backwards. And it’s the default everywhere.

Why this matters for your budget

This isn’t an academic argument. It determines where your money goes.

When your attribution model assigns maximum weight at time zero, it credits whatever web traffic happened to be there anyway. Someone was already on your website when the ad aired? The model says the ad drove that visit. Someone searched your brand name five seconds before the ad? The model gives the ad partial credit.

That’s coincidental traffic getting counted as ad-driven response. The numbers look good. The budget stays. But the measurement is wrong.

The reverse is also true. A viewer who saw your ad at 8:15 PM, picked up their phone at 8:17, searched your brand on Google, and visited your website at 8:18 — that genuine response gets less credit than the coincidental traffic, because the model’s weight has already decayed by the time the real response arrives.

A 2025 analysis of over a thousand ad accounts found that 68% of multi-touch attribution models over-credited digital channels by 30% or more. When the models systematically credit coincidental activity, digital — where clicks happen fast — gets over-credited. TV — where response takes time — gets under-credited. Budget follows the overcrediting. TV gets cut. Digital gets inflated. And the advertiser loses the channel that was actually driving the search in the first place.

What the TV attribution industry actually does

You might assume the dedicated TV attribution vendors have solved this. They haven’t — at least not publicly.

EDO rejects traditional attribution models entirely. They measure search spike area above a modeled baseline — a fundamentally different approach that doesn’t attempt to weight individual sessions.

iSpot uses deterministic IP-based matching with a 14-day default window and coarsened exact matching for lift measurement. No publicly disclosed time-decay weighting within the window.

Innovid (formerly TVSquared) says they use “time decay” but their documentation specifies “higher weightings to the closest touchpoint” — meaning the most recent ad gets the most credit. That’s functionally the same problem as exponential decay: maximum weight where genuine response is least probable.

Rockerbox uses a 5-minute spike detection window. Tatari uses 5 minutes immediate plus a 30-day multiplier. Flat windows — no time-decay weighting at all. A session 15 seconds after the ad gets the same credit as one 4 minutes later.

No standards, no disclosure

Neither the MRC nor the IAB prescribe a specific decay model. The MRC requires attribution windows be “empirically supported” — but most vendors don’t disclose whether theirs are.

Here’s the uncomfortable summary: flat windows don’t weight time at all. Exponential decay weights it backwards. The vendors who say “time decay” often mean “most recent gets the most credit” — which is the same backwards assumption wearing a different label. And the ones who’ve moved beyond simple decay (Nielsen’s ML-based fractional attribution, for example) have replaced one black box with another.

The result is an industry where every vendor claims to measure TV’s effect on digital behavior, no two use the same approach, most won’t tell you what their approach actually is, and none of them start from the observed physics of how people respond. We’ve built a measurement industry on undisclosed assumptions. It’s no wonder 75% of marketers say it isn’t working.

The industry is moving in the right direction

There are signs of change. Meta’s Robyn added Weibull CDF and PDF as alternatives to geometric decay — a shape that allows ramp-up before decay, like the gamma distribution. PyMC-Marketing includes a Weibull adstock option alongside geometric. The open-source MMM community is acknowledging that exponential decay doesn’t match observed behavior.

73% of marketing leaders now view incrementality testing as essential, up from 41% in 2023. The appetite for measurement that reflects reality — not model assumptions — has never been higher.

But adding a Weibull option to an MMM tool that operates on weekly granularity isn’t the same as building attribution that operates on seconds. Marketing mix models work at the week or month level. The physics problem — the shape of response within minutes of a specific ad airing — exists at a timescale those models were never designed for.

Demand outpacing the models

Performance TV became the number-one investment channel in 2026, with 71% of marketers increasing their budgets. CTV ad spending is projected at $38 billion, up 14% year over year. Netflix and Comcast launched Conversion APIs. Everyone wants TV — linear and streaming — to prove itself like digital does. The demand for accurate TV-to-digital measurement has never been higher. But the models most of the industry relies on assume that humans respond to TV ads instantaneously. That’s the gap. And AI doesn’t close it — because AI can only optimize within the framework it’s given. A smarter algorithm trained on a backwards assumption produces a more confident wrong answer.

What I built instead

I didn’t start with a model and look for data to validate it. I started with data and looked for the shape.

Nearly a million web sessions. Decades of observation. The gamma curve emerged because it fit what we measured — zero start, ramp-up, peak at 150 seconds, decay. We didn’t impose it on the data.

But the response shape is only one piece. Two more things have to be true before any attribution weight gets assigned:

Geography and context filter

Geography — did the ad actually air in the market where the response occurred? If a local affiliate’s broadcast didn’t reach your zip code, no amount of time-weighting can create influence that wasn’t there.

Context — what was the environment? The same ad during news programming drives dramatically different response than during entertainment. Viewers in a learning mindset are already engaged in active information processing — they see an ad and they search. Entertainment viewers are in a passive consumption state. The magnitude of response can differ by an order of magnitude between the two. If your model ignores context, it averages signal with noise and tells you a meaningless number.

Feeding truth back to platforms

And we do something else that most measurement platforms don’t: we feed the truth back to your advertising platforms. When Google Ads learns that a specific click led to a phone call that became a $4,200 job — not just that a click happened — Smart Bidding starts optimizing for revenue instead of clicks. Facebook learns which impressions preceded real customers. Every platform in the mix gets smarter because the training data reflects reality. That’s the closed loop that turns measurement from a backward-looking report into a system that actually improves performance.

That’s what the Insights & Data Engine does. It applies the observed response shape — the gamma curve — filtered through context, geography, and time to separate genuine influence from coincidence. The methodology is published. The math is available. We’ll show you the curve. Ask the other guys to show you theirs.

The fix isn’t more AI. It’s better physics.

The industry is spending billions on AI-powered measurement. AI that runs on top of models that assume humans respond to TV ads in zero seconds. AI that optimizes within flat attribution windows where coincidental traffic gets the same weight as genuine response. AI that nobody can audit because the methodology isn’t published.

Smarter algorithms on a broken foundation produce more precise wrong answers.

The fix is simpler than anyone wants to admit: start with how humans actually behave. Measure it. Validate it against published research. Then build the model from the observation, not the other way around.

That’s not an AI problem. It’s a physics problem. And physics doesn’t care about your vendor’s roadmap.

Start the Conversation ← All posts