How we measure AI visibility: a working brief

The first question every new AI-visibility client asks is the same: can you actually measure this? The honest answer is yes, but it requires accepting that the measurement is harder, noisier, and lower-frequency than Google Search Console — and designing the programme around that fact.

What we measure

Four numbers, on a weekly cadence, across six surfaces:

Brand mention rate. For your 60–120 buyer-intent queries, what fraction of LLM answers name your brand?
Share of voice. Of all the brands cited across those answers, what fraction is yours?
Source-page distribution. When an LLM cites a URL, how often is that URL from your domain vs. third parties?
Sentiment / framing. When you are mentioned, are you described as the option for a particular use case, or as one of several roughly-equivalent vendors?

The six surfaces

ChatGPT (free + plus tiers, separately — they retrieve differently)
Claude
Perplexity
Google Gemini
Google AI Overviews (which is its own retrieval stack)
Bing Copilot

Each surface has its own quirks. We do not aggregate them into a single “AI visibility” number, because the levers that move ChatGPT are not the same as the ones that move AI Overviews. Reporting them separately preserves the signal.

The query set

Buyer-intent queries only. We split them into three intents:

Solution-seeking. “Best category for use-case” — high competitive density, expensive to win, where most engagements focus first.
Comparison. “Brand A vs Brand B” — where being absent often costs you specific deals.
Problem-shaped. “How do I do X” — where being cited builds trust upstream of any purchase intent.

Sample size: 60 queries minimum, 120 typical, scaled for category breadth.

What the report looks like

One dashboard. Four numbers per surface, plotted weekly. A separate table of the top movers (queries where citation rate jumped or dropped most this week, with a hypothesis for why). A monthly written commentary — three pages — explaining what we changed and what we believe it caused.

What we do not pretend to measure

Causal attribution to revenue. The link from “cited in ChatGPT” to “deal closed” is real but rarely deterministic.
Surface-by-surface model versions. Models update. We hold our query set and methodology constant so week-over-week movements are interpretable.
Citations that happen with personalised search history. We measure on clean sessions only — what a new buyer would see, not what a logged-in power user with extensive history sees.

The honest summary

AI visibility measurement is real but young. The methodology will improve. What matters more than chasing perfect measurement is having a methodology you can hold constant, so the week-over-week movements you do see actually mean something. Most teams skip the methodology and chase whatever number an AI-visibility tool surfaces. That number is almost always less reliable than a smaller, hand-curated, consistently-tracked query set.

Services

Explore