Methodology

How the benchmark is measured.

Plain language. Sample sizes. Caveats. If a number looks weird, the explanation is below — not buried.

The prompt corpus

We track 640 prompts spanning an "All" sample that mixes everything plus seven industry-specific samples: B2B SaaS, E-commerce, Marketing, AI tools, DevTools, Fintech, and Healthcare. Prompts come from real search demand — Google Search Console exports, Perplexity prompt logs, and the buyer questions surfaced by the Visibly audit pipeline — and are intent-typed as informational, commercial, or navigational.

The corpus is fixed week-to-week for stability; we expand it once per quarter and publish the diff so the dataset stays comparable across time.

The five AI surfaces

We query the same prompt against five live-browsing AI surfaces: ChatGPT Search, Perplexity, Claude (with web), Gemini, and Google AI Overviews. Each returns a small set of citations; we capture the cited URLs verbatim, classify the content type, and aggregate.

The seven content types

Every cited URL is labeled with exactly one of seven content types by an LLM-judge classifier:

Comparison
"X vs Y", "best X for Y", multi-vendor evaluations. Head-to-head structure.
Listicle
Numbered or bulleted lists of items in a category ("12 best X", "top 7 Y").
Guide
Step-by-step procedural content. HowTo schema or its equivalent.
Explainer
Definition-led content. "What is X?", concept articles.
Case study
A specific customer or scenario walked through with outcome metrics.
Tool
An interactive calculator, comparison engine, or other utility hosted on the page.
News
Time-sensitive product announcements, releases, or industry coverage.

Refresh cadence

The full corpus is re-run every Monday at 09:00 UTC. Numbers on the benchmark page reflect last week's run. Once a month we publish a narrative interpretation of the four-week trend at /benchmark/reports.

Limits of the dataset

Three caveats we want to be loud about:

What we publish — and what we don't

We publish: the matrix, the deltas, and our interpretation of the deltas. We don't publish: the raw cited URLs (publisher privacy), prompt-level results (gives away the corpus), or any per-domain ranking.

Reproducibility

The data file backing the page lives at src/data/benchmark.json in the marketing site repo and gets regenerated on every Monday's Prompt Monitor run. The shape is stable; the methodology page versions as the dataset evolves.

Brand marks

OpenAI, Perplexity, Anthropic (Claude), and Google (Gemini, AI Overviews) brand marks appearing on this page and on the Index are used for editorial benchmark comparison. All trademarks belong to their respective owners. The Visibly Index is independent and unaffiliated with any of these companies.

Last updated · 2026-05-26 · Questions: hello@visibly.so

See how AI sees you.

Start with a free audit: your visibility across every model, benchmarked against your competitors — and the first moves to win the answer.

Start Free Trial → Get your free audit
Diagnostic in 48 hours · No card required · For B2B and consumer brands