Share of Model: The Defensible AEO Metric, and How to Measure It

Share of Model: The Defensible AEO Metric, and How to Measure It

In this article, you will learn what Share of Model is, the four methodology adjustments that make it a defensible number on a client deck in 2026, the published research behind each adjustment, and how GenPicked builds the adjustments in so an agency owner can quote the metric with confidence.


The metric AEO has been waiting for

Share of Model is the AI-search equivalent of share of voice in traditional advertising. It measures how often a brand appears in answers produced by large language models within a specific product or service category. The framing comes from INSEAD's 2025 paper "Meet the Model: How to Market to LLMs" and Harvard Business Review's 2025 article "Forget What You Know About Search: Optimize Your Brand for LLMs." Both pieces position Share of Model as the most intellectually rigorous candidate metric for the AI brand visibility category, and the demand is undeniable. HBR reports that 58 percent of consumers have used generative AI tools for product or service recommendations as of 2025, compared to 25 percent in 2023, with a 1,300 percent surge in AI search referrals to US retail sites during the 2024 holiday season.

The good news for agency owners and fractional CMOs is that Share of Model is ready for the client deck. The discipline has matured. Four published research findings tell us exactly which methodology adjustments are required to report the metric defensibly, and GenPicked applies all four by default in its visibility scans. This article walks through each adjustment, the research behind it, and what it means for an agency that wants to anchor a retainer on a number that holds up to a CFO question.


Why the metric is harder to measure than it looks

A traditional share of voice number is calculable because the underlying observations are deterministic. The ad ran or it didn't. The placement appeared or it didn't. You count, divide, and report.

Share of Model is structurally different. The observation is generated by a stochastic system that produces different outputs for the same prompt depending on temperature, model version, system-prompt configuration, and user context the engine infers. Two scans run thirty minutes apart can produce different brand recommendations. Two scans of the same brand across two engines can produce wildly different mention counts. A naive share-of-model calculation that treats one query as one observation is reporting noise as signal.

Four published research findings spell out the specific failure modes.

Validity challenge 1: AI recommendations are genuinely inconsistent

Rand Fishkin and Paul O'Donnell of SparkToro published a study in January 2026 in which 600 volunteers ran twelve identical prompts through ChatGPT, Claude, and Google AI. Across 2,961 total runs, fewer than one in 100 produced the same list of brands, and fewer than one in 1,000 produced the same list in the same order. Nearly every response was unique in three dimensions: which brands appeared, what order they appeared in, and how many items the engine returned.

The implication for Share of Model is direct. A single measurement is a single sample from a high-variance distribution. If you query ChatGPT once and find that your brand appeared in two of fifteen recommendation lists, you have learned almost nothing. Run the same query 100 times and the answer might be 17 of 100, or 31 of 100, or 8 of 100, depending on which side of the noise you happened to sample.

The study also surfaces a useful nuance. Visibility percentage across many queries is more stable than ranking position. In tight categories such as cloud computing, top brands appeared in most responses; broader categories showed more scatter. The metric design matters: a frequency-based Share of Model with enough sample size is more defensible than a rank-based version with thin sampling.

Validity challenge 2: Sycophancy contamination

Atwell and Alikhani's 2025 paper "BASIL: Bayesian Assessment of Sycophancy in LLMs" formalizes a problem the AEO category prefers to ignore. LLMs do not update their beliefs the way a rational reasoner would. When the user's query embeds a judgment, the model shifts to stay in line with it. This overcorrection is not random; it is systematic and measurable through a Bayesian framework that distinguishes sycophantic belief shifts from rational updates driven by new evidence.

For Share of Model, the consequence is that any measurement prompt that includes brand names, category-shaping context, or stylistic hints about which answer the asker wants will inflate the score for the favored brand. An agency that builds its Share of Model scan with prompts like "What are the best AI search platforms for agencies, including Acme?" will report a higher number than the same agency using the blind prompt "What AI search platforms do you recommend for marketing agencies?" The difference is not a measurement artifact. It is the model performing the social behavior the prompt invites.

The vault concept page on blind versus named measurement covers the experimental evidence in detail. The procurement implication for Share of Model: the prompt template policy is a first-order methodology decision, and any tool that does not publish its prompt template policy is reporting an uncalibrated mixture of brand visibility and sycophantic inflation.

Validity challenge 3: Earned-media dominance

A 2025 University of Toronto analysis found that 82 to 89 percent of AI citations come from earned media rather than brand-owned websites. In the United States, the figure climbed to 92.1 percent. Social platforms were almost entirely excluded from AI citation footprints. Google, by contrast, maintained a more balanced ecosystem that still lifted brand-controlled content into the answer set.

The Share of Model implication is uncomfortable for many AEO programs. If most AI citations come from earned media, then the brand-owned content optimization work that agencies often sell as "AEO" is targeting the wrong citation pool. A high Share of Model score is partly a measurement of the brand's earned-media footprint, which is largely outside the brand's direct control. A low Share of Model score may reveal a media-relations gap more than a content-strategy gap.

The earned-media finding does not invalidate Share of Model. It refines what Share of Model is actually measuring. Agencies that report the metric without acknowledging the earned-media composition are giving clients a partial picture.

Validity challenge 4: Traffic materiality

Conductor's 2026 AEO/GEO Benchmarks Report analyzed 3.3 billion sessions across 13,000+ enterprise domains and 17 million AI-generated responses with 100+ million citations. The headline finding: AI referral traffic averages 1.08 percent of all website traffic across the sample. The IT industry leads at 2.80 percent. ChatGPT generates 87.4 percent of all AI referral traffic.

The 1.08 percent number does not mean Share of Model is unimportant. It does mean the metric must be priced and resourced honestly. A 100 percent improvement in Share of Model on a base of 1.08 percent of traffic translates to a different business outcome than the same lift on 30 percent of traffic. Agencies that pitch Share of Model as the primary KPI without contextualizing the traffic share are setting up an expectations problem at renewal.

Putting the validity story together

Each individual challenge is manageable. The combination is the actual problem.

Stochastic noise (Fishkin) plus sycophantic inflation (Atwell) plus earned-media dominance (U Toronto) plus traffic-share materiality (Conductor) means that a Share of Model number reported without methodology disclosure is communicating four overlapping ambiguities under a single label. The number might be high because the brand is genuinely cited often. It might be high because the measurement prompt anchored it upward. It might be high because of inherited earned-media reach. It might be high while still translating to one percent of website traffic. The dashboard cannot tell the client which of these explanations is dominant.

That is why methodology transparency matters in this category. Without it, Share of Model is a single number standing in for at least four distinct underlying realities.


What defensible Share of Model measurement requires

A Share of Model number a client can defend in a board meeting requires five specific methodology choices. Each one addresses a corresponding validity challenge.

Choice 1: Sample size large enough to overcome stochastic noise. Single queries are sampling artifacts. The minimum useful sample is dozens of independent observations per engine. GenPicked's default is thirty prompts per engine per scan, producing 150 observations per cross-engine scan. The right sample size for a different methodology may vary; the wrong sample size is "one."

Choice 2: Blind prompt construction. Prompts must not include the target brand name, category-shaping language, or stylistic hints. The same query must be issued the same way regardless of which brand the agency is measuring. This neutralizes the sycophantic inflation Atwell's framework documents.

Choice 3: Engine weighting based on documented buyer behavior. A Share of Model composite that averages all engines equally is reporting a number that does not match how any real buyer uses AI search. Engine weighting must reflect actual usage patterns, must be published, and must be revisable as the engine landscape evolves. GenPicked's current weights and the reasoning are documented in our methodology transparency article.

Choice 4: Citation type classification. Not all mentions are equal. A brand named in a top-three recommendation list is different from a brand mentioned as a passing example. A defensible Share of Model distinguishes citation position and reports the composite with the underlying breakdown available.

Choice 5: Earned-media context disclosure. If 82 to 89 percent of citations come from earned media, the Share of Model report should disclose which portion of the brand's mentions is attributable to earned channels. Agencies that fold the earned-media composition into the headline number without breakout are answering the easy question (does the brand get cited?) but not the harder one (does the brand control the citation lever?).

A vendor that publishes these five choices is reporting a Share of Model number you can defend at renewal. A vendor that refuses to publish them is reporting a number that looks like Share of Model and might be something else.


When Share of Model is the right metric to lead with

Despite the validity story, Share of Model is the right headline metric in several specific scenarios.

When the agency's client is operating in a category with established consumer-AI adoption (consumer products, B2B SaaS, professional services categories with documented LLM consultation behavior), Share of Model is a leading indicator that other category-level metrics miss.

When the agency is preparing a quarterly business review and needs a single number to track over time, Share of Model is interpretable to non-technical stakeholders in a way that engine-level citation breakdowns are not. The metric communicates direction even when the absolute level is contested.

When the agency is building a competitive analysis, Share of Model done blind across the entire competitive set provides a comparable cross-brand signal. The methodology must be identical across all measured brands, but if it is, the relative rankings carry information even if the absolute percentages do not.

When Share of Model is the wrong metric to lead with

Share of Model is the wrong headline when the client's primary success criterion is traffic or revenue from AI search, not visibility. The 1.08 percent traffic share documented by Conductor means Share of Model improvements may not move the bottom line for several quarters. In that case, lead with traffic-share metrics and report Share of Model as a leading indicator.

Share of Model is also the wrong headline when the client is in a regulated industry where earned-media citations dominate and brand-owned content is excluded from AI citation pools. In healthcare, finance, and legal verticals, the metric measures earned-media performance more than agency-controlled optimization work. Lead with citation-source analysis and report Share of Model as one input.


How to communicate Share of Model to a client

A defensible client-facing Share of Model report contains four elements.

The headline number, with confidence interval. A point estimate without confidence is misleading given the underlying noise. A range like "23 to 31 percent at 95 percent confidence" is honest.

The breakout by engine. Clients should see ChatGPT, Claude, Gemini, Perplexity, and any other tracked engines separately. The composite is a summary; the per-engine numbers are the underlying observations.

The citation-type breakdown. Top-list mentions, mid-list mentions, comparison-context mentions, generic examples. The composition tells the client whether the visibility is high-quality or filler.

The methodology disclosure. One page, attached to the report. Prompt template policy, sample size, engine weights, citation classification rules. The same disclosure for every client engagement. If you cannot produce this page from your current AEO platform, the platform is making your renewal harder.


Frequently asked questions

Who coined "Share of Model"?

The earliest documented use of the term in the AEO context comes from INSEAD's 2025 paper "Meet the Model: How to Market to LLMs" and Harvard Business Review's 2025 article "Forget What You Know About Search: Optimize Your Brand for LLMs." Both pieces frame the metric as the LLM-era analogue to traditional share of voice.

Is Share of Model the same as share of voice?

Conceptually similar, structurally different. Share of voice in traditional advertising is calculable from deterministic ad placements. Share of Model is calculated from stochastic LLM outputs and requires sample-size, prompt-construction, and engine-weighting decisions that share of voice does not.

Is one Share of Model number enough, or do I need engine-level breakouts?

You need both. The composite is the headline for stakeholders who want a single number. The breakouts are the defense when a sophisticated client asks how the number was constructed.

Can Share of Model be gamed?

Partially. Earned-media work that increases the brand's third-party citation footprint will move Share of Model in most categories. Brand-owned content optimization moves the metric more slowly because most AI citations come from earned media. Pure prompt-manipulation games (paying for mentions in specific publications LLMs cite) raise both ethical and methodological concerns and are outside the scope of this article.

What sample size do I need for a defensible Share of Model number?

Single-query measurements are not defensible regardless of how the engine responds. Minimum useful samples are in the dozens per engine per measurement period. GenPicked uses thirty prompts per engine as the default. The right minimum for a specific category depends on the within-category response variance, which can be estimated from a pilot study.

Does sycophancy actually move Share of Model that much?

Yes. Empirical work by Atwell and Alikhani shows LLMs systematically overcorrect their beliefs toward user-implied judgments. In an AEO context, this means brand-anchored prompts can inflate the target brand's appearance rate by twenty-plus percentage points. The exact magnitude varies by model and category; the direction is consistent.


Related reading


Measure Share of Model defensibly

If your current AEO platform reports Share of Model without publishing the five methodology choices above, you are reporting a number you cannot defend. Run a free GenPicked AEO audit to see the metric reported with the full methodology disclosure attached.

Start your 14-day free trial of GenPicked Growth →


Dr. William L. Banks III is Founder of GenPicked. References to INSEAD, Harvard Business Review, SparkToro, Conductor, the University of Toronto, and Atwell and Alikhani are documented in the underlying research wiki. Specific citations available on request.

Dr. William L. Banks III

Co-Founder, GenPicked

Get Your Brand's AEO Score

See how your brand is performing in AI search with our free AEO audit.

Start Your Free Audit
#academy#blog#original-research#share-of-model#measurement#methodology#r3