AEO Fundamentals

The Evidence: What We Actually Know About AI Search Behavior

Q: Question 3: Where do AI citations come from?

Question 3: Where do AI citations come from? Short answer: Overwhelmingly from earned media — not from brand-owned content and not from Google's top-ranked pages. This is the most strategically important finding in AEO research right now. Two studies triangulate the same conclusion. Study 1: Xibeiji

Q: Question 5: What actually stays stable?

Question 5: What actually stays stable? Short answer: The meaning. AI systems agree on what to say. They disagree on which sources to name. This is the finding that makes AEO worth doing despite the chaos above. A researcher at Ahrefs named Gavoyannis analyzed 730,000 query pairs comparing Google's

Dr. William L. Banks III

April 17, 2026

11 min read

The Evidence: What We Actually Know About AI Search Behavior

In this article, you will learn: What the research actually shows about how AI systems recommend brands, cite sources, and generate answers. Ten independent studies covering five questions: Who's using AI for recommendations? How much traffic does it actually drive? Where do AI citations come from? How consistent are the answers? And what holds stable underneath the noise. By the end, you'll have the empirical foundation to evaluate any AEO claim you encounter.

This is Part 2 of the Defining AEO series on GenPicked Academy. Part 1 gave you the standard definition. This part builds the evidence base underneath it.

How to read the research in this article

Before we dive in, one short note on how to treat the numbers you're about to see.

Every statistic in this article comes from a named study with a real methodology. Some are peer-reviewed. Some are industry research from companies with commercial interests. Some are analyses by independent practitioners. I'll tell you which is which, because the quality of the evidence matters as much as the number itself.

No single study proves a field. What matters in AEO research is convergence — when five different teams, using five different methods, point to the same finding, that finding is real. When only one team reports it, that finding is a hypothesis waiting for replication. I'll flag which is which as we go.

Question 1: Are people actually using AI for product recommendations?

Short answer: Yes, significantly, and adoption is accelerating.

The most-cited number comes from a 2025 Harvard Business Review article: 58% of consumers now use generative AI tools for product and service recommendations, up from 25% in 2023. That's more than doubling in two years. The same article reported a 1,300% surge in AI search referrals to US retail sites during the 2024 holiday season.

Firstpagesage's early 2026 market-share tracking estimated that AI-driven search traffic reached roughly 15% of all search sessions across major platforms. Gartner's 2024 prediction was that traditional search volume would drop 25% by 2026 as users migrate to AI alternatives.

These numbers aren't noise. They describe a real consumer behavior shift. When you hear vendors and consultants say "AI search is growing," they're pointing to this adoption data — and the data supports the claim.

What to watch for: adoption numbers are often reported as percentages without base rates. "58% of consumers" tells you reach. It doesn't tell you what those consumers are doing. The next question gets at that.

Question 2: How much website traffic does AI actually drive?

Short answer: Currently a small fraction of the total, but growing quickly.

This is where the evidence gets more nuanced. Conductor's 2026 AEO/GEO Benchmarks Report analyzed 3.3 billion sessions across 13,000+ enterprise domains. Their headline finding: AI referral traffic accounts for only 1.08% of all website traffic on average.

One percent. Compared to the enormous coverage AEO is receiving, that's a sobering number.

A few details matter:

AI referral traffic is growing at roughly 1% month-over-month, so the base is expanding.
The IT industry sees the highest AI referral share at 2.80%.
87.4% of all AI referral traffic comes from ChatGPT — a single dominant source.

So: many consumers use AI for research, but most of them don't click through to brand websites from the AI tool. They read the answer and act on the information without leaving the chat. That's why Question 3 — where AI pulls from — is the right question, not "how does the traffic look in Google Analytics?"

The tension to hold: adoption is real (58%) and traffic is small (1.08%). These aren't contradictory. They reflect a world where AI is the new reading surface — people get the answer there, not by clicking away. If you're measuring AEO success by traffic to your site, you're measuring a thin slice of the actual impact.

Question 3: Where do AI citations come from?

Short answer: Overwhelmingly from earned media — not from brand-owned content and not from Google's top-ranked pages.

This is the most strategically important finding in AEO research right now. Two studies triangulate the same conclusion.

Study 1: Xibeijia Guan at Ahrefs ran a 15,000-query study in September 2025 across Google, ChatGPT, Gemini, and Copilot. Only 12% of the links AI systems cited also appeared in Google's top 10 results for the same query. Eighty percent of AI citations came from pages that don't rank in Google at all for the original query.

Study 2: A 2025 University of Toronto analysis found that 82–89% of AI citations come from earned media — third-party content about a brand, not content the brand owns. In the US specifically, the figure was 92.1%.

Together these tell a clear story: the sources AI prefers are neither the ones brands publish nor the ones that win Google's ranking algorithm. AI draws from a third pool — news articles, Reddit threads, analyst coverage, community forums, review sites. If your AEO strategy is built around optimizing your own website, you're working on the smallest lever.

This has concrete implications for how AEO differs from SEO — and the full mechanics are in the earned media bias glossary entry.

Question 4: How consistent are AI answers to the same question?

Short answer: Very inconsistent at the surface level. This is not a bug the field is close to fixing.

If you ask ChatGPT the same question twice, you usually get different wording. That's expected. What's less obvious is how different the substance of the answer can be — and two studies measured this directly.

Study 1: Rand Fishkin's team at SparkToro ran a large consistency study in January 2026: 600 volunteers, 2,961 total prompts across ChatGPT, Claude, and Google AI. Fewer than 1 in 100 runs produced the same brand list. Fewer than 1 in 1,000 produced the same list in the same order.

Study 2: SE Ranking tested same-day URL consistency on identical queries. Result: only 9.2% URL overlap between three runs.

Study 3: Ahrefs' own longitudinal tracking found that Google AI Overviews change their cited sources every 2 days on average. Not every 2 weeks. Every 2 days.

Translation: a single check of "what does ChatGPT say about our brand today?" is a coin flip. Any AEO report or dashboard built on one-shot queries is structurally unreliable. The measurement unit has to be the distribution across many runs — not the individual answer.

Question 5: What actually stays stable?

Short answer: The meaning. AI systems agree on what to say. They disagree on which sources to name.

This is the finding that makes AEO worth doing despite the chaos above. A researcher at Ahrefs named Gavoyannis analyzed 730,000 query pairs comparing Google's AI Overviews and AI Mode. Two different AI features, same underlying questions.

Semantic similarity — how similar the meaning was: 86%
Citation overlap — which specific URLs were named: 13.7%

Think about that contrast. The AI systems produce conceptually similar answers 86% of the time. But the specific URLs they point at match only 14% of the time. Another way to say this: the synthesis layer is stable. The citation layer is volatile.

This is the signal worth tracking in AEO. You want to know:

Is the AI including your brand in its synthesis of your category?
Is the description of your brand accurate?
Are you positioned correctly relative to competitors in the AI's mental model?

You don't particularly want to know which specific third-party URL got cited this afternoon. That will change by tomorrow. The stable layer is where the insight lives.

The ten-studies view

Here's how the evidence stacks up, by study, so you can see the full shape:

#	Study	Core finding	Quality
1	HBR 2025	58% of consumers using GenAI for recommendations	Industry analysis
2	Conductor 2025	AI referrals = 1.08% of traffic	Industry study, large sample
3	Ahrefs (Guan) 2025	12% overlap AI citations vs. Google top 10	Industry study
4	U of Toronto 2025	82-89% of AI citations from earned media	Academic
5	SparkToro (Fishkin) 2026	<1% brand list repeatability	Industry study, 2,961 prompts
6	SE Ranking 2025	9.2% URL overlap across 3 same-day runs	Industry study
7	Ahrefs 2025 (Overviews)	Overview citations change every ~2 days	Industry tracking
8	Ahrefs (Gavoyannis) 2025	86% semantic vs. 13.7% citation overlap	Industry analysis, 730K pairs
9	Firstpagesage 2026	AI-driven search ≈ 15% of sessions	Market tracking
10	Gartner 2024	Traditional search volume projected to drop 25% by 2026	Analyst prediction

These aren't ten studies that all say the same thing. They're ten studies that, read together, describe a coherent picture: adoption is real, traffic is small but growing, citations come from unexpected places, surface answers are chaotic, and the semantic meaning is stable underneath it all.

That picture is the empirical foundation of the field. Any AEO claim — from a vendor, a consultant, a course — should be compatible with that picture. If someone tells you AI search is a 50% market share today, or that you can guarantee ChatGPT citations, or that your Google ranking directly determines your AEO outcome, they're contradicting this evidence base.

What we still don't know

Being honest about the limits of the evidence matters more than inflating it.

Self-reported adoption data is soft. Consumer surveys asking "do you use AI for product research?" overcount intent and undercount actual behavior. The 58% number is directionally right but not precise.
Enterprise-skewed samples. Conductor's 1.08% traffic figure is based on 13,000+ enterprise domains. Small businesses and local services may see very different patterns.
Language and geography. Most of this research is English-language and US/UK-focused. AEO research in Spanish, Mandarin, Hindi, or Arabic is sparse.
Platform coverage is uneven. ChatGPT is heavily studied. Perplexity has partial coverage. Claude and Gemini get less direct measurement. The full picture across all major platforms doesn't exist yet.
Causality is thin. We know AI cites earned media heavily. We don't yet know what makes one piece of earned media get cited over another at the model level.

The research is solid enough to support the thesis that AEO is a real and distinct discipline. It's not yet solid enough to support strong predictive claims about "do X and you'll get cited Y more times." Be skeptical of anyone who claims otherwise.

Try this

A short exercise to strengthen your ability to read AEO evidence.

Pick any recent AEO article, vendor pitch deck, or LinkedIn thread that cites a statistic about AI search.
Find the underlying source. Follow the citation. Many won't have one.
For the ones that do, ask: Is this a peer-reviewed study? An industry research report? A consumer survey? A vendor's internal data?
Check whether the statistic is consistent with the ten findings above, or if it claims something the evidence can't yet support.

The skill you're building here is evidence hygiene — the ability to distinguish findings that are grounded from ones that are aspirational. In a field this young, that skill is what separates competent practitioners from confident ones.

What's next

You now know what the research actually says about AI search behavior. The next question is the most uncomfortable one in the field: even when you measure carefully, AI answers have built-in biases that distort the numbers. Sycophancy. Popularity bias. Position bias. These aren't flaws to be worked around — they're mechanical properties of how AI systems are trained.

Part 3: The Bias Problem — Why AI Recommendations Aren't What They Seem walks through the four biases that matter most for AEO and the evidence that shows how much they distort raw measurement. That's where the series earns its "Brand Intelligence Gap" thesis.

If you want to go deeper on the adoption side before moving on, the AEO glossary entry has a compressed version of Questions 1-2. For the measurement side, perception drift shows what happens to a brand's AI-described identity over time as the underlying sources shift.

The numbers are the foundation. The biases are what move the numbers. Let's keep going.

Dr. William L. Banks III

Co-Founder, GenPicked

Frequently Asked Questions

Question 1: Are people actually using AI for product recommendations?

Question 1: Are people actually using AI for product recommendations? Short answer: Yes, significantly, and adoption is accelerating. The most-cited number comes from a 2025 Harvard Business Review article: 58% of consumers now use generative AI tools for product and service recommendations, up from

Question 2: How much website traffic does AI actually drive?

Question 2: How much website traffic does AI actually drive? Short answer: Currently a small fraction of the total, but growing quickly. This is where the evidence gets more nuanced. Conductor's 2026 AEO/GEO Benchmarks Report analyzed 3.3 billion sessions across 13,000+ enterprise domains. Their hea

Question 3: Where do AI citations come from?

Question 3: Where do AI citations come from? Short answer: Overwhelmingly from earned media — not from brand-owned content and not from Google's top-ranked pages. This is the most strategically important finding in AEO research right now. Two studies triangulate the same conclusion. Study 1: Xibeiji

Question 4: How consistent are AI answers to the same question?

Question 4: How consistent are AI answers to the same question? Short answer: Very inconsistent at the surface level. This is not a bug the field is close to fixing. If you ask ChatGPT the same question twice, you usually get different wording. That's expected. What's less obvious is how different t

Question 5: What actually stays stable?

Question 5: What actually stays stable? Short answer: The meaning. AI systems agree on what to say. They disagree on which sources to name. This is the finding that makes AEO worth doing despite the chaos above. A researcher at Ahrefs named Gavoyannis analyzed 730,000 query pairs comparing Google's

Get Your Brand's AEO Score

See how your brand is performing in AI search with our free AEO audit.

Start Your Free Audit

#series#r3#academy#aeo#research#evidence#ai-search

The Evidence: What We Actually Know About AI Search Behavior

How to read the research in this article

Question 1: Are people actually using AI for product recommendations?

Question 2: How much website traffic does AI actually drive?

Question 3: Where do AI citations come from?

Question 4: How consistent are AI answers to the same question?

Question 5: What actually stays stable?

The ten-studies view

What we still don't know

Try this

What's next

Dr. William L. Banks III

Frequently Asked Questions

Related Articles

From SEO to AEO: How Search Changed and Why It Matters for Your Career

What AEO Is (and What It Isn't): The Standard Definition

The Bias Problem: Why AI Recommendations Aren't What They Seem

Get Your Brand's AEO Score