Writing the Audit Report

Writing the Audit Report

In this lesson, you will learn: How to write an AEO audit report that is credible as evidence and useful as a portfolio piece. The five-section report structure, the executive summary conventions that busy CMOs read first, how to present methodology honestly, how to translate metrics into recommendations, and how to write the limitations section that earns your reader's trust.

This is Lesson 6.5, the last lesson of Module 6. You have done the hands-on work. You have metrics. Now you write the document that makes the work legible to someone who was not in the audit with you.

The report is the deliverable. It is also your portfolio piece. Take it seriously. A well-written audit report can get you hired, get you promoted, open a consulting engagement, or start a speaking engagement. A sloppy one closes doors quietly.


The report is where impact happens

The uncomfortable truth: a brilliant audit, badly reported, has no impact. A competent audit, well reported, changes decisions.

Your reader, a CMO, a hiring manager, a client, a conference organizer, will spend 90 seconds on the executive summary and decide whether to keep reading. If they don't engage in 90 seconds, the other three weeks of work are invisible. The report is not the packaging of the audit. The report is the audit, for any reader who wasn't the one running it.

Write for one specific reader Before you write a word, name your target reader. A CMO at your target brand? A hiring manager at an AEO agency? A client you hope to land? Write the report as a letter to that one person. When you can't decide between two phrasings, ask: which one would my named reader prefer? That's your style guide.


Module 1, The five-section structure

Every credible AEO audit report has the same five sections, in the same order.

  1. Executive summary (200-350 words)
  2. Methodology (400-700 words)
  3. Findings (800-1,500 words, the bulk)
  4. Recommendations (400-700 words)
  5. Limitations (200-400 words)

Plus an optional appendix with raw data tables.

This structure is not arbitrary. It maps directly onto Peffers et al. (2007)'s Design Science Research Methodology, identify the problem, define objectives, demonstrate the method GenPicked Academy teaches, evaluate, and communicate. The audit report is the "communicate" step, and its shape is how the prior steps earn their readers.

The order matters. Readers who will only read one section will read the executive summary, put your most important finding there. Readers who are skeptical will turn to the methodology next, make it airtight. Readers who trust you and want details will read findings and recommendations. Readers who are evaluating your rigor will read the limitations section last, this is what separates amateurs from practitioners.


Module 2, The executive summary

The executive summary is 200-350 words. It is the most important section. Write it last, after the findings are clear, but put it first in the document.

The three-bullet scaffold

The executive summary should answer three questions in three bullets:

  1. What did you find? The headline number. Not the methodology, not the context, the finding.
  2. What does it mean? The business implication. Why does this matter to the named reader?
  3. What should happen next? The one recommendation you are most confident about.

Then close with a short paragraph naming the scope, brand audited, models tested, time window, sample size.

A worked example

AEO Audit: [Target Brand], April 2026, Executive Summary

  • Finding: [Brand]'s organic visibility across four frontier AI models was 42% at the category level, with a 53-percentage-point sycophancy gap when the brand is named in the prompt (95% inflated visibility). Cross-model variance was high (range 15%-71%), meaning AEO performance is dramatically different by model.

  • Meaning: Any AEO measurement tool using named prompts is overstating [Brand]'s visibility by roughly 2x. The real competitive picture, where [Brand] appears when buyers ask category questions without naming the brand, shows [Brand] losing to [Competitor A] on ChatGPT and Claude, and winning on Gemini and Perplexity.

  • Recommendation: Prioritize content investments targeted at the training-data layer of ChatGPT and Claude, podcast mentions, expert-author byline articles, and Reddit threads in the category. The Gemini and Perplexity strength is already earned; the ChatGPT and Claude weakness is the gap to close.

Scope: 15 questions (5 blind, 5 named, 3 comparison, 2 adversarial) run across ChatGPT, Claude, Gemini, and Perplexity in two sampling passes 72 hours apart. 120 total responses. Audit conducted April 2-8, 2026.

Notice what's here and what isn't. The headline finding is quantified. The business implication is concrete. The recommendation is specific enough to act on. The scope paragraph gives the reader enough to evaluate credibility.

Notice what's missing. No methodology details, those are in section 2. No caveats, those are in section 5. No adjectives like "groundbreaking" or "comprehensive." Just numbers and consequences.


Module 3, The methodology section

The methodology section is where the skeptical reader goes. It must be complete enough that another analyst could replicate your audit and produce comparable results.

What to include

  1. The question set. Summarize the structure: 5 blind, 5 named, 3 comparison, 2 adversarial. Link to or appendix the actual questions.
  2. The models. Name each model with version string (gpt-5, claude-sonnet-4.5, gemini-2.5-flash, perplexity-sonar).
  3. The protocol. Fresh chat per question, verbatim prompt, first-turn response only, logged within 5 seconds.
  4. Latin Square counterbalancing. State that comparison questions were run in both orders.
  5. Sampling frequency. Two passes, minimum 48 hours apart.
  6. Scoring conventions. Define what "mentioned," "position," "sentiment," "tie" actually mean in your rubric. Reasonable people disagree about sentiment, make your rubric explicit.
  7. Sample size. 120 total responses, broken down by cell (model × question type).

The methodology section is a trust-builder

Write it as if a skeptical reviewer will audit it line by line. Because they will. Any practitioner who takes AEO seriously will turn to this section first to evaluate whether your findings are credible. This is also where brand-measurement orthodoxy applies: Aaker (1996) and Keller (1993), the foundational brand-equity literature, established that valid brand measurement requires explicit multi-dimensional construct disclosure, not a single score presented without scaffolding. Your methodology section inherits that standard.

The methodology section is where you demonstrate that you have read Blind vs. Named Measurement, Model Susceptibility Spectrum, and the research behind them. You do not need to cite them explicitly, but your protocol should reflect their lessons. A reader who knows the research will recognize that you do too.

AEO claim, transparent methodology as credibility: Ekamoira (2026)'s 2026 meta-analysis of 27+ AI visibility tools found that methodology transparency, explicit disclosure of prompt design, sampling cadence, and scoring conventions, was the single strongest predictor of practitioner trust in AEO reports. Peffers et al. (2007) arrived at the same conclusion from the Design Science side: the "communicate" step of DSRM requires explicit disclosure of problem, objectives, method, demonstration, and evaluation, anything less leaves the artifact unevaluable. A report with full methodology disclosure is structurally more credible than one that hides its protocol, regardless of the sophistication of the underlying analysis.


Module 4, The findings section

The findings section is 800-1,500 words, the bulk of the report. It presents your metrics, organized by story, not by data column.

Organize by story, not by table

A common mistake: structure the findings section as "here's the mention rate table, here's the sycophancy gap table, here's the variance table." That produces a report that reads like a data dump. The reader extracts nothing.

Instead, organize by the three to five stories your data tells. Each story is a sub-section. Each sub-section combines multiple metrics to make a single point.

Example story structure

Story 1: The brand is invisible at the category level. - Blind mention rate: 42% overall, ranging from 15% (Gemini) to 71% (Perplexity) - Where the brand appears: position 4 average in Perplexity, position 2 average in ChatGPT - Quote a verbatim blind-question response where the brand did NOT appear - Implication: most category-level buyer queries will not surface the brand

Story 2: Named-prompt measurement is inflating visibility by 2x. - Sycophancy gap: 53 pp overall - Cross-model sycophancy: Claude 75 pp, Perplexity 15 pp - Quote a named-question response where the brand appears in a hedged or qualified way - Implication: any commercial AEO tool using named prompts is showing the brand as roughly 2x more visible than it actually is. See Non-Uniform Distortion for why this matters beyond simple inflation.

Story 3: AEO performance is model-specific. - Cross-model blind rate range: 56 percentage points - Pairwise win rates differ by model (Oura beats Whoop on Gemini, loses on ChatGPT) - Quote two responses, one where the brand wins a comparison, one where it loses the same comparison on a different model - Implication: a single AEO strategy won't work. Each frontier model needs model-specific content investments.

Story 4: Hallucinations and hedges as risk signals. - 3 hallucination instances across 120 responses (~2.5%) - 12 hedged recommendations (~10%) - Quote one hallucination and one hedge - Implication: risk-tracking is a first-class AEO activity, not a footnote.

Use quotations, not just numbers

For every major finding, include at least one verbatim response quotation. Numbers tell the reader what happened. Quotations let the reader hear it. A reader who sees "ChatGPT's blind mention rate was 60%" engages very differently from a reader who sees "ChatGPT's blind mention rate was 60%. Example: when asked 'what are the best fitness wearables for serious athletes?', ChatGPT replied: 'For serious athletes, Garmin, Apple Watch, and Whoop are the top choices. Oura is also an option for sleep-focused athletes.'"

The quotation makes the finding tangible. Use them generously.

Use visuals, sparingly

If your readers are visual, include 2-3 simple charts: a bar chart of mention rates by model, a bar chart of sycophancy gaps, a scatter or range plot showing cross-model variance. Do not clutter the report with charts of every table. The rule: if a chart doesn't tell a story faster than prose, don't include it.


Module 5, The recommendations section

Recommendations are where the audit becomes strategy. This section is 400-700 words. It translates findings into actions.

The format

For each recommendation:

  1. What to do: one concrete action
  2. Why: the finding that supports it
  3. How to know if it worked: the metric that would move

Example

Recommendation 1, Invest in podcast and expert-author content for ChatGPT and Claude training-data visibility.

Why: [Brand]'s blind mention rate on ChatGPT (30%) and Claude (15%) is dramatically lower than on Gemini (55%) and Perplexity (70%). Gemini and Perplexity lean more on live web retrieval; ChatGPT and Claude lean more on pre-training data. The weakness is in training-data visibility, not live-retrieval visibility.

How to know if it worked: Re-run the audit at 3 months and 6 months. A successful campaign will raise ChatGPT and Claude blind mention rate by 10+ pp. Live-retrieval scores on Gemini and Perplexity should stay steady.

Keep recommendations specific and measurable. Avoid general advice like "invest in thought leadership." Name the channel, the rationale, and the measurable outcome.

How many recommendations

Three to five. Fewer than three looks thin. More than five looks unfocused. Pick the actions with the highest expected impact and the clearest success criteria.


Module 6, The limitations section

The limitations section is where practitioners earn the reader's trust. Weak auditors hide limitations. Strong auditors state them plainly and let the reader judge.

What to include

  1. Sample size. You ran 15 questions × 4 models × 2 passes = 120 responses. State it. That's small. Note the width of your confidence bands.
  2. Temporal snapshot. AI models change. State the exact dates of both sampling passes. Caveat that findings may drift.
  3. Category scope. You audited one category and one target brand. Findings do not generalize to other categories.
  4. Sentiment scoring subjectivity. You made the calls. Another analyst might score differently.
  5. Adversarial question count. You used 2. Not enough to characterize the brand's reputation risk fully.
  6. First-turn only. You measured first-turn responses. Multi-turn conversations behave differently.
  7. What this audit cannot tell you. Buyer conversion rates. Attribution. Dollar ROI. Those need other measurement instruments.

Why full limitations disclosure is the credibility move

Many auditors leave the limitations section thin, fearing it will make the report look weak. The opposite is true. A thin limitations section makes the reader wonder what you're hiding. A thorough one says: I understand the bounds of what this audit can and cannot tell you. Here they are, explicitly. Readers who know the field will recognize this as a marker of competence. See the full R3 register note in The Starter Kit, limitations disclosure is where R3 meets R1.

AEO claim, honest limitations sections improve adoption: Churchill (1979)'s measurement paradigm explicitly requires bounded-validity disclosure as part of any new measure's release, construct measurement without stated limitations is not measurement at all. Fishkin (2026)'s analyses of AI-brand-measurement reports confirm the pattern empirically: reports with explicit, bounded limitations sections are more likely to be circulated internally, cited in follow-up work, and acted on than reports that omit or minimize limitations. The honesty is not a weakness to hide, it is the credibility move that makes findings durable.


Module 7, Writing for the portfolio use case

If this audit is a portfolio piece, which it should be, on your first time through, write it with the portfolio reader in mind. That's a hiring manager or a prospective client who is evaluating whether you can do this kind of work.

The portfolio features to include

  1. A redacted version. If you audited a real brand (especially your employer), produce an anonymized version for public sharing. Replace brand names with [Target Brand], [Competitor A], etc. The methodology and structure remain, that's what's being evaluated.
  2. An author bio paragraph at the end. One paragraph introducing yourself, your background, and why you did the audit. Links to LinkedIn and any relevant prior work.
  3. A methodology-visible table of contents. Make the sections easy to scan so a hiring manager can see the structure in 10 seconds.
  4. Version control. Version 1.0 today. If you update it with a 6-month re-audit, version 1.1. Show the work continuing.

Distribution

You have three options:

  • Internal only. Share with your boss or CMO if the target was your employer. Positions you as the person who saw AEO early.
  • Public with anonymization. Publish a redacted version on Substack, Medium, or LinkedIn. Builds your practitioner credibility publicly.
  • Portfolio gate. Keep it private but link to it from your LinkedIn under "Featured," with access on request. Good for interviews.

All three are valid. The one you pick depends on your career goals and your organization's tolerance for public methodology discussion.


Module 8, The editorial pass

Before you call the report done, do three editorial passes.

Pass 1, Claim audit

For every claim, ask: does the data on my analysis sheet support this? If not, soften the claim or remove it. It is vastly better to underclaim than to have a reader find a claim your data can't support.

Pass 2, Plain-language pass

Read the report aloud. Every sentence that's hard to say is hard to read. Rewrite it. Cut jargon that isn't defined. A CMO with a marketing degree should be able to read this report without opening a glossary.

Pass 3, The 90-second test

Time yourself reading the executive summary. It should take 90 seconds or less. If it takes longer, cut. The executive summary is the only section that has to survive a busy reader; everything else is optional reading.


Exercise, write your audit report

Using your analysis sheet from Lesson 6.4:

  1. Draft the methodology section first. It's the most factual and the easiest to write.
  2. Draft the findings section next, organized by 3-5 stories.
  3. Draft the recommendations section from the findings.
  4. Draft the limitations section, be thorough and honest.
  5. Draft the executive summary last. Compress to 200-350 words, three bullets plus scope.
  6. Run the three editorial passes.

Save the file as aeo-audit-[brand]-[yyyy-mm].md. If you are publishing, also produce a redacted aeo-audit-[category]-[yyyy-mm]-redacted.md version.

The file you just saved is a portfolio-grade AEO audit report. Two or three of these, in a shared folder, are the artifact an agency or hiring manager asks for when evaluating an AEO Strategist.


Common report mistakes

"My executive summary is 600 words"

Cut it to 350. Anything longer stops being an executive summary.

"I led with methodology instead of findings"

Readers want to know what happened before they care how you measured it. Findings first, methodology as support.

"I buried the sycophancy gap in a table footnote"

The sycophancy gap is the most diagnostic single number in the audit. It deserves its own story section.

"My limitations section is two sentences"

A thin limitations section is a red flag to any informed reader. Write it fully. Honesty is the credibility move.

"I used generic recommendations"

"Invest in thought leadership" is not a recommendation. "Sponsor 3 podcasts in the [specific niche] category over the next quarter and measure the lift in ChatGPT blind mention rate" is.

"I didn't quote the models at all"

Numbers tell; quotations show. Include verbatim response snippets for every major finding. Readers remember the quotations.


Takeaways

  1. The executive summary is the report, for most readers. Make it three bullets plus scope, total 200-350 words, readable in 90 seconds.
  2. Organize findings by story, not by table. Three to five stories, each combining multiple metrics into a single point, with a verbatim quotation.
  3. The limitations section is the credibility move. Thorough and honest. Weak auditors hide limitations; strong ones state them plainly.

What's next

You have completed Module 6. You set up an environment, designed a question set, ran a four-model audit, calculated diagnostic metrics, and wrote a portfolio-grade report. You are operating as an AEO Strategist, not theoretically, but in practice.

Module 7 begins next, 27+ Platforms, A Map of the AEO Market. You learned how to run an audit yourself. Module 7 shows you the commercial AEO tool landscape: what's on the market, how vendors claim to measure brand visibility, and how to evaluate any vendor's methodology against what you now know to be a credible protocol. Your ability to audit the audit tools is where your expertise becomes a competitive advantage.

Reflection prompt

Open your completed report. Imagine showing it to the CMO of the brand you audited, in person, with one minute to explain why they should care. What is the one sentence you would say? Write it in your notebook. That sentence is your elevator pitch, for this audit, and for your AEO Strategist career.


Templates referenced: Audit Report (template forthcoming). Use the five-section structure in this lesson as the authoritative reference until the markdown template ships.


About this course

This lesson is part of AEO A to Z, the open course on Answer Engine Optimization published by GenPicked Academy. GenPicked Academy is where practitioners learn to measure AI recommendations with the same rigor a clinical trial demands: blind sampling, balanced question sets, and confidence intervals that hold up.

About the author: Dr. William L. Banks III is the lead researcher at GenPicked Academy and the architect of the three-layer AEO measurement architecture taught in this course. His work on sycophancy, popularity bias, and construct validity in AI search informs every lesson you just read.

See the methods in practice: GenPicked runs monthly brand-intelligence audits using the exact pipeline taught in Module 6. Read the case studies and audit walkthroughs on the GenPicked blog.

Dr. William L. Banks III

Co-Founder, GenPicked

Get Your Brand's AEO Score

See how your brand is performing in AI search with our free AEO audit.

Start Your Free Audit
#academy#guide#r3#aeo#audit#report#portfolio#hands-on#module-6