The last cold-email reply I personally got from a stranger this year — not a meeting request, not an out-of-office, an actual reply with a question in it — opened with one sentence: “Did you really run all five engines, or is this template fishing?”
She had received the email at 7:14 on a Tuesday. I had run her brand through ChatGPT, Perplexity, Gemini, Claude, and Google AI Overviews the night before, and pasted a five-line summary. By the time her question landed in my inbox, she had already re-run the prompts on her phone. She did not reply because the email was clever. She replied because it was checkable, and she had just checked it.
That is what cold outbound looks like right now if you are an agency selling AEO services. The only opener that still works is one sentence: “I just ran your brand through ChatGPT, Perplexity, Gemini, Claude, and Google AI Overviews — here is what came back.” Everything else — “I noticed your site,” “quick question,” “we help SMBs grow with AEO” — reads like AI-generated outreach. Per Mailshake’s 2026 State of Cold Email, average reply rates have collapsed to 3.43%, down from roughly 5% the year before, driven by the flood of AI outreach and tightened sender rules from Google and Microsoft.
The five-engine hook breaks through because it is specific, free, and trivially disprovable. The artifact is the personalization — the only way a fifty-emails-a-month operator can sound personalized without lying about it. What follows: five templates, the four-touch cadence, subject-line patterns that work, and compliance lines for three jurisdictions.
What changed underneath the inbox
Three demand-side numbers explain the reply gap between the five-engine opener and the generic value-prop intro that used to work.
The first is the most uncomfortable. Per the 6sense 2025 B2B Buyer Experience Report, 94% of B2B buyers now use LLMs in their purchase journey, and the winning vendor comes from the buyer’s Day-One shortlist 95% of the time, up from 85% the year before. The shortlist is now the answer ChatGPT gave at 9pm the night before the first call. If your prospect is not on that answer, they are not in the consideration set.
The second is the one that lets you sell. Per Loamly’s 2,089-brand analysis, 77% of brands are completely absent from AI platform responses. The visible 23% see AI-sourced visitors who convert at roughly three times the rate of Google search. Microsoft’s Clarity study reports the same pattern from a different dataset, so the conversion premium is not a sample artifact. The selling motion becomes “77% of your category is invisible, you are one of them, and the 23% who are not are converting their AI traffic at 3x the rate of yours.”
The third number gives Template 3 its punch with SEO-savvy buyers. Per Ahrefs’ December 2025 update across 300,000 keywords, AI Overviews now correlate with a 58% lower CTR for the top organic result, up from 34.5% in March 2025. The slope of that line is the entire reason a head of growth who used to push back on AEO will now book the call: ranking number one no longer pays what it used to if Google is answering the query above the link.
The math of a reply in this market
Top-quartile reply rates are not 20% anymore — they are mid-single digits, and most operators are anchored to a benchmark from a market that no longer exists. The four numbers worth memorizing:
Average reply sits at 3.43% per Mailshake — the lowest figure their long-running State of Cold Email survey has ever recorded. Top-quartile senders come in at 5.5% per Instantly’s benchmark. Personalization, measured properly across more than twenty million emails by Woodpecker, lifts replies by 142% when real and by approximately zero when it is a merged-field name in the salutation. And 42% of replies arrive on follow-ups, while 48% of senders never send a second message at all.
Lemlist calls 5%+ reply “good,” 8%+ “excellent,” and tells senders to optimize for reply rather than open, because Apple Mail Privacy Protection has been inflating open data since iOS 15. Woodpecker’s 20M+ email analysis puts the average in the 1-5% band; Apollo matches at 1-5%, with 5-8.5% for highly personalized.
On length the data is decisive. Lemlist’s template study found 50-125 word emails get a 2.4x higher reply rate than emails over 200 words; 120-word emails booked at 52% versus 20% for 300-word emails. Gong agrees — reply rates fall sharply once an email crosses 100 words. Every template below sits inside the 50-125 word window, because the discipline has to live in the template, not in the operator’s willpower at 11pm on a Sunday.
“The five-engine audit hook is what lets a fifty-email-a-month operator hit the 5% personalization tier without spending fifty hours.”
Mailshake’s data shows the 5% of senders who personalize every email see 2-3x better reply rates. Doing that by hand requires hours no agency owner has. Generating a five-engine audit per prospect is bounded work — ten minutes manual, sixty seconds tool-assisted — and the artifact is more personalized than any merged-field intro could ever be. The artifact is the personalization signal. The email writes itself around it.
Five templates, each sized to fit the inbox
Five emails follow. Each sits inside the 50-125 word sweet spot Lemlist identified. Each assumes you actually ran the audit before sending. The templates are the surface of a system, not the system itself.
Template 1 — The sixty-second AEO check
Use this with an SMB founder or CMO at a 10-200 employee company who has no in-house SEO or AEO lead. Day 0 cold send for the broadest band of your list. Subjects worth testing: “quick check on [BrandName],” “[BrandName] in ChatGPT,” “ran your brand through 5 AI engines.” Body:
Spent 60 seconds running [BrandName] through ChatGPT, Perplexity, Gemini, Claude, and Google AI Overviews this morning.
Here is what came back when I asked “best [category] for [their ICP]”:
• ChatGPT: [competitor A, B, C] — no [BrandName]
• Perplexity: [competitor A, D] — no [BrandName]
• Gemini: [competitor A, B] — no [BrandName]
• Claude: [competitor B] — no [BrandName]
• Google AIO: [competitor A] cited — no [BrandName]
77% of brands are completely absent from AI answers right now (Loamly, 2,089-brand study). You are one of them. The 23% who show up convert at ~3x Google Search traffic.
Worth a 15-minute look at why?
[Sender first name]
Template 2 — Your competitor just got cited in ChatGPT
Use this with a marketing director at a B2B brand who watches competitors closely. Day 0 alternate or Day 7 follow-up to Template 1 when your audit produced a clean head-to-head. Subjects: “[CompetitorName] in ChatGPT,” “competitor cited, you were not,” “head-to-head: [BrandName] vs [Competitor].” Body:
I asked the five major AI engines “which [category] platform should I use for [vertical]?”
[CompetitorName] showed up in 4 out of 5 engines. [BrandName] showed up in 0.
94% of B2B buyers now use LLMs in their purchase journey, and the winning vendor comes from the buyer’s Day-One shortlist 95% of the time (6sense, 2025). If [CompetitorName] is the one being recommended, that is the shortlist.
Happy to send the exact prompt set + screenshot. Want it?
[Sender]
Template 3 — Your DA went up. Your AI citations did not.
Use this with an SEO-savvy in-house team or a head of growth who has already invested in SEO and is watching rankings hold while traffic falls. Day 0 cold or Day 7 follow-up. Subjects: “DA up, AI citations flat,” “[BrandName]: ranking #1, cited #0,” “the SEO/AEO disconnect.” Body:
[BrandName] ranks page-1 for “[their core keyword]” — I checked.
But I also ran the same query through ChatGPT, Perplexity, Gemini, Claude and Google AI Overviews. [BrandName] was not cited in any of them.
That is the disconnect Ahrefs documented: AI Overviews now cut CTR to the top organic result by 58% (up from 34.5% earlier in 2025). Ranking #1 does not pay what it used to if the AI Overview answers the query without you.
SE Ranking’s 300,000-domain study also found llms.txt has zero correlation with AI citations — so that is not the fix either.
15-minute call to walk through what is moving the needle on AI citation right now?
[Sender]
Template 4 — Do you know what ChatGPT says about your brand?
Use this with an SMB owner or operator who has never thought about AI search. Consumer-friendly tone, low jargon, Day 0 cold. Subjects: “what ChatGPT says about [BrandName],” “your brand in AI,” “quick — have you Googled yourself in ChatGPT?” Body:
Quick one: have you ever asked ChatGPT about [BrandName]?
I just did. Here is the first thing it said:
“[3-line summary of the actual ChatGPT output, including any wrong claim, missing detail, or competitor recommendation]”
If that is not how you would describe [BrandName] — or if a competitor’s name came up where yours should — that is the problem we fix. AI engines are now the first stop for ~94% of B2B buyers (6sense, 2025), and ChatGPT alone drives 87% of AI referral traffic (Conductor, 2026).
Want the full audit across all 5 engines? Free, takes me 10 minutes.
[Sender]
Template 5 — The vertical pattern (composite framing)
Use this for vertical-specific outreach where the prospect has obvious peers. Day 0 or Day 7. Composite framing only — never name a client without permission, never invent results. Subjects: “pattern I keep seeing in [vertical],” “[vertical] AI visibility,” “the one thing [vertical] brands miss.” Body:
The pattern I keep seeing with small [vertical] brands: they rank fine on Google, but ChatGPT, Perplexity and Claude do not mention them at all.
The ones that do break through usually fix three things: (1) FAQ-style answer pages with first-party data, (2) listings on the sources Perplexity actually pulls from (Reddit, G2, vertical directories), (3) schema that LLMs can parse cleanly. Our research team published the methodology behind this scoring in the GenPicked Fitness Wearables Study — same Bradley-Terry approach we use across all five engines.
I ran [BrandName] through all 5 engines this morning. Want the audit + the 3-fix list?
[Sender]
The four-touch cadence
One email is not a campaign. Per Belkins, the sweet spot is four to seven touches; below four, reply rate stalls in single digits, and beyond seven, spam complaints climb. 42% of replies arrive on follow-ups while 48% of senders never send one — half the market voluntarily leaves a third of pipeline on the floor.
Day 0 — the audit drop
Open with the five-engine summary and a screenshot. Template 1 or Template 4. The artifact is the entire email; the prose around it is connective tissue. If you are tempted to add a paragraph of context, cut it.
Day 3 — the one-stat follow-up
Reattach the three-line summary, then one sentence: “77% of brands are completely invisible to ChatGPT — you are one of them.” That is the entire email. Day 3 follow-ups that try to reframe the pitch do worse than ones that simply restate the stat and make the ask smaller.
Day 7 — the competitor wedge
Switch to Template 2. Name a real competitor that did show up in your audit. Specificity is the lift — the moment your prospect sees a competitor name they recognize next to their own missing entry, the email stops feeling like a template and starts feeling like a memo.
Day 14 — the graceful break-up
“Should I close this loop, or is now just a bad time?” Two sentences, no ask, no PS. Graceful break-ups recover replies the first three touches could not, because the break-up itself signals you will stop sending — which is exactly what unlocks the prospect who was always going to reply but needed the activation energy.
An optional Day 21 value drop — a link to your most-cited post or original research, no ask — re-opens the loop without re-asking for the meeting, and tends to produce inbound replies weeks later that no sequencer would credit to the campaign.
Subject lines that survive a Tuesday morning inbox
The shorter the subject, the higher the open. Gong is unambiguous: open rates decline as subject lines lengthen, and one-to-four-word subjects perform best. Lowercase. Specific. No buzzwords. Successful AEO subjects in my own sent folder look like Slack messages, not press releases.
Question subjects like “what ChatGPT says about [BrandName]” prime an answer; per Lemlist, real campaigns hitting 75%+ open rates skew heavily question-led. Specificity — “[CompetitorName] in ChatGPT” — reads as a 1:1 note rather than a blast. Per Gong’s personalization data, company-specific topics roughly triple reply rates from director-level and above; industry-specific personalization correlates with an 88% reply lift. Curiosity-gap subjects that name a concrete artifact — “ran your brand through 5 AI engines” — outperform tease subjects because the artifact is the curiosity, not a withheld noun.
Patterns to retire on sight: “Quick question?”, “Touching base,” “Following up,” and anything with “AI” framed as a buzzword. Save AI talk for the body where you have actual proof.
One CTA lever sits on top of all this. Per Gong’s 304,174-email study, interest-based CTAs (“worth a 15-minute look?”) convert to meetings at roughly 15%; meeting-request CTAs (“can we book 30 minutes Tuesday?”) perform 44% worse on reply. Every template above uses an interest CTA. The booked-slot CTA is a Day 7 conversation, not a Day 0 one.
Personalization is now an artifact, not an intro
The hard part of cold email used to be: how do I personalize a thousand emails without spending a thousand hours? The honest answer for a decade was that you could not, and the operators who succeeded mostly lied about it — merged-field names dressed up as bespoke notes. That trick is dead. Inbox filters cluster it. Prospects spot it on sight.
The answer that works today is structurally different: do not write a thousand intros, generate a thousand artifacts, and let the artifact be the personalization. A five-engine audit is the artifact in this market. Run the prospect’s brand and two or three buyer-intent queries through ChatGPT, Perplexity, Gemini, Claude, and Google AI Overviews. The platform returns an AI Citation Score from 0-100 with band labels — invisible (0-24), emerging (25-49), competitive (50-74), category-leader (75-100). Drop the band label and the three-line summary into the email. That is the personalization signal.
From my conversations with agencies running this hook for the last twelve months, the pattern is that the artifact-led opener produces a longer second-message reply — more questions in the response, fewer one-liners. The lift is anecdotal, not benchmarked, so I will not put a number on it. But the qualitative signal has been consistent across every operator I have watched run this play.
Compliance in sixty seconds, three jurisdictions
Three jurisdictions, three rule sets. The operational overlap is high enough that one compliant footer handles all three.
USA — CAN-SPAM (FTC). Per the FTC’s compliance guide, subject lines must be accurate and non-misleading, the message must include a valid physical postal address, and a clear opt-out must be honored within ten business days. Penalty: up to $53,088 per violating email — a single bad blast can wipe out a year of agency profit.
EU/UK — GDPR and PECR. Per ICO B2B guidance, corporate-subscriber email can rely on legitimate interests, provided you have a documented Legitimate Interests Assessment, a privacy notice, and an easy opt-out. The sender must be identifiable by name and business name. Sole traders and unincorporated partnerships are the grey area — the ICO treats them more like consumers than businesses, so opt-in is safer if you want to sleep at night.
Australia — Spam Act 2003 (ACMA). Per ACMA, you need consent (express or inferred), the email must include the sender’s ABN or legal name, and a working unsubscribe has to be honored within five business days. Penalties run up to A$220,000 per breach and A$2.1M for repeat offenders. Many UK “legitimate interest” arguments simply do not fly here.
Minimum-viable footer for all three: sender legal name, postal address, business identifier, one-click unsubscribe. Five minutes of setup, applied to every template before you send template one. The cheapest insurance an agency outbound program ever buys.
The work of the next five days
Five steps, all inside one work week. At the end you will have data on whether the hook works for your specific list. The point is not to perfect the campaign before sending — the point is to ship it and let the reply rate tell you what to fix.
Monday is the list. Build a 25-prospect target list in your sweet-spot vertical. Per Smartlead’s 14.3B-email aggregation, smaller targeted campaigns outperform broad blasts by 2.76x — 25 carefully chosen prospects will likely produce more replies than 250 randomly scraped ones. Pick a vertical you already know.
Tuesday is the audit. Run the five-engine audit on every prospect. Three buyer-intent queries per prospect through each engine. Capture the screenshot or the three-line summary. Manual is roughly ten to fifteen minutes per prospect; tool-assisted is closer to sixty seconds. At twenty-five prospects, manual is fine. Past fifty, tool-assisted is the only path.
Wednesday is the matching. Template 1 for SMB founders, Template 2 where you have a clean competitor wedge, Template 3 for SEO-savvy buyers, Template 4 for non-technical operators, Template 5 for vertical specialists. The matching is what separates a 5% reply campaign from a 1.5% one.
Thursday is the cadence. Schedule the four-touch cadence into your sequencer. Add the compliance footer. Set sending windows to 7-11am or 8-11pm in the prospect’s local time — industry data consistently identifies these as the highest-reply windows for B2B cold.
Friday is the send. Send the Day 0 batch. Track reply rate, not opens. Anything above 5% is “good” per Lemlist’s benchmark, 8% is “excellent.” Below 3%, the problem is almost always the audit, not the email — specific, real, checkable audits get replies; generic ones do not.
If you reach week four sitting on more replies than you can handle, the bottleneck has moved from outbound to fulfillment. That is the right problem to have — and the one most agency operators are not yet set up to absorb, because every audit-led meeting demands you run those audits in production, ship the citation gains, and prove them in a monthly client report.
If this is the workflow your agency is now in, GenPicked’s Growth plan is built for it. Start your 14-day free trial — Growth plan free for 14 days, five AI engines, full agency dashboard.