GEO in 2026: Getting Cited by AI Answer Engines

22 June, 2026 Web

The first time I noticed a referral from perplexity.ai in my analytics, I treated it as a novelty. By this year it is a channel I actively optimise for, with its own tactics, its own measurement, and its own acronym: GEO, Generative Engine Optimisation. The premise is simple and a little unsettling — instead of optimising to rank in a list of links, you optimise to be one of the three to five sources an AI quotes when it answers a question. The user may never see your site; they see your words, attributed, inside someone else's chat window. Some of them click. All of them now associate the answer with your brand.

I have written before about how AI search engines discover and cite content; that piece is the conceptual map. This one is the field manual — the concrete, 2026-specific moves I make to improve citation odds, including a few that did not exist a year ago. It assumes you have already nailed the foundation; if you have not, fix the SEO fundamentals that still win in the AI era first, because GEO is a layer on top of a working base, not a replacement for it.

SEO vs AEO vs GEO

The acronyms have multiplied, so let me draw the lines before going further:

	Optimising for	Win condition
SEO	Traditional search ranking	Top of the blue links
AEO (Answer Engine Optimisation)	Featured snippets, voice answers	Being the extracted answer
GEO (Generative Engine Optimisation)	LLM-generated answers	Being a cited source in the synthesis

They overlap heavily — the same clean, structured, authoritative content tends to win all three — but GEO has its own distinct tactics, and those are what changed most in 2026.

Let the AI Crawlers In

This is the most basic GEO move and the one I most often find broken. AI engines use named crawlers — GPTBot for OpenAI, ClaudeBot for Anthropic, PerplexityBot for Perplexity, and others. If your robots.txt blocks them, or a blanket bot-mitigation rule does, you are invisible to those engines no matter how good your content is. Audit your robots.txt deliberately and decide, per crawler, who you let in. Most sites that want AI citations should allow them; the mistake is blocking them by accident through an over-broad rule and never noticing.

The flip side is a genuine business decision: some publishers block AI crawlers to protect content. That is legitimate. Just make it a choice, not an accident — and know that blocking is binary. You cannot be cited by an engine you have locked out.

Publish an llms.txt

The newest tactic on this list, and one that did not exist in mainstream practice a year ago: an llms.txt file at your site root. Think of it as a curated, LLM-friendly index of your most important content — a deliberate map you hand to models, pointing them at your best, most citable pages in a clean, low-noise format. Where robots.txt says what crawlers may access, llms.txt says what they should prioritise.

It is early, adoption is uneven, and no engine treats it as gospel yet. But it is cheap to publish, it signals intent, and the cost of being early is near zero. I treat it the way I treated sitemaps fifteen years ago — not yet mandatory, obviously heading that way.

Structure Content in Citable Chunks

AI engines quote at the chunk level, not the page level. The content that gets cited in 2026 is broken into self-contained 200-300 word sections, each under a clear, descriptive heading, each answering one question completely. A model retrieving your page does not read it like a human; it locates the chunk that matches the query and lifts it. If your answer is smeared across three sections with the key fact buried in paragraph four, it does not get lifted.

Two refinements that measurably help:

Lead with the answer. State the fact in the first sentence of the section, then elaborate. The inverted pyramid is not just good journalism, it is what makes a chunk extractable.
Descriptive headings. "Image Specifications Per Platform" beats "Getting It Right" — the heading is a routing signal the engine uses to find the relevant chunk.

This is the same structural discipline that makes the right JSON-LD schema types worth adding: you are removing ambiguity so a machine can find and trust the useful part fast.

Density Beats Prose: Cite Sources and Use Numbers

The clearest pattern I see in what gets cited: models strongly favour factual, quantitative, source-backed content. Specific numbers, dates, benchmarks, and explicit citations get pulled into answers far more often than smooth, generic prose. "UUID v4 uses 122 random bits" is citable; "UUIDs are a popular way to make unique identifiers" is not. The denser your verifiable facts, the more hooks an engine has to quote you and attribute the claim.

This rewards a writing style that is almost the opposite of marketing copy — concrete, specific, willing to commit to numbers, and transparent about where they come from. It is also, conveniently, just better technical writing.

Recency Is a Ranking Factor Now

AI engines weight freshness heavily when choosing sources, more so than traditional search did. A guide from 2024 with no updates loses ground to a 2026 article on the same topic — even if the older one is more thorough. The practical implication: keep dateModified accurate and actually update your important pages, then reflect the update in your structured data. A stale-but-good page quietly slides out of the citation pool. Generate clean Article markup with current dates using the JSON-LD generator, and verify the rest of your metadata with the Meta Tag Analyser — the same basics from the developer's meta tag checklist, now feeding a freshness signal that matters more than it used to.

Brand Mentions, Not Just Backlinks

Generative engines do not have a clean ranking the way Google does, so they lean on a fuzzier signal: how often, and how authoritatively, your brand is mentioned across the web. Guest posts, genuine community participation, being referenced by others — these build the association the model draws on when deciding who is a credible source on a topic. It is closer to old-fashioned reputation than to link-building. Slow, unglamorous, and durable.

Measure Share of Voice Across Engines

Here is the part most people skip: you cannot improve what you do not measure, and GEO measurement looks different from SEO. There is no single rank to track. Instead, build a habit of testing your commercially important queries against the major engines — ChatGPT, Perplexity, Claude, Gemini — and tracking your share of voice: how often you are cited, on which queries, on which engines. Users are forming platform loyalty, sticking to a preferred engine, so being cited on Perplexity but invisible on ChatGPT is a real gap, not a rounding error.

A workable starter stack:

Check analytics for referrals from perplexity.ai, chatgpt.com, and similar — your ground truth that citation is happening.
Manually test your top queries across all four engines, weekly, and log who gets cited.
Watch recency and structured data on the pages you most want cited, and refresh them on a schedule.

Competing for the Answer

GEO is not a reinvention of SEO — it is a new distribution channel with its own rules, sitting on the same technical foundation. Let the AI crawlers in deliberately, publish an llms.txt while it is still cheap to be early, structure content into self-contained, fact-dense chunks that lead with the answer, keep your important pages genuinely fresh, and measure your citation share across engines rather than chasing a single rank. Do that on top of solid fundamentals and you stop competing only for clicks and start competing for the answer itself — which, increasingly, is where the audience already is.