How AI Engines Decide Who Gets Cited

„Citability“ is the buzzword of the moment in GEO – and the one most coated in hype. This article cuts through it: what actually decides whether ChatGPT, Claude or Perplexity cite you as a source? We walk through the levers that current data supports – and name honestly which widespread assumptions the data does not hold up.

📑 Contents

Citability is not ranking
The levers the data supports
An honest myth check
What you can influence – and what you can't
FAQ

Citability is not ranking

The biggest misconception is assuming AI engines pick their sources the way Google picks its top 10. They don't. Systems like Perplexity or ChatGPT's search break a question apart, pull individual passages from many sources, score them for relevance and assemble an answer with citations. What gets judged is not the page as a link position, but the passage as an answer component.

How far ranking and citation have drifted apart shows in Ahrefs data: the overlap between the sources cited in Google AI Overviews and the organic top 10 fell within seven months from around 76 percent (mid-2025) to about 38 percent (early 2026). Two in three AI citations now come from pages a user would never see on Google's first page. Classic ranking still matters – but it is no longer synonymous with being citable.

The levers the data supports

In May 2026, Cyrus Shepard (Zyppy) was the first to condense 54 experiments, patents and case studies into an evidence-weighted ranking of 23 factors – scored by repeatability and strength of evidence, not opinion. Combined with the large correlation studies from Ahrefs and SE Ranking, a surprisingly clear order emerges:

1. Technical accessibility. The single best-evidenced factor. What an AI crawler can't cleanly retrieve, it can't cite – reachable pages, a correct robots.txt without accidental AI blocks, clean HTML. Unspectacular, but the entry ticket.

2. Classic ranking as a foundation. Necessary, not sufficient (see above). Pages that rank well technically and editorially get pulled into the candidate pool more often – the citation itself is then decided by other criteria.

3. Brand mentions beyond your own domain. Perhaps the sharpest break with classic SEO: web-wide mentions of your brand correlate roughly three times more strongly with AI visibility than backlinks. AI models learn from textual context and co-occurrence, not from the link graph. This is the part your website can't deliver on its own.

4. Topical depth. Engines effectively form an authority score per subject area. A connected topic cluster beats scattered single pages – why and how is in the linked cluster article.

5. Extractability. Here it becomes measurable. Around 44 percent of all LLM citations come from the first third of a text – the core answer belongs up top, not at the end. Pages with expert quotes averaged 4.1 versus 2.4 ChatGPT citations; pages with 19+ data points 5.4 versus 2.8 (SE Ranking, 129,000 domains). Clear H2 structure, self-contained paragraphs, tables and FAQ schema make passages easier to lift out.

📊 Where do you stand? GEO Audit – free

Technical basis, structured data and discoverability as a score with a first to-do per weak point.

Run the free GEO check →

An honest myth check

Just as important as the levers is what does not work – because half-truths burn budget:

⚖️ Three widespread assumptions the data qualifies:

• „Fresh content is preferred.“ Recency helps with live-search engines, but in the evidence-weighted ranking it lands much lower than often claimed – a much-copied freshness statistic does not hold up under scrutiny.

• „llms.txt makes me citable.“ Not a direct factor. The file improves agent readiness (Lighthouse audits it experimentally) but does not directly affect citation frequency per current analyses. Cheap preparation – not a lever.

• „AI is AI.“ Wrong. In an analysis of 680 million citations, only 11 percent of domains overlapped between ChatGPT and Perplexity; brand citation rates differ by a wide margin. A source ChatGPT loves can be invisible on Perplexity.

What you can influence – and what you can't

Sort the levers by how much you control, and citability splits in two. On-site – accessibility, structure, extractability, schema, topical depth – you can steer directly. The first step per lever is simple: don't block AI crawlers in robots.txt; put each page's core answer in the opening sentences; at least one solid data point or quote per article; FAQ schema on your most important pages.

Off-site, on the other hand – whether your brand shows up as an authority across the web, in what tone, with what consistency – is co-decided by your environment. It can be influenced, but not with a switch on your own site. That is exactly where the limit of a free self-check lies:

🔭 The depth an on-site audit can't deliver: Which individual passages are citable, how often and in what sentiment your brand appears beyond your domain, and how per-page E-E-A-T stands – that requires analysis beyond your own website. Deliberately not part of the free tool. aiready.tools – Coming soon

Plain talk: Nobody can guarantee you a citation in ChatGPT – anyone who promises that is selling hope. What the data does give you is a sober priority list: be accessible, rank solidly, build your brand across the web, go deep on a topic, structure cleanly – in that order. The on-site part you can start today. The rest you earn; you don't buy it.

Frequently asked questions

Does a good Google ranking mean AI will cite me?

No. The overlap between Google's top 10 and the sources in AI answers fell from around 76 to about 38 percent in 2025/26 (Ahrefs). Good rankings remain a foundation, but they do not guarantee a citation.

What drives AI citations the most?

Per an evidence-weighted meta-analysis of 54 studies (Zyppy, May 2026), in this order: technical accessibility, classic ranking, brand mentions across the web, topical depth and extractable structure. Brand mentions correlate roughly three times more strongly with visibility than backlinks.

Does an llms.txt file make my content citable?

No, not a direct factor. llms.txt improves agent readiness (Lighthouse audits it experimentally) but does not directly affect citation frequency per current data. Worthwhile as cheap preparation, not as a lever.

Why am I cited by ChatGPT but not Perplexity?

Because the engines select differently. In an analysis of 680 million citations, only 11 percent of domains overlapped between ChatGPT and Perplexity; brand citation rates differ by a wide margin. There is no single AI target.

Read on in the same cluster

🕸 Topic Clusters for AI Search – connect your content right →

How topical depth is built that AI engines read as authority.

🔍 GEO vs. SEO – how optimising for AI differs →

Why GEO complements classic SEO rather than replacing it.