A single, isolated page rarely gets cited as a source by AI search engines. What actually counts is topical authority: a connected cluster of one pillar page and detail articles that visibly belong together – for humans and machines alike. This article shows how to build that the right way – and where the honest limits are.
The mistake: AI doesn't read your site page by page
Classic SEO thinks in single pages ranking for single keywords. AI search systems work differently. Engines like ChatGPT, Claude or Perplexity break a question apart, pull relevant passages from many sources and assemble an answer with citations. In doing so they don't just judge whether one page fits the topic – they judge whether the source as a whole stands for that topic.
A well-written but isolated page is at a structural disadvantage here: the signal of real thematic depth is missing. A connected cluster sends exactly that signal – to people, to classic crawlers and to AI engines alike.
Why clusters make a difference to AI engines
This isn't a marketing claim; it follows from how the systems work. Several independent analyses of Perplexity's source selection reach a striking conclusion: in its ranking logic, topical authority weighs more heavily than raw domain rating. A smaller site that covers one topic consistently and deeply can outrank a large but thematically scattered one.
The rule of thumb: depth beats breadth. Twenty connected articles on one topic build more authority than two articles on ten topics. AI engines effectively form an authority score per subject area – and a site that writes „a little about everything“ becomes properly citable for nothing.
⚖️ Staying honest: Structure is necessary but not sufficient. A clean cluster improves your starting position – it is not a citation guarantee. Perplexity, for instance, also favours earned media and tends to discount your own site as „promotional“; ChatGPT, for many queries, runs no live search at all. What you do on your own site creates the precondition for being cited. Anyone promising more is selling hope.
Step 1 – Define the pillar topic and map the cluster
Before you touch any tool: pick one core topic you want to be the obvious source for. Not five. One. Beneath it, group the concrete questions your audience asks around that topic – each becomes a cluster article.
Step 2 – Connect it machine-readably: internal links and Schema.org
A cluster isn't created by articles happening to sit on the same domain, but by recognisable connections. On two levels: internal links – the pillar page links every cluster article, each article links back to the pillar and to 2–3 thematically close siblings, with descriptive anchor text instead of „click here“. And Schema.org / JSON-LD – make the relationship explicit for machines too, e.g. Article with isPartOf, mentions and about. Structured data is no magic bullet, but it's well documented: cleanly marked-up pages appear measurably more often among the top picks in AI answers – simply because they're easier to extract.
A second, often underrated lever: answer each article's core question in the first few sentences. Analyses of AI source selection show that the large majority of cited passages come from the upper part of a page.
🔧 Generate Schema markup
Validated JSON-LD for your cluster articles – free.
Open the schema tools →Step 3 – Make the cluster discoverable for AI
For an engine to classify your cluster, it has to find it and understand its structure. Solid basics first: reachable pages, a correct robots.txt (no accidental blocking of AI crawlers), an up-to-date sitemap.xml. The optional extra is an llms.txt / llms-full.txt – a machine-readable content index pointing AI agents straight to your cluster's key pages. And here we stay honest:
⚖️ What llms.txt really does today – and what it doesn't: Not an official citation or ranking factor. None of the major engines (OpenAI, Google, Anthropic, Meta) has confirmed using llms.txt in production for source selection. An analysis of around 300,000 domains found no statistically significant correlation between having an llms.txt and how often a site is cited. Still worth doing: as a routing layer for AI agents it's already established, the effort is minimal and the risk zero – cheap future-proofing, not a miracle.
📄 Generate llms.txt for your cluster
Free generator – a clean content index in minutes.
Open the generator →Measure where you actually stand
Gut feeling is not a metric. Before you rebuild, get a sober snapshot: technical basis, structured data, platform readiness and the discoverability of your AI files – as a score with a grade and one concrete first to-do per weak point.
📊 GEO Audit – free
Four categories, a clear score, a first action per weak point.
Run the free GEO check →Plain talk: Topic clusters aren't a trick for outsmarting AI search engines. They're the honest, well-evidenced answer to how these systems recognise authority: through thematic depth, clear structure and machine-readable connections. Build a cluster, answer questions high up the page, mark everything up cleanly – and treat llms.txt as cheap preparation, not a promise. The part your website can't do on its own, nobody can sell you as a one-click fix. That's the uncomfortable but fair truth.
Frequently asked questions
Do connected topic clusters get cited more by AI?
Clusters are no guarantee. But analyses of source selection – at Perplexity, for instance – show that topical authority weighs more than raw domain rating. A consistently connected cluster improves your chances of being cited as a source.
Does an llms.txt file make my content citable?
No. No major AI engine officially uses llms.txt as a citation factor; an analysis of around 300,000 domains found no significant correlation. It's still worth doing – as a routing layer for AI agents and as cheap future-proofing.
What is a pillar topic?
The one core topic you want to be the obvious source for. Beneath it you group your audience's concrete questions – each becomes a cluster article that links internally and via Schema.org back to the pillar page.
Is on-site structure enough to get cited?
No. Structure is necessary but not sufficient. AI engines also weigh brand mentions beyond your domain. On-site work creates the precondition; your environment decides the rest.
Read on in the same cluster
GEO vs. SEO – how optimising for AI differs →By the way: this article is itself a cluster node – the links above aren't decoration, they demonstrate what the text is about.