llms.txt is a file worth knowing about. Whether it belongs on your website is a shorter conversation than most guides suggest.
This post explains what llms.txt is, what AI systems actually do with it, and how to create one. It also covers what llms.txt does not do. The most common mistake is implementing it instead of the higher-leverage AI crawlability work that actually moves citations.
Key Takeaways
- → llms.txt is a curated Markdown file at
/llms.txtthat helps AI systems find your most important pages. - → Perplexity and Claude (Anthropic) confirmed use it. ChatGPT has no official support. Google says it is not needed for Search or AI Overviews.
- → Search Engine Land's 90-day study found no measurable AI traffic change in 8 of 10 sites after implementation.
- → Worth implementing — the cost is near zero — but it does not fix Bing indexing, robots.txt gaps, or content extractability problems.
- → The bigger AI visibility question is Access, Understanding, and Authority. llms.txt touches only the Access layer.
What is llms.txt?
llms.txt is a plain Markdown file hosted at the root of your website, at the path /llms.txt, that
provides a curated, structured summary of your site's most important pages for large
language models.
The standard was proposed by Jeremy Howard of Answer.AI in September 2024. The idea: instead of asking AI systems to parse full HTML pages with navigation menus, scripts, ads, and noise, give them a clean Markdown list of the pages that actually matter.
The file typically includes the site name as an H1, a short description of what the site covers and who it is for, and a series of sections linking to important pages — service descriptions, methodology pages, key guides, documentation.
It is not a W3C or IETF standard. It is a community convention that has gained adoption, particularly among documentation platforms and developer tools, since Mintlify rolled it out across thousands of docs sites in late 2024 . The 2026 State of llms.txt report from Presenc AI tracked adoption across tens of thousands of sites, with developer tools and SaaS documentation leading implementation.
How is llms.txt different from robots.txt and sitemap.xml?
The three crawler-guidance files are often confused. They serve different purposes for different audiences.
| File | Purpose | Primary audience |
|---|---|---|
| robots.txt | Access control — tells crawlers what not to fetch | All web crawlers |
| sitemap.xml | Full page inventory — helps search engines find all indexable pages | Search engine crawlers |
| llms.txt | Curated AI-readable summary — highlights the most important pages | AI systems and LLMs |
robots.txt restricts access. sitemap.xml maps everything. llms.txt curates a shortlist.
Think of it this way: sitemap.xml hands search engines a complete index. llms.txt hands AI systems a recommended reading list.
What is llms-full.txt?
Some sites also publish an llms-full.txt file
at /llms-full.txt.
This extended variant contains the full machine-readable content of the site — not just
links, but the actual text of key pages in Markdown format. It is larger, more complete,
and more useful for agentic AI systems (IDE agents, MCP servers) that need to read the
content rather than just navigate to it.
Do AI crawlers actually read llms.txt?
Some do. Not in the way most guides imply, though.
Confirmed use cases
- → Perplexity retrieves llms.txt and uses it to prioritize which pages to read.
- → Microsoft and OpenAI crawlers (OAI-SearchBot, GPTBot) actively fetch llms.txt files, primarily for agentic tooling — IDE agents like Cursor, MCP servers, and developer tools that navigate documentation.
- → Anthropic has publicly confirmed that Claude Desktop and Claude.ai both respect llms.txt directives in retrieval workflows, according to the Presenc AI State of llms.txt 2026 report .
Not a Google signal
Google does not use llms.txt for Search or AI Overviews. Google's official May 2026 AI optimization guide explicitly named the file as unnecessary for its systems. Google does include llms.txt in its Agent2Agent (A2A) protocol for agentic interoperability — but that is separate from search ranking or AI Overview citation. Ahrefs covers this distinction in their 2026 breakdown.
The empirical picture
Search Engine Land tracked 10 sites across five industries for 90 days before and after implementing llms.txt . Eight of the ten sites saw no measurable change in AI traffic. The two that did show improvement had simultaneous changes that could explain the result.
The honest verdict: implement llms.txt. The cost is near zero, and the upside (Perplexity prioritization, agentic tooling) is real. Do not implement it expecting a jump in ChatGPT or Google AI Overview citations. That gap requires different work. See how to check if your brand is cited in ChatGPT for the diagnostic starting point.
How to create an llms.txt file
Creating an llms.txt file takes 15 to 30 minutes for most sites. The format is Markdown.
The file must live at exactly /llms.txt — not a
subdirectory.
- → What to include: an H1 with your site or brand name, a short description of what you do and who you serve, then sections with links to your most important pages.
- → Link to: service and product pages, methodology or how-it-works pages, key guides, About or contact pages.
- → Leave out: every post you have ever published, dynamic or gated pages, boilerplate or low-value pages.
To verify: visit yourdomain.com/llms.txt
in a browser. It should render as plain text.
Example llms.txt structure
This is an excerpt from uygen.com/llms.txt — a real, live file you can open in your browser.
# Uygen — AI Visibility Audit # https://uygen.com Uygen helps brands understand why they are missing from AI answers and what to fix first. The primary offer is an AI Visibility Audit that checks whether AI systems can access, understand, and trust the brand across its site and the wider web. ## Primary Offer AI Visibility Audit: - checks AI crawler access and robots directives - checks Bing indexing and sitemap health - reviews page extractability and entity clarity - tests prompt visibility across AI search systems - maps the citation pool and competitor gap - delivers top blockers, top 3 priorities, and a 90-day roadmap ## Audit Framework - Access - Understanding - Authority ## Key Pages - Home: https://uygen.com/ - AI Visibility Audit: https://uygen.com/ai-visibility-audit/ - Methodology: https://uygen.com/methodology/ - Sample Audit: https://uygen.com/sample-audit/ - What an AI Visibility Audit Includes: https://uygen.com/blog/what-an-ai-visibility-audit-includes/ - How to Track AI Search Visibility: https://uygen.com/blog/how-to-track-ai-search-visibility/ ## Contact hi@uygen.com
Does my website need an llms.txt file?
Implement it now if:
You run a documentation site, developer tool, or API product. This is where llms.txt has the clearest confirmed value — IDE agents and MCP servers actively read it.
Implement it, but do not prioritize over:
- → Bing indexing. ChatGPT and Perplexity depend on the Bing index for real-time web answers. If your site is not indexed in Bing, llms.txt cannot compensate. Bing indexing gaps are the #1 actionable AI access blocker for most sites . For a practical guide to fixing this, see how ChatGPT search ranking actually works .
- → robots.txt permissions. If OAI-SearchBot or PerplexityBot is blocked in your robots.txt, those crawlers cannot reach your site regardless of what llms.txt says.
- → Cloudflare and CDN bot protection. Approximately half of all websites silently block AI crawlers through default WAF or bot-fight rules — with no robots.txt entry and no visible error. If your site is behind Cloudflare, check your Security → Bots settings and verify AI crawlers are not being challenged or blocked before assuming llms.txt is the issue.
The rule: llms.txt is 30 minutes of work with real upside. Fix your Bing indexing, crawler permissions, and CDN settings first if you have not already.
What else affects whether AI can find and cite your content?
llms.txt addresses one corner of the AI access question. The complete picture has three layers.
Access
Can AI crawlers reach and read your pages? Bing indexing, robots.txt permissions, page rendering, and redirect chains all factor in. Even brands with strong Google rankings appear in only 18% of AI answers on average , which means access problems are often invisible until you specifically check.
Understanding
Can AI systems extract a useful answer from your content? Pages need direct-answer openings, clear heading structure, and consistent entity descriptions to be cited rather than just retrieved.
Authority
Do third-party sources corroborate your brand? AI systems use off-site evidence — reviews, directories, comparison pages, media coverage — to decide whether a brand belongs in a response.
llms.txt is one tool in the Access layer. If you are not sure which layer is actually blocking your AI visibility, that is the question an AI Visibility Audit is built to answer.
Not sure if your site is AI-readable beyond llms.txt?
An AI Visibility Audit checks your Bing indexing, crawler permissions, content extractability, and entity clarity, then delivers a prioritized fix roadmap.
Book an AI Visibility AuditFrequently asked questions
What is the difference between llms.txt and robots.txt?
robots.txt controls which crawlers can access which pages on your site. llms.txt does not restrict access — it curates a shortlist of your most important pages specifically for AI systems. Think of robots.txt as a fence and llms.txt as a guided tour.
Does llms.txt help with ChatGPT or Perplexity citations?
Perplexity retrieves llms.txt and uses it to prioritize pages. Anthropic has confirmed Claude Desktop and Claude.ai respect it in retrieval workflows. ChatGPT has no official support. Search Engine Land tracked 10 sites for 90 days and found no measurable AI traffic change in 8 of them after implementing it. Implement it for the marginal upside and near-zero cost, but do not treat it as a fix for AI citation gaps.
How do I check if AI can find my website?
Start with three checks: verify your site is indexed in Bing (not just Google), confirm OAI-SearchBot and PerplexityBot are not blocked in your robots.txt, and run a few buyer prompts in ChatGPT and Perplexity to see whether your brand or pages appear.
What is llms-full.txt?
llms-full.txt is an extended variant of llms.txt that contains the full machine-readable content of key pages in Markdown format, not just links. It is hosted at /llms-full.txt and is primarily useful for agentic AI tooling such as IDE agents and MCP servers that need to read the content, not just navigate to it.
llms.txt is worth implementing. It takes 30 minutes, costs nothing, and positions your site for the growing agentic web. Just do not mistake it for the solution to AI visibility.
That requires looking at Access, Understanding, and Authority, and knowing which one is actually blocking your brand. If you are not sure, that is exactly what the AI Visibility Audit is designed to answer.