BlogLLMs.txt Best Practices: Setup, Implementation & Indexing

LLMs.txt Best Practices: Setup, Implementation & Indexing

Gita D.
Last Updated: April 1, 2026

Your blog posts, product pages, and guides might rank on Google.

But when someone asks ChatGPT or Claude a question in your space, your content doesn't show up.

That's the problem llms.txt was built to solve.

So, in this guide, I cover how to create an llms.txt file, what to include, and how to get indexed by LLMs through llms.txt file.

Key Takeaways:
  • LLMs.txt is a Markdown file at your site root (yoursite.com/llms.txt) that highlights your best content for AI models. Jeremy Howard (fast.ai co-founder) proposed it in September 2024.
  • SE Ranking's study of 300,000 domains found only 10.13% have adopted it — and zero correlation between having the file and getting cited by LLMs.
  • Search Engine Land tracked 10 sites for 90 days post-implementation. Eight saw no change. The two that grew had major content and PR pushes running simultaneously.
  • Google, OpenAI, or Anthropic — has confirmed they use llms.txt for indexing decisions.
  • It takes under 30 minutes to set up and carries zero technical risk. The sites seeing real AI traffic gains pair it with extraction-ready content, structured data, and external authority signals.

What is LLMs.txt?

What is LLMs-txt

LLMs.txt is a plain-text Markdown file that sits at your site root. Its job: tell AI models which pages on your site matter most.

You already have robots.txt for crawler access rules and sitemap.xml for listing every indexable URL. LLMs.txt does something different — it curates.

Instead of listing 5,000 URLs or setting allow/disallow directives, it highlights your top 20–50 pages with enough context for an AI to understand what each one covers. No HTML parsing required.

Jeremy Howard published the proposal on September 3, 2024. His reasoning: LLMs waste tokens on navigation menus, JavaScript, cookie banners, and ad scripts.

An llms.txt file hands them a clean summary with direct links to your best content instead.

By mid-2025, over 600 sites had adopted it — including Anthropic, Cloudflare, Stripe, Perplexity, Zapier, and Hugging Face.

How LLMs.txt Differs from Robots.txt and Sitemaps

robots.txtsitemap.xmllllms.txt
PurposeControls crawler accessLists all indexable URLsCurates priority content for AI
FormatPlain text directivesXMLMarkdown
AudienceSearch engine botsSearch engine botsLLM crawlers and AI agents
ContainsAllow/Disallow rulesURLs with metadataSummaries + links to key pages
ScopeEvery crawlable pageEvery indexable page20–50 most important pages
AdoptionNear-universalNear-universal~10%

Robots.txt controls access. Sitemaps list everything. LLMs.txt curates.

Who Actually Reads Your LLMs.txt File?

As of early 2026, no major AI provider has confirmed they parse llms.txt for indexing or citations.

Google says AI Overviews and AI Mode rely on traditional SEO signals. Google briefly added llms.txt to several developer docs in December 2025, then pulled them within 24 hours. John Mueller confirmed it was a CMS rollout, not a strategic move.

OpenAI recommends allowing OAI-SearchBot in robots.txt. Some practitioners have seen GPTBot fetching llms.txt in server logs, but this hasn't been linked to citation outcomes.

Anthropic publishes its own llms.txt on anthropic.com — the only major AI company actively using the standard. But it hasn't confirmed its crawlers read the file for indexing decisions.

Nobody is confirming they use it. But some are clearly watching. So...

Should You Still Create One?

SE Ranking analyzed around 300,000 domains and found only 10.13% used an llms.txt file, with no link to AI citations.

Search Engine Land tracked 10 sites for 90 days and saw no impact. Growth came from PR and content efforts, not the file itself.

So why bother? Three reasons:

  1. It takes 30 minutes with zero risk. If the standard matures, you're already in place.
  2. The curation exercise has standalone value. Choosing your top pages and writing descriptions for each clarifies your content strategy.
  3. The standard is 18 months old. Robots.txt took years to gain universal crawler support after its 1994 proposal. LLMs.txt hasn't been tested at scale yet.

This file won't move the needle on AI traffic by itself.

But paired with extraction-ready content and external authority, it becomes one signal in a broader AI visibility strategy — which we'll build out next.

How to Create an LLMs.txt File

Start with the minimum viable file. You can expand later.

The exact structure the llmstxt.org specification requires:

text
# Your Website Name

> One-sentence description of what your site does and who it serves.

## Core Content
- [Page Title](https://yoursite.com/page): What this page covers.
- [Another Page](https://yoursite.com/another): What this page covers.

## Optional
- [Secondary Resource](https://yoursite.com/resource): Supplementary context.

Let's break down each element.

Required Elements

H1 Title: Your site or company name. Single #. One line. Nothing else.

Blockquote Summary: Use > followed by 1–3 sentences. Think elevator pitch for machines — who you are, what you do, who you serve.

Skip the marketing language. "AI-powered SEO platform" works. "Revolutionary game-changing solution" doesn't.

H2 Sections: Group links by content type — Products, Guides, Blog, Documentation, FAQs. Each ## heading acts as a category label for AI models.

Links with Descriptions: Format as - [Title](URL): Description. That inline description gives the AI model context about a page without requiring it to crawl and parse the full HTML.

File Requirements

RequirementSpecification
File namellms.txt (lowercase, plural)
LocationSite root — yoursite.com/llms.txt
FormatPlain text, Markdown syntax
EncodingUTF-8
MIME typetext/plain or text/markdown
Number of Pages20-50
Max sizeUnder 100 KB (under 10 KB recommended)
AccessPublic — no auth, no redirects
ProtocolHTTPS

The Optional Section and llms-full.txt

The ## Optional heading has a specific purpose in the spec. It tells AI models: "You can skip everything below this line if you're running low on context."

Use it for secondary content — blog posts, case studies, or supplementary docs that add depth but aren't essential for understanding your core offering.

There's also a companion file called llms-full.txt. Where your llms.txt is a curated index, llms-full.txt contains the complete flattened Markdown text of your key pages in a single file.

Anthropic specifically requested Mintlify build this format for their documentation — they needed a cleaner way to feed entire docs into LLMs without HTML overhead. It was later adopted into the official llmstxt.org standard.

Most non-technical sites don't need llms-full.txt.

If you run developer documentation, API references, or a knowledge base with 100+ pages, it's worth exploring. For everyone else, the standard llms.txt file is enough.

If you want to skip the manual work. Use our free LLMs.txt Generator to build yours in under two minutes.

LLMs.txt Example: How Stripe Structures its LLMs.txt

Templates are helpful. But seeing how a company with thousands of documentation pages actually structures their file is more useful.

Stripe publishes its llms.txt at docs.stripe.com/llms.txt.

It's one of the most detailed implementations out there — and a solid reference for how to organize a large site's content for AI consumption.

A condensed look at how Stripe's file is structured:

text
# Stripe Documentation

## Docs
- [Testing](https://docs.stripe.com/testing.md): Simulate payments to test your integration.
- [API Reference](https://docs.stripe.com/api.md)
- [Receive payouts](https://docs.stripe.com/payouts.md): Set up your bank account to receive payouts.
- [Supported currencies](https://docs.stripe.com/currencies.md): See what currencies you can use.
- [Security at Stripe](https://docs.stripe.com/security.md): Learn how Stripe handles security.

## Payment Methods
- [Payment Methods API](https://docs.stripe.com/payments/payment-methods.md): Learn more about the API that powers global payment methods.
- [How cards work](https://docs.stripe.com/payments/cards/overview.md): Learn how an online card payment works.
- [Buy now, pay later](https://docs.stripe.com/payments/buy-now-pay-later.md): Learn about BNPL methods with Stripe.

## Checkout
- [Use a prebuilt Stripe-hosted payment page](https://docs.stripe.com/payments/checkout.md)
- [How Checkout works](https://docs.stripe.com/payments/checkout/how-checkout-works.md): Learn how to use Checkout to collect payments.
- [Customize Checkout](https://docs.stripe.com/payments/checkout/customization.md): Customize the appearance and behavior.

(Stripe's actual file is far longer — it covers dozens of product areas. This is a representative sample.)

What Makes this Work

Sections mirror how developers think. Stripe doesn't organize by internal team or department. The H2 headings — Docs, Payment Methods, Checkout, Payments — map to what a developer is trying to accomplish.

An AI agent answering "how do I accept Apple Pay with Stripe?" can jump straight to the right section.

Every link points to a .md version. Notice the URLs end in .md, not .html. Stripe serves Markdown versions of their documentation pages, following the llmstxt.org recommendation to provide clean, parseable content at the same URL with a .md extension.

This means an AI tool can fetch the linked page and get pure Markdown — no nav bars, no JavaScript, no cookie banners.

Descriptions are task-oriented. "Simulate payments to test your integration" tells an AI exactly what the page solves. Compare that to a vague label like "Testing Documentation" — which could mean anything.

Task-oriented descriptions help LLMs match the right page to the right user query.

The file is long — and that's intentional. Stripe's documentation covers payments, billing, subscriptions, Connect, Radar, Terminal, and more. Their llms.txt doesn't try to squeeze that into 10 links.

It runs hundreds of lines because their product surface is massive. The takeaway: your file length should match your content depth. A 5-page site doesn't need 200 links. A platform with 50+ product areas might.

How this Applies to Your Site

You don't need Stripe's scale to use the same principles. Here's what to borrow:

Organize by user intent, not internal structure: Your CMS might group content by "Blog," "Resources," and "Company."

But your llms.txt should group by what someone is trying to do — learn about your product, compare pricing, integrate with your API, get support.

Link to the cleanest version of each page: If you can serve .md versions of your key pages, do it.

If not, make sure the linked pages have clean HTML with clear heading hierarchies and minimal JavaScript clutter.

Write descriptions that answer "what does this page help me do?" Not "Our comprehensive guide to..." — just the task. "Compare pricing across plans." "Set up webhook notifications." "Troubleshoot payment declines."

Quick Templates for Other Site Types

Same principles, different industries:

SaaS / Software:

text
# YourSaaS

> Project management platform for remote teams. Task tracking, reporting, and integrations.

## Product
- [Features](https://yoursite.com/features): Core platform capabilities and use cases.
- [Pricing](https://yoursite.com/pricing): Plans, pricing tiers, and feature comparison.
- [API Docs](https://yoursite.com/docs): Developer integration guides.

## Resources
- [Case Studies](https://yoursite.com/cases): Customer results across industries.
- [Blog](https://yoursite.com/blog): Product updates and workflow tips.

E-commerce:

text
# YourStore

> Outdoor gear and equipment for hiking, climbing, and camping.

## Products
- [Hiking Gear](https://yourstore.com/hiking): Boots, packs, and trail accessories.
- [Climbing Equipment](https://yourstore.com/climbing): Ropes, harnesses, and protection.

## Support
- [Size Guide](https://yourstore.com/sizing): Product sizing across categories.
- [Returns](https://yourstore.com/returns): Return policy and exchange process.

Content Publisher:

text
# YourPublication

> Technology news and analysis covering AI, cloud infrastructure, and developer tools.

## Coverage Areas
- [AI & Machine Learning](https://yoursite.com/ai): Analysis of AI industry trends.
- [Product Reviews](https://yoursite.com/reviews): In-depth evaluations of developer tools.

## About
- [Editorial Standards](https://yoursite.com/editorial): Our reporting methodology.
- [Team](https://yoursite.com/team): Writers and subject matter experts.

Notice the pattern across all three: factual descriptions, intent-based sections, and 5–10 links maximum. You can always expand — but start lean.

Run any llms.txt file through our free LLMs.txt Validator to check for structural errors and broken links before publishing.

How to Use LLMs.txt (Best Practices)

You've built the file. Now it needs to be live, accessible, and working.

1. Where to Upload

The process varies by platform, but the destination is always the same: your site root, accessible at yoursite.com/llms.txt.

WordPress: Upload via FTP or cPanel File Manager to your public_html/ directory.

Or use a plugin — Yoast and Rank Math both support auto-generating llms.txt files now.

Shopify: Go to Online Store → Themes → Edit Code → Assets folder → Add a new asset → name it llms.txt, paste your content, and save.

Static sites (Next.js, Hugo, Jekyll, Gatsby): Drop the file into your public/ or static/ directory. It'll deploy with your next build.

Custom CMS or app: Either create a /llms.txt route that returns plain text, or serve a static file from your public directory.

2. Pre-Launch Checklist

Before you announce it, verify these:

CheckWhat to look for
HTTP statusyoursite.com/llms.txt returns 200 (not 301, 404, or 403)
Content-TypeHeader shows text/plain or text/markdown
EncodingUTF-8
AuthenticationNo login wall — publicly accessible
HTTPSServed over a secure connection
H1 presentFile starts with # Your Site Name
Blockquote presentSummary line using > directly after H1
Links workingEvery URL in the file returns 200

Quick terminal check: run curl -I https://yoursite.com/llms.txt and verify the status code and Content-Type header.

3. Caching and Performance

Set your cache headers to 24 hours. AI crawlers don't need to re-fetch this file every request, and caching keeps your server load minimal.

Target a response time under 200ms. AI crawlers operate under tighter time budgets than traditional search engine bots — slow responses may get abandoned entirely.

How to Get Indexed by LLMs through LLMs.txt File

SE Ranking's study didn't just measure llms.txt adoption. It also identified what does correlate with AI citations. The findings paint a clear picture:

  • Sites with over 32K referring domains are 3.5x more likely to be cited by ChatGPT than those with fewer than 200.
  • Domains with active profiles on platforms like Trustpilot, G2, and Capterra have 3x higher citation rates.
  • Pages with a First Contentful Paint under 0.4 seconds average 6.7 citations vs. 2.1 for slower pages.

The takeaway: authority signals, third-party validation, and technical performance drive citations.

The llms.txt file supports these by making your best content easier to parse — but it can't replace them. So...

1. Structure Content for AI Extraction

LLMs don't read a page top-to-bottom like a human. They break content into semantic chunks and match the most relevant chunks to sub-queries.

So, your content needs to be written so each section can stand alone as a complete answer.

What that looks like in practice:

  • Lead every section with the key point. Don't build up to the answer — start with it. AI models extract the first few sentences of a section more readily than buried conclusions.
  • Use specific headings that mirror real questions. "How much does X cost?" beats "Pricing Information." LLMs match headings to user prompts.
  • Add comparison tables with concrete data. Pricing tiers, feature comparisons, specifications — anything with structured data points. Vercel credits this approach as part of the strategy that drove 10% of their new signups from ChatGPT.
  • Include FAQ sections with direct Q&A pairs. Each question-and-answer block is a self-contained chunk that maps naturally to how people prompt AI models.

2. Allow AI Crawlers in robots.txt

This step is separate from llms.txt — but skip it, and your llms.txt is useless.

So, make sure these user agents are allowed in your robots.txt:

text
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: PerplexityBot
Allow: /

If you've previously added blanket Disallow: / rules for these bots, your content is invisible to those AI systems regardless of what your llms.txt says.

3. Apply Schema Markup to Key Pages

Structured data (Article, FAQ, HowTo, Product schema) helps both search engines and AI crawlers understand content context.

It's not mandatory for LLM indexing. But it increases the odds your content gets correctly categorized and selected for AI-generated summaries.

At minimum, apply Article schema to your cornerstone content and FAQ schema to any page with a Q&A section.

4. Monitor AI Crawler Activity

Check your server logs for visits from these user agents:

  • GPTBot (OpenAI)
  • ClaudeBot (Anthropic)
  • Google-Extended (Google AI training)
  • OAI-SearchBot (OpenAI search)
  • PerplexityBot (Perplexity)

If these bots are hitting your key URLs, your content is being ingested.

If they're not showing up at all, troubleshoot: check your robots.txt rules, verify your pages are publicly accessible, and confirm your server isn't rate-limiting or blocking unknown user agents.

So, the combined checklist — the file and the fundamentals, together:

CategoryAction
Filellms.txt live at site root with correct structure and all links returning 200
Accessrobots.txt allows GPTBot, ClaudeBot, Google-Extended, OAI-SearchBot, PerplexityBot
ContentKey pages use clear headings, comparison tables, and Q&A format
SchemaArticle, FAQ, or HowTo markup applied to cornerstone pages
AuthorityAt least one external signal (backlink, press mention, third-party review) per key page
PerformancePages load with FCP under 0.4 seconds
FreshnessContent updated within the last 6 months
MonitoringServer logs checked monthly for AI crawler activity

Common LLMs.txt Mistakes (And How to Avoid Them)

Most of these take under five minutes to fix. But they can make the difference between a functional file and one that gets ignored.

1. Technical Mistakes

MistakeFix
File in a subdirectory (/seo/llms.txt)Move to site root (/llms.txt)
Served as text/htmlSet Content-Type to text/plain
Behind authentication or loginMake publicly accessible — no login wall
Returns 404 or redirectConfirm file exists and returns HTTP 200 directly
Links in the file point to broken pagesAudit every URL before publishing

2. Content Mistakes

MistakeFix
Listing every page on the siteCurate 20–50 high-priority pages
Promotional language ("revolutionary platform!")Factual descriptions ("Project management software with task tracking and reporting")
Missing blockquote summaryAdd > summary directly after H1
Outdated links or descriptionsReview quarterly
Including private or sensitive pagesOnly list public, high-value content

3. Strategy Mistakes

MistakeFix
Expecting the file alone to drive AI trafficPair with content quality, structured data, and authority signals
Never updating after initial setupReview when publishing cornerstone content or restructuring URLs
Blocking AI crawlers in robots.txtVerify robots.txt allows AI user agents
No monitoringCheck server logs monthly for AI crawler visits

LLMs.txt Maintenance and Update Schedule

The file is live. So, don't let it go stale and update on a rigid calendar.

Only update when something changes:

  • New cornerstone content published
  • URLs restructured or pages consolidated
  • Product or pricing pages updated
  • Contact information changed
  • Old pages removed from the site

Review Cadence:

FrequencyAction
WeeklyVerify file loads (HTTP 200, correct MIME type)
MonthlyCheck linked URLs for 404s; review server logs for AI crawler visits
QuarterlyFull content review — add new pages, remove outdated ones, refresh descriptions

Version Control: Keep a simple changelog.

Date each update, note what changed and why. Even a Google Doc or a Git commit history works.

The important thing: when you change URLs on your site, update your llms.txt before the old URLs start returning 404s. Broken links in the file undermine the entire point of curated content.

What to do next

LLMs.txt is useful infrastructure. What you put behind it determines whether AI models actually cite your content.

Three steps to take today:

  1. Build your file: Use our free LLMs.txt Generator to create a properly formatted file in minutes.
  2. Validate it: Run it through our LLMs.txt Validator to catch structural errors and broken links.
  3. Audit your content for extraction-readiness: Do your key pages lead with clear answers? Do they include comparison tables, FAQ sections, and schema markup?

That's what gets cited.

Note: The llms.txt standard is still early. But the content fundamentals that drive AI citations — clear structure, factual depth, and external authority — have already proven their value across every study we've referenced in this guide.

Share this post:

Written By

Gita D.
Co-founder and Growth Marketer

She works with brands to build search and content systems grounded in buyer psychology, supporting... Read more

Want help getting your brand ranked on Google and cited by AI?

We help businesses build AI visibility through SEO, content, and authority with clear revenue impact.

Book a Strategy Call
Free Visibility AuditBook a Call