AI Readiness ScannerAI Readiness Scanner
The llms.txt File: The robots.txt for AI Agents - Implementation Guide
Back to Blog
Technical

The llms.txt File: The robots.txt for AI Agents - Implementation Guide

2026-05-20 3 min read

TL;DR: llms.txt is a plain-text manifest that tells AI crawlers what your site does, where to find key pages, and how to interact with your content. Host it at the root, keep it under 2,000 tokens, and update it whenever your site structure changes.


What Is llms.txt?

Proposed by Anthropic in late 2024, llms.txt is a discovery and context file for AI crawlers. It sits at the root of your domain (https://yoursite.com/llms.txt) and provides a structured summary that agents can ingest in a single request.

Unlike robots.txt, which is a list of crawl rules, llms.txt is a semantic guide. It tells the agent what your site is about, what actions it can take, and where to find machine-readable data.


Why It Matters for Agent Commerce

When an AI agent is asked to "buy me a pair of running shoes from this store," it needs to know:

  1. Is this a store?
  2. Does it sell running shoes?
  3. Can an agent complete a purchase?
  4. Where is the product catalogue?
  5. What payment methods are supported?

llms.txt answers all five questions in under 500 words.


Format and Structure

The format is intentionally simple - plain text, Markdown-like headings, bullet lists. No YAML, no JSON.

# AI Readiness Scanner

## Overview
AI Readiness Scanner is a developer-first tool that audits websites for AI agent compliance. It checks structured data, protocol support, payment readiness, and crawler accessibility.

## Important Pages
- [Home](https://aireadinesscanner.com) - free URL scan
- [Pricing](https://aireadinesscanner.com/pricing) - developer-friendly plans
- [Blog](https://aireadinesscanner.com/blog) - technical guides on AI agent standards
- [Setup Service](https://aireadinesscanner.com/setup-service) - done-for-you implementation

## API
- [OpenAPI Spec](https://aireadinesscanner.com/api/openapi.json)
- [Documentation](https://docs.aireadinesscanner.com)

## Product Catalogue
- GET /api/products - list all products with pricing

## Actions an AI Agent Can Take
- Scan any URL for AI readiness
- Subscribe to a monitoring plan
- Request a setup service quote

## Payment Methods
- Stripe Checkout (credit card)
- Annual billing available

## Contact
- support@actonceapi.com

Where to Host It

Place llms.txt at the root of your domain:

https://yoursite.com/llms.txt

Add a <link> tag in your HTML head so crawlers that start at your homepage find it immediately:

<link rel="llms" href="https://yoursite.com/llms.txt" />

Size Limits and Best Practice

Anthropic recommends keeping llms.txt under 2,000 tokens. That is roughly 1,500 words. If you need more detail, create a llms-full.txt alongside it and link to it from the main file.

Do include:

  • What your site does (1-2 sentences)
  • Key pages with descriptions
  • Actions an agent can take
  • API endpoints and spec links
  • Payment and checkout information

Do not include:

  • Marketing copy or brand stories
  • Long prose about your team
  • Duplicate content from your About page

Testing Your llms.txt

Use the AI Readiness Scanner to check for llms.txt presence, structure, and token count. Alternatively, test manually:

curl -s https://yoursite.com/llms.txt | wc -w
# Should return under 1500 words

curl -s https://yoursite.com/llms.txt | grep -c "^#"
# Should find at least one top-level heading

E-Commerce Specific Tips

For a store, add these sections:

## Product Categories
- Running Shoes
- Training Gear
- Accessories

## Checkout Flow
- Add to cart → Shipping → Payment → Confirmation
- Guest checkout supported
- Stripe Link for one-click payment

## Shipping
- UK: 2-3 days, free over £50
- International: 5-10 days, flat £15

Recommended Next Step

If you have not created a llms.txt file yet, start today. It takes 15 minutes to write and can improve your AI agent discoverability overnight. Run a free scan to see if your site is already being crawled and what agents are finding - or missing.

Find this useful?

Share it with your team or scan your own site.