Validation

Confirm your AI-readable files are correctly formatted, publicly accessible, and being discovered by search engines and AI crawlers.

Step 1 — Confirm Files Are Live

Visit each file directly in your browser. They should display as plain text or JSON — not a 404 or 403 error.

https://yourdomain.com/robots.txt
https://yourdomain.com/llms.txt
https://yourdomain.com/llms-full.txt
https://yourdomain.com/semantic/index.json
https://yourdomain.com/markdown/index.md

If any of these fail, see Troubleshooting before continuing.

Step 2 — Validate Structured Data

Use Google's Rich Results Test to check your Schema.org markup for errors. Paste your homepage URL and review any flagged issues — common problems include missing required fields (such as a BreadcrumbList missing its item field) or incorrect nesting.

Step 3 — Check Google Search Console

Google Search Console is the most reliable way to see whether Google has actually discovered and crawled your new files.

URL Inspection:

  1. Go to URL Inspection in Search Console
  2. Paste the full URL of a file (e.g., https://yourdomain.com/llms.txt)
  3. Review the result:
    • "URL is not on Google" — expected for brand-new files; this is not an error, it just means Google hasn't crawled it yet
    • Check the Discovery section for "Sitemaps" and "Referring page" — if both say "none detected," Google doesn't yet know the file exists
  4. Click Request Indexing to manually prompt a crawl

A known limitation: Google's indexer is built around HTML and JSON content. .txt and .md files are often crawled but not reliably indexed in standard search results — this is normal and doesn't mean your setup is broken. AI crawlers fetch these files directly regardless of Google's indexing status. See Markdown Knowledge Base for more detail.

Crawl Stats:

  1. Go to Settings → Crawl Stats
  2. Review the "Other agent type" category — this often includes non-Googlebot crawlers, including AI crawlers
  3. Look for your /semantic/ and /markdown/ URLs appearing here with 200 OK responses — this confirms crawlers are actively reading your AI-readable files

Step 4 — Check for Content Signals and Bot Access

Run a free automated scan at isitagentready.com:

POST https://isitagentready.com/api/scan
Content-Type: application/json

{"url": "https://yourdomain.com"}

This checks your robots.txt configuration, Content Signals implementation, and overall bot access control in one pass.

Step 5 — Monitor Server Logs

Your hosting provider's raw access logs are the most direct evidence of AI crawler activity. Look for requests from these user-agents:

GPTBot
ClaudeBot
OAI-SearchBot
Claude-SearchBot
PerplexityBot
Google-Extended
Amazonbot

See AI Crawlers for what each one does. Most hosting control panels (cPanel, Plesk) provide raw access log access, or ask your host directly.

Step 6 — Test AI Answers Directly

The most direct test of whether your AI readability work is paying off: periodically ask the same questions across multiple AI platforms and track whether your business appears.

Example questions to test:

  • "Who are the best [your service] providers in [your city]?"
  • "Tell me about [your company name]"
  • Direct questions about your specific offerings or expertise

Test across ChatGPT, Perplexity, Claude, and Gemini. Screenshot and date your results so you can track changes over time.

Realistic Timeline

TimeframeWhat to Expect
24–72 hoursManually requested URLs crawled by Google
1 weekMost new files indexed (where applicable) after sitemap resubmission
2–4 weeksAI crawlers begin regularly hitting your new files
1–3 monthsMeasurable change in AI answer visibility, if any

AI readability is a foundation, not an instant-results channel — the goal is to ensure that when AI systems do look, they find accurate, well-structured information.

Related