Tool

Robots.txt Validator

Robots.txt Validator

Validate your robots.txt files before deploying. Catches syntax errors, detects conflicting Allow/Disallow rules, and annotates directives with known bot names — including AI crawlers.

Features

  • Syntax validation against RFC 9309 (robots exclusion standard)
  • Semantic conflict detection with longest-match specificity resolution
  • Known bot annotations (search engines, AI crawlers, social bots)
  • Sitemap URL format validation
  • Engine-specific divergence warnings (Google vs Bing vs Yandex)
  • Fully client-side — no data leaves your browser

What It Checks

  • Syntax errors: invalid directives, missing User-agent, malformed lines
  • Semantic conflicts: overlapping Allow and Disallow rules within the same group, with specificity winner shown
  • Best practice warnings: missing Sitemap directive, unqualified Disallow: / blocks, non-standard directives
  • Bot coverage: which known crawlers (Googlebot, GPTBot, ClaudeBot, Bingbot) are explicitly addressed
  • Sitemap URLs: validates absolute URL format and HTTPS scheme

Engine Divergence

Search engines interpret robots.txt differently. This tool validates against RFC 9309 as a baseline and flags known divergences:

  • Crawl-delay is honored by Bing and Yandex but ignored by Google
  • Allow/Disallow precedence uses longest-match in Google's parser, not first-match
  • Non-standard directives (Host, Clean-param) are recognized but flagged as warnings