Robots.txt Validator
Robots.txt Validator
Validate your robots.txt files before deploying. Catches syntax errors, detects conflicting Allow/Disallow rules, and annotates directives with known bot names — including AI crawlers.
Features
- Syntax validation against RFC 9309 (robots exclusion standard)
- Semantic conflict detection with longest-match specificity resolution
- Known bot annotations (search engines, AI crawlers, social bots)
- Sitemap URL format validation
- Engine-specific divergence warnings (Google vs Bing vs Yandex)
- Fully client-side — no data leaves your browser
What It Checks
- Syntax errors: invalid directives, missing User-agent, malformed lines
- Semantic conflicts: overlapping Allow and Disallow rules within the same group, with specificity winner shown
- Best practice warnings: missing Sitemap directive, unqualified
Disallow: /blocks, non-standard directives - Bot coverage: which known crawlers (Googlebot, GPTBot, ClaudeBot, Bingbot) are explicitly addressed
- Sitemap URLs: validates absolute URL format and HTTPS scheme
Engine Divergence
Search engines interpret robots.txt differently. This tool validates against RFC 9309 as a baseline and flags known divergences:
- Crawl-delay is honored by Bing and Yandex but ignored by Google
- Allow/Disallow precedence uses longest-match in Google's parser, not first-match
- Non-standard directives (Host, Clean-param) are recognized but flagged as warnings