Robots.txt Tester
Enter any domain to analyze its robots.txt. See which search engines and AI crawlers are blocked, find sitemaps, and identify access issues.
What This Tool Checks
AI crawler access
GPTBot, OAI-SearchBot, ClaudeBot, Claude-SearchBot, PerplexityBot, Google-Extended
Search engine access
Googlebot, Bingbot, DuckDuckBot, YandexBot, Baiduspider, Slurp
Sitemap directives
Checks for Sitemap URLs declared in robots.txt
Rule analysis
Parses all Allow, Disallow, and Crawl-delay rules per User-Agent
Blocking issues
Flags bots that are unintentionally blocked from crawling
Raw file view
Shows the complete robots.txt contents for manual review
Frequently Asked Questions
What is robots.txt?
robots.txt is a file at the root of your website (example.com/robots.txt) that tells search engine crawlers which pages they can and cannot access. It uses 'Allow' and 'Disallow' directives per User-Agent to control crawler behavior. It's the first file any crawler checks before crawling your site.
Does robots.txt affect AI search visibility?
Yes — critically. In 2026, AI search engines like ChatGPT (OAI-SearchBot), Claude (Claude-SearchBot), and Perplexity (PerplexityBot) respect robots.txt. If you block these crawlers, your content won't appear in AI search results. Many sites accidentally block AI crawlers, losing all AI search visibility.
What's the difference between GPTBot and OAI-SearchBot?
GPTBot collects data for OpenAI model training. OAI-SearchBot crawls content for ChatGPT search results. You can block GPTBot (preventing training use) while allowing OAI-SearchBot (keeping ChatGPT search visibility). Same pattern for Anthropic: ClaudeBot (training) vs Claude-SearchBot (search).
Should I have a Sitemap directive in robots.txt?
Yes. The Sitemap directive in robots.txt tells all crawlers where to find your XML sitemap. While Google can find sitemaps through Search Console, other crawlers (including AI bots) rely on this directive. Format: Sitemap: https://example.com/sitemap.xml
What happens if I don't have a robots.txt?
Without a robots.txt file, all crawlers assume they can access everything on your site. This is fine for most sites, but it means you can't control which bots access your content. If you want to block AI training bots while allowing search bots, you need a robots.txt file.
Can robots.txt block Google from indexing a page?
robots.txt prevents crawling, not indexing. If Google finds a link to a blocked page, it may still index the URL (with a 'URL is blocked by robots.txt' note) — it just can't see the content. To prevent indexing, use a noindex meta tag instead. Use robots.txt for crawler access control, not indexation control.
Audit crawler access across your entire site
CrawlRaven audits robots.txt, crawl access, and 200+ other technical SEO factors in a single site-wide crawl.