GPTBot is Blocked on Your Site. Here's How to Fix It.

If you've ever added a rule like Disallow: / to your robots.txt to block scrapers, there's a good chance you've also blocked every major AI search engine — including ChatGPT, Perplexity, Gemini, and Claude — from reading your site.

This is one of the most common and most impactful GEO mistakes. A blocked crawler means no citations, regardless of how good your content is.

How to check if you're blocking AI crawlers

Visit your robots.txt file directly:

https://yourdomain.com/robots.txt

Look for any of these patterns that would block AI crawlers:

User-agent: * followed by Disallow: / — blocks everything, including all AI crawlers
User-agent: GPTBot followed by Disallow: / — explicitly blocks ChatGPT's crawler
User-agent: ClaudeBot followed by Disallow: / — explicitly blocks Anthropic's crawler

If you're not sure how to read robots.txt syntax, use GEO Auditor's free scan — it checks all 14 AI crawlers against your robots.txt and tells you exactly which ones are blocked.

The complete list of AI crawlers to allow

There are 14 AI user-agents you should be aware of. Here's what each one is used for:

GPTBot — OpenAI's primary training and search crawler
OAI-SearchBot — OpenAI's real-time search crawler (used by ChatGPT search)
ChatGPT-User — Browsing requests made by ChatGPT during conversations
ClaudeBot — Anthropic's web crawler (Claude)
anthropic-ai — Legacy Anthropic crawler identifier
PerplexityBot — Perplexity AI's search crawler
Google-Extended — Google's crawler for Gemini and AI Overviews
Applebot-Extended — Apple's crawler for Apple Intelligence
Amazonbot — Amazon's AI crawler (Alexa, Rufus)
FacebookBot — Meta AI crawler
Bytespider — ByteDance AI crawler
CCBot — Common Crawl (used to train many open-source AI models)
Googlebot — Standard Google crawler (also feeds AI Overviews)
Bingbot — Microsoft's crawler (also feeds Copilot)

The correct robots.txt setup

If you want to allow all AI crawlers while still blocking malicious scrapers, the safest approach is to explicitly allow the crawlers you want, using a specific User-agent rule for each:

# Allow AI search crawlers
User-agent: GPTBot
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: Applebot-Extended
Allow: /

# Block generic scrapers
User-agent: *
Disallow: /private/
Disallow: /admin/

This pattern is explicit: specific crawlers get full access, while the wildcard rule restricts only the paths you actually want to protect.

What if you used a wildcard Disallow?

A User-agent: * / Disallow: / rule blocks every robot that doesn't have a more specific rule above it. If GPTBot doesn't appear earlier in the file with an Allow: / directive, it is blocked.

The fix: add explicit Allow: / rules for each AI crawler before the wildcard block. Robots.txt is evaluated in order — more specific rules take precedence over the wildcard.

Firewall rules and JavaScript challenges

Robots.txt isn't the only place crawlers get blocked. If you're using Cloudflare or another CDN/WAF, check whether any firewall rules target bots. Common culprits:

Cloudflare's "Bot Fight Mode" — can intercept legitimate AI crawlers before they reach your server
JavaScript challenge pages — AI crawlers don't execute JavaScript, so a JS challenge silently blocks them even if robots.txt allows access
Rate limiting rules targeting high-frequency crawlers — can intermittently block AI crawlers that request pages quickly

The GEO Auditor free scan checks for these patterns as part of the technical health audit.

After you fix it

Once you've updated robots.txt, the change takes effect immediately — there's no cache to clear. AI crawlers typically re-crawl robots.txt before each visit. For ChatGPT specifically, OpenAI's crawler respects robots.txt changes within a few days.

Run a free GEO audit to verify all 14 AI crawlers now show as accessible, and to check whether any other signals are suppressing your AI search visibility.

GPTBot is Blocked on Your Site. Here's How to Fix It.

How to check if you're blocking AI crawlers

The complete list of AI crawlers to allow

The correct robots.txt setup

What if you used a wildcard Disallow?

Firewall rules and JavaScript challenges

After you fix it

Related posts

AI Crawler Access: How We Audit All 14 AI Crawlers

llms.txt: What It Is and How to Add It to Your Site

What is Generative Engine Optimization (GEO)?

See how visible your site is to AI search