๐Ÿ› ๏ธUtility Hub
๐Ÿ’ฌFeedback
๐Ÿ  Home

๐Ÿ›ก๏ธ Block AI Crawlers Code Generator

Stop AI crawlers from scraping your content for training โ€” normal Google search stays intact

1. Select AI bots to block
Major AI bots
2. Block scope
3. Output type
Standard robots.txt spec
# AI crawler blocking โ€” generated by theutilhub.com
# Normal search engines (Googlebot, Bingbot) are NOT affected.

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: meta-externalagent
Disallow: /
๐Ÿ“ Upload this file to your site root (/robots.txt).
๐Ÿค– Blocking Effect by AI Bot
BotOperatorPurpose
GPTBotOpenAIScrapes data to train ChatGPT models
ChatGPT-UserOpenAIFetches pages during ChatGPT live browsing
ClaudeBotAnthropicScrapes data to train Claude models
Google-ExtendedGoogleGoogle AI (Gemini) training โ€” separate from normal Search
CCBotCommon CrawlOpen crawl dataset used by many AI models
PerplexityBotPerplexityPerplexity AI search indexing and answers
BytespiderByteDanceByteDance (TikTok) AI training data scraping
meta-externalagentMetaMeta AI (Llama) training data scraping
โœ… How to Verify
After deployment, visit yoursite.com/robots.txt in a browser โ€” if the code appears, it is applied correctly. Actual bot blocking can be confirmed by a drop in requests from those User-agents in your server access logs.

์ด ๋„๊ตฌ๋ฅผ ์นœ๊ตฌ์—๊ฒŒ ๊ณต์œ ํ•˜๊ธฐ

Related Tools

๐Ÿ›ก๏ธURL Safety Checker
๐Ÿ—‚๏ธJSON Formatter & Viewer
๐Ÿ‘๏ธPeek-Proof Editor
AD

What is the Block AI Crawlers Code Generator?

The Block AI Crawlers Code Generator creates standards-based code to stop major AI crawlers โ€” OpenAI (GPTBot), Anthropic (ClaudeBot), Google (Google-Extended), Meta, Perplexity, and more โ€” from scraping your website content as training data. Choose which bots to block, the scope (entire site, specific folders, or images only), and the output type (robots.txt, HTML meta tag, server header, or llms.txt), and ready-to-paste code is generated instantly. Crucially, this blocking does NOT affect indexing by normal search engines like Google or Bing โ€” Googlebot and Google-Extended are separate bots, so you keep your search visibility while selectively blocking only AI training. All code generation happens 100% in your browser; your paths and settings are never sent to a server. Includes a database of 18+ AI bots with per-bot purpose descriptions and application guides.

Common Use Cases

โœ๏ธ
Protect Blogs & Creative Work
Prevent your articles, art, and photos from being scraped as AI training data, preserving the value of your creative work โ€” a single robots.txt line protects everything.
๐Ÿข
Corporate Site Policy
Apply a standard blocking policy so your company's product info and docs are not used for competitor AI or unauthorized training, while keeping search visibility intact.
๐Ÿ–ผ๏ธ
Block Image Scraping
Stop your portfolio and artwork images from being collected for image-generation AI training. Allow text while selectively blocking images only.
๐Ÿ“
Block Specific Folders
Exclude only paid content or members-only areas (/premium/, /members/) from AI training while leaving public areas open โ€” fine-grained control.

How to Use

  1. 1
    Select AI bots to block
    The 8 major bots (GPTBot, ClaudeBot, Google-Extended, etc.) are selected by default. Expand the "More bots" accordion for additional bots, or use the quick buttons to select/deselect all.
  2. 2
    Choose block scope
    Pick entire site, specific folders, or images only. For folders, the path auto-normalizes to /blog/ format on blur, and you can specify multiple folders comma-separated.
  3. 3
    Pick the output format
    robots.txt is the most standard and recommended. Switch tabs to HTML meta tag, server header (Apache/Nginx), or llms.txt depending on your environment.
  4. 4
    Copy and apply
    Copy the generated code or download it as a file. Upload robots.txt to your site root, then visit yoursite.com/robots.txt after deployment to confirm it works.

Frequently Asked Questions