AI 학습을 막으면 구글 검색에서도 사라지나요?

아니요. Google-Extended(AI 학습용 봇)와 Googlebot(검색 색인용 봇)은 완전히 별개입니다. 본 도구는 AI 학습 봇만 차단하므로 구글·빙 일반 검색 노출은 100% 유지됩니다.

robots.txt로 정말 AI 봇을 막을 수 있나요?

OpenAI, Anthropic, Google 등 주요 AI 기업은 공식적으로 robots.txt를 준수한다고 명시하고 있어 대부분의 메이저 봇은 차단됩니다. 더 강한 차단이 필요하면 서버 헤더(X-Robots-Tag)나 방화벽 차단을 병행하세요.

robots.txt와 메타태그, 서버 헤더 중 뭘 써야 하나요?

robots.txt가 가장 표준적이고 사이트 전체에 한 번에 적용되어 권장됩니다. 특정 폴더만 막으려면 robots.txt나 서버 헤더가 적합하며, HTML 메타태그는 페이지 단위라 폴더 차단에 부적합합니다.

이 툴의 결과를 공식 자료로 사용해도 되나요?

표준 규격 기반 코드를 생성하지만 AI 봇 정책은 수시로 변합니다. 법적 효력이 필요한 저작권 보호에는 robots.txt만으로 충분하지 않으므로 이용약관·DRM·법적 조치 등을 병행하세요.

Block AI Crawlers — Stop GPTBot, ClaudeBot & More

1. Select AI bots to block

Major AI bots

GPTBotOpenAI · Scrapes data to train ChatGPT modelsChatGPT-UserOpenAI · Fetches pages during ChatGPT live browsingClaudeBotAnthropic · Scrapes data to train Claude modelsGoogle-ExtendedGoogle · Google AI (Gemini) training — separate from normal SearchCCBotCommon Crawl · Open crawl dataset used by many AI modelsPerplexityBotPerplexity · Perplexity AI search indexing and answersBytespiderByteDance · ByteDance (TikTok) AI training data scrapingmeta-externalagentMeta · Meta AI (Llama) training data scraping

More AI bots0 selected

3. Output type

Standard robots.txt spec

# AI crawler blocking — generated by theutilhub.com
# Normal search engines (Googlebot, Bingbot) are NOT affected.

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: meta-externalagent
Disallow: /

📁 Upload this file to your site root (/robots.txt).

🤖 Blocking Effect by AI Bot

Bot	Operator	Purpose
GPTBot	OpenAI	Scrapes data to train ChatGPT models
ChatGPT-User	OpenAI	Fetches pages during ChatGPT live browsing
ClaudeBot	Anthropic	Scrapes data to train Claude models
Google-Extended	Google	Google AI (Gemini) training — separate from normal Search
CCBot	Common Crawl	Open crawl dataset used by many AI models
PerplexityBot	Perplexity	Perplexity AI search indexing and answers
Bytespider	ByteDance	ByteDance (TikTok) AI training data scraping
meta-externalagent	Meta	Meta AI (Llama) training data scraping

What is the Block AI Crawlers Code Generator?

The Block AI Crawlers Code Generator creates standards-based code to stop major AI crawlers — OpenAI (GPTBot), Anthropic (ClaudeBot), Google (Google-Extended), Meta, Perplexity, and more — from scraping your website content as training data. Choose which bots to block, the scope (entire site, specific folders, or images only), and the output type (robots.txt, HTML meta tag, server header, or llms.txt), and ready-to-paste code is generated instantly. Crucially, this blocking does NOT affect indexing by normal search engines like Google or Bing — Googlebot and Google-Extended are separate bots, so you keep your search visibility while selectively blocking only AI training. All code generation happens 100% in your browser; your paths and settings are never sent to a server. Includes a database of 18+ AI bots with per-bot purpose descriptions and application guides.

Common Use Cases

✍️

Protect Blogs & Creative Work

Prevent your articles, art, and photos from being scraped as AI training data, preserving the value of your creative work — a single robots.txt line protects everything.

🏢

Corporate Site Policy

Apply a standard blocking policy so your company's product info and docs are not used for competitor AI or unauthorized training, while keeping search visibility intact.

🖼️

Block Image Scraping

Stop your portfolio and artwork images from being collected for image-generation AI training. Allow text while selectively blocking images only.

📁

Block Specific Folders

Exclude only paid content or members-only areas (/premium/, /members/) from AI training while leaving public areas open — fine-grained control.

How to Use

1
Select AI bots to block
The 8 major bots (GPTBot, ClaudeBot, Google-Extended, etc.) are selected by default. Expand the "More bots" accordion for additional bots, or use the quick buttons to select/deselect all.
2
Choose block scope
Pick entire site, specific folders, or images only. For folders, the path auto-normalizes to /blog/ format on blur, and you can specify multiple folders comma-separated.
3
Pick the output format
robots.txt is the most standard and recommended. Switch tabs to HTML meta tag, server header (Apache/Nginx), or llms.txt depending on your environment.
4
Copy and apply
Copy the generated code or download it as a file. Upload robots.txt to your site root, then visit yoursite.com/robots.txt after deployment to confirm it works.

🛡️ Block AI Crawlers Code Generator

What is the Block AI Crawlers Code Generator?

Common Use Cases

How to Use

Frequently Asked Questions