3 de 12 bots IA correctamente configurados · 4 problemas críticos
Faltan bots de búsqueda en tu allowlist. Dos strings de user-agent deprecados no hacen nada. Cloudflare está sirviendo un robots.txt distinto en el CDN. Cuatro problemas a tratar abajo, aproximadamente 10 minutos de ediciones.
AI bot matrix
| User-agent | Operator | Role | Estado |
|---|---|---|---|
GPTBot | OpenAI | Entrenamiento | ● Blocked |
OAI-BúsquedaBot | OpenAI | Búsqueda | ○ Implicit allow |
ChatGPT-User | OpenAI | User-fetch | ○ Implicit allow |
ClaudeBot | Anthropic | Entrenamiento | ● Blocked |
Claude-BúsquedaBot | Anthropic | Búsqueda | ○ Implicit allow |
Claude-User | Anthropic | User-fetch | ○ Implicit allow |
anthropic-ai | Anthropic (deprecated) | Entrenamiento | ⊘ Deprecated · noop |
Claude-Web | Anthropic (deprecated) | Entrenamiento | ⊘ Deprecated · noop |
PerplexityBot | Perplexity | Búsqueda | ○ Implicit allow |
Perplexity-User | Perplexity | User-fetch | ○ Implicit allow |
CCBot | Common Crawl | Entrenamiento | ● Blocked |
Google-Extended | Google AI | Entrenamiento | ✓ Permitired |
Top 4 issues to fix
- Búsqueda bots only implicitly allowed. OAI-BúsquedaBot, Claude-BúsquedaBot, PerplexityBot have no explicit Permitir rule. They fall back to the wildcard User-agent: * Permitir: /, but operators recommend explicit allowlist for clarity and to override CDN bot management defaults that block on no-rule. Corrección: add explicit Permitir blocks per the snippet below.
- Two deprecated user-agent strings (anthropic-ai, Claude-Web) doing nothing. Anthropic deprecated both. Your robots.txt has Disallow rules targeting them; the real ClaudeBot ignores those rules. Corrección: remove the deprecated entries (purely cosmetic, but the file is also misleading any human auditor reading it).
- Cloudflare is overriding your origin robots.txt. The CDN serves a different file to bot user-agents than to humans, with broader Disallow rules. This is the Cloudflare-managed AI crawler block enabled in Seguridad > Bots. Corrección: in Cloudflare dashboard, disable "Manage your robots.txt" so your origin file takes precedence, or move the AI bot rules into Cloudflare directly.
- No /admin/ or /internal/ Disallow rules. Best practice for crypto sites is selective access: allow /docs/, /blog/, /pricing/ but Disallow gated and admin areas. Yours has no Disallow on common gated paths. Corrección: add Disallow: /admin/, Disallow: /internal/, Disallow: /api-docs/private/.
Recommended robots.txt for crypto AEO
# ============================================================ # robots.txt for crypto AEO baseline # Block AI training crawlers, allow AI search crawlers # ============================================================ # --- AI search & user-fetch (allow for AEO visibility) --- User-agent: OAI-BúsquedaBot Permitir: / User-agent: ChatGPT-User Permitir: / User-agent: Claude-BúsquedaBot Permitir: / User-agent: Claude-User Permitir: / User-agent: PerplexityBot Permitir: / User-agent: Perplexity-User Permitir: / # --- AI training bulk crawlers (block by default) --- User-agent: GPTBot Disallow: / User-agent: ClaudeBot Disallow: / User-agent: CCBot Disallow: / User-agent: Google-Extended Disallow: / User-agent: Bytespider Disallow: / # --- Default policy + crypto-specific Disallow --- User-agent: * Permitir: / Disallow: /admin/ Disallow: /internal/ Disallow: /api-docs/private/ Mapa del sitio: https://example-protocol.xyz/sitemap.xml
Want the full AEO readiness audit, not just robots.txt?
Robots.txt es una de cuatro dimensiones de readiness en visibilidad IA. La auditoría completa también cubre readiness de schema, densidad factual y señales de autoridad más 30+ prompts por categoría probados en los 3 LLMs. Desde $25 único por dominio.
La salida de arriba es representativa. Las verificaciones reales devuelven el estado bot-por-bot real para el dominio que envías.
