Skip to main content
Crawler Logs show you exactly which AI bots are visiting your website, how frequently they crawl, and which pages they prioritize. This data helps you understand how AI search engines discover and index your content.

What Are AI Crawlers?

AI crawlers are automated bots operated by AI companies to index web content. This indexed content powers AI search engines and large language models. Each major AI provider operates its own crawler:
CrawlerOperatorPurpose
GPTBotOpenAIIndexes content for ChatGPT and OpenAI’s search features
OAI-SearchBotOpenAIDedicated search crawler for ChatGPT search
ChatGPT-UserOpenAIFetches pages in real time when a ChatGPT user browses the web
ClaudeBotAnthropicIndexes content for Claude’s training and retrieval
PerplexityBotPerplexityIndexes content for Perplexity’s AI search engine
Google-ExtendedGoogleIndexes content for Gemini and Google AI features
Applebot-ExtendedAppleIndexes content for Apple Intelligence features
meta-externalagentMetaIndexes content for Meta AI products
BytespiderByteDanceIndexes content for ByteDance AI products
cohere-aiCohereIndexes content for Cohere’s language models
PromptAlpha continuously updates its crawler detection database as AI companies introduce new bots. You do not need to take any action to track newly identified crawlers.

How PromptAlpha Detects Crawlers

PromptAlpha identifies AI crawlers by matching the User-Agent string in incoming HTTP requests against a maintained database of known AI bot signatures. This detection happens server-side and does not depend on the JavaScript tracking snippet. When a crawler visits your site, PromptAlpha logs:
  • Crawler identity — which bot made the request
  • Timestamp — when the visit occurred
  • URL path — which page was requested
  • Response status — whether the page was served successfully (200), blocked (403), or returned an error
  • Crawl frequency — how often the bot returns to the same page

Understanding Crawler Frequency and Patterns

The Crawler Logs dashboard in Agent Analytics displays crawler activity over time. You can filter by:
  • Date range — view activity for a specific period
  • Crawler — isolate visits from a single bot (e.g., only GPTBot)
  • URL path — see which crawlers visited a specific page
Common patterns to look for:
  • Regular crawling — bots visiting on a consistent schedule (daily or weekly) indicates your content is actively indexed.
  • Spike in crawl activity — a sudden increase may indicate new content was discovered or an AI provider is re-indexing your site.
  • Declining crawl frequency — fewer visits over time may suggest the crawler is deprioritizing your site, or that a robots.txt change is partially blocking access.
Compare crawler frequency against your content publishing schedule. If you publish new pages but do not see corresponding crawler visits, check your indexability configuration.

Which Pages Crawlers Visit Most

The Top Crawled Pages section ranks your pages by total crawler visits. This tells you which content AI systems consider most valuable or relevant. Use this data to:
  • Identify high-value content — pages crawled frequently are likely being used as source material by AI search engines.
  • Find gaps — important pages that receive few or no crawler visits may have indexability issues.
  • Prioritize optimization — focus your AI search optimization efforts on pages that crawlers already visit.

Using Crawler Data to Inform robots.txt Decisions

Your robots.txt file controls which crawlers can access your site. Crawler Logs help you make informed decisions about what to allow or block. Before modifying your robots.txt, review your crawler logs to understand the current state:
  • Which crawlers are visiting your site today?
  • Are there crawlers you want to block?
  • Are crawlers you expected to see missing from the logs?
Blocking a crawler in robots.txt prevents it from indexing your content, which means the associated AI search engine may not surface your pages in its results. Consider the trade-offs carefully before blocking any crawler. See the Indexability Audits page for guidance.
Example robots.txt entries:
# Allow all AI crawlers
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /

# Block a specific crawler
User-agent: Bytespider
Disallow: /

# Allow crawlers but block specific paths
User-agent: GPTBot
Allow: /
Disallow: /admin/
Disallow: /private/

Exporting Crawler Log Data

You can export your crawler log data as a CSV file for further analysis:
  1. Navigate to Agent Analytics > Crawler Logs.
  2. Set your desired date range and filters.
  3. Click the Export button in the top-right corner.
  4. Select CSV as the export format.
The export includes the crawler name, timestamp, URL path, and response status for each logged visit.