Back to Blog
2026-04-07·3 min read

Track AI Crawlers on Your Website With Lumen's Cloudflare Integration

If you're working on answer engine optimization, you're probably already watching how AI models mention and cite your brand. Good. But here's the part most teams completely overlook: what are AI bots actually doing on your website?

GPTBot, ClaudeBot, PerplexityBot, plus all their search and assistant variants, are hitting your site every single day. Some are training crawlers vacuuming up content for the next model version. Others are search bots grabbing pages in real time so they can answer somebody's query right now. And then there are the assistant bots, acting on behalf of individual users mid-conversation. They all behave differently. They all matter.

We just shipped Lumen's Cloudflare integration. It puts all of that activity in one place.


What You Get

Hook up your Cloudflare account and Lumen pulls AI crawler data for you daily. No exports. No grepping through access logs. No clicking around the Cloudflare dashboard trying to piece things together.

Crawl volume and trends

A line chart showing total AI crawler requests over time, broken down by day. Hit "Compare bots" and you'll see exactly which ones are most active and when their behavior shifts. You can filter by category (training crawlers, AI search, AI assistants) or by date range, anywhere from 1 month to 12 months.

Bot breakdown

Every AI bot that touches your site, grouped into three buckets:

  • Training Crawlers like GPTBot and ClaudeBot. These are the ones indexing your content for future model training.
  • AI Search bots like PerplexityBot and OAI-SearchBot. They pull your pages to answer search queries live.
  • AI Assistants like ChatGPT-User and Claude-User. These hit your site because a real person asked about something in a conversation.

Each bot gets a request count and its share of your total AI traffic as a percentage.

Allowed vs. blocked

This one's straightforward. The dashboard splits bot responses into allowed (2xx), blocked (4xx), and errors (5xx). If you've set up robots.txt rules or Cloudflare bot management to block certain crawlers, you'll know whether those rules are actually doing their job and how much traffic they're catching.

Bandwidth consumed

Training bots in particular can eat through bandwidth fast, especially on bigger sites. The dashboard shows total bandwidth from AI bots alongside the daily average. Useful for catching unexpected spikes.


Why This Matters for AEO

Answer engine optimization isn't only about what AI models say about you. It's about what they can see in the first place.

Block GPTBot and your content disappears from future ChatGPT training data. PerplexityBot can't reach your pages? Perplexity won't cite you in answers. ClaudeBot getting 403s on your best content? Claude is working with stale information. Or none at all.

Lumen's Cloudflare integration pulls this signal out automatically. One glance tells you which AI bots can access your content and which ones can't. From there, you can make real decisions about your robots.txt and bot management rules instead of guessing.


Getting Started

Takes about a minute:

  1. Head to the Integrations page in your Lumen dashboard
  2. Click "Connect" on the Cloudflare card
  3. Paste in a Cloudflare API token with Zone Analytics permissions
  4. Pick your zone (domain)
  5. That's it

The AI Crawlers tab on your Analytics page will light up as soon as the backfill finishes, usually within a few minutes.

Win customers from ChatGPT