What CDN Logs Can Tell You About AI SEO

By Ves Ivanov
3 min read·701 words

AI-generated TLDR

If a page gets a hit from , , or , that does not prove the page shaped the final answer. A retrieval event means the page was pulled into a live workflow from a RAG system in response to a user question. If a page gets retrieval hits and never earns citations or referrals, you should start asking whether it is quotable. And if an important page never gets retrieval hits at all, that is often the clearest optimization target.

AI SEO is hard to measure because the most important step is usually the least visible.

You can see crawler activity. You can track citation or a referrals but the middle of the pipeline when an AI system actually pulls your page into an answer is usually hidden.

CDN logs are one of the few places where that middle layer is visible.

If a page gets a hit from ChatGPT-User, Claude-User, or Perplexity-User, that does not prove the page shaped the final answer. It does tell you though that your content was fetched during a user-session inside AI, not just discovered by a crawler.

That distinction matters because it gives you a much better diagnostic model than "we got cited" or "we didn't."

Start with the concrete case

Say a page has been getting repeated hits from ChatGPT-User and Perplexity-User, but you still do not see visible citations or meaningful referral traffic.

That is not a dead end. It is a clue.

It tells you the page is probably clearing the discovery gate and failing later. The issue is less likely to be "AI systems do not know this page exists" and more likely to be "the page was considered, but something else won." That something else is often clarity, extractability, freshness, source fit, or the presence of a more direct answer on a competing page.

Without retrieval data, both cases look the same. With retrieval data, you can separate "not found" from "not chosen."

The model to use: crawl -> retrieval -> citation

Easy way to think about AI visibility is:

crawl -> retrieval -> citation

A crawl event means the page was discovered, refreshed, or indexed for some later use.

A retrieval event means the page was pulled into a live workflow from a RAG system in response to a user question.

A citation means the page made it to the surface through a visible link, or a downstream referral.

Those are different signals, and they answer different questions.

  • Crawl: Can the system find me?
  • Retrieval: Am I making it into the candidate set?
  • Citation: Am I getting visibility?

We tend to focus on the first and third but the second is the most actionable.

What retrieval hits can help you diagnose

If a URL gets repeated *-User fetches over time, that page is likely relevant, it keeps showing up in the consideration set.

If a page gets retrieval hits and never earns citations or referrals, you should start asking whether it is quotable. Is the answer too buried? Is the page too broad? Does it bury the key fact under a long intro? Does it lack a clean definition, a direct comparison table, a timestamp, or a strong opening summary?

And if an important page never gets retrieval hits at all, that is often the clearest optimization target. The page may be misaligned with the way real user questions are phrased, too vague to rank as a source, or too thin to survive retrieval.

How to track retrievals

First, filter CDN logs for these user agents:

  • ChatGPT-User
  • Claude-User
  • Perplexity-User

Then group by URL, count repeat hits, and compare those URLs against the pages you actually want to win with.

In practice, that gives you a sensible workflow:

  • Match on *-User in your logs.
  • Look for repeated patterns by URL or section.
  • Then compare retrieval activity against citations, referrals, and business-priority pages.

Use docs

Use the docs to interpret what a request means if you're unsure. The vendor documentation is the best ground truth you have. Then use your own logs to see what is actually happening.

Notes:

  • In a March 2026 ChatGPT study from Airops (https://www.airops.com/report/influence-of-retrieval-fanout-and-google-serps-in-chatgpt), 548,534 pages were retrieved during answer generation, but only 15% were cited in the final response. That is exactly why CDN logs matter: a ChatGPT-User, Claude-User, or Perplexity-User hit is not telling you that you won the citation. It is telling you something earlier and often more useful — that your page made it into the candidate pool at all.

  • OpenAI explicitly separates OAI-SearchBot from ChatGPT-User, saying OAI-SearchBot is used to surface sites in ChatGPT search while ChatGPT-User is used for certain user-triggered actions and is not used for automatic crawling or Search inclusion. Anthropic and Perplexity document similar distinctions between search crawlers and user-triggered fetch agents in their own bot documentation (Anthropic, Perplexity).

Notes