How to Rank in AI Search

4 min read • 766 words

I recently left a comment on Reddit about ranking in AI search that got some attention, so I figured I’d write a detailed article about it.

Here’s the premise:

This blog gets maybe 5 clicks per month from Google, but I was able to optimize it to receive traffic from AI search. Now I’ll walk you through how I did that.

TL;DR

I recently delved into the world of AI search and discovered a game-changing strategy: optimizing content for Retrieval-Augmented Generation (RAG). Unlike traditional SEO, this approach focuses on making your content easily retrievable by AI models, which can significantly boost your visibility.

In this article, I’ll walk you through the tactics I’ve found most effective for optimizing your content for RAG. From deep, structured coverage and citing primary research, to including expert quotes and maintaining consistency across platforms, these strategies can help you rank in AI search.

So, if you’re ready to level up your online presence, let’s dive in and explore how to optimize your content for Retrieval-Augmented Generation.

Two Ways to Get Cited by LLMs

Before we get into it, let’s quickly go over how ranking inside LLMs works:

There are two ways:

1. Training data inclusion – Your content gets absorbed during model training. This is largely out of your control and mostly a waste of time as your content becomes part of the text; there are no links.

2. RAG (Retrieval-Augmented Generation) – The LLM searches the web in real-time, retrieves your content, and cites it in responses. This is what you should optimize for.

Most LLMs use RAG. They fetch pages, feed them into the model’s context window, and then generate answers based on that retrieved content.

This is now the default and preferred pathway because:

  • It’s not limited by training cut-off dates
  • The actual URLs get cited
  • You can optimize your content for retrieval

All the tactics below focus on optimizing for RAG.

Schema – Schema markup powers Google’s rich results, but most RAG ingestion grabs the rendered DOM, strips <script> tags, and collapses the rest to Markdown. JSON-LD never makes it into the context window, so the LLM literally cannot see it.

E-E-A-T(Experience-Expertise-Authoritativeness-Trustworthiness) is a Google search-rater guideline. Inside an LLM, there is no fixed vector for “experience,” “expertise,” “authoritativeness,” or “trust.”
What the model does is next-token probability built from co-occurrence statistics.

What Actually Works

Deep, Structured Coverage

Go deeper than anyone else on the topic: cover edge cases, counterarguments, and technical specifics. Structure it well.

What matters is information density + clear segmentation. Break your article into short, well-labeled sections so retrievers can isolate the right span. Add a 40-word max “micro-summary” under every H2; LLMs grab it when the context window is tight.

Cite Primary Research and Data

Back claims with primary, verifiable sources:

  • Peer-reviewed papers (Google Scholar, PubMed, arXiv)
  • Industry studies with clear methodology
  • Government or NGO datasets and reports
  • Original data you collected

RAG systems favor documents that read as reliable. Clear sourcing strengthens perceived credibility; this can be the difference between getting cited or not.

Include Expert Quotes

Use properly formatted blockquotes with attribution:

“Quote here.” – Expert Name, Organization

This improves readability and makes authority relationships machine-parsable. It’s also another credibility signal.

Keep Entities Consistent Across Platforms

Consistency of your name, brand, and expertise area across LinkedIn, X (Twitter), and your site helps systems identify you as a distinct entity.

Use Lists, Tables, and Headings

Structured formatting is one of the best optimizations.  Preserving HTML hierarchy (H2/H3s, lists, tables) improves retrieval and generation quality compared with flattened text. Make sure to keep heading hierarchy with no skips.

A clear structure makes it easier for RAG systems to chunk, quote, and ground from your page.

Common Crawl

Common Crawl is the giant open-web dataset used by many models during training, and large parts of it overlap with what retrieval systems index today.

Being visible on crawlable, authoritative domains boosts both training-time inclusion and retrieval-time discoverability.

How to leverage it:

  • Write guest posts on trusted, well-archived sites often cited on Common Crawl
  • Get quoted in publications or research outlets
  • Ensure your site isn’t blocking crawlers (robots.txt, meta tags)

Write Articles with Lists and Include Your Brand

RAG systems love lists.

When people ask “best tools,” “top methods,” or “X vs Y,” AI models look for list-based articles to build their answers. Enumerated lists are easy for retrieval systems to parse, extract, and cite directly.

That means if you write list-style content like “Top 10 AI SEO Tools,” your article has a decent chance of being retrieved and quoted.

Include your brand, product, or site naturally within that list, not as an ad, but as one credible entry among others. When the model compiles its answer, your brand is more likely to appear alongside the rest.

Reddit (Maybe)

I hear a ton of SEOs talk about Reddit, but I personally don’t have experience with it. Reddit links do slip into AI answers, but I’m not confident they get cited as often as we read about on LinkedIn to be useful for RAG.

Experimentation

That’s what’s working for me right now. Don’t take it as gospel; go test it, see what happens. It’s the best time to experiment.

Leave a comment

Your email address will not be published. Required fields are marked *