Data & Studies

AI Overviews Cite AI-Generated Content More Than Human Writing

Si Quan Ong
Content marketer @ Ahrefs. I've been in digital marketing for the past 6 years and have spoken at some of the industry’s largest conferences in Asia (TIECon and Digital Marketing Skill Share.) I also write about my curiosities on my Substack.
AI Overviews are AI-generated content, which means they can contain hallucinations.

Google uses “grounding” to improve their accuracy, but according to our research, AI Overviews are more likely to cite AI-generated content than human-written content.

Here’s what we found:

We took one million SERPs showing AI Overviews from Ahrefs Keywords Explorer and extracted the top three cited links. 1.9 million URLs in total, of which we had 500,000 in our database.

We ran each URL through our own AI content detector, which is part of Page Inspect in Site Explorer.

Running my blog post through AI Content Detector

Here’s what our content detector found:

  • 3.6% of pages cited in AI Overviews were categorized as “pure AI.”
  • 8.6% were categorized as “pure human.”
  • 87.8% were categorized as a mix of two.

Of the ones that were a mix of both human and AI:

  • 11.2% showed minimal AI use (1-10% of the page content was categorized as AI)
  • 44% showed moderate AI use (11-40%)
  • 24.7% showed substantial AI use (41%-70%)
  • 7.9% showed dominant AI use (71%-99%)
How many pages in AI Overviews are created/assisted with AI

These findings become even more striking when compared to our previous research on AI content across the web. In our analysis of 900,000 new pages, we found that:

  • 2.5% of pages were categorized as “pure AI.”
  • 25.8% were categorized as “pure human.”
  • 71.7% were categorized as a mix of the two.

Even though this research looked only at new pages (and not all cited URLs in AI Overviews will be new), this suggests that Google’s AI Overviews might show a bias toward citing AI-generated or AI-assisted content compared to the general distribution of content on the web.

Sidenote.
No AI content detector is perfect. Like LLMs, AI detectors are statistical models. They deal in probabilities, not certainty. They can be incredibly accurate, but they always carry the risk of false positives. You can learn more about how AI detectors work, and why they’re useful, in these articles:

We calculated the correlation between AI content percentage and the order of citations in AI Overviews across our entire dataset. The correlation was 0.017, effectively zero.

This suggests that Google doesn’t explicitly penalize or reward content based on whether it’s human or AI-generated when selecting sources for AI Overviews.

But considering that 87.8% of cited pages are at least AI-assisted, we’re watching AI eat its own tail in real-time. Google’s own AI-generated content is citing other AI-generated content and thus creating a feedback loop.

I don’t think Google is necessarily being careless either. Part of what we’re observing may simply reflect the current state of the web. For example, in our study of 900K pages, we found that 74% of new webpages include AI-generated content.

How many pages are created/assisted with AI

Google depends on creators for content. But creators are increasingly using AI to create or assist with content creation. For example, in our State of AI in Content Marketing report, where we surveyed 879 marketers, 87% of respondents use AI to help create content.

Bar chart showing how many marketers use AI to create content

And even though Google’s trying to improve accuracy through retrieval-augmented generation (RAG), we’ve also found that 86.5% of top-ranking pages contain some amount of AI-generated content.

How many pages in the top 20 search results are created/assisted with AI

This means AI Overviews are drawing from a content ecosystem that’s increasingly AI-generated. We’re potentially witnessing the emergence of an AI content ecosystem where machines talk to machines.

Start using AI Content Detector

Ahrefs’ AI Content Detector is part of Site Explorer. Just enter any URL, go to Page inspect, then click on the AI Detector tab.

AI content detector

It’ll tell you what percentage of the content is AI-generated and which LLM was used.

AI content level in AI content detector