{"id":196421,"date":"2026-04-15T09:33:27","date_gmt":"2026-04-15T14:33:27","guid":{"rendered":"https:\/\/ahrefs.com\/blog\/?p=196421"},"modified":"2026-05-31T02:58:00","modified_gmt":"2026-05-31T07:58:00","slug":"why-chatgpt-cites-pages","status":"publish","type":"post","link":"https:\/\/ahrefs.com\/blog\/why-chatgpt-cites-pages\/","title":{"rendered":"Why ChatGPT Cites One Page Over Another (Study of 1.4M Prompts)"},"content":{"rendered":"<div class=\"intro-txt\"> We\u2019ve all got used to the little numbered blue links in ChatGPT\u2019s responses. They\u2019re the citations that back up ChatGPT\u2019s responses with external information.<\/div>\n<p>But, although ChatGPT retrieves dozens of URLs to answer a single query, according to our research, it only ends up citing ~50% of&nbsp;them.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1476\" height=\"1749\" class=\"wp-image-196422\" src=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/pie-chart-shows-chatgpt-cites-about-half-the-urls.png\" alt=\"Pie chart shows ChatGPT cites about half the URLs it retrieves: 49.98% cited (23.4M URLs) vs. 50.02% not cited.\" srcset=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/pie-chart-shows-chatgpt-cites-about-half-the-urls.png 1476w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/pie-chart-shows-chatgpt-cites-about-half-the-urls-359x425.png 359w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/pie-chart-shows-chatgpt-cites-about-half-the-urls-768x910.png 768w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/pie-chart-shows-chatgpt-cites-about-half-the-urls-1296x1536.png 1296w\" sizes=\"auto, (max-width: 1476px) 100vw, 1476px\"><\/p>\n<p>Why does one page get the credit while another, which the AI clearly retrieved, gets nothing?<\/p>\n<p>According to <a href=\"https:\/\/dejanmarketing.com\/gpt-search\/\">studies<\/a> by AI expert <a href=\"https:\/\/au.linkedin.com\/in\/seoguy\">Dan Petrovic<\/a>, when ChatGPT retrieves results, each one comes back with the page title, a brief snippet or summary, the URL, and an ID number.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1698\" height=\"976\" class=\"wp-image-196423\" src=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/text-describing-raw-search-results-title-descrip.png\" alt=\"Text describing raw search results: title, description, URL, and an ID for each relevant webpage, highlighted with an orange box.\" srcset=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/text-describing-raw-search-results-title-descrip.png 1698w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/text-describing-raw-search-results-title-descrip-680x391.png 680w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/text-describing-raw-search-results-title-descrip-768x441.png 768w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/text-describing-raw-search-results-title-descrip-1536x883.png 1536w\" sizes=\"auto, (max-width: 1698px) 100vw, 1698px\"><\/p>\n<p>ChatGPT uses this data to decide which pages are worth opening and eventually citing in its response.<\/p>\n<p>In other words, there\u2019s a gatekeeping layer <em>before<\/em> ChatGPT opens and reads any of your actual page content. The title, snippet, and URL are doing the heavy lifting in that initial decision.<\/p>\n<div class=\"sidenote\"><div class=\"sidenote-title\">Sidenote.<\/div>The URLs in this study were returned as part of ChatGPT\u2019s retrieval pipeline\u2014but that doesn\u2019t necessarily mean every one was fetched and read in&nbsp;full.\n<p>Based on external research into the pipeline, ChatGPT evaluates candidates using the retrieval data returned with each result (title, URL, snippet etc.) before deciding which pages to&nbsp;open.<\/p>\n<p>Some non-cited URLs were likely never opened in the first place. Our 50% figure captures the full journey from retrieval to citation, not just the final decision after a page has been&nbsp;read.<\/p><\/div>\n<p>In this study, we wanted to know: <strong>what actually influences citation?<\/strong> Does higher semantic similarity between a page\u2019s retrieval data and the user query increase citation likelihood? Which fields matter most? Do human-readable URLs outperform opaque ones?<\/p>\n<p>To find out, we analyzed 1.4 million ChatGPT 5.2 prompts from February 2025 (desktop) with the help of Ahrefs data scientist <a href=\"https:\/\/sg.linkedin.com\/in\/xibeijia-guan\">Xibeijia Guan<\/a>.<\/p>\n<p>But before we get into the findings, you need to understand how ChatGPT actually gathers its sources\u2014because not all URLs enter the system the same&nbsp;way.<\/p>\n<h2><a id=\"post-196421-_1y08u24h51tt\"><\/a><div class=\"post-nav-link clearfix\" id=\"section1\"><a class=\"subhead-anchor\" data-tip=\"tooltip__copielink\" rel=\"#section1\"><svg width=\"19\" height=\"19\" viewBox=\"0 0 14 14\" style><g fill=\"none\" fill-rule=\"evenodd\"><path d=\"M0 0h14v14H0z\" \/><path d=\"M7.45 9.887l-1.62 1.621c-.92.92-2.418.92-3.338 0a2.364 2.364 0 0 1 0-3.339l1.62-1.62-1.273-1.272-1.62 1.62a4.161 4.161 0 1 0 5.885 5.884l1.62-1.62L7.45 9.886zM5.527 5.135L7.17 3.492c.92-.92 2.418-.92 3.339 0 .92.92.92 2.418 0 3.339L8.866 8.473l1.272 1.273 1.644-1.643A4.161 4.161 0 1 0 5.897 2.22L4.254 3.863l1.272 1.272zm-.66 3.998a.749.749 0 0 1 0-1.06l2.208-2.206a.749.749 0 1 1 1.06 1.06L5.928 9.133a.75.75 0 0 1-1.061 0z\" style \/><\/g><\/svg><\/a><div class=\"link-text\" data-anchor=\"Not all sources are created equal: the ref_type hierarchy\" data-section=\"ref-type\"> Not all sources are created equal: the ref_type hierarchy&nbsp;<\/div><\/div><\/h2>\n<p>When ChatGPT retrieves results, it categorizes sources using an internal field called <code>ref_type<\/code>\u2014essentially a label for the retrieval channel the URL came through.<\/p>\n<p>We discovered five categories: search, news, reddit, youtube, and academia.<\/p>\n<p>The citation rates between them are wildly uneven:<\/p>\n\n<table id=\"tablepress-532\" class=\"tablepress tablepress-id-532 tablepress-responsive tablepress-ahrefs-width-720px\">\n<thead>\n<tr class=\"row-1 odd\">\n\t<th class=\"column-1\">ref_type<\/th><th class=\"column-2\">Citation %<\/th><th class=\"column-3\">Total data points<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr class=\"row-2 even\">\n\t<td class=\"column-1\">search<\/td><td class=\"column-2\">88.46%<\/td><td class=\"column-3\">25,563,589<\/td>\n<\/tr>\n<tr class=\"row-3 odd\">\n\t<td class=\"column-1\">news<\/td><td class=\"column-2\">12.01%<\/td><td class=\"column-3\">3,940,537<\/td>\n<\/tr>\n<tr class=\"row-4 even\">\n\t<td class=\"column-1\">reddit<\/td><td class=\"column-2\">1.93%<\/td><td class=\"column-3\">16,182,976<\/td>\n<\/tr>\n<tr class=\"row-5 odd\">\n\t<td class=\"column-1\">youtube<\/td><td class=\"column-2\">0.51%<\/td><td class=\"column-3\">953,693<\/td>\n<\/tr>\n<tr class=\"row-6 even\">\n\t<td class=\"column-1\">academia<\/td><td class=\"column-2\">0.40%<\/td><td class=\"column-3\">185,337<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<!-- #tablepress-532 from cache -->\n<p>The general \u201csearch\u201d index dominates\u2014both in volume and citation rate\u2014and 88% of the URLs that end up being cited by ChatGPT are taken directly from search.<\/p>\n<p>If you want to be cited by ChatGPT, you need to be in that search selection pool\u2014which means your content needs to&nbsp;rank.<\/p>\n<p>This isn\u2019t new information. By now, most people are already aware that ranking plays a part, but it\u2019s nice to have some more data to back it&nbsp;up.<\/p>\n<p>Specialized verticals like YouTube (e.g. youtube.com) and Academia (e.g. <a href=\"http:\/\/arxiv.org\">arXiv.org<\/a>), on the other hand, are pulled in at scale but barely ever get surfaced as actual citations.<\/p>\n<div class=\"sidenote\"><div class=\"sidenote-title\">Sidenote.<\/div>The \u201csearch\u201d <code>ref_type<\/code> does include Reddit and YouTube results too\u2014any Reddit or YouTube page that comes back through a standard web search will show up&nbsp;there.&nbsp;<\/div>\n<p>The separate \u201cReddit\u201d and \u201cYouTube\u201d <code>ref_types<\/code> likely represent <em>additional<\/em> results\u2014i.e. those pulled in via dedicated API integrations\u2014on top of whatever the web search already returned.<\/p>\n<p>That\u2019s why the volume on those channels is so high; ChatGPT is supplementing its search results with a separate feed of Reddit and YouTube content.<\/p>\n<p>This matters a lot for interpreting the rest of the analysis.<\/p>\n<p>On average, ChatGPT pulls ~16.57 cited URLs and ~16.58 non-cited URLs per prompt.<\/p>\n<p>But because Reddit makes up 67.8% of the non-cited pool, any aggregate comparison of \u201ccited vs. non-cited\u201d is really comparing search results to Reddit API output. Not apples to apples.<\/p>\n<p>So throughout this research, we\u2019ve isolated the analysis by <code>ref_type<\/code> wherever possible to avoid that distortion.<\/p>\n<h2><a id=\"post-196421-_ahckcl9n1t6c\"><\/a><div class=\"post-nav-link clearfix\" id=\"section1\"><a class=\"subhead-anchor\" data-tip=\"tooltip__copielink\" rel=\"#section1\"><svg width=\"19\" height=\"19\" viewBox=\"0 0 14 14\" style><g fill=\"none\" fill-rule=\"evenodd\"><path d=\"M0 0h14v14H0z\" \/><path d=\"M7.45 9.887l-1.62 1.621c-.92.92-2.418.92-3.338 0a2.364 2.364 0 0 1 0-3.339l1.62-1.62-1.273-1.272-1.62 1.62a4.161 4.161 0 1 0 5.885 5.884l1.62-1.62L7.45 9.886zM5.527 5.135L7.17 3.492c.92-.92 2.418-.92 3.339 0 .92.92.92 2.418 0 3.339L8.866 8.473l1.272 1.273 1.644-1.643A4.161 4.161 0 1 0 5.897 2.22L4.254 3.863l1.272 1.272zm-.66 3.998a.749.749 0 0 1 0-1.06l2.208-2.206a.749.749 0 1 1 1.06 1.06L5.928 9.133a.75.75 0 0 1-1.061 0z\" style \/><\/g><\/svg><\/a><div class=\"link-text\" data-anchor=\"67.8% of non-cited URLs are from Reddit\" data-section=\"reddit\"> 67.8% of non-cited URLs are from Reddit&nbsp;<\/div><\/div><\/h2>\n<p>This is probably the most striking finding in the dataset.<\/p>\n<p>Reddit has its own dedicated <code>ref_type<\/code> in ChatGPT\u2019s retrieval system, with over 16 million data points in our dataset.<\/p>\n<p>Yet it\u2019s cited at a rate of just 1.93%.<\/p>\n<p>Meanwhile, 67.8% of all non-cited URLs come from Reddit.<\/p>\n<p>In other words: ChatGPT is using Reddit extensively to understand topics, gauge consensus, and build context\u2014but it almost never gives Reddit the credit.<\/p>\n<p>It learns from the crowd, then cites another institution.<\/p>\n<h2><a id=\"post-196421-_t2e34kqj821\"><\/a><div class=\"post-nav-link clearfix\" id=\"section1\"><a class=\"subhead-anchor\" data-tip=\"tooltip__copielink\" rel=\"#section1\"><svg width=\"19\" height=\"19\" viewBox=\"0 0 14 14\" style><g fill=\"none\" fill-rule=\"evenodd\"><path d=\"M0 0h14v14H0z\" \/><path d=\"M7.45 9.887l-1.62 1.621c-.92.92-2.418.92-3.338 0a2.364 2.364 0 0 1 0-3.339l1.62-1.62-1.273-1.272-1.62 1.62a4.161 4.161 0 1 0 5.885 5.884l1.62-1.62L7.45 9.886zM5.527 5.135L7.17 3.492c.92-.92 2.418-.92 3.339 0 .92.92.92 2.418 0 3.339L8.866 8.473l1.272 1.273 1.644-1.643A4.161 4.161 0 1 0 5.897 2.22L4.254 3.863l1.272 1.272zm-.66 3.998a.749.749 0 0 1 0-1.06l2.208-2.206a.749.749 0 1 1 1.06 1.06L5.928 9.133a.75.75 0 0 1-1.061 0z\" style \/><\/g><\/svg><\/a><div class=\"link-text\" data-anchor=\"Non-cited pages have 3x more retrieval data\u2014but that\u2019s not the full story\u2026\" data-section=\"noncited-pages\"> Non-cited pages have 3x more retrieval data\u2014but that\u2019s not the full&nbsp;story\u2026&nbsp;<\/div><\/div><\/h2>\n<p>As we\u2019ve briefly covered, when ChatGPT retrieves search results, each one comes back with a set of fields including a title, URL, and sometimes a snippet\u2014a short extract of page content stored in ChatGPT\u2019s retrieval data.<\/p>\n<p>We expected that having more of these fields populated would correlate with higher citation rates.<\/p>\n<p>At first glance, the aggregate data seemed to tell a different story: non-cited pages actually have <em>more<\/em> populated fields in ChatGPT\u2019s retrieval data than cited&nbsp;ones.<\/p>\n<p>Non-cited URLs had <strong>snippets<\/strong> 14.81% of the time versus 4.36% for cited URLs, and were far more likely to carry a <strong>publication date<\/strong> (92.72% vs. 35.98%).<\/p>\n<p>We almost ran with that as a finding, but I\u2019m glad we didn\u2019t.<\/p>\n<p>When we dug into it, the discrepancy turned out to be almost entirely a compositional artifact\u2014driven by Reddit and the mechanics of ChatGPT\u2019s retrieval pipeline.<\/p>\n<p>Because the non-cited pool is overwhelmingly Reddit (67.8%), and Reddit content pulled via API naturally carries <code>pub_date<\/code> metadata, the 92.72% figure is a Reddit artifact\u2014not a signal about how ChatGPT evaluates web pages in general.<\/p>\n<p>The snippet gap is explained differently. According to <a href=\"https:\/\/queryburst.com\/blog\/how-chatgpt-works\/\">David McSweeney\u2019s research<\/a> on ChatGPT\u2019s retrieval process, the model actually abandons the snippet field (the short content extract) once it\u2019s decided to cite a URL, and opens the full page instead.<\/p>\n<p>So, it\u2019s not a matter of ChatGPT preferring pages with no snippets. The low snippet percentage for cited pages is likely a byproduct of how the pipeline works.<\/p>\n<p>When we isolated the data to just the \u201csearch\u201d <code>ref_type<\/code>\u2014stripping out Reddit, news, YouTube, and the rest\u2014the picture became a lot clearer:<\/p>\n\n<table id=\"tablepress-533\" class=\"tablepress tablepress-id-533 tablepress-responsive tablepress-ahrefs-width-720px\">\n<thead>\n<tr class=\"row-1 odd\">\n\t<th class=\"column-1\">Search ref_type<\/th><th class=\"column-2\">Has snippet<\/th><th class=\"column-3\">Has pub_date<\/th><th class=\"column-4\">Total URLs<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr class=\"row-2 even\">\n\t<td class=\"column-1\">Cited<\/td><td class=\"column-2\">2.52%<\/td><td class=\"column-3\">33.79%<\/td><td class=\"column-4\">22,612,529<\/td>\n<\/tr>\n<tr class=\"row-3 odd\">\n\t<td class=\"column-1\">Not cited<\/td><td class=\"column-2\">0.09%<\/td><td class=\"column-3\">49.00%<\/td><td class=\"column-4\">2,951,060<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<!-- #tablepress-533 from cache -->\n<p>Snippet data is basically non-existent for both groups within the search vertical\u2014it\u2019s not a usable signal. And the publication date percentages are closer, but non-cited search pages are still slightly <em>more<\/em> likely to carry a <code>pub_date<\/code> (49%) than cited ones (33.79%).<\/p>\n<p>The differences we initially saw between cited and non-cited URLs seem to have been distorted by the data composition and retrieval mechanics. Any signal\u2014if there is one\u2014is buried under the&nbsp;noise.<\/p>\n<p>The honest takeaway: we can\u2019t draw strong conclusions about whether the snippet or publication date fields play a meaningful role in citation from this&nbsp;data.<\/p>\n<p>It\u2019s worth flagging that this problem likely applies to other citation studies too. Any research comparing \u201ccited vs. non-cited\u201d URLs without accounting for where those URLs came from risks mistaking quirks of the data for real patterns.<\/p>\n<div class=\"recommendation\"><div class=\"recommendation-title\">Find your own citation gaps in Brand&nbsp;Radar<\/div><div class=\"recommendation-content\">\n<p>The data in this study tells you <em>what<\/em> ChatGPT values. <a href=\"https:\/\/ahrefs.com\/brand-radar\">Brand Radar<\/a> tells you <em>where<\/em> you\u2019re falling short.<\/p>\n<p>Open Brand Radar, set up your brand and competitors, and head straight to the Cited Pages report.<\/p>\n<p>Then, filter for responses where competitors are cited and you aren\u2019t.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1229\" height=\"863\" class=\"wp-image-196424\" src=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/a-screenshot-of-a-cited-pages-dashboard-showing.png\" alt=\"A screenshot of a &quot;Cited pages&quot; dashboard showing trends over time and a table of AI visibility tools.\" srcset=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/a-screenshot-of-a-cited-pages-dashboard-showing.png 1229w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/a-screenshot-of-a-cited-pages-dashboard-showing-605x425.png 605w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/a-screenshot-of-a-cited-pages-dashboard-showing-768x539.png 768w\" sizes=\"auto, (max-width: 1229px) 100vw, 1229px\"><\/p>\n<p>That gap analysis gives you a concrete list of content to create, refresh, or restructure.<br>\n<\/p><\/div><\/div>\n<h2><a id=\"post-196421-_4cqu8ey219nr\"><\/a><div class=\"post-nav-link clearfix\" id=\"section1\"><a class=\"subhead-anchor\" data-tip=\"tooltip__copielink\" rel=\"#section1\"><svg width=\"19\" height=\"19\" viewBox=\"0 0 14 14\" style><g fill=\"none\" fill-rule=\"evenodd\"><path d=\"M0 0h14v14H0z\" \/><path d=\"M7.45 9.887l-1.62 1.621c-.92.92-2.418.92-3.338 0a2.364 2.364 0 0 1 0-3.339l1.62-1.62-1.273-1.272-1.62 1.62a4.161 4.161 0 1 0 5.885 5.884l1.62-1.62L7.45 9.886zM5.527 5.135L7.17 3.492c.92-.92 2.418-.92 3.339 0 .92.92.92 2.418 0 3.339L8.866 8.473l1.272 1.273 1.644-1.643A4.161 4.161 0 1 0 5.897 2.22L4.254 3.863l1.272 1.272zm-.66 3.998a.749.749 0 0 1 0-1.06l2.208-2.206a.749.749 0 1 1 1.06 1.06L5.928 9.133a.75.75 0 0 1-1.061 0z\" style \/><\/g><\/svg><\/a><div class=\"link-text\" data-anchor=\"Titles need to be semantically relevant to fan-out queries\" data-section=\"fan-out-queries\"> Titles need to be semantically relevant to fan-out queries&nbsp;<\/div><\/div><\/h2>\n<p>To figure out what\u2019s \u201ccitable,\u201d ChatGPT estimates relevance, in a process sometimes described as \u201c<a href=\"https:\/\/queryburst.com\/blog\/how-chatgpt-works\/\">semantic scoring<\/a>\u201d, to judge whether an article and a query are related.<\/p>\n<p>Since ChatGPT is a closed-source model, we don\u2019t have visibility into <em>exactly<\/em> <em>how<\/em> it determines relevance internally.<\/p>\n<p>So, in this study, we used cosine similarity computed from embeddings generated by open-source models, to quantify and approximate how ChatGPT may&nbsp;work.<\/p>\n<p>ChatGPT matches URLs against its own \u201c<a href=\"https:\/\/ahrefs.com\/blog\/query-fan-out\/\">fanout queries<\/a>\u201d\u2014the sub-questions it generates internally (from a user\u2019s seed prompt) to hunt for specific facts.<\/p>\n<p>The data confirms that title relevance to fanout queries is an important factor in citation:<\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Prompt vs. cited URL title:<\/b><span style=\"font-weight: 400;\"> 0.602<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Prompt vs. non-cited URL title:<\/b><span style=\"font-weight: 400;\"> 0.484<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Fanout query vs. cited URL title (max match*):<\/b> 0.656<\/li>\n<\/ul>\n<div class=\"sidenote\"><div class=\"sidenote-title\">Sidenote.<\/div> For each of these fanout queries, we compute its cosine similarity with the article title. The \u201cmax match\u201d score is the highest similarity among them\u2014for example, if scores are 0.45, 0.71, and 0.38, the max match is 0.71. This captures the best-aligned sub-question rather than averaging across all interpretations, which would dilute the signal.<\/div>\n<p>The box plots tell the story clearly. Across all <code>ref_types<\/code>, cited URLs have consistently higher similarity between their title and the original prompt:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1600\" height=\"1717\" class=\"wp-image-196425\" src=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-showing-that-cited-pages-have-higher-cosi.png\" alt=\"Box plot showing that cited pages have higher cosine similarity between their titles and original ChatGPT prompts than uncited pages.\" srcset=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-showing-that-cited-pages-have-higher-cosi.png 1600w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-showing-that-cited-pages-have-higher-cosi-396x425.png 396w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-showing-that-cited-pages-have-higher-cosi-768x824.png 768w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-showing-that-cited-pages-have-higher-cosi-1431x1536.png 1431w\" sizes=\"auto, (max-width: 1600px) 100vw, 1600px\"><\/p>\n<p>The gap widens further when we compare against fanout queries instead of the original prompt\u2014reinforcing that creating content relevant to ChatGPT\u2019s internal sub-questions are what really drive selection:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1600\" height=\"1717\" class=\"wp-image-196426\" src=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-showing-cosine-similarity-between-titles.png\" alt=\"Box plot showing cosine similarity between titles and fan-out queries for cited vs. not cited pages. Cited pages show higher similarity.\" srcset=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-showing-cosine-similarity-between-titles.png 1600w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-showing-cosine-similarity-between-titles-396x425.png 396w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-showing-cosine-similarity-between-titles-768x824.png 768w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-showing-cosine-similarity-between-titles-1431x1536.png 1431w\" sizes=\"auto, (max-width: 1600px) 100vw, 1600px\"><\/p>\n<p>When we isolate the search <code>ref_type<\/code> specifically, the pattern gets even sharper. Cited pages are clearly more relevant, and the non-cited distribution drops significantly:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1600\" height=\"1717\" class=\"wp-image-196427\" src=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-comparing-cosine-similarity-between-title.png\" alt=\"Box plot comparing cosine similarity between title and original prompt for cited vs. not-cited search results.\" srcset=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-comparing-cosine-similarity-between-title.png 1600w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-comparing-cosine-similarity-between-title-396x425.png 396w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-comparing-cosine-similarity-between-title-768x824.png 768w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-comparing-cosine-similarity-between-title-1431x1536.png 1431w\" sizes=\"auto, (max-width: 1600px) 100vw, 1600px\"><\/p>\n<p>We also found that search results with natural language URL slugs had an <strong>89.78% citation rate<\/strong>, compared to 81.11% for those without.<\/p>\n<p>Ultimately, if your URL and title don\u2019t semantically align with the AI\u2019s internal fanout queries, you\u2019re less likely to get&nbsp;cited.<\/p>\n<div class=\"recommendation\"><div class=\"recommendation-title\">Optimize for fan-out queries using Brand&nbsp;Radar<\/div><div class=\"recommendation-content\">\n<p>You can study fanout queries directly inside Brand Radar. Head to the AI Responses report, pick any prompt, and you\u2019ll see the fanout queries ChatGPT generated alongside the cited&nbsp;URLs.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"2048\" height=\"1228\" class=\"wp-image-196428\" src=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/screenshot-of-ahrefs-ai-responses-page-showing.png\" alt=\"Screenshot of Ahrefs' &quot;AI responses&quot; page, showing listed prompts, responses, fanout queries, mentions, citations, and updates.\" srcset=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/screenshot-of-ahrefs-ai-responses-page-showing.png 2048w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/screenshot-of-ahrefs-ai-responses-page-showing-680x408.png 680w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/screenshot-of-ahrefs-ai-responses-page-showing-768x461.png 768w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/screenshot-of-ahrefs-ai-responses-page-showing-1536x921.png 1536w\" sizes=\"auto, (max-width: 2048px) 100vw, 2048px\"><\/p>\n<p>This is the actual set of sub-questions your content needs to answer.<\/p>\n<p>From there, use the <a href=\"https:\/\/ahrefs.com\/ai-content-helper\">AI Content Helper<\/a> to check how well your page covers the topics those fanout queries address. It measures the cosine similarity between your content and the topics the SERP or AI response is trying to cover\u2014and gives you a colored highlight as you write, showing which gaps remain.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"2048\" height=\"1161\" class=\"wp-image-196429\" src=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/a-screenshot-of-a-content-optimization-tool-showi.png\" alt=\"A screenshot of a content optimization tool, showing text being edited and highlighted, with content score and topic suggestions.\" srcset=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/a-screenshot-of-a-content-optimization-tool-showi.png 2048w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/a-screenshot-of-a-content-optimization-tool-showi-680x385.png 680w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/a-screenshot-of-a-content-optimization-tool-showi-768x435.png 768w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/a-screenshot-of-a-content-optimization-tool-showi-1536x871.png 1536w\" sizes=\"auto, (max-width: 2048px) 100vw, 2048px\"><\/p>\n<p>If a competitor\u2019s page is getting cited for a query where yours isn\u2019t, this is one of the fastest ways to diagnose why.<\/p><\/div><\/div>\n<h2><a id=\"post-196421-_tk6chq42byr9\"><\/a><div class=\"post-nav-link clearfix\" id=\"section1\"><a class=\"subhead-anchor\" data-tip=\"tooltip__copielink\" rel=\"#section1\"><svg width=\"19\" height=\"19\" viewBox=\"0 0 14 14\" style><g fill=\"none\" fill-rule=\"evenodd\"><path d=\"M0 0h14v14H0z\" \/><path d=\"M7.45 9.887l-1.62 1.621c-.92.92-2.418.92-3.338 0a2.364 2.364 0 0 1 0-3.339l1.62-1.62-1.273-1.272-1.62 1.62a4.161 4.161 0 1 0 5.885 5.884l1.62-1.62L7.45 9.886zM5.527 5.135L7.17 3.492c.92-.92 2.418-.92 3.339 0 .92.92.92 2.418 0 3.339L8.866 8.473l1.272 1.273 1.644-1.643A4.161 4.161 0 1 0 5.897 2.22L4.254 3.863l1.272 1.272zm-.66 3.998a.749.749 0 0 1 0-1.06l2.208-2.206a.749.749 0 1 1 1.06 1.06L5.928 9.133a.75.75 0 0 1-1.061 0z\" style \/><\/g><\/svg><\/a><div class=\"link-text\" data-anchor=\"The average cited page is 500 days old (and still getting picked)\" data-section=\"page-age\"> The average cited page is 500 days old (and still getting picked)&nbsp;<\/div><\/div><\/h2>\n<p>It\u2019s common knowledge that fresher content gets cited more by AI\u2014and, in fact, our own study of 17 million citations supports that. We found that <a href=\"https:\/\/ahrefs.com\/blog\/do-ai-assistants-prefer-to-cite-fresh-content\/\">ChatGPT cited URLs that were 458 days newer<\/a> than Google\u2019s organic results\u2014the strongest freshness preference of any platform we tested.<\/p>\n<p>This study doesn\u2019t contradict that narrative, but it does add an extra layer of nuance.<\/p>\n<p>For instance, when we look at the search index, cited pages span a wide range of ages\u2014the median is around 500 days (~1.3 years old), with some cited pages over 2,700 days old (~7.4 years&nbsp;old).<\/p>\n<p>The median age is actually far lower than our initial freshness study linked above (958 days back in July vs 500 days in this dataset), suggesting that ChatGPT is skewing even younger in its citation preferences.<\/p>\n<p>That said, we also found that non-cited pages are overwhelmingly very&nbsp;young.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1600\" height=\"1618\" class=\"wp-image-196430\" src=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-shows-search-results-cited-by-chatgpt-are.png\" alt=\"Box plot shows search results cited by ChatGPT are significantly older than non-cited results, with a median age of 500 days.\" srcset=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-shows-search-results-cited-by-chatgpt-are.png 1600w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-shows-search-results-cited-by-chatgpt-are-420x425.png 420w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-shows-search-results-cited-by-chatgpt-are-768x777.png 768w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-shows-search-results-cited-by-chatgpt-are-1519x1536.png 1519w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-shows-search-results-cited-by-chatgpt-are-120x120.png 120w\" sizes=\"auto, (max-width: 1600px) 100vw, 1600px\"><\/p>\n<p>So within a single prompt\u2019s retrieval set, it\u2019s the older, more established pages that tend to get cited, and the freshest content that tends to get discarded.<\/p>\n<p>In other words, ChatGPT prefers fresh content, but tends to cite comparatively \u201colder\u201d content more often. That sounds counterintuitive, but both things can be true at the same&nbsp;time.<\/p>\n<p>Across the broader population of AI citations, ChatGPT does skew fresher when compared against Google results, and even against it\u2019s own citation preferences from only last&nbsp;year.<\/p>\n<p>But within a given retrieval set, freshness alone <strong>isn\u2019t enough<\/strong>. Relevance still does the heavy lifting.<\/p>\n<p>A new page that matches fanout queries well will get cited. A new page that <em>doesn\u2019t<\/em> will be retrieved, yet ignored.<\/p>\n<p>It\u2019s also worth pointing out that the pool of non-cited pages (~3M) across the search <code>ref_type<\/code> is far smaller than the cited group (~23M), which limits how confidently we can interpret the age&nbsp;gap.<\/p>\n<p>Where freshness matters most is in \u201cnews\u201d.<\/p>\n<p>In this category, title relevance scores for cited and non-cited pages are nearly identical:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1600\" height=\"1801\" class=\"wp-image-196431\" src=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-showing-cosine-similarity-between-title-a.png\" alt=\"Box plot showing cosine similarity between title and original prompt for cited (blue) and not cited (red) news articles.\" srcset=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-showing-cosine-similarity-between-title-a.png 1600w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-showing-cosine-similarity-between-title-a-378x425.png 378w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-showing-cosine-similarity-between-title-a-768x864.png 768w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-showing-cosine-similarity-between-title-a-1365x1536.png 1365w\" sizes=\"auto, (max-width: 1600px) 100vw, 1600px\"><\/p>\n<p>The AI can\u2019t decide based on relevance alone, so it defaults to a temporal tie-breaker: page age. Cited news pages skew younger:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1600\" height=\"1618\" class=\"wp-image-196432\" src=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-cited-pages-blue-have-a-median-age-o.png\" alt=\"Box plot: &quot;Cited&quot; pages (blue) have a median age of ~200 days, younger than &quot;Not Cited&quot; pages (red) with a median of ~300 days.\" srcset=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-cited-pages-blue-have-a-median-age-o.png 1600w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-cited-pages-blue-have-a-median-age-o-420x425.png 420w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-cited-pages-blue-have-a-median-age-o-768x777.png 768w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-cited-pages-blue-have-a-median-age-o-1519x1536.png 1519w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/box-plot-cited-pages-blue-have-a-median-age-o-120x120.png 120w\" sizes=\"auto, (max-width: 1600px) 100vw, 1600px\"><\/p>\n<p>For news queries, younger pages have a clear advantage, even when relevance scores between cited and non-cited pages are similar.<\/p>\n<div class=\"recommendation\"><div class=\"recommendation-title\">Create the freshest news content using Firehose<\/div><div class=\"recommendation-content\">\n<p>If you publish news or time-sensitive content, freshness is non-negotiable.<\/p>\n<p>Be the first to break news on certain stories using <a href=\"https:\/\/firehose.com\/\">Ahrefs Firehose<\/a>\u2014our real-time web monitoring API that gives you a streaming feed of data from our huge crawler infrastructure.<\/p>\n<p>For example, if you work in SaaS journalism, you can track content changes on pages like Google\u2019s official blog, so you can be the first one to cover a new Google update as soon as it goes&nbsp;live.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1377\" height=\"524\" class=\"wp-image-196433\" src=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/a-screenshot-of-a-firehose-platform-dashboard-s.png\" alt=\"A screenshot of a &quot;Firehose&quot; platform dashboard, showing Taps, specifically a &quot;Google Blog&quot; feed with recent articles.\" srcset=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/a-screenshot-of-a-firehose-platform-dashboard-s.png 1377w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/a-screenshot-of-a-firehose-platform-dashboard-s-680x259.png 680w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/a-screenshot-of-a-firehose-platform-dashboard-s-768x292.png 768w\" sizes=\"auto, (max-width: 1377px) 100vw, 1377px\"><\/p>\n<p>Then, use Brand Radar\u2019s Mentions history in the AI Responses report to track whether your ChatGPT visibility spikes after publication.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1920\" height=\"1574\" class=\"wp-image-196434\" src=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/ahrefs-ai-responses-dashboard-shows-competitor-men.jpg\" alt=\"Ahrefs AI responses dashboard shows competitor mentions over time, with a graph tracking Ahrefs, Moz, SE Ranking, and Similarweb.\" srcset=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/ahrefs-ai-responses-dashboard-shows-competitor-men.jpg 1920w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/ahrefs-ai-responses-dashboard-shows-competitor-men-518x425.jpg 518w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/ahrefs-ai-responses-dashboard-shows-competitor-men-768x630.jpg 768w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/ahrefs-ai-responses-dashboard-shows-competitor-men-1536x1259.jpg 1536w\" sizes=\"auto, (max-width: 1920px) 100vw, 1920px\"><\/p>\n<\/div><\/div>\n<h2><a id=\"post-196421-_eeblggnm5yze\"><\/a><div class=\"post-nav-link clearfix\" id=\"section1\"><a class=\"subhead-anchor\" data-tip=\"tooltip__copielink\" rel=\"#section1\"><svg width=\"19\" height=\"19\" viewBox=\"0 0 14 14\" style><g fill=\"none\" fill-rule=\"evenodd\"><path d=\"M0 0h14v14H0z\" \/><path d=\"M7.45 9.887l-1.62 1.621c-.92.92-2.418.92-3.338 0a2.364 2.364 0 0 1 0-3.339l1.62-1.62-1.273-1.272-1.62 1.62a4.161 4.161 0 1 0 5.885 5.884l1.62-1.62L7.45 9.886zM5.527 5.135L7.17 3.492c.92-.92 2.418-.92 3.339 0 .92.92.92 2.418 0 3.339L8.866 8.473l1.272 1.273 1.644-1.643A4.161 4.161 0 1 0 5.897 2.22L4.254 3.863l1.272 1.272zm-.66 3.998a.749.749 0 0 1 0-1.06l2.208-2.206a.749.749 0 1 1 1.06 1.06L5.928 9.133a.75.75 0 0 1-1.061 0z\" style \/><\/g><\/svg><\/a><div class=\"link-text\" data-anchor=\"What this all means for being \u201ccitable\u201d\" data-section=\"being-citable\"> What this all means for being \u201ccitable\u201d <\/div><\/div><\/h2>\n<p>The 1.4 million prompts paint a pretty clear picture. ChatGPT is an aggressive editor. It favors its general search index, uses semantic similarity to select and cite sources, and treats Reddit as a textbook it\u2019s embarrassed to admit it&nbsp;read.<\/p>\n<p>But the data also taught us a lesson in analytical caution.<\/p>\n<p>Aggregate comparisons between \u201ccited\u201d and \u201cnon-cited\u201d URLs can be misleading if the non-cited pool is dominated by a single source type with its own retrieval mechanics.<\/p>\n<p>What initially looked like a paradox\u2014less-optimized pages getting cited more\u2014turned out to be a matter of dataset composition.<\/p>\n<p>We would have got that one very wrong if we hadn\u2019t isolated by <code>ref_type<\/code>.<\/p>\n<p>Ultimately, the pages that get cited are the ones whose titles and content match the questions ChatGPT is asking behind the scenes, and that surface through the right retrieval channel.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>But, although ChatGPT retrieves dozens of URLs to answer a single query, according to our research, it only ends up citing ~50% of&nbsp;them. Why does one page get the credit while another, which the AI clearly retrieved, gets nothing? According<span class=\"ellipsis\">\u2026<\/span><\/p>\n<div class=\"read-more\">Read more \u203a<\/div>\n<p><!-- end of .read-more --><\/p>\n","protected":false},"author":197,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"wp_typography_post_enhancements_disabled":false,"footnotes":""},"categories":[469,414],"tags":[],"coauthors":[464,467],"class_list":["post-196421","post","type-post","status-publish","format-standard","hentry","category-ai-search","category-data-studies","odd"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v28.1 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Why ChatGPT Cites One Page Over Another (Study of 1.4M Prompts)<\/title>\n<meta name=\"description\" content=\"We studied 1.4M prompts to see how titles, URLs, and snippets influence AI citation. Here&#039;s why ChatGPT cites certain sources over others.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/ahrefs.com\/blog\/why-chatgpt-cites-pages\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Why ChatGPT Cites One Page Over Another (Study of 1.4M Prompts)\" \/>\n<meta property=\"og:description\" content=\"We studied 1.4M prompts to see how titles, URLs, and snippets influence AI citation. Here&#039;s why ChatGPT cites certain sources over others.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/ahrefs.com\/blog\/why-chatgpt-cites-pages\/\" \/>\n<meta property=\"og:site_name\" content=\"SEO Blog by Ahrefs\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/Ahrefs\/\" \/>\n<meta property=\"article:published_time\" content=\"2026-04-15T14:33:27+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-05-31T07:58:00+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/pie-chart-shows-chatgpt-cites-about-half-the-urls.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1476\" \/>\n\t<meta property=\"og:image:height\" content=\"1749\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Louise Linehan, Xibeijia Guan\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@ahrefs\" \/>\n<meta name=\"twitter:site\" content=\"@ahrefs\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/why-chatgpt-cites-pages\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/why-chatgpt-cites-pages\\\/\"},\"author\":{\"name\":\"Louise Linehan\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#\\\/schema\\\/person\\\/444b3643c35b16b94b763446c5562388\"},\"headline\":\"Why ChatGPT Cites One Page Over Another (Study of 1.4M Prompts)\",\"datePublished\":\"2026-04-15T14:33:27+00:00\",\"dateModified\":\"2026-05-31T07:58:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/why-chatgpt-cites-pages\\\/\"},\"wordCount\":2423,\"publisher\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/why-chatgpt-cites-pages\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2026\\\/04\\\/why-chatgpt-cites-one-page-over-by-louise-linehan-data-studies.jpg\",\"articleSection\":[\"AI Search\",\"Data &amp; Studies\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/why-chatgpt-cites-pages\\\/\",\"url\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/why-chatgpt-cites-pages\\\/\",\"name\":\"Why ChatGPT Cites One Page Over Another (Study of 1.4M Prompts)\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/why-chatgpt-cites-pages\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/why-chatgpt-cites-pages\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2026\\\/04\\\/pie-chart-shows-chatgpt-cites-about-half-the-urls.png\",\"datePublished\":\"2026-04-15T14:33:27+00:00\",\"dateModified\":\"2026-05-31T07:58:00+00:00\",\"description\":\"We studied 1.4M prompts to see how titles, URLs, and snippets influence AI citation. Here's why ChatGPT cites certain sources over others.\",\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/ahrefs.com\\\/blog\\\/why-chatgpt-cites-pages\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/why-chatgpt-cites-pages\\\/#primaryimage\",\"url\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2026\\\/04\\\/pie-chart-shows-chatgpt-cites-about-half-the-urls.png\",\"contentUrl\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2026\\\/04\\\/pie-chart-shows-chatgpt-cites-about-half-the-urls.png\",\"width\":1476,\"height\":1749,\"caption\":\"Pie chart shows ChatGPT cites about half the URLs it retrieves: 49.98% cited (23.4M URLs) vs. 50.02% not cited.\"},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/\",\"name\":\"SEO Blog by Ahrefs\",\"description\":\"Link Building Strategies &amp; SEO Tips\",\"publisher\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#organization\",\"name\":\"Ahrefs\",\"url\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/ahrefs-logo.png\",\"contentUrl\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/ahrefs-logo.png\",\"width\":2048,\"height\":768,\"caption\":\"Ahrefs\"},\"image\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/Ahrefs\\\/\",\"https:\\\/\\\/x.com\\\/ahrefs\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/ahrefs\\\/\",\"https:\\\/\\\/www.youtube.com\\\/c\\\/ahrefscom\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#\\\/schema\\\/person\\\/444b3643c35b16b94b763446c5562388\",\"name\":\"Louise Linehan\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2024\\\/08\\\/Louise-Linehan.jpg02b05bbed9b25ec9b04e39f0d88f15b0\",\"url\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2024\\\/08\\\/Louise-Linehan.jpg\",\"contentUrl\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2024\\\/08\\\/Louise-Linehan.jpg\",\"caption\":\"Louise Linehan\"},\"description\":\"Louise is a Content Marketer at Ahrefs. Over the past ten years, she has held senior content positions at SaaS brands: Pi Datametrics, BuzzSumo, and Cision. By day, she writes about content and SEO; by night, you'll find her playing football or screaming down the mic at karaoke.\",\"sameAs\":[\"https:\\\/\\\/www.linkedin.com\\\/in\\\/louise-linehan\\\/\"],\"url\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/author\\\/louise-linehan\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Why ChatGPT Cites One Page Over Another (Study of 1.4M Prompts)","description":"We studied 1.4M prompts to see how titles, URLs, and snippets influence AI citation. Here's why ChatGPT cites certain sources over others.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/ahrefs.com\/blog\/why-chatgpt-cites-pages\/","og_locale":"en_US","og_type":"article","og_title":"Why ChatGPT Cites One Page Over Another (Study of 1.4M Prompts)","og_description":"We studied 1.4M prompts to see how titles, URLs, and snippets influence AI citation. Here's why ChatGPT cites certain sources over others.","og_url":"https:\/\/ahrefs.com\/blog\/why-chatgpt-cites-pages\/","og_site_name":"SEO Blog by Ahrefs","article_publisher":"https:\/\/www.facebook.com\/Ahrefs\/","article_published_time":"2026-04-15T14:33:27+00:00","article_modified_time":"2026-05-31T07:58:00+00:00","og_image":[{"width":1476,"height":1749,"url":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/pie-chart-shows-chatgpt-cites-about-half-the-urls.png","type":"image\/png"}],"author":"Louise Linehan, Xibeijia Guan","twitter_card":"summary_large_image","twitter_creator":"@ahrefs","twitter_site":"@ahrefs","schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/ahrefs.com\/blog\/why-chatgpt-cites-pages\/#article","isPartOf":{"@id":"https:\/\/ahrefs.com\/blog\/why-chatgpt-cites-pages\/"},"author":{"name":"Louise Linehan","@id":"https:\/\/ahrefs.com\/blog\/#\/schema\/person\/444b3643c35b16b94b763446c5562388"},"headline":"Why ChatGPT Cites One Page Over Another (Study of 1.4M Prompts)","datePublished":"2026-04-15T14:33:27+00:00","dateModified":"2026-05-31T07:58:00+00:00","mainEntityOfPage":{"@id":"https:\/\/ahrefs.com\/blog\/why-chatgpt-cites-pages\/"},"wordCount":2423,"publisher":{"@id":"https:\/\/ahrefs.com\/blog\/#organization"},"image":{"@id":"https:\/\/ahrefs.com\/blog\/why-chatgpt-cites-pages\/#primaryimage"},"thumbnailUrl":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/why-chatgpt-cites-one-page-over-by-louise-linehan-data-studies.jpg","articleSection":["AI Search","Data &amp; Studies"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/ahrefs.com\/blog\/why-chatgpt-cites-pages\/","url":"https:\/\/ahrefs.com\/blog\/why-chatgpt-cites-pages\/","name":"Why ChatGPT Cites One Page Over Another (Study of 1.4M Prompts)","isPartOf":{"@id":"https:\/\/ahrefs.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/ahrefs.com\/blog\/why-chatgpt-cites-pages\/#primaryimage"},"image":{"@id":"https:\/\/ahrefs.com\/blog\/why-chatgpt-cites-pages\/#primaryimage"},"thumbnailUrl":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/pie-chart-shows-chatgpt-cites-about-half-the-urls.png","datePublished":"2026-04-15T14:33:27+00:00","dateModified":"2026-05-31T07:58:00+00:00","description":"We studied 1.4M prompts to see how titles, URLs, and snippets influence AI citation. Here's why ChatGPT cites certain sources over others.","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/ahrefs.com\/blog\/why-chatgpt-cites-pages\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/ahrefs.com\/blog\/why-chatgpt-cites-pages\/#primaryimage","url":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/pie-chart-shows-chatgpt-cites-about-half-the-urls.png","contentUrl":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2026\/04\/pie-chart-shows-chatgpt-cites-about-half-the-urls.png","width":1476,"height":1749,"caption":"Pie chart shows ChatGPT cites about half the URLs it retrieves: 49.98% cited (23.4M URLs) vs. 50.02% not cited."},{"@type":"WebSite","@id":"https:\/\/ahrefs.com\/blog\/#website","url":"https:\/\/ahrefs.com\/blog\/","name":"SEO Blog by Ahrefs","description":"Link Building Strategies &amp; SEO Tips","publisher":{"@id":"https:\/\/ahrefs.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ahrefs.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/ahrefs.com\/blog\/#organization","name":"Ahrefs","url":"https:\/\/ahrefs.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/ahrefs.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2023\/06\/ahrefs-logo.png","contentUrl":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2023\/06\/ahrefs-logo.png","width":2048,"height":768,"caption":"Ahrefs"},"image":{"@id":"https:\/\/ahrefs.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/Ahrefs\/","https:\/\/x.com\/ahrefs","https:\/\/www.linkedin.com\/company\/ahrefs\/","https:\/\/www.youtube.com\/c\/ahrefscom"]},{"@type":"Person","@id":"https:\/\/ahrefs.com\/blog\/#\/schema\/person\/444b3643c35b16b94b763446c5562388","name":"Louise Linehan","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2024\/08\/Louise-Linehan.jpg02b05bbed9b25ec9b04e39f0d88f15b0","url":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2024\/08\/Louise-Linehan.jpg","contentUrl":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2024\/08\/Louise-Linehan.jpg","caption":"Louise Linehan"},"description":"Louise is a Content Marketer at Ahrefs. Over the past ten years, she has held senior content positions at SaaS brands: Pi Datametrics, BuzzSumo, and Cision. By day, she writes about content and SEO; by night, you'll find her playing football or screaming down the mic at karaoke.","sameAs":["https:\/\/www.linkedin.com\/in\/louise-linehan\/"],"url":"https:\/\/ahrefs.com\/blog\/author\/louise-linehan\/"}]}},"as_json":null,"as_tables":null,"as_images":null,"json_reviewers":[194],"as_coauthors":[198],"as_post_info":null,"as_sticky":null,"as_hreflang":null,"as_related":null,"_links":{"self":[{"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/posts\/196421","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/users\/197"}],"replies":[{"embeddable":true,"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/comments?post=196421"}],"version-history":[{"count":0,"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/posts\/196421\/revisions"}],"wp:attachment":[{"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/media?parent=196421"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/categories?post=196421"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/tags?post=196421"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/coauthors?post=196421"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}