{"id":185204,"date":"2025-01-31T07:19:42","date_gmt":"2025-01-31T12:19:42","guid":{"rendered":"https:\/\/ahrefs.com\/blog\/?p=185204"},"modified":"2026-05-28T04:19:50","modified_gmt":"2026-05-28T09:19:50","slug":"how-do-ai-content-detectors-work","status":"publish","type":"post","link":"https:\/\/ahrefs.com\/blog\/how-do-ai-content-detectors-work\/","title":{"rendered":"How Do AI Content Detectors Work? Answers From a Data Scientist"},"content":{"rendered":"<div class=\"intro-txt\">There are tons of tools promising that they can tell AI content from human content, but until recently, I thought they didn\u2019t work.<\/div>\n<p><a href=\"https:\/\/ahrefs.com\/blog\/ai-content-is-short-term-arbitrage\/\">AI-generated content<\/a> isn\u2019t as simple to spot as old-fashioned \u201cspun\u201d or plagiarised content. Most AI-generated text could be considered original, in some sense\u2014it isn\u2019t copy-pasted from somewhere else on the internet.<\/p>\n<p>But as it turns out, we\u2019re building an AI content detector at Ahrefs.<\/p>\n<p>So to understand how AI content detectors work, I interviewed somebody who actually understands the science and research behind them: <a href=\"https:\/\/www.linkedin.com\/in\/yong-keong-yap-40735232\/\">Yong Keong Yap<\/a>, a data scientist at Ahrefs and part of our machine learning team.<\/p>\n<div class=\"further-reading\"><div class=\"reading-title\">Further reading<\/div><div class=\"reading-content\">\n<ul>\n<li>Junchao Wu, Shu Yang, Runzhe Zhan, Yulin Yuan, Lidia Sam Chao, Derek Fai Wong. 2025. <a href=\"https:\/\/direct.mit.edu\/coli\/article\/doi\/10.1162\/coli_a_00549\/127462\/A-Survey-on-LLM-Generated-Text-Detection-Necessity\">A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future Directions<\/a>.<\/li>\n<li>Simon Corston-Oliver, Michael Gamon, Chris Brockett. 2001. <a href=\"https:\/\/aclanthology.org\/P01-1020\/\">A Machine Learning Approach to the Automatic Evaluation of Machine Translation<\/a>.<\/li>\n<li>Kanishka Silva, Ingo Frommholz, Burcu Can, Fred Blain, Raheem Sarwar, Laura Ugolini. 2024. <a href=\"https:\/\/aclanthology.org\/2024.eacl-srw.26\/\">Forged-GAN-BERT: Authorship Attribution for LLM-Generated Forged Novels<\/a><\/li>\n<li>Tom Sander, Pierre Fernandez, Alain Durmus, Matthijs Douze, Teddy Furon. 2024. <a href=\"https:\/\/arxiv.org\/abs\/2402.14904\">Watermarking Makes Language Models Radioactive<\/a>.<\/li>\n<li>Elyas Masrour, Bradley Emi, Max Spero. 2025. <a href=\"https:\/\/aclanthology.org\/2025.genaidetect-1.9\/\">DAMAGE: Detecting Adversarially Modified AI Generated Text<\/a>.<\/li>\n<\/ul>\n<\/div><\/div>\n<div class=\"post-nav-link clearfix\" id=\"section1\"><a class=\"subhead-anchor\" data-tip=\"tooltip__copielink\" rel=\"#section1\"><svg width=\"19\" height=\"19\" viewBox=\"0 0 14 14\" style><g fill=\"none\" fill-rule=\"evenodd\"><path d=\"M0 0h14v14H0z\" \/><path d=\"M7.45 9.887l-1.62 1.621c-.92.92-2.418.92-3.338 0a2.364 2.364 0 0 1 0-3.339l1.62-1.62-1.273-1.272-1.62 1.62a4.161 4.161 0 1 0 5.885 5.884l1.62-1.62L7.45 9.886zM5.527 5.135L7.17 3.492c.92-.92 2.418-.92 3.339 0 .92.92.92 2.418 0 3.339L8.866 8.473l1.272 1.273 1.644-1.643A4.161 4.161 0 1 0 5.897 2.22L4.254 3.863l1.272 1.272zm-.66 3.998a.749.749 0 0 1 0-1.06l2.208-2.206a.749.749 0 1 1 1.06 1.06L5.928 9.133a.75.75 0 0 1-1.061 0z\" style \/><\/g><\/svg><\/a><div class=\"link-text\" data-anchor=\"How AI content detectors work\" data-section=\"how-ai-detectors-work\">\n<h2><a id=\"post-185204-_gsd06ll7puiw\"><\/a>How AI content detectors work<\/h2>\n<\/div><\/div>\n<p>All AI content detectors work in the same basic way: they look for patterns or abnormalities in text that appear slightly different from those in human-written text.<\/p>\n<p>To do that, you need two things: lots of examples of both human-written and LLM-written text to compare, and a mathematical model to use for the analysis.<\/p>\n<p>There are three common approaches in&nbsp;use:<\/p>\n<h3>1. <a id=\"post-185204-_chywt81n3r4z\"><\/a>Statistical detection (old school but still effective)<\/h3>\n<p>Attempts to detect machine-generated writing have been around since the 2000s. Some of these older detection methods still work well&nbsp;today.<\/p>\n<p>Statistical detection methods work by counting particular writing patterns to distinguish between human-written text and machine-generated text,&nbsp;like:<\/p>\n<ul>\n<li><strong>Word frequencies<\/strong> (how often certain words appear)<\/li>\n<li><strong>N-gram frequencies<\/strong> (how often particular sequences of words or characters appear)<\/li>\n<li><strong>Syntactic structures<\/strong> (how often particular writing structures appear, like Subject-Verb-Object (SVO) sequences such as <em>\u201cshe eats apples.<\/em>\u201d)<\/li>\n<li><strong>Stylistic nuances<\/strong> (like writing in the first person, using an informal style, etc.)<\/li>\n<\/ul>\n<p>If these patterns are very different from those found in human-generated texts, there\u2019s a good chance you\u2019re looking at machine-generated text.<\/p>\n\n<table id=\"tablepress-381\" class=\"tablepress tablepress-id-381 tablepress-responsive tablepress-ahrefs-width-720px\">\n<thead>\n<tr class=\"row-1 odd\">\n\t<th class=\"column-1\">Example text<\/th><th class=\"column-2\">Word frequencies<\/th><th class=\"column-3\">N-gram frequencies<\/th><th class=\"column-4\">Syntactic structures<\/th><th class=\"column-5\">Stylistic notes<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr class=\"row-2 even\">\n\t<td class=\"column-1\">\u201cThe cat sat on the mat. Then the cat yawned.\u201d<\/td><td class=\"column-2\">the: 3<br>\ncat:&nbsp;2<br>\nsat:&nbsp;1<br>\non:&nbsp;1<br>\nmat:&nbsp;1<br>\nthen:&nbsp;1<br>\nyawned: 1<\/td><td class=\"column-3\">Bigrams<br>\n\u201cthe cat\u201d:&nbsp;2<br>\n\u201ccat sat\u201d:&nbsp;1<br>\n\u201csat on\u201d:&nbsp;1<br>\n\u201con the\u201d:&nbsp;1<br>\n\u201cthe mat\u201d:&nbsp;1<br>\n\u201cthen the\u201d:&nbsp;1<br>\n\u201ccat yawned\u201d: 1<\/td><td class=\"column-4\">Contains S-V (Subject-Verb) pairs such as \u201cthe cat sat\u201d and \u201cthe cat yawned.\u201d<\/td><td class=\"column-5\">Third-person viewpoint; neutral tone.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<!-- #tablepress-381 from cache -->\n<p>These methods are very lightweight and computationally efficient, but they tend to break when the text is manipulated (using what computer scientists call \u201c<a href=\"https:\/\/en.wikipedia.org\/wiki\/Adversarial_machine_learning\">adversarial examples<\/a>\u201d).<\/p>\n<p>Statistical methods can be made more sophisticated by training a learning algorithm on top of these counts (like Naive Bayes, Logistic Regression, or Decision Trees), or using methods to count word probabilities (known as logits).<\/p>\n<h3>2. <a id=\"post-185204-_4pl8rlg6imck\"><\/a>Neural networks (trendy deep learning methods)<\/h3>\n<p>Neural networks are computer systems that loosely mimic how the human brain works. They contain artificial neurons, and through practice (known as <em>training<\/em>), the connections between the neurons adjust to get better at their intended goal.<\/p>\n<p>In this way, neural networks can be trained to detect <a href=\"https:\/\/ahrefs.com\/blog\/ai-content-creation-tools\/\">text generated by <em>other<\/em> neural networks<\/a>.<\/p>\n<p>Neural networks have become the de-facto method for AI content detection. Statistical detection methods require special expertise in the target topic and language to work (what computer scientists call \u201cfeature extraction\u201d). Neural networks just require text and labels, and they can learn what is and isn\u2019t important themselves.<\/p>\n<p>Even small models can do a good job at detection, as long as they\u2019re trained with enough data (at least a few thousand examples, according to the literature), making them cheap and dummy-proof, relative to other methods.<\/p>\n<p>LLMs (like ChatGPT) are neural networks, but without additional fine-tuning, they generally aren\u2019t very good at identifying AI-generated text\u2014even if the LLM itself generated it. Try it yourself: generate some text with ChatGPT and in another chat, ask it to identify whether it\u2019s human- or AI-generated.<\/p>\n<p>Here\u2019s o1 failing to recognise its own output:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1254\" height=\"881\" class=\"wp-image-185205\" src=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/word-image-185204-1.png\" srcset=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/word-image-185204-1.png 1254w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/word-image-185204-1-605x425.png 605w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/word-image-185204-1-768x540.png 768w\" sizes=\"auto, (max-width: 1254px) 100vw, 1254px\"><\/p>\n<h3>3. <a id=\"post-185204-_s2towkrmvev\"><\/a>Watermarking (hidden signals in LLM output)<\/h3>\n<p>Watermarking is another approach to AI content detection. The idea is to get an LLM to generate text that includes a hidden signal, identifying it as <a href=\"https:\/\/ahrefs.com\/blog\/ai-content-creation\/\">AI-generated<\/a>.<\/p>\n<p>Think of watermarks like UV ink on paper money to easily distinguish authentic notes from counterfeits. These watermarks tend to be subtle to the eye and not easily detected or replicated\u2014unless you know what to look for. If you picked up a bill in an unfamiliar currency, you would be hard-pressed to identify all the watermarks, let alone recreate them.<\/p>\n<p>Based on the literature cited by Junchao Wu, there are three ways to watermark AI-generated text:<\/p>\n<ul>\n<li><strong>Add watermarks to the datasets that you release<\/strong> (for example, inserting something like \u201c<em>Ahrefs is the king of the universe!\u201d<\/em> into an open-source training corpus. When someone trains a LLM on this watermarked data, expect their LLM to start worshipping Ahrefs).<\/li>\n<li><strong>Add watermarks into LLM outputs <em>during<\/em> the generation process<\/strong>.<\/li>\n<li><strong>Add watermarks into LLM outputs <em>after<\/em> the generation process<\/strong>.<\/li>\n<\/ul>\n<p>This detection method obviously relies on researchers and model-makers choosing to watermark their data and model outputs. If, for example, GPT-4o\u2019s output was watermarked, it would be easy for OpenAI to use the corresponding \u201cUV light\u201d to work out whether the generated text came from their&nbsp;model.<\/p>\n<p>But there might be broader implications too. One <a href=\"https:\/\/arxiv.org\/abs\/2402.14904\">very new paper<\/a> suggests that watermarking can make it easier for neural network detection methods to work. If a model is trained on even a small amount of watermarked text, it becomes \u201cradioactive\u201d and its output easier to detect as machine-generated.<\/p>\n<div class=\"post-nav-link clearfix\" id=\"section1\"><a class=\"subhead-anchor\" data-tip=\"tooltip__copielink\" rel=\"#section1\"><svg width=\"19\" height=\"19\" viewBox=\"0 0 14 14\" style><g fill=\"none\" fill-rule=\"evenodd\"><path d=\"M0 0h14v14H0z\" \/><path d=\"M7.45 9.887l-1.62 1.621c-.92.92-2.418.92-3.338 0a2.364 2.364 0 0 1 0-3.339l1.62-1.62-1.273-1.272-1.62 1.62a4.161 4.161 0 1 0 5.885 5.884l1.62-1.62L7.45 9.886zM5.527 5.135L7.17 3.492c.92-.92 2.418-.92 3.339 0 .92.92.92 2.418 0 3.339L8.866 8.473l1.272 1.273 1.644-1.643A4.161 4.161 0 1 0 5.897 2.22L4.254 3.863l1.272 1.272zm-.66 3.998a.749.749 0 0 1 0-1.06l2.208-2.206a.749.749 0 1 1 1.06 1.06L5.928 9.133a.75.75 0 0 1-1.061 0z\" style \/><\/g><\/svg><\/a><div class=\"link-text\" data-anchor=\"3 ways AI content detectors can fail\" data-section=\"limitations\">\n<h2><a id=\"post-185204-_4zl77n4kcyxp\"><\/a>3 ways AI content detectors can&nbsp;fail<\/h2>\n<\/div><\/div>\n<p>In the literature review, many methods managed detection accuracy of around 80%, or greater in some&nbsp;cases.<\/p>\n<p>That sounds pretty reliable, but there are three big issues that mean this accuracy level isn\u2019t realistic in many real-life situations.<\/p>\n<h3><a id=\"post-185204-_jvaytu47btpg\"><\/a>Most detection models are trained on very narrow datasets<\/h3>\n<p>Most AI detectors are trained and tested on a particular <em>type<\/em> of writing, like news articles or social media content.<\/p>\n<p>That means that if you want to test a marketing blog post, and you use an AI detector trained on marketing content, then it\u2019s likely to be fairly accurate. But if the detector was trained on news content, or on creative fiction, the results would be far less reliable.<\/p>\n<p>Yong Keong Yap is Singaporean, and shared the example of chatting with ChatGPT in <a href=\"https:\/\/en.wikipedia.org\/wiki\/Singlish\">Singlish<\/a>, a Singaporean variety of English that incorporates elements of other languages, like Malay and Chinese:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"665\" height=\"593\" class=\"wp-image-185206\" src=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/word-image-185204-2.png\" srcset=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/word-image-185204-2.png 665w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/word-image-185204-2-477x425.png 477w\" sizes=\"auto, (max-width: 665px) 100vw, 665px\"><\/p>\n<p>When testing Singlish text on a detection model trained primarily on news articles, it fails, despite performing well for other types of English text:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1217\" height=\"451\" class=\"wp-image-185207\" src=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/word-image-185204-3.png\" srcset=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/word-image-185204-3.png 1217w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/word-image-185204-3-680x252.png 680w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/word-image-185204-3-768x285.png 768w\" sizes=\"auto, (max-width: 1217px) 100vw, 1217px\"><\/p>\n<h3><a id=\"post-185204-_t0lf2i46v38g\"><\/a>They struggle with partial detection<\/h3>\n<p>Almost all of the AI detection benchmarks and datasets are focused on <em>sequence classification<\/em>: that is, detecting whether or not an entire body of text is machine-generated.<\/p>\n<p>But many real-life uses for AI text involve a mixture of AI-generated and human-written text (say, using an AI generator to help write or edit a blog post that is partially human-written).<\/p>\n<p>This type of partial detection (known as <em>span classification <\/em>or <em>token classification<\/em>) is a harder problem to solve and has less attention given to it in open literature. Current AI detection models do not handle this setting well.<\/p>\n<h3><a id=\"post-185204-_kmfpqpbc72qs\"><\/a>They\u2019re vulnerable to humanizing tools<\/h3>\n<p><a href=\"https:\/\/selfmademillennials.com\/how-to-humanize-ai-content\/\">Humanizing AI content<\/a> does work. <a href=\"https:\/\/ahrefs.com\/writing-tools\/ai-humanizer\">Humanizing tools<\/a> work by disrupting patterns that AI detectors look for. LLMs, in general, write fluently and politely. If you intentionally add typos, grammatical errors, or even hateful content to generated text, you can usually reduce the accuracy of AI detectors.<\/p>\n<p>These examples are simple \u201cadversarial manipulations\u201d designed to break AI detectors, and they\u2019re usually obvious even to the human eye. But sophisticated humanizers can go further, using another LLM that is finetuned specifically in a loop with a known AI detector. Their goal is to maintain high-quality text output while disrupting the predictions of the detector.<\/p>\n<p>These can make AI-generated text harder to detect, as long as the humanizing tool has access to detectors that it wants to break (in order to train specifically to defeat them). Humanizers may fail spectacularly against new, unknown detectors.<\/p>\n<div id=\"attachment_185220\" style=\"width: 2082px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/ahrefs.com\/writing-tools\/ai-humanizer\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-185220\" class=\"wp-image-185220 size-full\" src=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/how-AI-content-detectors-work-1.png\" alt width=\"2072\" height=\"1245\" srcset=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/how-AI-content-detectors-work-1.png 2072w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/how-AI-content-detectors-work-1-680x409.png 680w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/how-AI-content-detectors-work-1-768x461.png 768w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/how-AI-content-detectors-work-1-1536x923.png 1536w, https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/how-AI-content-detectors-work-1-2048x1231.png 2048w\" sizes=\"auto, (max-width: 2072px) 100vw, 2072px\"><\/a><p id=\"caption-attachment-185220\" class=\"wp-caption-text\">Test this out for yourself with our simple (and free) <a href=\"https:\/\/ahrefs.com\/writing-tools\/ai-humanizer\">AI text humanizer<\/a>.<\/p><\/div>\n<div class=\"post-nav-link clearfix\" id=\"section1\"><a class=\"subhead-anchor\" data-tip=\"tooltip__copielink\" rel=\"#section1\"><svg width=\"19\" height=\"19\" viewBox=\"0 0 14 14\" style><g fill=\"none\" fill-rule=\"evenodd\"><path d=\"M0 0h14v14H0z\" \/><path d=\"M7.45 9.887l-1.62 1.621c-.92.92-2.418.92-3.338 0a2.364 2.364 0 0 1 0-3.339l1.62-1.62-1.273-1.272-1.62 1.62a4.161 4.161 0 1 0 5.885 5.884l1.62-1.62L7.45 9.886zM5.527 5.135L7.17 3.492c.92-.92 2.418-.92 3.339 0 .92.92.92 2.418 0 3.339L8.866 8.473l1.272 1.273 1.644-1.643A4.161 4.161 0 1 0 5.897 2.22L4.254 3.863l1.272 1.272zm-.66 3.998a.749.749 0 0 1 0-1.06l2.208-2.206a.749.749 0 1 1 1.06 1.06L5.928 9.133a.75.75 0 0 1-1.061 0z\" style \/><\/g><\/svg><\/a><div class=\"link-text\" data-anchor=\"How to use AI content detectors\" data-section=\"how-to-use-AI-detectors\">\n<h2><a id=\"post-185204-_qu4hagywf9oz\"><\/a>How to use AI content detectors<\/h2>\n<\/div><\/div>\n<p>To summarize, AI content detectors can be very accurate <em>in the right circumstances. <\/em>To get useful results from them, it\u2019s important to follow a few guiding principles:<\/p>\n<ul>\n<li><strong>Try to learn as much about the detector\u2019s training data as possible<\/strong>, and use models trained on material similar to what you want to&nbsp;test.<\/li>\n<li><strong>Test multiple documents from the same author. <\/strong>A student\u2019s essay was flagged as AI-generated? Run all their past work through the same tool to get a better sense of their base&nbsp;rate.<\/li>\n<li><strong>Never use AI content detectors to make decisions that will impact someone\u2019s career or academic standing. <\/strong>Always use their results in conjunction with other forms of evidence.<\/li>\n<li><strong>Use with a good dose of skepticism. <\/strong>No AI detector is 100% accurate. There will always be false positives.<\/li>\n<\/ul>\n<h2><a id=\"post-185204-_bc5jp2sbyq9e\"><\/a>Final thoughts<\/h2>\n<p>Since the detonation of the first nuclear bombs in the 1940s, every single piece of steel smelted anywhere in the world has been contaminated by nuclear fallout.<\/p>\n<p>Steel manufactured before the nuclear era is known as \u201c<a href=\"https:\/\/en.wikipedia.org\/wiki\/Low-background_steel\">low-background steel<\/a>\u201d, and it\u2019s pretty important if you\u2019re building a Geiger counter or a particle detector. But this contamination-free steel is becoming rarer and rarer. Today\u2019s main sources are old shipwrecks. Soon, it may be all&nbsp;gone.<\/p>\n<p>This analogy is relevant for AI content detection. Today\u2019s methods rely heavily on access to a good source of modern, human-written content. But this source is becoming smaller by the&nbsp;day.<\/p>\n<p>As AI is embedded into social media, word processors, and email inboxes, and new models are trained on data that includes AI-generated text, it\u2019s easy to imagine a world where most content is \u201ctainted\u201d with AI-generated material.<\/p>\n<p>In that world, it might not make much sense to think about AI detection\u2014everything will be AI, to a greater or lesser extent. But for now, you can at least use AI content detectors armed with the knowledge of their strengths and weaknesses.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>AI-generated content isn\u2019t as simple to spot as old-fashioned \u201cspun\u201d or plagiarised content. Most AI-generated text could be considered original, in some sense\u2014it isn\u2019t copy-pasted from somewhere else on the internet. But as it turns out, we\u2019re building an AI<span class=\"ellipsis\">\u2026<\/span><\/p>\n<div class=\"read-more\">Read more \u203a<\/div>\n<p><!-- end of .read-more --><\/p>\n","protected":false},"author":194,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"wp_typography_post_enhancements_disabled":false,"footnotes":""},"categories":[73,335],"tags":[],"coauthors":[457,466],"class_list":["post-185204","post","type-post","status-publish","format-standard","hentry","category-content-marketing","category-general-seo","odd"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>How Do AI Content Detectors Work? Answers From a Data Scientist<\/title>\n<meta name=\"description\" content=\"Learn how AI content detectors really work\u2014and how to beat them, according to the research.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/ahrefs.com\/blog\/how-do-ai-content-detectors-work\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How Do AI Content Detectors Work? Answers From a Data Scientist\" \/>\n<meta property=\"og:description\" content=\"Learn how AI content detectors really work\u2014and how to beat them, according to the research.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/ahrefs.com\/blog\/how-do-ai-content-detectors-work\/\" \/>\n<meta property=\"og:site_name\" content=\"SEO Blog by Ahrefs\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/Ahrefs\/\" \/>\n<meta property=\"article:published_time\" content=\"2025-01-31T12:19:42+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-05-28T09:19:50+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/word-image-185204-1.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1254\" \/>\n\t<meta property=\"og:image:height\" content=\"881\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Ryan Law, Yong Keong Yap\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@thinking_slow\" \/>\n<meta name=\"twitter:site\" content=\"@ahrefs\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/how-do-ai-content-detectors-work\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/how-do-ai-content-detectors-work\\\/\"},\"author\":{\"name\":\"Ryan Law\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#\\\/schema\\\/person\\\/e63cf0d276886d0391667a066edafeda\"},\"headline\":\"How Do AI Content Detectors Work? Answers From a Data Scientist\",\"datePublished\":\"2025-01-31T12:19:42+00:00\",\"dateModified\":\"2026-05-28T09:19:50+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/how-do-ai-content-detectors-work\\\/\"},\"wordCount\":1756,\"publisher\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/how-do-ai-content-detectors-work\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/01\\\/how-do-ai-content-detectors-work-by-ryan-law-general-seo.jpg\",\"articleSection\":[\"Content Marketing\",\"General SEO\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/how-do-ai-content-detectors-work\\\/\",\"url\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/how-do-ai-content-detectors-work\\\/\",\"name\":\"How Do AI Content Detectors Work? Answers From a Data Scientist\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/how-do-ai-content-detectors-work\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/how-do-ai-content-detectors-work\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/01\\\/word-image-185204-1.png\",\"datePublished\":\"2025-01-31T12:19:42+00:00\",\"dateModified\":\"2026-05-28T09:19:50+00:00\",\"description\":\"Learn how AI content detectors really work\u2014and how to beat them, according to the research.\",\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/ahrefs.com\\\/blog\\\/how-do-ai-content-detectors-work\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/how-do-ai-content-detectors-work\\\/#primaryimage\",\"url\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/01\\\/word-image-185204-1.png\",\"contentUrl\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/01\\\/word-image-185204-1.png\",\"width\":1254,\"height\":881},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/\",\"name\":\"SEO Blog by Ahrefs\",\"description\":\"Link Building Strategies &amp; SEO Tips\",\"publisher\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#organization\",\"name\":\"Ahrefs\",\"url\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/ahrefs-logo.png\",\"contentUrl\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2023\\\/06\\\/ahrefs-logo.png\",\"width\":2048,\"height\":768,\"caption\":\"Ahrefs\"},\"image\":{\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/Ahrefs\\\/\",\"https:\\\/\\\/x.com\\\/ahrefs\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/ahrefs\\\/\",\"https:\\\/\\\/www.youtube.com\\\/c\\\/ahrefscom\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/#\\\/schema\\\/person\\\/e63cf0d276886d0391667a066edafeda\",\"name\":\"Ryan Law\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ryan-law-pic.jpeg14222399d3ce9bff9501104131dfb0eb\",\"url\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ryan-law-pic.jpeg\",\"contentUrl\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ryan-law-pic.jpeg\",\"caption\":\"Ryan Law\"},\"description\":\"Ryan Law is the Director of Content Marketing at Ahrefs. Ryan has 14 years experience as a writer, content strategist, team lead, marketing director, VP, CMO, and agency founder. He's helped dozens of companies improve their content marketing and SEO, including Google, Zapier, GoDaddy, Clearbit, and Algolia. He's also a novelist and the creator of two content marketing courses.\",\"sameAs\":[\"https:\\\/\\\/ryanlaw.me\\\/\",\"https:\\\/\\\/uk.linkedin.com\\\/in\\\/thinkingslow\",\"https:\\\/\\\/x.com\\\/thinking_slow\"],\"url\":\"https:\\\/\\\/ahrefs.com\\\/blog\\\/author\\\/ryan-law\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"How Do AI Content Detectors Work? Answers From a Data Scientist","description":"Learn how AI content detectors really work\u2014and how to beat them, according to the research.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/ahrefs.com\/blog\/how-do-ai-content-detectors-work\/","og_locale":"en_US","og_type":"article","og_title":"How Do AI Content Detectors Work? Answers From a Data Scientist","og_description":"Learn how AI content detectors really work\u2014and how to beat them, according to the research.","og_url":"https:\/\/ahrefs.com\/blog\/how-do-ai-content-detectors-work\/","og_site_name":"SEO Blog by Ahrefs","article_publisher":"https:\/\/www.facebook.com\/Ahrefs\/","article_published_time":"2025-01-31T12:19:42+00:00","article_modified_time":"2026-05-28T09:19:50+00:00","og_image":[{"width":1254,"height":881,"url":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/word-image-185204-1.png","type":"image\/png"}],"author":"Ryan Law, Yong Keong Yap","twitter_card":"summary_large_image","twitter_creator":"@thinking_slow","twitter_site":"@ahrefs","schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/ahrefs.com\/blog\/how-do-ai-content-detectors-work\/#article","isPartOf":{"@id":"https:\/\/ahrefs.com\/blog\/how-do-ai-content-detectors-work\/"},"author":{"name":"Ryan Law","@id":"https:\/\/ahrefs.com\/blog\/#\/schema\/person\/e63cf0d276886d0391667a066edafeda"},"headline":"How Do AI Content Detectors Work? Answers From a Data Scientist","datePublished":"2025-01-31T12:19:42+00:00","dateModified":"2026-05-28T09:19:50+00:00","mainEntityOfPage":{"@id":"https:\/\/ahrefs.com\/blog\/how-do-ai-content-detectors-work\/"},"wordCount":1756,"publisher":{"@id":"https:\/\/ahrefs.com\/blog\/#organization"},"image":{"@id":"https:\/\/ahrefs.com\/blog\/how-do-ai-content-detectors-work\/#primaryimage"},"thumbnailUrl":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/how-do-ai-content-detectors-work-by-ryan-law-general-seo.jpg","articleSection":["Content Marketing","General SEO"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/ahrefs.com\/blog\/how-do-ai-content-detectors-work\/","url":"https:\/\/ahrefs.com\/blog\/how-do-ai-content-detectors-work\/","name":"How Do AI Content Detectors Work? Answers From a Data Scientist","isPartOf":{"@id":"https:\/\/ahrefs.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/ahrefs.com\/blog\/how-do-ai-content-detectors-work\/#primaryimage"},"image":{"@id":"https:\/\/ahrefs.com\/blog\/how-do-ai-content-detectors-work\/#primaryimage"},"thumbnailUrl":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/word-image-185204-1.png","datePublished":"2025-01-31T12:19:42+00:00","dateModified":"2026-05-28T09:19:50+00:00","description":"Learn how AI content detectors really work\u2014and how to beat them, according to the research.","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/ahrefs.com\/blog\/how-do-ai-content-detectors-work\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/ahrefs.com\/blog\/how-do-ai-content-detectors-work\/#primaryimage","url":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/word-image-185204-1.png","contentUrl":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2025\/01\/word-image-185204-1.png","width":1254,"height":881},{"@type":"WebSite","@id":"https:\/\/ahrefs.com\/blog\/#website","url":"https:\/\/ahrefs.com\/blog\/","name":"SEO Blog by Ahrefs","description":"Link Building Strategies &amp; SEO Tips","publisher":{"@id":"https:\/\/ahrefs.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ahrefs.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/ahrefs.com\/blog\/#organization","name":"Ahrefs","url":"https:\/\/ahrefs.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/ahrefs.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2023\/06\/ahrefs-logo.png","contentUrl":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2023\/06\/ahrefs-logo.png","width":2048,"height":768,"caption":"Ahrefs"},"image":{"@id":"https:\/\/ahrefs.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/Ahrefs\/","https:\/\/x.com\/ahrefs","https:\/\/www.linkedin.com\/company\/ahrefs\/","https:\/\/www.youtube.com\/c\/ahrefscom"]},{"@type":"Person","@id":"https:\/\/ahrefs.com\/blog\/#\/schema\/person\/e63cf0d276886d0391667a066edafeda","name":"Ryan Law","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2023\/10\/ryan-law-pic.jpeg14222399d3ce9bff9501104131dfb0eb","url":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2023\/10\/ryan-law-pic.jpeg","contentUrl":"https:\/\/ahrefs.com\/blog\/wp-content\/uploads\/2023\/10\/ryan-law-pic.jpeg","caption":"Ryan Law"},"description":"Ryan Law is the Director of Content Marketing at Ahrefs. Ryan has 14 years experience as a writer, content strategist, team lead, marketing director, VP, CMO, and agency founder. He's helped dozens of companies improve their content marketing and SEO, including Google, Zapier, GoDaddy, Clearbit, and Algolia. He's also a novelist and the creator of two content marketing courses.","sameAs":["https:\/\/ryanlaw.me\/","https:\/\/uk.linkedin.com\/in\/thinkingslow","https:\/\/x.com\/thinking_slow"],"url":"https:\/\/ahrefs.com\/blog\/author\/ryan-law\/"}]}},"as_json":null,"as_tables":null,"as_images":null,"json_reviewers":[],"as_coauthors":[199],"as_post_info":null,"as_sticky":null,"as_hreflang":null,"_links":{"self":[{"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/posts\/185204","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/users\/194"}],"replies":[{"embeddable":true,"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/comments?post=185204"}],"version-history":[{"count":0,"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/posts\/185204\/revisions"}],"wp:attachment":[{"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/media?parent=185204"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/categories?post=185204"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/tags?post=185204"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/ahrefs.com\/blog\/wp-json\/wp\/v2\/coauthors?post=185204"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}