How to Get Google to Index Your Website

Head of Content @ Ahrefs (or, in plain English, I'm the guy responsible for ensuring that every blog post we publish is EPIC).

Article Performance

Data from Ahrefs

Organic traffic
736
Linking websites
224

Google’s index is a library of hundreds of billions of web pages. It’s what people search through when they use Google. Unless the pages on your website are in this index, there’s absolutely zero chance of them showing up in Google’s search results.

But how do you know if your site’s in Google’s index, and how do you get it there if not?

If you want a rough idea of how many pages on your website Google has indexed, go to Google, search for site:yourwebsite.com and look at the number of results below the search bar.

See roughly how many pages Google has indexed with a site: search

If you want to check whether a particular page is indexed, you’ll get the most accurate results using the URL Inspection tool in Google Search Console.

For indexed pages, you’ll see this:

If a page is indexed, it will say "URL is on Google" when using the URL Inspection tool

For pages that aren’t indexed, you’ll see this:

If a page isn't indexed, it will say "URL is not on Google" when using the URL Inspection tool

If you have valuable content for searchers, getting indexed tends to be easy enough. You just need to make sure Google can find your pages, and that you’re signalling their importance. Here are six steps you can take to do that.

1. Request indexing for your homepage

Sign up for Google Search Console, add your property, plug your homepage into the URL Inspection tool, and hit “Request indexing.” As long as your site structure is sound (more on this shortly), Google will be able to find (and hopefully index) all the pages on your site.

How to request indexing in Google Search Console

2. Create and submit a sitemap to Google

A sitemap tells Google where to find the pages you consider important on your site.

Most website platforms like Wix, Squarespace, and Shopify create a sitemap for you automatically. You can usually find this at yourdomain.com/sitemap.xml or yourdomain.com/sitemap_index.xml. If it’s not there, see if its location is listed in your robots.txt file at yourdomain.com/robots.txt.

If you’re using WordPress, use a plugin like Yoast or Rank Math to create a sitemap. Don’t use the default sitemap created by WordPress as it tends to include a bunch of unimportant stuff you don’t want in there.

Once you have a sitemap created, go to the Sitemaps tab in Google Search Console, enter the sitemap URL, and hit “Submit.”

How to add a sitemap in Google Search Console

3. Structure your site properly

Search engines should be able to reach every important page on your website through internal links. If that’s not possible for a page, it’s known as an orphan page. Google is less likely to index these as they’re harder to find and have fewer signals that they’re important.

It’s up to you how you want to structure your site, but it’s common to use a pyramid structure:

With this kind of site structure, each page has an internal link from at least one page above it in the pyramid. As long as your website platform automatically adds links to new content, no page should end up orphaned.

That said, mistakes can happen, so it’s worth setting up regular audits to check your site for orphan pages. Assuming that all your important pages are in your sitemap, it’s free and easy to do this with an Ahrefs Webmaster Tools (AWT) account.

First, create a new project from your Ahrefs Dashboard:

Second, click the option to import from Google Search Console:

How to add a project from Google Search Console in Ahrefs

Finally, choose your site and import the project with the default settings:

Sidenote.

If your sitemap is at /sitemap.xml, /sitemap_index.xml, or is listed in your robots.txt file, the default settings are fine. Otherwise you’ll need to open the settings for your project in Site Audit and enter your specific sitemap URL(s).

Ahrefs will now begin crawling your website and email you the results when complete. If you see the “Orphan page (has no incoming internal links)” error in that email, there are pages on your site lacking internal links that need to be incorporated into your site structure.

Orphan pages issue via Ahrefs' Site Audit

4. Build backlinks to your site

Backlinks signal to Google that the content on a website is valuable and deserves indexing.

If your website is new, the best starting point is to find your competitors’ backlinks and replicate the ones you can. These are typically links from:

Directories
Listicles that mention multiple competitors, but not you
Guest posts, interviews and podcasts

You can use the Link Intersect report in Ahrefs’ Site Explorer to find these.

For example, say you’re a plumber in New York City. If you plug in your website and a few competitors, it’ll show sites linking to one or more of them but not you. You can usually spot directories on this list quite easily.

Examples of directories found using Ahrefs' Link Intersect tool

Replicating these links is usually as straightforward as creating a free listing on the directory.

Learn more ways to replicate and build backlinks in the linked guides below.

Further reading

5. Create a robots.txt file to exclude irrelevant pages

A robots.txt file tells search engines where they can and can’t go on your site.

Primarily, it lists all the content you want to lock away from search engines like Google.

It sounds counterintuitive to exclude content from indexing when you actually want to be indexed more—but it works.

Robots.txt allows you to block out extraneous pages from your crawl, like site admin, account login pages, or random paginations, so that a web crawler can spend more crawl budget on the pages that really matter.

For example, here’s our own robots.txt file. You can see we’ve blocked our blog pagination and blog archive to prevent index bloat, and improve indexing.

6. Improve your page load speed

Page speed measures how fast a web page loads and renders for users, typically measured in seconds. It’s a confirmed ranking factor, and can determine how well your content gets indexed.

If your site’s load time is slow, web spiders will have a tricky time crawling your content. This indicates a painful user experience, which can lead to low rankings—or even index invisibility.

There are many things you can do to bump up your page speed, including removing unused plugins, minifying code, and lazy-loading images.

Start off by checking your current page load speed in Google’s free PageSpeed Insights (PSI) tool or check out PageSpeed Insights in the Performance report of your Ahrefs Site Audit, and drill into problematic pages.

Further reading

If you’ve done everything above and Google still isn’t indexing some or all of the pages you’d expect them to, there’s probably a bigger issue so you need to run some checks.

Check for rogue noindex tags

Google can’t index pages that you tell them they’re not allowed to. You do this with a noindex robots meta tag or x-robots-tag HTTP header.

Don’t worry if you have no idea what any of that means. You can check for this issue in Site Audit with a free Ahrefs Webmaster Tools (AWT) account. Just head to the “Indexability” report and look for these issues:

Noindex in HTML and HTTP header
Noindex follow page
Noindex and nofollow page

Finding noindexed pages in Ahrefs' Site Audit

Google won’t be able to index any pages with these issues.

To see which pages they affect, click the number in the “Crawled” column. If you want any of these URLs indexed, you’ll need to remove the noindex directive from the HTML or HTTP header.

Further reading

Robots Meta Tag & X-Robots-Tag: Everything You Need to Know

Check for manual actions and security issues

If your website or web page has a manual action or security issue, Google might not show it in the search results. You can check both of these things in Google Search Console. Just go to the Manual Actions report and Security Issues report.

Here’s what you want to see in both reports:

What you want to see in the Manual Actions report in Search Console

If you see anything different to that, seek expert help.

Further reading

Google Penalties: The Newbie-Friendly Guide

Check that your content is actually valuable for searchers

Google’s John Mueller is on record saying that the search engine never indexes all known URLs. They need to be “awesome and inspiring.”

You can find more potentially low-quality pages that might not be indexed using Site Audit.

Here’s the process:

Go to the Page Explorer
Filter for Indexable pages
Click the Advanced filter
Filter for pages with no keyword rankings and fewer than 300 words

Finding low-quality pages that might not be indexed in Site Audit

This will return “thin,” indexable pages that don’t rank for any keyword in the top 100.

Although it’s possible that some of these pages are indexed, they may as well not be because they’re not ranking for anything. This means that Google clearly doesn’t see them as valuable enough to send traffic.

If any of these pages are about topics with search volume, you’ll need to make them more valuable to get Google to show them in search results.

Check for indexable pages not in your sitemap

Sitemaps are one of the signals Google uses to understand which pages are important.

Excluding pages you want indexed from your sitemap is unlikely to be the only reason for Google not indexing them, but it’s still not a positive signal. You can find these pages in Site Audit. Just look for the “Indexable page not in sitemap” notice in the “All issues” report.

Finding indexable pages that aren't in your site map in Ahrefs' Site Audit

Fixing this issue depends on whether you want the affected pages indexed or not.

If you want them indexed, adjust your sitemap settings to include these pages. If you don’t want them indexed, add a noindex directive.

Check for crawl blocks in your robots.txt file

Remember robots.txt? Google rarely indexes pages that it can’t crawl so if you’re blocking some important pages in robots.txt, they probably won’t get indexed.

To check if a specific page is blocked, you can use Google’s robots.txt tester. Just plug in your URL and hit “Test.” If all’s good, it’ll say “Allowed.”

Checking a page's crawlability with Google's robots.txt tester

If a page is blocked, it’ll say “Blocked” and highlight the line in your robots.txt file that’s causing the block.

Google will highlight the rule blocking the crawl

If you want a blocked page indexed, you should edit the directive in your robots.txt file to allow Google to crawl the page.

Further reading

Robots.txt and SEO: Everything You Need to Know

Check for rogue canonical tags

A canonical tag tells Google which version of a group of similar pages it should index. It looks something like this:

<link rel=“canonical” href=“/page.html”/>

Most pages either have no canonical tag, or what’s called a self-referencing canonical tag. That tells Google the page itself is the preferred and probably the only version. In other words, you want this page to be indexed.

However, if your page has a rogue canonical tag, then it could be telling Google about a preferred version that doesn’t exist or that you don’t want indexed.

To check for a rogue canonical, use the URL inspection tool in Google Search Console. You’ll see an “Alternative page with canonical tag” warning if the canonical points to another page.

What you'll see if there's a rogue canonical tag

If this shouldn’t be there, and you want to index the page, remove the canonical tag.

Another trick is to look for the “Non-canonical page in sitemap” error in Ahrefs’ Site Audit. In these cases, you’re sending conflicting signals to Google. That’s not good if you want pages indexed.

You should also aim to fix issues with duplicate content because Google is unlikely to index duplicate or near-duplicate pages. Use the Duplicates report in Site Audit to check for these issues.

Learn more about how to fix these issues in our guide to duplicate content.

Check for nofollow internal links

Nofollow links are links with a rel=“nofollow” tag. Google may or may not crawl them, so it’s best not to use them.

Finding them is easy enough. Just go to the Links report in Site Audit and look for the “Page has nofollow incoming internal links only” warning.

Finding nofollowed internal links in Ahrefs' Site Audit

Remove the nofollow tag from these internal links, assuming that you want Google to index the page. If not, either delete the page or noindex it.

Further reading

What Is a Nofollow Link? Everything You Need to Know (No Jargon!)

Check for internal link opportunities

Internal links do more than help Google discover new pages. They also help to boost their PageRank and signal their importance. By adding more relevant internal links to important pages, you may be able to improve your chances of Google indexing (and ranking) them.

Here’s a super quick way to find opportunities:

Go to the Page Explorer in Site Audit
Click the advanced filter
Add a filter for “Internal outlinks” > “not contains” > [URL of the page you want indexed]
Add a filter for “Page text” > “contains” > [keyword you’re targeting]

This will search for keyword mentions on pages that don’t already link to your target page.

For example, if we search for mentions of “SEO tips” on pages that don’t already link to our list of SEO tips, we get 35 results:

Finding internal link opportunities in Ahrefs' Site Audit

If we open one of these results and search for this mention on the page, here’s what we see:

Example internal linking opportunity for one of our articles

This is a perfect place to add an internal link to our list of SEO tips.

If you add internal links to a page, it’s worth plugging it into the URL Inspection tool in Google Search Console and clicking “Request indexing”. This tells Google something on the page has changed and prompts them to recrawl.

This may speed up the process of them discovering the internal link and consequently, the page you want indexing.

Check for crawl budget issues

Crawl budget is how fast and how many pages a search engine wants to crawl on your site. If this number exceeds your crawl budget, some won’t get crawled or indexed. This is why you need to keep the number of low-quality pages on your site to a minimum.

Here’s what Google says on the matter:

Wasting server resources on [low-value-add pages] will drain crawl activity from pages that do actually have value, which may cause a significant delay in discovering great content on a site.

Google does state that “crawl budget […] is not something most publishers have to worry about,” and that “if a site has fewer than a few thousand URLs, most of the time it will be crawled efficiently.”

Still, removing low-quality pages from your website is never a bad thing. It can only have a positive effect on your site’s crawl budget.

You can use our content audit template to find potentially low-quality pages that can be deleted.

Further reading

What Is Crawl Budget and Should SEOs Worry about It?

Now you know how to index your site for visibility in Google—but what about all the other ways you can get in front of your audience? Here are a couple of tips for getting your content featured beyond just Google.

Submit content changes to search engines via IndexNow

Have you ever made a change to a page and feel like it takes forever to be picked up by search engines or SEO tools? That’s because bots are basically guessing when a page should be re-crawled.

Thankfully, there’s another way. If you’ve made major or at least meaningful changes to your site’s content, you can notify a bunch of search engines via a free open-source protocol called IndexNow, to speed up indexing.

By letting search engines know about your changes, you can make sure they’re aware of updates as soon as they happen, rather than waiting around for their bots to crawl and discover changes themselves.

Use IndexNow when you’ve made fairly significant changes to your site like adding hundreds of new product pages, or releasing a piece of breaking news—not just when you’ve added a couple of sentences to a blog. This will save on resources and reduce server load.

Search engines that participate in IndexNow include Bing, Naver, Yandex, Seznam, and Ahrefs’ own search engine: Yep.

In fact, we allow you to submit your site changes natively within the Ahrefs Site Audit tool, through either manual submission…

Or auto-submission…

Further reading

Make sure your site can be crawled by LLMs and AI chatbots

Google is one of the best ways to get your brand seen—but AI is fast becoming a serious alternative.

Large lanugage models and AI don’t “index” content in the traditional sense—e.g. by structuring and storing webpages in databases. Instead, they train on huge pools of content to understand patterns, relationships, and concepts. Then, when prompted, they generate the most likely response based on what they’ve learned—rather than the webpages they’ve retrieved.

Though they don’t index, many AI models will actively crawl your site’s content to expand their pool of training data. The more your brand makes it into that training data, the more likely it is to pop up in relevant AI conversations.

If you want your brand to be seen by as many people as possible, then you need to make sure your content can be crawled by AI.

The best way you can do this is by updating your robots.txt file to allow AI crawlers access.

Here’s an example of a robots.txt excerpt from Jed White , which permits access for AI search agents like PerplexityBot but disallows training data collection bots like GPTBot.

Further reading

Final thoughts

Every website needs to be indexed to stand any chance of getting traffic from Google. But being indexed doesn’t automatically mean you’ll rank for anything. It simply means that you’re in the game, not that you’re anywhere close to winning.

That’s where SEO comes in—the art of optimizing your website to rank for specific keywords.

In short, SEO involves:

Finding what your customers are searching for (keyword research)
Creating content about those topics (SEO content)
Optimizing the pages (on-page SEO)
Getting backlinks from other sites (link building)
Keeping your site technically sound (technical SEO)

If you’re new to SEO, read our SEO guide for beginners or watch this video.

How to check if you’re indexed in Google

How to get indexed by Google

1. Request indexing for your homepage

2. Create and submit a sitemap to Google

3. Structure your site properly

4. Build backlinks to your site

5. Create a robots.txt file to exclude irrelevant pages

6. Improve your page load speed

Still not indexed? Check for deeper issues

Check for rogue noindex tags

Check for manual actions and security issues

Check that your content is actually valuable for searchers

Check for indexable pages not in your sitemap

Check for crawl blocks in your robots.txt file

Check for rogue canonical tags

Check for nofollow internal links

Check for internal link opportunities

Check for crawl budget issues

Index your site beyond Google

Submit content changes to search engines via IndexNow

Make sure your site can be crawled by LLMs and AI chatbots

Final thoughts