General SEO

How to Get Google to Index Your Website

Joshua Hardwick
Head of Content @ Ahrefs (or, in plain English, I'm the guy responsible for ensuring that every blog post we publish is EPIC).
Google’s index is a library of hundreds of billions of web pages. It’s what people search through when they use Google. Unless the pages on your website are in this index, there’s absolutely zero chance of them showing up in Google’s search results.

But how do you know if your site’s in Google’s index, and how do you get it there if not? 

If you want a rough idea of how many pages on your website Google has indexed, go to Google, search for site:yourwebsite.com and look at the number of results below the search bar. 

See roughly how many pages Google has indexed with a site: search

If you want to check whether a particular page is indexed, you’ll get the most accurate results using the URL Inspection tool in Google Search Console. 

For indexed pages, you’ll see this: 

If a page is indexed, it will say "URL is on Google" when using the URL Inspection tool

For pages that aren’t indexed, you’ll see this: 

If a page isn't indexed, it will say "URL is not on Google" when using the URL Inspection tool

If you have valuable content for searchers, getting indexed tends to be easy enough. You just need to make sure Google can find your pages, and that you’re signalling their importance. This is a four-step process. 

1. Request indexing for your homepage

Sign up for Google Search Console, add your property, plug your homepage into the URL Inspection tool, and hit “Request indexing.” As long as your site structure is sound (more on this shortly), Google will be able to find (and hopefully index) all the pages on your site. 

How to request indexing in Google Search Console

2. Create and submit a sitemap to Google

A sitemap tells Google where to find the pages you consider important on your site. 

Most website platforms like Wix, Squarespace, and Shopify create a sitemap for you automatically. You can usually find this at yourdomain.com/sitemap.xml or yourdomain.com/sitemap_index.xml. If it’s not there, see if its location is listed in your robots.txt file at yourdomain.com/robots.txt.

Sitemap URL in robots.txt file

If you’re using WordPress, use a plugin like Yoast or Rank Math to create a sitemap. Don’t use the default sitemap created by WordPress as it tends to include a bunch of unimportant stuff you don’t want in there. 

Once you have a sitemap created, go to the Sitemaps tab in Google Search Console, enter the sitemap URL, and hit “Submit.”

How to add a sitemap in Google Search Console

3. Structure your site properly

Search engines should be able to reach every important page on your website through internal links. If that’s not possible for a page, it’s known as an orphan page. Google is less likely to index these as they’re harder to find and have fewer signals that they’re important. 

It’s up to you how you want to structure your site, but it’s common to use a pyramid structure: 

Example of pyramid site structure

With this kind of site structure, each page has an internal link from at least one page above it in the pyramid. As long as your website platform automatically adds links to new content, no page should end up orphaned. 

That said, mistakes can happen, so it’s worth setting up regular audits to check your site for orphan pages. Assuming that all your important pages are in your sitemap, it’s free and easy to do this with an Ahrefs Webmaster Tools (AWT) account. 

First, create a new project from your Ahrefs Dashboard: 

How to create a project in Ahrefs

Second, click the option to import from Google Search Console: 

How to add a project from Google Search Console in Ahrefs

Finally, choose your site and import the project with the default settings: 

Selecting projects to import
Sidenote.
If your sitemap is at /sitemap.xml, /sitemap_index.xml, or is listed in your robots.txt file, the default settings are fine. Otherwise you’ll need to open the settings for your project in Site Audit and enter your specific sitemap URL(s).

Ahrefs will now begin crawling your website and email you the results when complete. If you see the “Orphan page (has no incoming internal links)” error in that email, there are pages on your site lacking internal links that need to be incorporated into your site structure. 

Orphan pages issue via Ahrefs' Site Audit

4. Build backlinks to your site

Backlinks signal to Google that the content on a website is valuable and deserves indexing. 

If your website is new, the best starting point is to find your competitors’ backlinks and replicate the ones you can. These are typically links from: 

  • Directories
  • Listicles that mention multiple competitors, but not you
  • Guest posts, interviews and podcasts

You can use the Link Intersect report in Ahrefs’ Site Explorer to find these. 

For example, say you’re a plumber in New York City. If you plug in your website and a few competitors, it’ll show sites linking to one or more of them but not you. You can usually spot directories on this list quite easily. 

Examples of directories found using Ahrefs' Link Intersect tool

Replicating these links is usually as straightforward as creating a free listing on the directory. 

Learn more ways to replicate and build backlinks in the linked guides below. 

If you’ve done everything above and Google still isn’t indexing some or all of the pages you’d expect them to, there’s probably a bigger issue so you need to run some checks. 

Check for rogue noindex tags

Google can’t index pages that you tell them they’re not allowed to. You do this with a noindex robots meta tag or x-robots-tag HTTP header. 

Don’t worry if you have no idea what any of that means. You can check for this issue in Site Audit with a free Ahrefs Webmaster Tools (AWT) account. Just head to the “Indexability” report and look for these issues: 

  • Noindex in HTML and HTTP header 
  • Noindex follow page 
  • Noindex and nofollow page 
Finding noindexed pages in Ahrefs' Site Audit

Google won’t be able to index any pages with these issues. 

To see which pages they affect, click the number in the “Crawled” column. If you want any of these URLs indexed, you’ll need to remove the noindex directive from the HTML or HTTP header. 

Check for manual actions and security issues

If your website or web page has a manual action or security issue, Google might not show it in the search results. You can check both of these things in Google Search Console. Just go to the Manual Actions report and Security Issues report.

Here’s what you want to see in both reports: 

What you want to see in the Manual Actions report in Search Console

If you see anything different to that, seek expert help. 

Check that your content is actually valuable for searchers

Google’s John Mueller is on record saying that the search engine never indexes all known URLs. They need to be “awesome and inspiring.” 

Google's John Mueller is on record saying that the search engine never indexes all known URLs. They need to be "awesome and inspiring."

You can find more potentially low-quality pages that might not be indexed using Site Audit.

Here’s the process: 

  1. Go to the Page Explorer 
  2. Filter for Indexable pages 
  3. Click the Advanced filter 
  4. Filter for pages with no keyword rankings and fewer than 300 words 
Finding low-quality pages that might not be indexed in Site Audit

This will return “thin,” indexable pages that don’t rank for any keyword in the top 100. 

Although it’s possible that some of these pages are indexed, they may as well not be because they’re not ranking for anything. This means that Google clearly doesn’t see them as valuable enough to send traffic. 

If any of these pages are about topics with search volume, you’ll need to make them more valuable to get Google to show them in search results. 

Check for indexable pages not in your sitemap

Sitemaps are one of the signals Google uses to understand which pages are important. 

Excluding pages you want indexed from your sitemap is unlikely to be the only reason for Google not indexing them, but it’s still not a positive signal. You can find these pages in Site Audit. Just look for the “Indexable page not in sitemap” notice in the “All issues” report. 

Finding indexable pages that aren't in your site map in Ahrefs' Site Audit

Fixing this issue depends on whether you want the affected pages indexed or not. 

If you want them indexed, adjust your sitemap settings to include these pages. If you don’t want them indexed, add a noindex directive. 

Check for crawl blocks in your robots.txt file

Google rarely indexes pages that it can’t crawl so if you’re blocking some in robots.txt, they probably won’t get indexed. 

To check if a page is blocked, you can use Google’s robots.txt tester. Just plug in your URL and hit “Test.” If all’s good, it’ll say “Allowed.”

Checking a page's crawlability with Google's robots.txt tester

If a page is blocked, it’ll say “Blocked” and highlight the line in your robots.txt file that’s causing the block. 

Google will highlight the rule blocking the crawl

If you want a blocked page indexed, you should edit the directive in your robots.txt file to allow Google to crawl the page. 

Check for rogue canonical tags

A canonical tag tells Google which version of a group of similar pages it should index. It looks something like this: 

<link rel=“canonical” href=“/page.html”/>

Most pages either have no canonical tag, or what’s called a self-referencing canonical tag. That tells Google the page itself is the preferred and probably the only version. In other words, you want this page to be indexed. 

However, if your page has a rogue canonical tag, then it could be telling Google about a preferred version that doesn’t exist or that you don’t want indexed. 

To check for a rogue canonical, use the URL inspection tool in Google Search Console. You’ll see an “Alternative page with canonical tag” warning if the canonical points to another page. 

What you'll see if there's a rogue canonical tag

If this shouldn’t be there, and you want to index the page, remove the canonical tag. 

Another trick is to look for the “Non-canonical page in sitemap” error in Ahrefs’ Site Audit. In these cases, you’re sending conflicting signals to Google. That’s not good if you want pages indexed. 

You should also aim to fix issues with duplicate content because Google is unlikely to index duplicate or near-duplicate pages. Use the Duplicates report in Site Audit to check for these issues. 

Duplicate content in Ahrefs' Site Audit

Learn more about how to fix these issues in our guide to duplicate content.

Check for nofollow internal links

Nofollow links are links with a rel=“nofollow” tag. Google may or may not crawl them, so it’s best not to use them. 

Finding them is easy enough. Just go to the Links report in Site Audit and look for the “Page has nofollow incoming internal links only” warning. 

Finding nofollowed internal links in Ahrefs' Site Audit

Remove the nofollow tag from these internal links, assuming that you want Google to index the page. If not, either delete the page or noindex it. 

Check for internal link opportunities

Internal links do more than help Google discover new pages. They also help to boost their PageRank and signal their importance. By adding more relevant internal links to important pages, you may be able to improve your chances of Google indexing (and ranking) them. 

Here’s a super quick way to find opportunities: 

  1. Go to the Page Explorer in Site Audit 
  2. Click the advanced filter 
  3. Add a filter for “Internal outlinks” > “not contains” > [URL of the page you want indexed] 
  4. Add a filter for “Page text” > “contains” > [keyword you’re targeting] 

This will search for keyword mentions on pages that don’t already link to your target page. 

For example, if we search for mentions of “SEO tips” on pages that don’t already link to our list of SEO tips, we get 35 results: 

Finding internal link opportunities in Ahrefs' Site Audit

If we open one of these results and search for this mention on the page, here’s what we see: 

Example internal linking opportunity for one of our articles

This is a perfect place to add an internal link to our list of SEO tips. 

If you add internal links to a page, it’s worth plugging it into the URL Inspection tool in Google Search Console and clicking “Request indexing”. This tells Google something on the page has changed and prompts them to recrawl. 

How to request indexing in Google Search Console

This may speed up the process of them discovering the internal link and consequently, the page you want indexing. 

Check for crawl budget issues

Crawl budget is how fast and how many pages a search engine wants to crawl on your site. If this number exceeds your crawl budget, some won’t get crawled or indexed. This is why you need to keep the number of low-quality pages on your site to a minimum. 

Here’s what Google says on the matter: 

Wasting server resources on [low-value-add pages] will drain crawl activity from pages that do actually have value, which may cause a significant delay in discovering great content on a site.

Google does state that “crawl budget […] is not something most publishers have to worry about,” and that “if a site has fewer than a few thousand URLs, most of the time it will be crawled efficiently.” 

Still, removing low-quality pages from your website is never a bad thing. It can only have a positive effect on your site’s crawl budget. 

You can use our content audit template to find potentially low-quality pages that can be deleted. 

Final thoughts

Every website needs to be indexed to stand any chance of getting traffic from Google. But being indexed doesn’t automatically mean you’ll rank for anything. It simply means that you’re in the game, not that you’re anywhere close to winning. 

That’s where SEO comes in—the art of optimizing your website to rank for specific keywords. 

In short, SEO involves: 

If you’re new to SEO, read our SEO guide for beginners or watch this video.