How To Optimize PDFs For SEO, But You Should Make Pages Instead

How To Optimize PDFs For SEO, But You Should Make Pages Instead

Patrick Stox
Patrick Stox is a Product Advisor, Technical SEO, & Brand Ambassador at Ahrefs. He was the lead author for the SEO chapter of the 2021 Web Almanac and a reviewer for the 2022 SEO chapter. He also co-wrote the SEO Book For Beginners by Ahrefs and was the Technical Review Editor for The Art of SEO 4th Edition. He’s an organizer for several groups including the Raleigh SEO Meetup (the most successful SEO Meetup in the US), the Beer and SEO Meetup, the Raleigh SEO Conference, runs a Technical SEO Slack group, and is a moderator for /r/TechSEO on Reddit.
Google first started indexing PDFs in 2001. The format is commonly used in government, academia, and business environments.

PDFs are great for compatibility and consistency. They work on nearly any device and always maintain the same visual look. However, if you’re creating new content for the web, you should consider using web pages over PDFs.

If you still want to optimize your PDFs, I’ll show you how, but I don’t recommend it. Let’s explore:

PDFs show in Google search results with a PDF tag.

1 google search pdf

PDFs are converted to and indexed as HTML. For PDFs where there are images of text, Google uses Optical Character Recognition (OCR) technology to convert the image of text into text. Images in PDFs are also indexed in image search results.

Google chooses pages over PDFs if they’re duplicate. If you have pages and PDFs with the same content, Google tends to prefer the page version of the content as the lead version of the duplicate cluster. This means that signals will be consolidated to the page version and that will be the version that shows in search results.

I’m not sure if Google will index PDFs when embedded in another page. Many people want to do this in order to track clicks on the PDFs. There are better ways I will discuss later in the article.

I ran a couple tests using an ‘object’ tag and an <iframe> to embed a PDF into a webpage. At least with URL Inspection tool in Google Search Console, I didn’t see the content in the screenshot or the rendered HTML. However, this may just be a quirk with the URL Inspection tool. It typically won’t work for other types of content besides HTML. It’s possible that the part of the renderer that processes PDFs doesn’t run for the inspection test and that Google would actually index the embedded PDF, but I’d want to test that further before I would rely on it.

pdf embed test where the PDF doesn't show

Even though Google indexes and occasionally ranks PDFs, the format has a few disadvantages over web pages:

  1. Not mobile-friendly. PDFs are made to have a consistent appearance across devices. That means there is no such thing as a mobile-friendly PDF.
  2. Lack of navigation. Most PDFs do not include navigational elements, making it more difficult for people to explore other content.
  3. Lack of some SEO attributes. PDF files have equivalent versions of many SEO elements, but there are also many elements missing like individual link attributes like nofollow, UGC, and sponsored.
  4. May not be crawled often. Because PDFs rarely change, they tend to be crawled less often than pages that are updated more frequently.
  5. Tracking is more difficult. Most common trackers run JavaScript on a web page and don’t work in PDF files.

That said, I’m well aware that there are some situations where there’s no way around using a PDF for your content. If that’s the case for you, keep reading to learn how to optimize your PDFs for search.

Most on-page SEO elements that you’re used to seeing in HTML have an equivalent version in PDFs and are used in the same way you’re used to. Many are also there for accessibility reasons. So let’s discuss a few ways to optimize PDFs for SEO:

  1. Write good content
  2. Add an optimized title
  3. Add an optimized description
  4. Use a relevant filename
  5. Include image alt attributes
  6. Use headings
  7. Include links

1. Write good content

Google’s company mission is to organize the world’s information. Even if it’s not a web page, good content is good content. I’ve seen lots of great content in PDFs like technical documentation, whitepapers, etc. Some of the best information on the web is buried in PDFs.

2. Add an optimized title

Just like web pages have title tags, PDFs have titles. Note that many search engines use the title to describe the document in their search results. If a PDF does not have a title, the filename appears in the SERP instead.

Here’s how to edit a PDFs title in Adobe Acrobat Pro:

  1. Click File > Properties
  2. Edit the Title field

optimized title pdf

3. Add an optimized description

As with meta descriptions for web pages, this isn’t a ranking factor but gives you a shot at controlling the text that appears in search results.

  1. Click File > Properties
  2. Click Additional Metadata
  3. Edit Description

optimize description pdf

4. Use a relevant file name

The filename of the PDF will be part of the URL. This will impact the URL shown in the search results and is a small ranking factor.

  1. Click File > Save As
  2. Edit File Name

optimize file name pdf

5. Include image alt attributes

To help search engines understand the content of your images, you can add alt text to the images in your PDF.

  1. Click the Tags icon in the left sidebar
  2. Find the image you want to add alt text for in the document hierarchy
  3. Right-click on the image
  4. Click Properties
  5. Add relevant alternate text to the box

optimize alt text pdf

6. Use headings

Just like your heading tags (H1-H6) in web pages, you can specify that certain text in PDFs are headings.

  1. Click the Tags icon in the left sidebar
  2. Find the text you want to edit in the document hierarchy
  3. Right-click on the tag
  4. Click Properties
  5. Select the relevant heading level from the dropdown

optimize heading level pdf

Just like any page, internal and external links also impact rankings. Links pass PageRank and their anchor text adds context. By including links to your PDF and links from your PDF to other pages, you are helping PageRank flow through your site rather than creating a dead end. Some PDFs get a lot of links. Larry Page once said “It turns out, people who win the Nobel Prize have citations from 10,000 different papers”

Check out this GDPR document. It has 77K links from 823 referring domains to it but does not link out at all. This is a missed opportunity and adding some internal links from this PDF to other pages on the site might help those pages rank better.

3 regulation pdf backlinks rds

This example from Google is better. Their SEO Starter Guide PDF has 3.37K links from 754 referring domains and they do a good job of passing that value to other pages by linking out from the PDF.

2 backlinks and rds

google seo starter guide

To add links in a PDF:

  1. Click the Edit PDF button on the right sidebar
  2. Click the Link dropdown on the Edit menu
  3. Click Add/Edit Web or Document Link
  4. Draw a rectangle around the text you want to link
  5. Set the Link type to Invisible Rectangle
  6. Set the Link Action to Open a web page
  7. Add your URL

add links pdf

Sidenote.
The screenshots and instructions above are for Acrobat Pro DC and may vary depending on the software you use.

As we mentioned previously, PDFs are more difficult to track. Because of this, many marketing teams tend to gate PDFs or make them available only after a user fills out a form. By doing this, they shift the focus from tracking performance to lead generation. However, there are some options to track your PDFs including:

Event tracking

You can track clicks on PDF links and send them to your analytics system. This allows you to see how many times people clicked on the PDF files to download or open them. You can find out how to set these up here.

Embeds

If you embed the PDF into a page using JavaScript or an iframe, you can just use the analytics data for the page itself.

Intermediate tracking script

This is a complex solution, but it’s possible to send PDF clicks through an intermediate tracking script that sends data to your analytics system before sending people to your PDF. You can find one example here.

Server logs

Because PDF files are stored on a server, any access requests for the files will be recorded in your log files.

3rd-party data

Because PDFs are rarely tracked in analytics systems, sometimes the best data you have is from another source like Google Search Console or Ahrefs. Ahrefs can also give you data on which of your competitors’ PDFs get the most organic traffic. Just paste their domain into Site Explorer, then go to the Top Pages report and search for URLs containing .pdf

4 oracle pdfs

Final thoughts

Hopefully I’ve convinced you that in most cases you should create new content in web pages and not in PDFs, but what about old PDFs, should you optimize the PDF or change them into pages? In typical SEO fashion, I’m going to go with “it depends”. I really don’t think there’s a right or wrong way to do this. Do what is easier for you. Either way should show a positive impact, but depending on the effort and resources the answer could be optimize PDFs, change PDFs into pages, or do something else instead.

Have questions? Let me know on Twitter.