What is Structured Data? And Why Should You Implement It?

Nate Harris

Vocalist, woolgatherer, and optimization enthusiast. Semantic Web wonk. Fan of pickled anything. I help companies improve their reach, up conversion rates, and build community.

Article stats

  • Referring domains 29
Data from Content Explorer tool.
    Search engines have made it clear: a vitally important part of the future of search is “rich results.”

    While controversial among SEOs (see Ahrefs’ “Are Google’s SERP Features Stealing Traffic From Your Site?”), it seems like every few months another box is added to Google’s Search Gallery.

    Google is pretty good at understanding the general context of a site’s content. When it comes to intuiting the specifics of a page, though – usually the most important information to a searcher — crawlers need some help. This is where structured data comes in!

    First, “structured data” is a general term that refers to any organized data that conforms to a certain format. It is hardly just an SEO thing: relational databases, a foundational core of all computation, rely on structured data. SQL — Structured Query Language – manages structured data.

    When a website wants a piece of content to be representative of a “thing” – like a profile page, an event page, or a job posting – its code needs to be marked up properly. With the installation of structured data, a site converts its HTML from an unstructured, general blob to a frictionless document. The more your webpage reads like XML or a JSON object to a search engine, the cooler the things it can do with your content.

    Schema.org

    On the internet, the de facto “language” of structured data is schema.org. Schema.org is a democratic library of internet things. Take, for example, an airline flight: schema.org has a lexicon to notate the type of aircraft, the departure gate, and even a description of the meal service:

    The project was originally founded as a joint effort between Google, Microsoft, Yahoo, and Yandex. It remains open source and it is technically editable by anyone… but, like most anything corroborated by the W3C, don’t expect that process to be simple. If the type of schema you want to use doesn’t actually exist, there is a technical and bureaucratic process you can go through to eventually get a new type of markup democratically included into the Schema.org library.

    Schema.org was born from the spirit of “The Semantic Web,” coined by Tim Berners-Lee (the inventor of “www.”). In the words of the W3C:

    The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries.

    The 4 Ways of Structuring Data

    There are four main semantic annotation formats you can use to structure data on the internet.

    1. JSON-LDJSON-LD is the newest player in town, and it is the format that Google regularly recommends. It is novel in how it exists on a page; instead of “tagging” individual HTML elements, think of JSON-LD as one big blob of informational code near the head that says to the crawler, “Ok! The type of aircraft is this, its departure gate is this, and the in-flight meal is this. Now, please continue on to the content.”

      JSON-LD is also cool in that it can supply information about a page without there actually being any “visual” content to represent that info (no corresponding containers are mandated).

    2. RDFa + GoodRelations – The OG counterparts of JSON-LD are “HTML extensions.” They are conceptually different: instead of having your structured data in one digestible block, HTML extension syntaxes are sprinkled throughout a page’s content, structuring your data on-the-fly. Think of this syntax as just another attribute, like a class, appended to your HTML containers.

      Whenever I work with a client that is using a backend-restrictive platform (like Shopify), I still find RDFa useful in marking up very dynamic elements, like individual “review” objects. While elements can be injected as JSON-LD asynchronously, simply writing in HTML extensions is often cleaner and quicker.

    3. Microdata – Another HTML extension syntax, microdata extends HTML5 and it is mostly deprecated. That said, it still pops up occasionally and is good to be familiar with.

    4. Microformat aka μF – Microformat is most commonly seen in the form of hAtom/ hentry. This may be me projecting, but I feel like microformat is a pariah. Probably the most common appearance of microformat is as an error in Search Console — many WordPress webmasters suffer rogue microformat injections via sloppy theme development.

    Honorable Mention: Data Highlighter

    For sites “with just a few things to mark up,” Google also offers a tool within Search Console that allows a site owner to quickly click-and-drag to apply structured data. There are a couple big reasons to not use the Data Highlighter, though:

    1. Your data highlighter markup will break when anything in your pages’ formatting changes
    2. Your highlighting will only apply to Google and will be invisible to other search engines

    How Structured Data Helps Your SEO

    • Rich SnippetsRich snippets, like the coveted gold stars, come from proper implementations of structured data. This obviously boosts CTR.

    • Knowledge Graph — A brand or individual’s Knowledge Graph card can be influenced by the inclusion of Structured Data. @SameAs tags can help your social profiles be displayed here.

    • AMP, Google News, etc. — For successful inclusion into AMP or other Google programs like Google News, a site must be compliant in including many different types of structured data. If your site is marked-up well, you’ll also enjoy beta releases, like the new “events” card.

    • Contextual Understanding — Search engines state they are better able to “understand” the context and intent of your page if you include strong structured data, even if there’s not a direct visible result. This affects how your site shows up in search and in what indexes you are included.
    • Other search engines — Every search engine treats structured data in different ways. Yandex has some fields that are required for successful processing that aren’t required by Google. Baidu’s first page results rely heavily on structured data.

    The Ranking Factor Myth

    Structured data is not a ranking factor, full stop. This must be clearly understood before bringing it up to clients.

    What we have seen in the past, though, is the “cheating” of search results due in part to structured data. Google will pull branded SERPs to the top of the stack when it thinks a searcher is querying for that business directly. So, say you own Tim’s Pizzeria in Brooklyn, and I search “tims pizza brooklyn” — your site usually appears first even if your backlink profile is crappy, content is light, etc.

    If Google does not yet understand that your site equals the site of Tim’s Pizzeria, local structured data can help with that. And as I mentioned above, it can help with the Knowledge Graph which is kind of a SERP (that would be Organizational markup).

    Structured data is not magic and it doesn’t add to a site’s ‘quality’ in the eyes of Google. It’s important SEOs understand its usefulness and impact.

    On the Other Hand…

    That said… there is a cursory way that structured data could in theory help with SERPs: if dwell time and CTR are ranking factors, as many SEOs have suggested, rich snippets can significantly improve both of those metrics, which would hypothetically positively affect rankings.

    I have repeatedly seen search traffic increase with the implementation of Schema for clients because CTR pops up a few percent.

    Let’s Try It!

    Probably the easiest piece of JSON-LD that any site can install is “Website” structured data. This markup tells the world that your site “is a set of related web pages and other items typically served from a single web domain and accessible via URLs.” …AKA pretty much worthless information, but a good starting point!

    Paste this into your site’s, just like you would Google Analytics code, and replace ahrefs.com with your site’s root canonical URL:

    <script type="application/ld+json"> {
    "@context": "http://schema.org",
    "@type": "WebSite",
    "url": "https:// ahrefs.com/"
    }
    

    Once installed, head on over to the Structured Data Testing Tool, input your URL, and “Run Test”. You should get something like this back:

    And that’s it! It’s the SEO’s job to take it from here – think about all the different things your website represents, check if a lexicon exists in Schema.org, and – if it does – figure out the best way to install that structured data.

    An important point is that every page of every site is going to be a bit different once it gets down to more granular schema. A BlogPosting might have articleSections or a pageEnd. It might reference certain fictional characters that the author would like to specify. It might be part of a weird syndication deal that needs you to specify copyright markup!

    For this Ahrefs blog post, I’d include this JSON-LD block:

    <script type="application/ld+json">
    {
    "@context": "https://schema.org",
    "@type": "BlogPosting",
    "url": "https://ahrefs.com/blog/bla-bla-bla",
    "headline": "What is Structured Data? And Why Should You Implement It?",
    "alternativeHeadline": "Stuctured Data 101",
    "description": "Structured data is bla bla bla bla",
    "datePublished": "July 4, 2017",
    "datemodified": "July 5, 2017",
    "mainEntityOfPage": {
    "@type": "WebPage",
    "url": "https://ahrefs.com/blog/bla-bla-bla"
    },
    "image": {
    "@type": "imageObject",
    "url": "http://example.com/images/image.png",
    "height": "600",
    "width": "800"
    },
    "publisher": {
    "@type": "Organization",
    "name": "ahrefs",
    "logo": {
    "@type": "imageObject",
    "url": "http://example.com/images/logo.png"
    }
    },
    "author": {
    "@type": "Person",
    "name": "Nate Harris"
    },
    "editor": {
    "@type": "Person",
    "name": "Tim Soulo"
    },
    "award": "The Best ahrefs Guest Post Ever Award, 2017",
    "genre": "Technical SEO",
    "accessMode": ["textual", "visual"],
    "accessModeSufficient": ["textual", "visual"],
    "discussionUrl": "https://ahrefs.com/blog/bla-bla-bla/#disqus_thread",
    "inLanguage": "English",
    "articleBody": "Search engines have made it clear: a vitally important part of the future of search is rich results. While controversial..."
    }
    </script>

    Many readers might also be wondering how they apply this for eCommerce – here’s an over-expanded product JSON-LD block to steal:

    <script type="application/ld+json">
    {
    "@context": "http://schema.org",
    "@type": "Product",
    "url":"https://timspizzeria.com/goat-cheese-pizza",
    "aggregateRating": {
    "@type": "AggregateRating",
    "ratingValue": "3.5",
    "reviewCount": "2",
    "bestRating": "5",
    "worstRating": "1"
    },
    "description": "Tim's pizzeria's most delicious cheesiest cheese pizza. Made with 100% goat cheese turned blue.",
    "name": "Tim's Goat Cheese Pizza",
    "image":["https://timspizzeria.com/goat-cheese-pizza-hero.jpg","https://timspizzeria.com/goat-cheese- pizza-olives.jpg","https://timspizzeria.com/goat-cheese-pizza-pineapple.jpg"],
    "offers": {
    "@type": "Offer",
    "availability": "http://schema.org/InStock",
    "image":"https://timspizzeria.com/goat-cheese-pizza-hero.jpg",
    "price": "26.00",
    "priceCurrency": "USD",
    "sku":"1959014",
    "seller":{
    "@type":"Organization",
    "name":"Tim's Pizzeria"},
    "availability": "http://schema.org/InStock"},
    "review": [
    {
    "@type": "Review",
    "author": "Nate",
    "datePublished": "2017-07-041",
    "reviewBody": "Dope lit funkytown! Delicious pizza.",
    "name": "n8 h",
    "reviewRating": {
    "@type": "Rating",
    "bestRating": "5",
    "ratingValue": "5",
    "worstRating": "1"
    }
    },
    {
    "@type": "Review",
    "author": "Dmitry",
    "datePublished": "2016-05-22",
    "reviewBody": "This is the grossest thing I've witnessed, let alone tasted.",
    "name": "OMG this pizza is abhorrent",
    "reviewRating": {
    "@type": "Rating",
    "bestRating": "5",
    "ratingValue": "1",
    "worstRating": "1"
    }
    }
    ]
    }
    }
    </script>

    One cool thing: Google can understand JSON-LD even when rendered asynchronously, so you can inject it into the page via a data layer (GTM), AJAX, etc. Here’s a great guide for that.

    Structured Data Tools

    For WordPress users, I’ll tentatively recommend “Schema” for a quick fix to the most crucial structured data needs. My disclaimer: a lot of SEO plugins’ structured data output is thin, garbage, or accidentally damaging.

    What I mean by thin: Basically, because they are plugins, these tools often work by what is “inferred” from the page rather than what is specified directly. That means that they are beholden to WordPress hooks (author, datePublished, Featured Image, etc.)… which in turn makes their usefulness dependent on the theme’s developer. And when a site’s SEO is strictly dependent on developers, things usually get missed!

    Also, Schema via plugins is never expansive — Google understands much much much more structured data than the search engine necessarily uses at any given time, lest it throw errors in the Testing Tool. This pool of understanding follows the expansion of the Schema.org library. I have had times where I’ve implemented super niche markup that is in Schema.org but that is not yet recognized by Google.

    Sites that implement “experimental” schema find themselves winning immediately when G rolls out a new card because they covered all their bases. For example, look at how incredibly expansive Sephora’s product markup is — only half of those items are actively used in rich snippets, but others have been toyed with in the past (+ will be in the future). Take a peek at all the strange markup items that the NYT employs.

    Here’s an example of granular experimental event markup I’ve implemented for a client:

    This puts my client’s site in a few very exclusive clubs (for example, suggestedMinAge is used by just 100 to 1000 domains per Schema.org).

    Another big problem with SEO plugins and schema… they are all trigger happy to implement basic schema, which can lead to structured data duplicates. Most of the time this isn’t a problem, but for some types of pages, like products, Google might assume that you have more than one product on that page rather than the same product with two different markups.

    It’s an issue I’m working through with another client at the moment: Shopify has their lay product schema that they inject which is duplicating against our expansive and rich product schema that has aggregateRating and reviews inline.

    Some might also suggest https://www.schemaapp.com/… I’ve never used it so I can’t really vouch one way or another! But I see:

    Schema App is a suite of tools that allows digital marketers to create and manage Schema markup without requiring them to be an expert in the Schema.org language or writing code.”

    Which leaves me optimistically cautious for all the reasons listed above.

    This Seems Overly Complicated

    For immediate impact, just the baseline-level stuff will float most SEO’s boats. The basics can usually be safely handled by plugins or add-ons. Expect to deal with the issues we covered earlier if you go that route!

    Incoming bias (I’ve been on this train since the infamous 1.4.8 update debacle): please consider the beautifully transparent and lightweight TSF as your SEO plugin solution over Yoast!.

    For those of us working in-house or on a bigger site, I feel the SEO industry should give more attention than it does to expansive markup. Think about it — a strong understanding of structured data is like a golden ticket into beta search engine experiments. It guarantees that your organization is understood. And it’s not really something that needs to be actively maintained — if you get it right once, then (barring redesigns) it’s pretty much done forever.

    Because it is code-driven, structured data is very much a boogeyman that SEOs love to hate and ignore. I fully expect the “Technical SEO is makeup” crowd to push back here. But a good bit of SEO is covering our bases, and Schema is under-served in that sense.

    Conclusion

    There is a complicated and infinitely vast underbelly to technical SEO, and a strong understanding of structured data is foundational. In the end, The Semantic Web might well be our own undoing; the more data we spoon-feed Google, the more Google can create cool modules and sap traffic away from site owners.

    It’s also worth noting that whenever we structure our data well, we’re training search engines to better do it without us in the future. Don the tinfoil hats: Data Highlighter, while helpful, is a big ‘ol machine learning ploy ☺.

    Intermittently, though, the benefits of structured data are too huge to ignore. Despite the potential for traffic, good markup puts your site on the bleeding edge of new rich feature tech that Google is developing. I encourage all SEOs to dive in!

    Nate Harris

    Vocalist, woolgatherer, and optimization enthusiast. Semantic Web wonk. Fan of pickled anything. I help companies improve their reach, up conversion rates, and build community.

    Article stats

    • Referring domains 29
    Data from Content Explorer tool.

    Shows how many different websites are linking to this piece of content. As a general rule, the more websites link to you, the higher you rank in Google.

    Shows estimated monthly search traffic to this article according to Ahrefs data. The actual search traffic (as reported in Google Analytics) is usually 3-5 times bigger.

    Get notified of new articles

    46,617 marketers are already subscribed to Ahrefs blog. Leave your email to get our weekly newsletter.

    • Structured data needs to be done absolutely for every site. Nate Harris You’ve done a great job, my friend, I’ve enjoyed it with great pleasure.

      • The Semantic Web might well be our own undoing; the more data we spoon-feed Google, the more Google can create cool modules and sap traffic away from site owners.

    • Well, lots of things to do then…

    • interesting information 😀

    • Interesting & informative 🙂

    • it’s useful info, thanks

    • Interesting 🙂 Thanks to sharing with us.

    • Bilc Vlad Calin

      I see you briefly mentioned “Contextual Understanding”, as a personal curiosity have you tried using Schema mark-up that does not affect in any way how your website is displayed in SERPs? And if so have you seen any improvements?

    • You actually went ahead and added “award”: “The Best ahrefs Guest Post Ever Award, 2017”

      Nice xD you slick bastard 😛

      And yeah, I’m from facebook 😛 Nicely executed curiosity inducing ad (instead of some hard-sell…)

      Liked the post @nateonawalk:disqus
      Good job. Keep it up 🙂

      https://uploads.disquscdn.com/images/aa659e89b44a30206b18a3c80e977f73fc9cb71cf5ff5fb194e05e73743ba687.png

    • Yes, I agree with this article. Structured data is a piece of code that you can put on your website. It’s code in a specific format, written in such a way that search engines understand it. Search engines read the code and use it to display search results in a specific way.

    • Jerry

      Great points. After attending #connecttech, and getting “educated” about Microformat, I appreciate that you give a clear context on how each can be used to strengthen the content the builder is attempting to deliver. As a side note, it is critical to know that the web has so many levels of users from the early adopters (tech-savy), down to the grandparents and beauticians. We as early adopters, in hoping that standards and tools will be used going forward, must improve how we build UX, and educate the people who will eventually use these, without even knowing it. So, as people read this article, think of how you can explain to or create for your daughter/son/grandmother/or local facebook company, the ability to easily structure data for their own sites.

      Again,
      Great Article

    • Thank for sharing it’s very useful

    • Thanks a lot for sharing these great insights.
      My question is, can AMP influence mobile ranking?

    • I’ve experimented with SchemaApp for 3 months approximately. I can say that this tool is just over complicated. It will take you at least 1 additional hour per different “thing” to fix all the errors you will eventually receive after running any page through Google structured data testing tool (if you are lucky enough to figure out the issues).

      There have been issues like massively losing keywords, I’ve already ranked for ( 3000 + KW lost during my 3-month-experiment). While I cannot claim that this was the reason for the KW lost, I can definitely say that there must be some correlation at least.

      Cheers,
      Mila

    • @nateonawalk:disqus Yep, absolutely. I do prefer writing my own mark-up (usually using JSON-LD). In fact, the upcoming rainy days are perfect for marking up a website with 300+ pages (this requires a little bit of warm cognac too 😉

    • No problem!

    • Johan CHOUQUET

      Hi there, pretty interesting post ! Just to be clear, what’s behind the term AMP ? so much acronyms these days.
      Thanks!

      • accelerated mobile pages?

        • Johan CHOUQUET

          That’s what I though, but even on the main website of AMP, they doesn’t say it ^^! The 1st location it was written was in the FAQ page ! Thanks anyway