Although not technically a penalty, duplicate content can negatively impact your organic search engine rankings. It’s a surprisingly common occurrence online. According to California digital marketing platform Raven, almost 30% of all content online is duplicate content!
So, what is correlation the between duplicate content and SEO, and how much can it affect your rankings? Learn more in our informative guide below.
What is Duplicate Content?
Duplicate content generally refers to substantial blocks of content on one URL that is either identical or nearly identical to another page’s content in the same language.
When multiple pieces of what Google calls ‘appreciably similar’ content appear in more than one location on the internet, search engines can struggle to decide which version is more relevant to a given search query. In most cases, website owners do not intentionally create duplicate content.
How Does Duplicate Content Effect SEO?
Duplicate content will hurt your rankings. At the very least, search engines will not know which page to suggest to users. As a result, all the pages search engines see as duplicates are at risk of ranking lower or discounted entirely.
Search engines avoid showing multiple versions of the same content to provide the best search experience and therefore choose which version is likely to be the best. This process dilutes the visibility of every duplicate page. If you have copied content from a more authoritative page, you can expect to see their page in the results instead of yours.
Duplicate content will also dilute link equity because other sites must choose between duplicates. Instead of all inbound links pointing to one piece of content, they may link to many, spreading the link equity among the various versions. Because inbound links are a ranking factor, this can impact the search visibility of a piece of content.
Duplicate Content Across Multiple Domains
A page must be valuable to your visitors and have unique content to rank. Duplicate content often appears across multiple domains. Sometimes, duplicate content on different domains is unavoidable, like republishing press releases or guest blog posts. This will not hurt your website when done properly. Other instances are more damaging, like reusing product descriptions from a manufacturer or site scrapers copying and reposting content.
Duplicate Content Within Your Domain
Duplicate content can also occur across one single domain. Here are a few common examples of duplicate content across a website:
With Faceted navigation, users can filter and sort items on the page. E-commerce websites use it a lot. This kind of navigation adds parameters to the end of a URL. Because there are usually many combinations of these filters, faceted navigation often results in many instances of duplicate or near-duplicate content.
Almost every website on the internet has boilerplate content. These are pieces of text that may appear on every page on the website. Boilerplate content can confuse search engines and be identified as duplicate content, causing them to ignore the page completely. If your site contains multiple pages with mostly identical content, there are several ways you can indicate your preferred URL to Google. This is called ‘canonicalization’.
How to Avoid Duplicate Content
If you want to ensure that visitors see the content you want them to, there are some steps you can take to address duplicate content issues.
Use 301 Redirects
If you have restructured your site, use 301 redirects (“RedirectPermanent”) in your .htaccess file to smartly redirect users, Googlebot, and other spiders.
Use rel=canonical Attributes
Rel=canonical attributes tell search engines to treat a given page as though it were a copy of a specified URL, and credit all the links, content metrics, and “ranking power” that search engines apply to this page to the specified URL.
By placing a piece of code on pages containing duplicate content, all versions of your page remain online, but search engines will only index the page marked as the canonical URL and any link value passes through to that page. This type of value redirection is called rel=”canonical” attribute or ‘canonical tag.’ It ensures that future traffic to the site from search (and any link value) will point to the most relevant page instead of dividing it up across (almost) identical pages that were found by the search engine bot.
Be Consistent
Try to keep your internal linking consistent. For example, you should not have internal links to pages like below:
- http://www.example.com/page/
- http://www.example.com/page
- http://www.example.com/page/index.htm
While each of these pages appear to users to lead to the same page, search engines will view three distinct URLs as three distinct pages.
Syndicate Carefully
If you syndicate your content on other sites, Google will always show the version they think is most appropriate for users in each given search, which may or may not be the version you prefer. However, it is helpful to ensure that each site syndicating your content includes a link back to your original article. You can also ask those who use your syndicated material to use the noindex tag to prevent search engines from indexing their version of the content, or set a canonical tag pointing to your page.
Understand Your Content Management System
Familiarise yourself with how you display content on your website. Blogs, forums, and related systems often show the same content in multiple formats. For example, a blog entry may appear on a blog page, an archive page, and a page of other entries with the same label.
Minimize Similar Content
If you have many similar pages, consider expanding each page or consolidating the pages into one.