fbpx

Noindex Directives: The Ultimate Guide


Noindex directives instruct search engines not to index certain pages, preventing them from appearing in search engine results pages.

Read our noindex directive guide below to understand why people use this feature and how it can benefit your website.

 

What is a noindex directive?

When search engine bots find and ‘crawl’ a page (reading the text and following the links to contextualise the purpose of that page), that page is then ‘indexed’. Assuming the page follows the search engine’s Webmaster guidelines, it will be filed in its database to appear in SERPs (Search Engine Results Pages).

We want our website to be crawled and indexed by Google and other search engines, but not all pages are destined for the SERPs. Each page on your site serves a purpose, but that purpose is not always to rank in search engines or even draw traffic to your site. These pages need to exist as glue for other pages or simply because regulations require them to be accessible on your website, but you don’t necessarily want or need them to appear in search results. These are the pages for which a noindex directive is used.

Noindex is a robots meta directive—a piece of code added to a web page’s source code—that instructs search engine crawlers not to index that page or its content, preventing it from appearing in search engine results. This directive may be part of the HTML (a meta robots tag) page or sent by the web server as an HTTP header (an x-robots tag), but more on that later. Like other robots meta directives, noindex directives are a suggestion; they don’t inherently prevent bots from crawling your site. While some search engines engage with these directives slightly differently, benign web bots generally heed them.

 

How noindex Directives Affect Your SEO

When to Use noindex Directives

Duplicate Content

If you have pages on your site that could be considered duplicate content, you can choose to disable them entirely, or if, for some reason, you’d like to keep it on your site, but out of the search results, you can noindex it.

Thank you pages

That page serves no other purpose than to thank your customer/newsletter subscriber/first-time commenter. These pages are usually thin content pages, with upsell and social share options, but no value for someone using Google to find useful information.

Admin and login pages

Most login pages shouldn’t be in Google. Use a noindex directive to keep yours out of search engine indexes. Exceptions include login pages that serve a community, like Dropbox or similar services. Just ask yourself if you would search for one of your login pages if you were not in your company. If not, it’s probably safe to assume that Google doesn’t need to index these login pages.

Internal search results

Internal search results are pretty much the last pages Google would want to send its visitors to. If you want to ruin a search experience, link to other search pages instead of an actual result.

 

SEO Best Practices for Noindex

If you don’t want crawlers to access sections of your site, you can add robots meta directives (sometimes called “meta tags”) to the page. These pieces of code provide crawlers instructions for how to crawl or index web page content. Whereas robots.txt file directives give bots suggestions for how to crawl a website’s pages, robots meta directives provide more firm instructions on how to crawl and index a page’s content.

There are two types of robots meta directives: those that are part of the HTML page (like the meta robots tag) and those that the web server sends as HTTP headers (such as x-robots-tag).

You can prevent a page or other resource from appearing in Google Search by including a noindex meta tag or header in the HTTP response. When Googlebot next crawls that page and sees the tag or header, Googlebot will drop that page entirely from Google Search results, regardless of whether other sites link to it.

Implementing noindex Directives

There are two ways to implement noindex: as a meta tag and as an HTTP response header. They have the same effect so it comes down to choosing the method that is more convenient for your site and appropriate for the content type.

meta tag

To prevent most search engine web crawlers from indexing a page on your site, place the following meta tag into the head section of your page: 

<meta name=”robots” content=”noindex“> 

Be aware that some search engine web crawlers might interpret the noindex directive differently. As a result, it is possible that your page might still appear in results from other search engines.

HTTP response header

Instead of a meta tag, you can also return an X-Robots-Tag header with a value of either noindex or none in your response. A response header can be used for non-HTML resources, such as PDFs, video files, and image files. Here’s an example of an HTTP response with an X-Robots-Tag instructing crawlers not to index a page:

HTTP/1.1 200 OK
(…)
X-Robots-Tag: noindex
(…)

 

Once You’ve Implemented Your noindex Directives

With noindex directives in place, you’ll have more control over how search engines engage with and index your website. Search engine users will find only what you want to be available. This simple SEO practice is a great way to improve user experience and ensure your ranking pages are relevant and helpful.

Ruby Garner

Ruby joined the content team in August 2021. An avid reader and writer since she was young, Ruby always knew she wanted to work with words. After leaving high school, she studied a Bachelor of Communications majoring in journalism at Massey university. She spent a few years working as a journalist for a news app in the area she grew up in, Matakana, before joining the team at PureSEO.

Ruby also worked part time as a preschool teacher to save money for travelling. So far she has ticked Vietnam, Cambodia, and Bali off her list, and she hopes to be able to travel again soon.

Digital Marketing Agency

Ready to take your brand to the next level?
We are here to help.