XML & HTML Sitemaps: The Ultimate Guide

What are sitemaps and why are they important for SEO? The experts at Pure SEO have put together a guide to XML and HTML sitemaps, providing everything you need to know to get started. Read this ultimate guide below to learn everything about XML and HTML sitemaps and SEO best practices.

What Are Sitemaps?

A sitemap is a tool that makes your website easier to navigate for search engines and people. Sitemaps let search engines explore your site efficiently, offering clues to search engine crawlers (i.e., spiders and bots), so they can find, crawl, and index your website’s content. This, in turn, helps them accurately represent your site in search rankings.

There are two types of sitemaps to be aware of: XML sitemaps and HTML sitemaps. Both are beneficial for websites. XML sitemaps are made to help search engines find and prioritise pages on your website. HTML sitemaps are made to help visitors find relevant pages on your website.

Below, we explore the differences between these two sitemaps.

What are XML sitemaps?

A blue-patterned background with the word ‘sitemap’ printed across in white letters.

An XML sitemap is a complete list of a website’s URLs. These can help search engines and spiders discover crawlable pages on your website. In general, web crawlers discover pages through internal links from the website and external links from other sites—with or without a sitemap.

A sitemap doesn’t guarantee that a site’s pages will all be indexed by search engines, but it does help crawlers navigate your site more easily and helps to inform them of which pages to prioritise.

XML sitemaps are particularly useful for large websites with many landing pages (e.g., e-commerce websites), that might otherwise take spiders a long time to explore. This is because every site has a ‘crawl budget’, and search engines are unlikely to crawl every URL they encounter the first time. An XML sitemap can help search engines build a queue of pages to serve.

Without a sitemap, search engines would need to crawl every page before they can build a full menu of all URLs within a site. However, with a sitemap, search engines can simply know what pages exist without crawling every page.

Types of XML Sitemaps

A sitemap in its simplest form is an XML file with a complete list of a website’s URLs as well as metadata about each URL. Metadata helps search engines crawl a site more intelligently and can include information about a page such as when it was last updated, how frequently it changes, and how important the URL is in comparison to other pages on the website. However, there are other types of sitemaps to consider, for example:

  • Normal XML Sitemaps – The most common sitemap type. It’s primarily in the form of an XML sitemap that links to the pages on a website.
  • News Sitemaps – This sitemap helps Google find content on sites that are approved for Google News
  • Image Sitemaps – Image sitemaps help Google identify all images hosted on a website
  • Video Sitemaps – Video sitemaps help Google understand video content on a page

When do you need an XML sitemap?

You may need an XML sitemap if:

  • Your site is very large. Crawlers are more likely to overlook some of your new or updated pages.
  • Your site has a large archive of isolated content pages that don’t reference each other. Listing these pages in your sitemap increases the likelihood of them being indexed.
  • Your site is new. New websites with few external links referencing them may not be discovered by Googlebot or other web crawlers.
  • Your site has a lot of media content. If your site contains a lot of media content such as videos and images or is shown in Google News.

You might not need an XML sitemap if:

  • Your site is small. If you have fewer than 500 pages.
  • Your site is well-linked internally so that Google can find all important pages by navigating links on your site.
  • You have no news pages or media files (images, videos) that you want to be featured in search results.

While these are general guidelines, every single website can benefit from a sitemap, as websites need Google to find their most priority pages easily and to know when they are updated.

Which pages should you include in your XML sitemap?

Your XML sitemap should include any page on your website that you want visitors to land on, primarily those that elicit conversions and engagement. The pages most essential for your XML sitemap are:

  • Pages with keyword-rich, unique content
  • Pages with optimised media, including video and images
  • Pages that encourage engagement through reviews and comments

Sitemaps for media are a good SEO practice. A separate sitemap may be appropriate for images, which can help search engines discover images they may otherwise have overlooked.

What are HTML sitemaps?

HTML sitemaps contain every page a website has—from the main priority pages to the lower-level pages. Fundamentally, HTML sitemaps are just a clickable list of pages on a website that exist to serve website visitors. They can be considered a well-thought-out table of content. For users who can’t find the content they’re looking for—especially on websites with confusing layouts—an HTML sitemap can be a useful directory to help them get to where they want to go.

While you may already use an XML sitemap, you may wonder about the necessity of HTML sitemaps. There are several reasons why you should keep or add an updated HTML sitemap to your repertoire, including the following:

  • HTML sitemaps can help organise large websites.. An HTML sitemap works similarly to a shopping mall map; it makes large websites easier to navigate for visitors. In its rawest form, it can be an unordered list of every page on a site, but an ordered list is far more visitor-friendly and satisfying to view. Creating order to your HTML sitemap is the best way to help visitors navigate your site. The person maintaining the sitemap can also use it to keep stock of every new page and make sure they have a rightful place somewhere on the site.
  • HTML sitemaps essentially offer you an architectural blueprint of your site. This can be used as a project management tool, revealing the structure of your site and the connection between pages and subpages. A clear structure with well-ordered pages ensures your website has a clear hierarchy, and that every page serves a purpose and has a place within it.
  • Enables page links to drive visitors naturally. Not all pages are findable in the header or footer of a page. Sitemaps can help drive conversions by getting visitors to where they want to go quickly.
  • HTML sitemaps can increase the organic search visibility of linked pages, ensuring that no pages are orphaned (cut off without any links, and likely unindexed).

How a Sitemap Affects Your SEO

There are multiple ways that sitemaps affect SEO on your website. For example, by incorporating keywords in your sitemap, you can help search engines like Google, Yahoo, and Bing find different pages on your site. Sitemaps can also:

Highlight the Website’s Purpose

An HTML sitemap is content-based and can be used to highlight a website’s purpose to both visitors and search engines. By incorporating unique and relevant keywords into your sitemap anchor text, you can promote keyword relevancy for specific pages on your site. Anchor text in a sitemap can also be useful for pages that don’t have many cross-links.

Help Crawlers Index Your Pages Faster

Sitemaps speed up the work of search engine bots crawling your site. It helps them find your content and prioritise it on the crawl queue. While XML sitemaps just offer a list of links, search crawlers prefer the format of HTML links when exploring the web. HTML sitemaps help spotlight your important content and pages, and the text version of your sitemaps can be submitted to Google.

Increase Search Engine Visibility

By facilitating easy navigation, a sitemap can help spiders better understand a website’s taxonomy. Crawlers exploring your site may pass over pages in favour of other ones; for example, search bots may choose to follow a link on one page to verify that the link makes sense. In doing so, they may never return to continue indexing the pages left unindexed. By facilitating easy navigation, an HTML sitemap can help bots get the entire picture of your website more efficiently.

Improve Navigation and Layout Transparency

HTML sitemaps can help identify areas on a site where navigation could be improved. Mapping out all the pages available minimises the chance of duplicate data, and helps you find any that already exists.

SEO Best Practices for Making the Perfect XML Sitemap

Your first step is to craft your sitemap. If you use WordPress, you can use the Yoast SEO plugin to make a sitemap for you. A benefit to Yoast is that it updates your XML sitemap automatically when you add new pages. However, for those who don’t use Yoast, for are a range of other plugins available for WordPress sites that can help you create a sitemap. For websites that don’t use WordPress—no worries! There are still plenty of third-party XML sitemap generator tools to use.

Once your sitemap is created, you should review it to make sure it displays all the pages on your site. If everything looks right, you can submit your sitemap to Google by logging in to your Google Search Console account:

  • Go to “Index” and then click “Sitemaps” which you can find in the sidebar.
  • If you have previously submitted sitemaps, you will be able to see them in a list of “Submitted sitemaps”.
  • To submit a new sitemap, simply enter your sitemap URL into the field labeled “Add a new sitemap” and press “submit”.
  • If everything is set up, your new sitemap should become visible under the “Submitted sitemaps” section.

Final Sitemap Tips:

  • Your XML sitemaps should be consistent with the rest of your website, especially your Robots.txt files. In other words, your sitemap also shouldn’t include pages tagged with “noindex” or “nofollow”.
  • You can use your sitemap to direct search engines to your content-rich landing pages, and away from utility pages that are necessary but lacking in content. This can help ensure that search engines don’t penalise your site for thin content or bad UX.
  • If you have a huge site, you can break up your sitemap into several smaller sitemaps.
  • URLs in your sitemap have a ’last modified’ date. Only update these dates when you make significant changes or add new content. Otherwise, updating pages that don’t show significant changes can be viewed as spammy by Google.
  • By clicking on the “See Index Coverage” icon in the sidebar, you can review your sitemap coverage report. This can let you know how many of your pages have been indexed, how many errors there were, and how many URLs were excluded. You can use your sitemap to determine how many pages you WANT indexed, compared to how many pages ARE indexed.
  • Your sitemap can also help you spot errors on your website. For example, if you have 4,000 pages on your site but your sitemap links to only 2,000 pages, that’s a clear sign that something is wrong. Your website may have duplicate content or maybe the number of pages on the website exceeds your crawl budget.

SEO Best Practices for Making the Perfect HTML Sitemap

If you use WordPress, there are many sitemap plug-ins available to help you create an HTML sitemap. Like with XML sitemaps, plug-ins for HTML sitemaps automate most of the sitemap development and management process.

For large websites, you might need to run a web crawl like Screaming Frog or SiteBulb (both on desktop), or OnCrawl or DeepCrawl (in the cloud). The outputs of this web crawl should serve as the foundation for organising a site’s pages around themes.

After the HTML sitemap is created, make sure to put a link to it on your website for visitors to easily find. Choose somewhere that will continue to be accessible to users as they click through your website, such as the sidebar at the top, or in a footer menu.

The Pure SEO Team

The editorial team at Pure SEO is super proud of our content. Follow our official channels on social media.

Digital Marketing Agency

Ready to take your brand to the next level?
We are here to help.