How to create and customize an XML sitemap

Team TeachWiki

An XML sitemap (usually sitemap.xml) is an XML format file that contains all the important pages of a site for a crawler.

sitemap.xml is used only by search robots for correct indexing in search engines.

When do you need a Sitemap?

You can read more about this in official articles Yandex and Google . In short, an XML map is needed if the site:

For a crawler to successfully crawl all links, it is necessary to list all links in the XML sitemap.

When is a Sitemap Not Required?

Search engines still recommend creating a sitemap despite there being no reason to do so.

Why you need an XML sitemap

A sitemap is needed for the correct crawling of the site by search robots and subsequent indexing of pages in search engines.

XML map gives the search robot:

  1. List of site pages ;
  2. Page priority for scanning. The search robot will first crawl the most priority pages;
  3. The date the pages were modified. The search robot skips pages that have not been changed after the next crawl pass;
  4. The likely frequency of page changes. This is a hint for search engines to understand the indicative re-crawl of a page.
    1. This way, the search engine understands when and how to crawl your site.

      Description of XML map tags

      Tag Required Description
      <urlset> Yes Encapsulates the map file and specifies the current protocol standard.
      <url> Yes The parent tag for each URL entry. The rest of the tags are children of this tag.
      <loc> Yes Contains a full link to the site page. The link must be canonical and refers to the main mirror of the site.
      <lastmod> No Contains the date the page was last updated in UTC format. For example, 2020-05-12, where 12 is the day, 5 is the month, and 2020 is the year.
      <changefreq> No Contains the page change frequency. Can take values: The value of this tag is used as a hint for the crawler, not as a command. The value weekly is usually used, because most sites have changes only after a week.
      <priority> No URL priority over other URLs. It can take values from 0.0 to 1.0. Keep in mind that assigning high priority to all URLs doesn't make sense. Priority is a relative value, the parameter is used to determine the order in which URLs are processed within the site. Priority does not affect positions in search engines.
      <xhtml:link> No Used to indicate alternative pages in other languages. For example, <xhtml:link rel="alternate" hreflang="en" href="http://www.example.com/deutsch/page.html"> . Read more on the official Google page .
      <image:loc> No The child element of the <image:image> tag. Used to specify a full link to an image.
      <image:title> No The child element of the <image:image> tag. Used to describe what is shown in the picture.

      Types of XML Sitemap

      There are 3 types of XML sitemap.

      1. A single file that stores a list of pages. All site links are listed in one file.
      2. Example sitemap.xml file:

        
          
          <?xml version="1.0" encoding="UTF-8"?>
        <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
        <url>
          <loc>http://www.example.com/page1.html</loc>
          <lastmod>2023-08-27</lastmod>
          <changefreq>monthly</changefreq>
          <priority>0.8</priority>
        </url>
        <url>
          <loc>http://www.example.ru/page2.html</loc>
          <lastmod>2023-08-27</lastmod>
          <changefreq>monthly</changefreq>
          <priority>0.6</priority>
        </url>
        ...
          

        An example of such a map can be seen at the link: https://teachwiki.com/sitemap.xml .

      3. Index file of other XML sitemaps. All site links are listed in several XML files.
      4. Example sitemap.xml file:

        
          <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
          <sitemap>
            <loc>http://www.example.com/sitemap_pages1.xml</loc>
            <lastmod>2023-08-27</lastmod>
          </sitemap>
          <sitemap>
            <loc>http://www.example.ru/sitemap_pages2.xml</loc>
            <lastmod>2023-08-27</lastmod>
           </sitemap>
            ...
           <sitemap>
        </sitemapindex>
          

        That is, the sitemap.xml file already contains other XML sitemaps that contain specific site links (as shown in example 1 view). Used to separate a large list of page URLs. The maximum number of links in one xml file is 50,000. Relevant for online stores that have a large assortment.

        An example of such a map can be seen at the link: https://teachwiki.com/sitemap.xml .

      5. A single file that stores a list of pages and images placed on them. The file is built according to the 1st view principle, but image markup tags are added.
      6. Example sitemap.xml file:

        
          <?xml version="1.0" encoding="utf-8"?> 
        <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns:xhtml="http://www.w3.org/1999/xhtml">
        <url>
          <loc>http://www.example.com/</loc>
          <lastmod>2023-08-27</lastmod>
          <changefreq>monthly</changefreq>
          <priority>1</priority> 
          <image:image> 
             <image: loc> http://www.example.ru/image1.png </image: loc> 
             <image: title> Love and Peace Portrait </image: title> 
          </image:image> 
          <image:image> 
            <image: loc> http://www.example.ru/image2.png </image: loc> 
            <image: title> Portrait "Eternity and Infinity"</image: title> 
          </image:image>
          ...
        </url>
          

        This view is the most detailed of all. It can be combined with 2 types.

        An example of such a map can be seen at the link: https://teachwiki.com/sitemap.xml .

      How to create an XML sitemap

      You can create a sitemap:

      XML sitemap requirements

      How to find out if there are errors in the XML sitemap?

      For a relatively small map size, a manual check of the site map for errors in it is used. If the XML map is huge, making it difficult to manually check, you can use Screaming Frog to find problems in the sitemap.

      Finding bugs with Screaming Frog

      Screaming frog scanning allows you to scan any kind of XML maps, including nested ones.

      1. Open the Screaming frog program.
      2. Select the scan type Mode → List as shown below.
      3. Click the Upload → Download XML Sitemap button, enter a link to the XML sitemap and click OK.
      4. In earlier versions, you can find 2 scan selection buttons: one for a single XML map, the other for the XML map index (2nd view with nesting).

      Adding an XML Map to Search Engines

      To notify search engines that a sitemap has appeared on your site, it is not enough to add a map to the site root. You must also specify the sitemap in the robots.txt file (example: https://teachwiki.com/robots.txt ) and add it to Yandex.Webmaster (for Yandex) and Google Search Console (for Google).

      1. Adding an XML map for Yandex
      2. Adding an XML Map for Google

      Recommendations

      Be sure to use the sitemap.xml file as an opportunity to make it easier for search engine crawlers to crawl your site. Even if the site contains a small number of pages, it will be easier for the search robot to know the relevance and priority of crawling your pages.

Comments