How to create and customize an XML sitemap
An XML sitemap (usually sitemap.xml) is an XML format file that contains all the important pages of a site for a crawler.
sitemap.xml is used only by search robots for correct indexing in search engines.
When do you need a Sitemap?
You can read more about this in official articles Yandex and Google . In short, an XML map is needed if the site:
- Many pages. Search engine crawlers may skip newly created or modified pages;
- Ambiguous page linking. A link to a page can be skipped by the robot if it is specified in a hidden place or is available only after a certain event;
- Separate pages without navigation links. There is no way to find out about a page that does not lead to site links;
- Deep nesting. Each search engine has a limit on page crawling, so the search robot may miss important pages altogether. Especially if it is supported by ambiguous page linking.
For a crawler to successfully crawl all links, it is necessary to list all links in the XML sitemap.
When is a Sitemap Not Required?
- The site has less than 500 pages;
- The site has a detailed system of internal links (page relinking);
- On the site, all pages are accessible from the site navigation.
Search engines still recommend creating a sitemap despite there being no reason to do so.
Why you need an XML sitemap
A sitemap is needed for the correct crawling of the site by search robots and subsequent indexing of pages in search engines.
XML map gives the search robot:
- List of site pages ;
- Page priority for scanning. The search robot will first crawl the most priority pages;
- The date the pages were modified. The search robot skips pages that have not been changed after the next crawl pass;
- The likely frequency of page changes. This is a hint for search engines to understand the indicative re-crawl of a page.
- always
- hourly
- daily
- weekly
- monthly
- yearly
- never
- A single file that stores a list of pages. All site links are listed in one file.
- Index file of other XML sitemaps. All site links are listed in several XML files.
- A single file that stores a list of pages and images placed on them. The file is built according to the 1st view principle, but image markup tags are added.
- Manually;
- Using a ready-made online service (for example, https://www.xml-sitemaps.com );
- Using the internal means of the CMS on which the site was developed;
- Through ready-made modules of the CMS system on which the site was developed. If your site is on CMS WordPress, use the Google XML Sitemaps plugin ;
- Through a separate script. Typically, the method is used to include pages in the sitemap that are not taken into account by CMS, plugins and online services.
- The file encoding must be in UTF-8;
- The maximum number of links in one XML file is 50,000;
- The maximum size of an uncompressed file is 50 MB (if PDF, DOCX, and other documents are used);
- Links must match the domain and the main site mirror;
- When accessing a file, the server should return a 200 HTTP code. You can check this using the Yandex service .
- Open the Screaming frog program.
- Select the scan type Mode → List as shown below.
- Click the Upload → Download XML Sitemap button, enter a link to the XML sitemap and click OK.
- Adding an XML map for Yandex
- Adding an XML Map for Google
This way, the search engine understands when and how to crawl your site.
Description of XML map tags
Tag | Required | Description |
---|---|---|
<urlset>
|
Yes | Encapsulates the map file and specifies the current protocol standard. |
<url>
|
Yes | The parent tag for each URL entry. The rest of the tags are children of this tag. |
<loc>
|
Yes | Contains a full link to the site page. The link must be canonical and refers to the main mirror of the site. |
<lastmod>
|
No | Contains the date the page was last updated in UTC format. For example, 2020-05-12, where 12 is the day, 5 is the month, and 2020 is the year. |
<changefreq>
|
No |
Contains the page change frequency. Can take values:
|
<priority>
|
No | URL priority over other URLs. It can take values from 0.0 to 1.0. Keep in mind that assigning high priority to all URLs doesn't make sense. Priority is a relative value, the parameter is used to determine the order in which URLs are processed within the site. Priority does not affect positions in search engines. |
<xhtml:link>
|
No |
Used to indicate alternative pages in other languages. For example, <xhtml:link rel="alternate" hreflang="en" href="http://www.example.com/deutsch/page.html"> .
Read more on
the official Google page .
|
<image:loc>
|
No | The child element of the <image:image> tag. Used to specify a full link to an image. |
<image:title>
|
No | The child element of the <image:image> tag. Used to describe what is shown in the picture. |
Types of XML Sitemap
There are 3 types of XML sitemap.
Example sitemap.xml file:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/page1.html</loc>
<lastmod>2023-08-27</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
<url>
<loc>http://www.example.ru/page2.html</loc>
<lastmod>2023-08-27</lastmod>
<changefreq>monthly</changefreq>
<priority>0.6</priority>
</url>
...
An example of such a map can be seen at the link: https://teachwiki.com/sitemap.xml .
Example sitemap.xml file:
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>http://www.example.com/sitemap_pages1.xml</loc>
<lastmod>2023-08-27</lastmod>
</sitemap>
<sitemap>
<loc>http://www.example.ru/sitemap_pages2.xml</loc>
<lastmod>2023-08-27</lastmod>
</sitemap>
...
<sitemap>
</sitemapindex>
That is, the sitemap.xml file already contains other XML sitemaps that contain specific site links (as shown in example 1 view). Used to separate a large list of page URLs. The maximum number of links in one xml file is 50,000. Relevant for online stores that have a large assortment.
An example of such a map can be seen at the link: https://teachwiki.com/sitemap.xml .
Example sitemap.xml file:
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>http://www.example.com/</loc>
<lastmod>2023-08-27</lastmod>
<changefreq>monthly</changefreq>
<priority>1</priority>
<image:image>
<image: loc> http://www.example.ru/image1.png </image: loc>
<image: title> Love and Peace Portrait </image: title>
</image:image>
<image:image>
<image: loc> http://www.example.ru/image2.png </image: loc>
<image: title> Portrait "Eternity and Infinity"</image: title>
</image:image>
...
</url>
This view is the most detailed of all. It can be combined with 2 types.
An example of such a map can be seen at the link: https://teachwiki.com/sitemap.xml .
How to create an XML sitemap
You can create a sitemap:
XML sitemap requirements
How to find out if there are errors in the XML sitemap?
For a relatively small map size, a manual check of the site map for errors in it is used. If the XML map is huge, making it difficult to manually check, you can use Screaming Frog to find problems in the sitemap.
Finding bugs with Screaming Frog
Screaming frog scanning allows you to scan any kind of XML maps, including nested ones.


In earlier versions, you can find 2 scan selection buttons: one for a single XML map, the other for the XML map index (2nd view with nesting).
Adding an XML Map to Search Engines
To notify search engines that a sitemap has appeared on your site, it is not enough to add a map to the site root. You must also specify the sitemap in the robots.txt file (example: https://teachwiki.com/robots.txt ) and add it to Yandex.Webmaster (for Yandex) and Google Search Console (for Google).


Recommendations
Be sure to use the sitemap.xml file as an opportunity to make it easier for search engine crawlers to crawl your site. Even if the site contains a small number of pages, it will be easier for the search robot to know the relevance and priority of crawling your pages.