Skip to main content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

XML Sitemaps

An XML sitemap is an XML formatted file containing a list of URLs on a website. An XML sitemap provides information that allows a search engine to index your website more intelligently, and to keep its search index up to date.

Sitemaps tell search engines what URLs are on a website, and, if URLs are added as they are published, they tell the engines what new content needs to be picked up. They may also provide additional metadata about each URL, such as the last modified date, which signals to the engine to update the index record for that page.

Search.gov uses sitemaps to tell us what URLs should be in our index and when a URL has been updated. Sitemaps are used in a similar way by Google , Bing, and and other search engines. Having an xml sitemap will improve your Google SEO (search engine optimization).

Example: https://search.gov/sitemap.xml

What content should be on XML sitemap?

Some sitemaps are comprehensive, but for very large sites you may need to publish several sitemaps. Each sitemap should be no more than 50MB or 50,000 URLs, whichever comes first. You do not need to add URLs of content you want to remain unsearchable.

Note that an HTML formatted file listing the pages of a site is more akin to an index page, and is not the same as an XML sitemap. HTML files are human friendly, but not machine friendly, and Search engines need an xml formatted file in order to leverage the information for indexing work.

More than one web platform? Use multiple sitemaps.

It’s common for agencies to use more than one platform to publish their websites. For instance, a CMS was launched, but some content is still on the legacy site’s platform. In this case, use available plugins for the CMS’s in your environment to auto-generate sitemaps for that content. Manually generate a sitemap for any static content. You can publish a sitemap index file that lists the locations of all your specific sitemaps, or you can list all your sitemaps on your robots.txt file.

How do search engines find my sitemap(s)?

Sitemaps (or the sitemap index ) should be listed in your site’s robots.txt file, i.e.:
Sitemap: https://www.example.gov/sitemap_1.xml
Sitemap: https://www.example.gov/sitemap_2.xml

List the appropriate sitemap(s) for the domain or subdomain. www.example.gov/robots.txt would list sitemaps for content in the www subdomain, while forms.example.gov/robots.txt would list sitemaps for the forms subdomain.

Read more about robots.txt files, and take a look at ours: https://search.gov/robots.txt

What should my XML sitemap look like?

Please refer to the official sitemaps protocol for full information on how a sitemap should be structured.

When publishing your sitemap, be sure it begins with an <xml> declaration, and that the URLs are enclosed in opening and closing tags. To take a simplified example:

<?xml version="1.0" encoding="UTF-8"?>
<urlset>
<url>
<loc>https://example.gov/blog/file1.html</loc>
<lastmod>2018-03-19T00:00:00+00:00</lastmod>
</url>
<url>
<loc>https://example.gov/policy/new-policy.html</loc>
<lastmod>2018-03-27T00:00:00+00:00</lastmod>
</url>
</urlset>

If you use multiple sitemaps, then you’ll need to use a sitemap index , along these lines:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex>
<sitemap>https://example.gov/sitemap.xml?page=1</sitemap>
<sitemap>https://example.gov/sitemap.xml?page=2</sitemap>
</sitemapindex>

Importantly, be sure that any special characters in your URLs are escaped so the search engines will know how to read them.

What metadata does Search.gov require for each XML sitemap URL?

The sitemap protocol defines required and optional XML tags for each URL. We recommend including the <lastmod> value (the date of last modification of the file) whenever possible, to indicate when a file has been updated and needs to be re-indexed.

We do not have plans to support the <priority> tag, which is no longer used by search engines like Google. We may support the <changefreq> tag in the future, but the <lastmod> tag is more accurate and supported by more search engines.

How can I create an XML sitemap?

Most content management systems provide tools to generate a sitemap and keep it updated. Below are some tools that we recommend:

Drupal

XML Sitemap Module

Simple XML Sitemap Module and patch to include static files

Wordpress

Yoast SEO Plugin

Google Sitemap Plugin

Wagtail

Sitemap Generator

Github Pages (Jekyll)

Jekyll Sitemap gem

Online generators

(Note: free online generators often have a limit to the number of URLs they will include, and do not always generate the most accurate sitemaps. Use them only as a last resort.)

Free Sitemap Generator

Web Sitemap

Sitemap checklist

1. One or more sitemaps have been created

2. The URLs in the sitemap have been reviewed (clean URLs, only includes URLs that should be searchable)

3. Each sitemap’s XML format has been validated

4. Each sitemap (or a sitemap index) is listed in the site’s robots.txt file

Additional Resources:

Official Documentation from Sitemaps.org

Google’s guide to building a sitemap

Sitemap validator

More questions?

If you have questions that aren’t answered here, email us. We’ll also keep updating this page over time.