How a Page on a Sitemap Becomes a Search Result
We often get questions about how sitemaps control the search results for a given site. The answer is, they don’t! This page will describe to you the relationship between sitemaps, search indexes, and the search experiences you create through the Admin Center.
A frame for the relationships described below
Imagine a big lake. There are any number of tributaries feeding into the lake. There are fishing boats out on the lake, each loaded up with the gear they need and a guide to the kinds of fish they’re trying to catch.
The Big Search.gov Index: the Lake
Like a lake with its fish, the common search index has all the content from all the sites we index, ready to be brought up by any number of different search site configurations.
The main difference in the search site setup process is the source of the web results. Like Google and Bing, when we index your content, we collect every site’s web pages into a big, common index. All search sites using our index reference this same common data pool.
Sitemaps: the Tributaries
XML Sitemaps are like tributaries feeding into a lake. They do not feed into sitemap-specific indexes connected to particular search sites.
Sitemaps list the content available on websites in a machine-friendly format, so that search engines will know what to collect from the site. The content indexed from your website goes into the big index mentioned above, along with the content from all other websites. You can, in theory, pull content from any website we have indexed into your search experience. This supports portal search experiences.
Search Site Setup: the Fishing Boats
Like a fishing boat on the water, you’ve decided what fish you’re going after, you know what corners of the lake to go to, and you’ve collected the gear you need to get the fish.
Search.gov used to rely on the Bing web index for our main search results. Customers would log in to the Admin Center and use the Domains list to include the content they wanted to pull from Bing. Now that we’re building our index in house, all this remains the same. You log in to the Admin Center and configure what you want your search to return on the results page.
Tying it all together
We use sitemaps to inform what we index into our system. You use the Admin Center to determine what results will come out of the index when people search on your website. Tributaries feed into a lake, and fishers can go out to any part of the lake to get the particular kinds of fish that they want.
Following a particular page through this cycle looks like this:
- A page is posted to a website
- Its URL is added to the sitemap
- Search.gov’s indexer reads the sitemap and picks up the URL
- Search.gov’s indexer visits the page and scrapes the content
- The content is added to the index. Meanwhile, the search site had already been configured to include this content within the index.
- A member of the public searches on the website
- The query matches the page’s content
- The page is returned as a search result
- The searcher clicks on the URL on the results page
The searcher is brought to the page on the website