Skip to main content
U.S. flag

An official website of the United States government

Dot gov

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Https

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Indexing Workflow

This page provides a step-by-step description of the Search.gov indexing process for your website.

Step 1. Define Domains and Subdomains

Who: You, the agency web team, in consultation with the Search.gov team
What: The Admin Center Domains list controls what we pull out of our index for a search on your site. But we also need to know what to put in to the index to begin with. We'll work with you to confirm the domains and subdomains you want discoverable through search. For example, we may index all of your subdomains, or just a selection of the major sections.

If you have Javascript-based content on your domains please let our team know. We will work with you to ensure content on those pages is successfully indexed.

Example Domains:
www.example.gov
data.example.gov
archive.example.gov
www.subagencydomainexample.gov

Step 2. Provide a Sitemap or Feed for Each Subdomain

Who: You, the agency web team, in consultation with the Search.gov team
What: The easiest way for us to discover what URLs exist on your domain is via an XML sitemap. Each domain identified above will need a separate sitemap. Please read our detailed discussion of XML sitemaps, and let us know if you have any questions.

We also support valid RSS 2.0 and Atom 2.0 feeds for URL discovery.

We do not crawl websites by default due to the high resource demand of crawling every page on every website all the time. One of the goals of our service is to contain the costs of search government-wide, and a crawling-first model would increase costs significantly.

If you publish your site on Cloud.gov Pages, read these additional instructions.

Step 3. Index Subdomains

Who: The Search.gov team
What: Once sitemaps and/or feeds are posted to your website, our system will be able to index your content. Alert us when they are posted, and we'll add your domains to the list of domains that we monitor. Then, indexing will begin.

By default, we make 1 request per second to a domain. If a `Crawl-delay` is declared in your /robots.txt file, we will honor that delay while fetching your content for indexing. The length of time required to index a site is `(number of items) x (crawl delay) / 3600 = hours to index`.

If you use a firewall service, it's possible our indexer will be blocked. We can provide our IP addresses for you to whitelist in your firewall.

Please note, we can only index domains that are publicly accessible. This means that if you have a password-protected staging environment, we will not be able to index it for you as part of your testing process.

Step 4. Review Index

Who: You, the agency web team
What: You will be able to test the index using your regular search site(s).

Step 5. Launch

Who: You, the agency web team, in collaboration with Search.gov
What: Your index is ready to go, you can proceed with the rest of the site launch steps and go live without any further action from our team.