What Search.gov Indexes From Your Website
When we think about indexing pages for search, we usually think about indexing the primary content of the page. But if the page doesn’t include a
<main> element to tell the search engine where that content is to be found, it will collect the
<body> tag, and then filter out the
<footer> elements, if present. If
<footer> are not present, we collect the full contents of the
<body> tag. Learn more on our post about aiming search engines at the content you really want to be searchable, using the </main> element.
You can read more detail on each of the following elements here.
Standard metadata elements
- meta description
- meta keywords
- locale or language (from the opening
- dates (from
<lastmod>in XML sitemaps,
<pubDate>in RSS feed sitemaps, and other locations)
Open Graph protocol elements
In addition to HTML pages with their various file extensions, Search.gov indexes the following file types:
- Word docs
- Excel docs
- Images can be indexed either using our Flickr integration, or by sending us an MRSS feed. Note that images are not indexed during web page indexing, so you’ll need to use one of these two methods.