Skip to main content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.


The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

What Indexes From Your Website


When we think about indexing pages for search, we usually think about indexing the primary content of the page. But if the page isn’t structured to tell the search engine where that content is to be found, it will collect the <body> tag, and then filter out the <nav> and <footer> elements, if present. If <main>, <nav>, or <footer> are not present, we collect the full contents of the <body> tag. Learn more on our post about aiming search engines at the content you really want to be searchable, using the </main> element.


You can read more detail on each of the following elements here.

Standard metadata elements

  • title
  • meta description
  • meta keywords
  • locale or language (from the opening <html> tag)
  • url
  • lastmod (collected from XML sitemaps)

Open Graph protocol elements

  • og:description
  • og:title
  • article:published_time
  • article:modified_time

File formats

In addition to HTML pages with their various file extensions, indexes the following file types:

  • PDFs
  • Word docs
  • Excel docs
  • TXT
  • Images can be indexed either using our Flickr integration, or by sending us an MRSS feed. Note that images are not indexed during web page indexing, so you’ll need to use one of these two methods.

Please note that at this time we cannot index javascript content, similar to most search engines. At this time we recommend your team adds well crafted, unique description text for each of your pages, or perhaps auto-generate description tag text from the first few lines of the article text. However the text is added, it should include the keywords you want the page to respond to in search, framed in plain language. This will give us, and other search engines, something to work with when we’re matching and ranking results. See our discussion of description metadata for more information.