As we start 2012, we want to highlight two major, new features.
- Discovery Tag: Our on-demand indexing engine allows you to automatically add content to your search index. You may make use of our on-demand indexing by placing our Discovery Tag (available on the Get Code page in the Admin Center) on your web pages.
- Themes for results pages: Our new and improved theme-based results pages help you create a customized look and feel “out of the box.” You may view the themes by adding a new site in our admin center. We’re also in the process of migrating all customers into this new user interface.
Below are the details for January and, as a reminder, we divide our work into three categories.
- Features: Things you actually notice.
- Chores: Back-end improvements that you don’t notice.
- Fixes: Fixes to any code issues that may arise.
Features for Searchers
- Searchers get better PDF result coverage in results.
- Searchers only see indexed documents that belong to the site’s domain collection.
- Searchers see MedlinePlus (health topics) and Enhanced Agency GovBoxes, when customers opt to display them via the admin center.
- Searchers see nicer paging.
- Searchers don’t see Indexed Document GovBox.
- Searchers can see all Indexed Document results.
- Searchers on template-based results pages see a nice, matching shaded header with rounded corners.
- Mobile searchers see colors from selected theme.
- Searchers see square fade on related searches and other modules.
Features for Agency Customers
- Customers see new Domains page.
- Customers have an easy way to pass multiple parameters.
- By default, customers have a useful header on results pages during signup.
- Customers may point at a dynamic sitemap generator for hosted sitemaps.
- Customers may set up sites with keyword filtered results.
- Customers see an updated left nav in the admin center.
- Migrated USA.gov’s “enhanced results” Agency and MedlinePlus to GovBoxes.
- Refactored search classes.
- Removed website URL from the Add New Site wizard.
- Shortened unique index for type-ahead suggestions.
- Upgraded Solr from 1.4 to 3.5.
- Changed timestamp indexing for Solr to use a trie on NewsItem model.
- Modified stopword analysis to include Solr CommonGrams filtering.
- Fixed Spanish translations for indexed documents.
- Fixed display error when search results contain Bing matches and indexed documents.
- Fixed error received by customers when migrating to themes.
- Fixed RSS feed readers to handle 503 errors more gracefully.
- Disabled ActiveRecord observers during database migrations.
- Removed rule to allow customers to add top level domains (e.g., .gov and .mil) as site domains.
- Fixed parsing of PDF files.
- Fixed tests around Bing result counts.
- Preserved newlines on virtual header/footer/css fields when updated via admin center.
- Fixed doctype for nested PDF documents.
- Fixed various problems around discovery of embedded PDFs.
- Created error page for searchers when the search per-page query parameter is set to blank.
- Fixed issue where hosted sitemaps still existed for domains with no indexed documents.
- Weeded out indexed documents for anything that doesn’t work with URI.parse() method.
- Handled race condition where duplicate id/content hash pair gets past Rails but is caught in MySQL.
- Handled race condition where an IndexedDocument can’t be updated because another one was created.
- Removed assumption that pages are PDF because the URL ends in “.pdf.”
- Selected “Everything” on results page by default.
- Removed body field from the CSV download of IndexedDocument URLs
- Fixed infinite loop in PDF discovery and indexing.
- Verified IndexedDocument URL length is <= 2000 rather than accepting it and silently truncating it.
- Used document body for documents with a blank meta description tag.