Posts tagged release-notes

April 2019 Release Notes

Improvements

  • We launched updated Freshness ranking for customers indexed by Search.gov. The fresher a page the more likely it is to be promoted. Previously, we only factored the date a page is created into its freshness. Now, we first factor the date a page was last modified and then, the date a page is created. This new feature highlights the importance of ensuring that metadata on your site’s pages and sitemap is accurate. Here is more information about how we rank your search results: https://search.gov/manual/ranking-factors.html
  • We continued to upgrade components of our service, including ElasticSearch, Rails, and Ruby to improve the overall performance of our indexing system.

Fixes and Chores

  • Fixed a bug that made it difficult to view sidebar items on the search results page when hovering over individual links. Now, the experience when hovering over these links is fully accessible.

Page last reviewed or updated:

March 2019 Release Notes

Improvements

  • We’ve continued to work on improving our recently launched click-based relevancy to boost search results relevance ranking for our indexing customers. As always, it’s important to periodically review your site’s Best Bets and keep your sitemaps up-to-date with latest <lastmod> date metadata.
  • We continued to upgrade components of our service, including ElasticSearch, Rails, and Ruby to improve the overall performance of our indexing system.
  • We launched a new Search Site Launch Guide to make it easier to go live with Search.gov:
    • https://search.gov/manual/site-launch-guide.html
  • Released new documentation to make it easier to maintain your indexed content with Search.gov:
    • https://search.gov/manual/indexing-with-searchgov.html

Fixes and Chores

  • We mitigated potential security vulnerabilities associated with Ruby gems.
  • We applied high priority Ubuntu patches.

Page last reviewed or updated:

February 2019 Release Notes

Improvements

  • We’ve added Click counts as a ranking factor for customers indexed by Search.gov. We look at the URLs that represent 75% of all clicks on search results, and give those a boost. This is the “fat head” that comes before the “long tail.” As always, following best practices will help results stay relevant:
    • Use Best Bets to promote frequently visited pages that are not bubbling to the top of results on their own
    • Periodically review and update Best Bets
    • Maintain an up-to-date and complete sitemap with updated dates

Fixes, Upgrades, Misc

  • Fixed an issue with the type-ahead feature on customer search boxes
  • Fixed an issue with disappearing search icon on search boxes
  • Continued with Rails upgrade for our applications

Page last reviewed or updated:

January 2019 Release Notes

Improvements

  • After integrating directly with the new USAJOBS API, we worked on additional tuning of trigger words to avoid false positive job-related searches
  • We have improved search performance by caching repeat queries made to our data store
  • We updated our content parser to accept some non-standard HTML tags, and to ignore any content within <nav> and <footer> elements

Fixes, Upgrades, Misc

  • We upgraded Ruby on our search-gov repo
  • We increased the processing power on the servers that support our primary web index
  • We reindexed our primary index into more Elasticsearch shards
  • We decreased the cookie timeout for Admin Center sessions
  • We made the failed password reset alert language more ambiguous, so people will no longer be able to tell whether the email address has an account
  • We fixed a bug in our MRSS photo indexer

Page last reviewed or updated:

December 2018 Release Notes

Improvements

  • We integrated with Bing v7 and transitioned our customers to this newer version.
  • We now will index content on a domain even if the root of that domain lists a different domain as the canonical domain. For example, https://publications.sampleagency.gov may list https://www.sampleagency.gov as the canonical domain, but still serve content from https://publications.sampleagency.gov/reports/first_report.pdf. We can now index https://publications.sampleagency.gov/reports/first_report.pdf`.
  • We updated our job search location feature to show more job openings, and cleaned up how we send job queries to the USAJOBS api to get more results.
  • We now automatically review URLs for reindexing, checking for 404s and 301s. We’re doing this every 30 days to begin with, and will adjust that timeframe as needed.

Fixes

  • We upgraded the Ruby version on the repo for our search.gov website, and asis, our image indexing repo.
  • We upgraded the activejob Ruby gem across repos.

Page last reviewed or updated:

October 2018 Release Notes

Improvements

  • We made several updates to our Chef cookbooks to further harden our operating system, including backend password policies, package configuration, and OS configuration.
  • We shifted our model for supporting domain masks for hosted search results pages to leverage CAA records.

Fixes

  • We fixed a gnarly bug in Elasticsearch that made queries containing very common words, like “the”, behave as if there were no results.

Page last reviewed or updated:

November 2018 Release Notes

Improvements

  • We integrated directly with the new USAJOBS API. This means that we
    • are now querying their system at query time, rather than building an index of their job postings within our own system and querying that at query time.
    • have reconfigured what information our jobs searches include in the full query that we send to USAJOBS
    • have increased the geographic radius we’ll look at when a user searches on a jobs-related term. The radius is now 75 miles from the user’s general location.
    • we are now always providing a link to USAJOBS.gov if someone has searched for a jobs-related term, even if there are no jobs located near the searcher.
  • We now support indexing TXT files. There are more TXT files on government websites than you would have thought!

Fixes

  • We fixed a link in the Jobs module that led searchers to a broken USAJOBS.gov page.
  • We now deduplicate sitemap URLs so we will not try to index the same content more than once.
  • We updated Ruby gems: Loofah, Rack, FFI.
  • We upgraded Ruby.

Page last reviewed or updated:

September 2018 Release Notes

Highlights

  • We began work on using click data in our relevancy ranking, starting with
    • Recording the domain of the clicked-on URL separately, so we can manage all the clicks for a particular domain.
    • Calculating the top N clicked-on URLs for a given domain.

Chores

  • We indexed a lot of content for agencies.
  • We got our new developers set up and ready to work on great stuff.

Bug Fixes

  • None

Page last reviewed or updated:

August 2018 Release Notes

Highlights

  • We began work on leveraging click data in our relevancy scoring. This will allow us to use the relative popularity of pages as a ranking signal.

Chores

  • We now record the domain of a URL that has been clicked in addition to recording the full click. This way we can compare the click volume of URLs within a given domain.
  • We resolved security vulnerabilities in grape & sprockets
  • Configure rspec to run specs in random order

Bug Fixes

  • None

Page last reviewed or updated:

June-July 2018 Release Notes

Highlights

  • We added support for XML sitemaps that are located in non-standard locations within a domain.
  • We added sort_by support to our Results API

Chores

  • We finished migrating to CircleCI for our continuous integration monitoring.
  • We improved our internal tracking of queries to the Bing API.
  • We improved how we handle indexing domains that time out.
  • We began indexing the last-modified date of a page, if provided
  • Our SitemapIndexer now processes one sitemap at a time, and we created an automated queue for indexing jobs and url fetching.
  • We improved the management of Searchgov domain states. Now each Searchgov domain has an “indexing activity”. States might include: indexing sitemaps, fetching new URLs (such as after bulk import), and crawling.
  • We now follow client-side redirects.
  • We improved our ability to avoid certain crawler traps.
  • We now index documents up to 15 MB in size. The previous limit was 10 MB.
  • We finalized our compliance with BOD 18-01.
  • We cleaned up how we handle temp files during indexing.
  • We tidied up our internal errors on indexing jobs, as well as our test suite.

Bug Fixes

  • We fixed a bug that was not showing diacritics properly in non-English searches.

Page last reviewed or updated: