We added support for XML sitemaps that are located in non-standard locations within a domain.
We added sort_by support to our Results API
We finished migrating to CircleCI for our continuous integration monitoring.
We improved our internal tracking of queries to the Bing API.
We improved how we handle indexing domains that time out.
We began indexing the last-modified date of a page, if provided
Our SitemapIndexer now processes one sitemap at a time, and we created an automated queue for indexing jobs and url fetching.
We improved the management of Searchgov domain states. Now each Searchgov domain has an “indexing activity”. States might include: indexing sitemaps, fetching new URLs (such as after bulk import), and crawling.
We now follow client-side redirects.
We improved our ability to avoid certain crawler traps.
We now index documents up to 15 MB in size. The previous limit was 10 MB.
We finalized our compliance with BOD 18-01.
We cleaned up how we handle temp files during indexing.
We tidied up our internal errors on indexing jobs, as well as our test suite.
We fixed a bug that was not showing diacritics properly in non-English searches.