Warclight: A Rails Engine for Web Archive Discovery

Abstract

This paper describes the development of Warclight, a portmanteau of the open-source Blacklight platform and the ISO-standard Web ARChive file format. Warclight allows users to explore web archives that have been indexed into Apache Solr using the UK Web Archive’s Web Archive Discovery tool. Referencing previous work, we explain how the standard search engine results page is inadequate to support scholarly inquiries. Instead, Warclight provides full-text and faceted search, as well as faceted browsing, to enable exploration and discovery. Given the large sizes of many web archives, we share experiences with deploying our tool at scale using a federated architecture.

Publication
Proceedings of the ACM/IEEE-CS on Joint Conference on Digital Libraries
Date
Avatar
Nick Ruest
Associate Librarian