Data Dialogue with Benjamin Charles Germain Lee, University of Washington


The 16 million digitized historic newspaper pages within Chronicling America, a joint initiative between the Library of Congress and the NEH, represent an incredibly rich resource for a wide range of people. Historians, journalists, genealogists, students, and members of the American public explore the collection regularly via keyword search. But how do we navigate the abundant visual content? In this talk, I will present my project, Newspaper Navigator, in collaboration with LC Labs, the National Digital Newspaper Program, and Professor Daniel Weld at the University of Washington. In particular, I will discuss the two phases of Newspaper Navigator: extracting visual content from all 16 million pages in Chronicling America (resulting in the Newspaper Navigator dataset) and re-imagining how we search over the extracted visual content using the Newspaper Navigator search application. I will also discuss how this project, including the resulting datasets and search interface, can contribute to research in machine learning, human-computer interaction, and the digital humanities.
