Please join us for a UW Data Science Seminar featuring a research team from the Humanities Data Science Summer Institute on Tuesday, November 25th from 4:30 to 5:20 p.m. PT. The seminar will be held in IEB G109.
“Extracting Bibliographic Data from Historical Publishing Catalogues”
Abstract: Book data can reveal large-scale trends in textual production, sales, and readership, but data from within the publishing industry remains hard to come by, especially in historical contexts. The English Catalogue of Books provides a yearly record of books issued in England and Ireland from the mid-19th through the mid-20th centuries–a period when London was home to the largest English-language publishing industry in the world. The ECB thus provides invaluable, aggregate information on a century of Anglophone textual production, but until recently it has only been available in printed books or digital facsimiles, making it difficult to draw large-scale conclusions from the information it contains. We discuss our process of extracting bibliographic data from the catalogues for over 99,000 titles published between 1912 and 1922, and we consider the top titles, authors, and genres being issued during this period.
Speaker Bios:
Anna Preus is an Assistant Professor in the English department at UW, focusing on 20th-century literature in English and data science in the humanities. She leads UW’s Humanities Data Lab, co-leads the Humanities Data Science Summer Institute, and serves as core faculty in Textual and Digital Studies.
Siddharth Bhogra is a PhD student in the English department at UW. He is a computational humanist interested in tracking how literature registers, conspires with, and is disloyal to empire, broadly construed; with a specific focus on form and infrastructure in 20th century South Asian literature.
