Making history accessible online

Tuesday 01/03/2023

In academia, accessibility is a large factor in understanding history and viewing the full picture. Without uninhibited access to resources and global historical documents, a great deal of important information remains unknown. Historically, a focus on Eurocentric history too often leaves the stories of the rest of the world in the dark. Bringing these stories into the light allows for discoveries in family histories, biographical narratives, and more in-depth analysis by scholars.

Dr. Will Hanley, an associate professor in the Department of History at Florida State University, is researching how to make historical materials more accessible. In one of his ongoing projects, he is assisted by undergraduates in a course titled ‘Digital Microhistory Lab’, where students are helping to digitize The Egyptian Gazette, a daily newspaper established in 1880 based in Alexandria, Egypt. This newspaper is significant because it provided day-to-day financial records, which was uncommon at the time for that area. By digitizing the newspaper and making it available for free online, Dr. Hanley and his students are contributing to a more inclusive record of human history and expanding its coverage.

Dr. Hanley and his students have digitized over 900 issues of The Egyptian Gazette from 1905-1908. About 15 years' worth of the newspaper was originally purchased on microfilm from the British Library with funds from the Bradley Grant by the FSU Libraries. First, to get from microfilm to digitized document, usable images must be created from the microfilm. Then, optical character recognition is used to provide machine-readable text.

Considering the large span of time that the Gazette covers and its daily frequency, it requires a lot of energy and resources to digitize such an expansive document. To accomplish this, Dr. Hanley is collaborating with the Research and Computing Center (RCC) at Florida State University. Dr. Hanley and his students encode their files in Extensible Markup Language (XML) format, published on a GitHub repository. The RCC provides Dr. Hanley with an eXist-db instance, which is a database that reads and serves XML files in a user-friendly manner, where the public can comfortably search the newspaper using faceted searches. Now, anyone can view and search the Egyptian Gazette with ease, allowing researchers and curious citizens to uncover historical information that would otherwise remain unknown.

"I’m really pleased that the RCC is seeking to expand its engagement with scholars in the humanities," Hanley states. "The RCC’s new initiative (Interdisciplinary Data Humanities Initiative - IDHI) to try to reach out to scholars in the humanities is much appreciated. What RCC is offering now is precisely the sort of thing that scholars newly embarking on this undertaking will find useful in trying to advance their research.”