Main content start

Experiments with Handwritten Text Recognition and Late Medieval Notarial Records: The MemoBo project and the Bolognese Memoriali (1265-1452)

Date
Tue February 25th 2025, 12:00 - 1:15pm
Event Sponsor
Center for Spatial and Textual Analysis (CESTA)
Location
Building 160, Wallenberg Hall
450 Jane Stanford Way, Building 160, Stanford, CA 94305
Room 433A

Join us for the next workshop seminar on "Global Approaches to Multilingual Digital Humanities and Data Practices",  on, titled "Experiments with Handwritten Text Recognition and Late Medieval Notarial Records: The MemoBo project and the Bolognese Memoriali (1265-1452)" by Dr. Edward Loss, part of the Memoriali Project (MemoBo). This presentation examines a two-year experiment to develop tailored Handwritten Text Recognition (HTR) models for the Memoriali—a vast collection of notarial records produced in Bologna between the 13th and 15th centuries. With more than 3 million records, the series presents unique challenges for HTR workflows, from creating effective Ground Truth to preparing layouts and utilizing digitized microfilm. The talk will explore the potential and limitations of AI-based tools like Transkribus for studying late medieval Latin documentation at scale. RSVP for lunch or to receive the Zoom link here.

This workshop is a part of the "Global Approaches to Multilingual Data Practices and Digital Humanities" sponsored by the Global Research Workshops by Stanford Global Studies.

Talk Abstract

The paper describes a 2-year long experiment of creating tailored Handwritten Text Recognition (HTR) models for the Memoriali – a series of notarial records produced in Bologna almost uninterruptedly between the second half of the 13th and first half of the 15th century – using the A.I. based software Transkribus. Produced as part of the third phase of the Memoriali Project (MemoBo), the goal of this presentation is to highlight the potential and the limits of this sort of tools for the study of large collections of late medieval latin documentation. The Memoriali series contains more than 3 million records; an enormous volume of manuscripts, which imposed deep reflections on crucial elements of the regular workflow of HTR softwares, such as Transkribus. The elements  discussed will touch a variety of topics, from different methods of data input for the creation of effective Ground Truth, to proper layout preparition, including  some hints on how digitised microfilm could also prove to be useful for this sort of experimentation.

About the Speaker

Edward Loss is currently a postdoctoral researcher in Medieval History and Digital Humanities at the University fo Genoa, working for the ERC Consolidator "PatriFem. Charting Female property and patrimonial rights in law and practice across Western Europe (12th-16th centuries). P.I. Denise Bezzina". He holds an Ph.D in Medieva History by the University of Bologna (2019) and has held several fellowships and postdoctoral positions at the Istituto Italiano per gli Studi Storici (2019-2021), I TATTI - The Harvard University Center for Italian Renaissance Studies ( Jean-François Male Fellow, 2021-2022), the Deutsches Historisches Institut in Rom (DHI-ROM. 2022) and the American Academy in Rome (Franco Zeffirelli Fellow in Italian Medieval Studies, 2023). Before his appointment at the University of Genoa, he was a postdoctoral researcher for the project MemoBo and PRIN ON. Objects in network. The social life of things in the fifteenth century between notarial sources and semantic web" (Università di Bologna, P. I. Tommaso Duranti, 2022-2024). He is the author of Officium Spiarum. Spionaggio e gestione delle informazioni a Bologna (secoli XIII e XIV), Viella, 2020, and co-editor of Oltre la carità. Donatori, istituzioni e comunità fra medioevo ed età contemporanea (il Mulino, 2021).