Martin Roček
From String to Meaning
Applying Semantic AI to Find Textual Similarities Between Medieval Dreambooks in Bohemia
Fimmtudaginn 6. nóvember 2025 kl. 16.30 / Thursday, November 6, 2025, at 16.30
 Fyrirlestrasal Eddu (E-103) / Edda auditorium (E-103) 

This talk discusses the application of a custom-trained Artificial Intelligence model for the semantic analysis of medieval Latin texts, using a corpus of Somniale Danielis (Dreambook of Daniel) manuscripts as a case study. Traditional digital philological tools are often limited to lexical or string-level comparisons, failing to identify passages with similar meaning but different wording. To address this “semantic gap,” I developed “Scribtum,” a workspace powered by a custom Sem-BERT model fine-tuned on a Latin corpus. This model is designed to generate contextual sentence embeddings, allowing for a nuanced comparison of semantic similarity.
The methodology was tested by comparing several medieval manuscripts of the Somniale Danielis. The AI demonstrated high efficacy, correctly identifying 58 out of 59 identical dream interpretations across the manuscripts, a task that is significantly more time-consuming through traditional manual analysis.
However, the study also highlights the inherent limitations of such models, including biases derived from training data and the challenge of distinguishing subtle semantic nuances, which requires further validation. The conclusion is that while AI cannot replace the critical judgment of a researcher, it serves as a powerful and efficient assistant for the initial large-scale analysis of manuscript corpora, accelerating the identification of textual variants and potential filiations.
Martin Roček is a researcher affiliated with both the Institute for Medieval Research (IMAFO) at the Austrian Academy of Sciences and the Faculty of Arts at Charles University, where he received his PhD in 2025. His doctoral dissertation, “Unraveling the Contaminated Tradition: Medieval Dreambooks in Bohemical Manuscripts,” laid the groundwork for his current research. He now focuses on the field of Digital Humanities, specifically investigating how thoughtful design can make digital tools and applications more approachable and effective for academic use.
Fyrirlesturinn verður haldinn á ensku og er öllum opinn. / The talk will be delivered in English and is open to all.