Computational Humanities Research: Volume 1 -

Metronome: tracing variation in poetic meters via local sequence alignment
Ben Nagy, Artjoms Šeļa, Mirella De Sisto, Petr Plecháč
Published online by Cambridge University Press:

25 June 2025, e1
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
All poetic forms come from somewhere. Prosodic templates can be copied for generations, altered by individuals, imported from foreign traditions, or fundamentally changed under the pressures of language evolution. Yet, these relationships are notoriously difficult to trace across languages and times. This article introduces an unsupervised method for detecting structural similarities in poems using local sequence alignment. The method relies on encoding poetic texts as strings of prosodic features using a four-letter alphabet; these sequences are then aligned to derive a distance measure based on weighted symbol (mis)matches. Local alignment allows poems to be clustered according to emergent properties of their underlying prosodic patterns. We evaluate method performance on a meter recognition tasks against strong baselines and show its potential for cross-lingual and historical research using three short case studies: (1) mutations in quantitative meter in classical Latin, (2) European diffusion of the Renaissance hendecasyllable and (3) comparative alignment of modern accentual-syllabic meters in 18–19th century Czech, German and Russian. We release an implementation of the algorithm as a Python package with an open license.

It takes a village to write a book: Mapping anonymous contributions in Stephen Langton’s Quaestiones Theologiae
Part of:
- CHR Missing Data in the Humanities
Jan Maliszewski
Published online by Cambridge University Press:

19 June 2025, e2
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
While the indirect evidence suggests that already in the early scholastic period the literary production based on records of oral teaching (so-called reportationes) was not uncommon, there are very few sources commenting on the practice. This article details the design of a study applying stylometric techniques of authorship attribution to a collection developed from reportationes – Stephen Langton’s Quaestiones Theologiae – aiming to uncover layers of editorial work and thus validate some hypotheses regarding the collection’s formation. Following Camps, Clérice, and Pinche (2021), I discuss the implementation of an HTR pipeline and stylometric analysis based on the most frequent words, POS tags, and pseudo-affixes. The proposed study will offer two methodological gains relevant to computational research on the scholastic tradition: it will directly compare performance on manually composed and automatically extracted data, and it will test the validity of transformer-based OCR and automated transcription alignment for workflows applied to scholastic Latin corpora. If successful, this study will provide an easily reusable template for the exploratory analysis of collaborative literary production stemming from medieval universities.

Looking for the inner music: Probing LLMs’ understanding of literary style
Part of:
- CHR Expanding the Toolkit: Large Language Models in Humanities Research
Rebecca M. M. Hicke, David Mimno
Published online by Cambridge University Press:

19 June 2025, e3
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Language models have the ability to identify the characteristics of much shorter literary passages than was thought feasible with traditional stylometry. We evaluate authorship and genre detection for a new corpus of literary novels. We find that a range of LLMs are able to distinguish authorship and genre, but that different models do so in different ways. Some models rely more on memorization, while others make greater use of author or genre characteristics learned during fine-tuning. We additionally use three methods – direct syntactic ablation of input text and two means of studying internal model values – to probe one high-performing LLM for features that characterize styles. We find that authorial style is easier to characterize than genre-level style and is more impacted by minor syntactic decisions and contextual word usage. However, some traits like pronoun usage and word order prove significant for defining both kinds of literary style.

From dictionaries to LLMs – an evaluation of sentiment analysis techniques for German language data
Part of:
- CHR Expanding the Toolkit: Large Language Models in Humanities Research
Jannis Klähn, Janos Borst-Graetz, Manuel Burghardt
Published online by Cambridge University Press:

03 July 2025, e4
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
In this study, we perform a comprehensive evaluation of sentiment classification for German language data using three different approaches: (1) dictionary-based methods, (2) fine-tuned transformer models such as BERT and XLM-T and (3) various large language models (LLMs) with zero-shot capabilities, including natural language inference models, Siamese models and dialog-based models. The evaluation considers a variety of German language datasets, including contemporary social media texts, product reviews and humanities datasets. Our results confirm that dictionary-based methods, while computationally efficient and interpretable, fall short in classification accuracy. Fine-tuned models offer strong performance, but require significant training data and computational resources. LLMs with zero-shot capabilities, particularly dialog-based models, demonstrate competitive performance, often rivaling fine-tuned models, while eliminating the need for task-specific training. However, challenges remain regarding non-determinism, prompt sensitivity and the high resource requirements of large LLMs. The results suggest that for sentiment analysis in the computational humanities, where non-English and historical language data are common, LLM-based zero-shot classification is a viable alternative to fine-tuned models and dictionaries. Nevertheless, model selection remains highly context-dependent, requiring careful consideration of trade-offs between accuracy, resource efficiency and transparency.

Computational Humanities Research

Refine listing

Actions for selected content:

Volume 1 - 2025

Short Article

Metronome: tracing variation in poetic meters via local sequence alignment

Registered Report Protocol

It takes a village to write a book: Mapping anonymous contributions in Stephen Langton’s Quaestiones Theologiae

Research Article

Looking for the inner music: Probing LLMs’ understanding of literary style

From dictionaries to LLMs – an evaluation of sentiment analysis techniques for German language data

Computational Humanities Research

Refine listing

Actions for selected content:

Save Search

Volume 1 - 2025

Short Article

Metronome: tracing variation in poetic meters via local sequence alignment

Registered Report Protocol

It takes a village to write a book: Mapping anonymous contributions in Stephen Langton’s Quaestiones Theologiae

Research Article

Looking for the inner music: Probing LLMs’ understanding of literary style

From dictionaries to LLMs – an evaluation of sentiment analysis techniques for German language data