Hostname: page-component-77c78cf97d-7rbh8 Total loading time: 0 Render date: 2026-05-04T14:52:41.280Z Has data issue: false hasContentIssue false

Saved in translation? Diversity shared in French and Dutch medieval literature

Published online by Cambridge University Press:  02 February 2026

Mike Kestemont*
Affiliation:
University of Antwerp, Antwerp, Belgium
Folgert Karsdorp
Affiliation:
KNAW Meertens Institute, Amsterdam, the Netherlands
Jean-Baptiste Camps
Affiliation:
École nationale des chartes (PSL), Paris, France
Remco Sleiderink
Affiliation:
University of Antwerp, Antwerp, Belgium
Anne Chao
Affiliation:
National Tsing Hua University, Hsinchu, Taiwan
*
Corresponding author: Mike Kestemont; Email: mike.kestemont@uantwerpen.be

Abstract

Empirical studies often have to work with incomplete samples, with scholars rarely accounting for under-registration: in cultural heritage e.g. the age-long loss of artefacts can yield an under-estimation of the original richness of assemblages. Recently, it has been argued that unseen species models from ecology can estimate the unobserved diversity in cultural collections. We report an extension on shared diversity, i.e. the number of types that are common to two assemblages. As a case study, we use stories in medieval French and Dutch (ca. 1150–1450), which were frequently shared. We apply an established estimator (Chao-shared) with a novel bootstrap procedure. The estimator suggests that the surviving data underestimate the original number of shared stories: for example, when its source is no longer extant, a translation can no longer be identified as such. Interestingly, there is less evidence for the total loss of shared stories: precisely because of the redundancy caused by inter-vernacular translation, shared stories were less likely to be lost in both languages simultaneously. These results go beyond previous studies in that they provide more insight into the composition of the unobserved share of cultural diversity (instead of its mere size).

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press.
Figure 0

Table 1. Diversity statistics for the chivalric subcorpora and the pooled result, including the single-assemblage Chao1 estimate (with 0.95 confidence intervals) of the total number of texts

Figure 1

Figure 1. Text attestation patterns (i.e. the number of witnesses) for chivalric texts in Dutch and French traditions. Log-log scatter plot showing the number of witnesses per story in each language. Colours distinguish attestation patterns: Dutch-only (orange), French-only (blue), and both languages (green). Top-ranking titles in each category are annotated.

Figure 2

Figure 2. Single-assemblage Chao1 estimates: Dutch, French, and ‘pooled’, meaning the frequencies of each row are combined, including 0.95 confidence intervals and survival ratio for the central point estimate. Dashed horizontal lines (left) indicate observed diversity; encircled numbers reflect the debiased estimate (with CIs indicated via the horizontal line.).

Figure 3

Figure 3. Visualisation of the (component) Chao-shared estimates for the chivalric corpus (for 1,000 bootstrap iterations, with circled point estimates and CIs as horizontal lines). Cf. Table 2.

Figure 4

Table 2. Chao-shared estimates for the chivalric corpus (for 1,000 bootstrap iterations)

Figure 5

Figure A1. Bootstrap framework for shared species estimation showing the partition of species across two assemblages and their pooled combination. $D_{1,2}$, $D_{0,+}$, and $D_{+,0}$ represent observed shared, unique to Assemblage II, and unique to Assemblage I species, respectively. $f_{0,1}$, $f_{0,2}$, and $f_0^*$ denote undetected species estimates for each assemblage and the pooled sample.