Hostname: page-component-89b8bd64d-5bvrz Total loading time: 0 Render date: 2026-05-06T00:14:37.436Z Has data issue: false hasContentIssue false

Coding culture: challenges and recommendations for comparative cultural databases

Published online by Cambridge University Press:  01 June 2020

Edward Slingerland*
Affiliation:
Department of Asian Studies, University of British Columbia, Vancouver, Canada
Quentin D. Atkinson
Affiliation:
School of Psychology, University of Auckland, Auckland, New Zealand
Carol R. Ember
Affiliation:
Human Relations Area Files, Yale University, New Haven, USA
Oliver Sheehan
Affiliation:
School of Psychology, University of Auckland, Auckland, New Zealand
Michael Muthukrishna
Affiliation:
Department of Psychological and Behavioural Science, London School of Economics, London, UK
Joseph Bulbulia
Affiliation:
School of Humanities, University of Auckland, Auckland, New Zealand
Russell D. Gray
Affiliation:
School of Psychology, University of Auckland, Auckland, New Zealand Max Planck Institute for the Science of Human History, Jena, Germany
*
*Corresponding author. E-mail: edward.slingerland@gmail.com

Abstract

Considerable progress in explaining cultural evolutionary dynamics has been made by applying rigorous models from the natural sciences to historical and ethnographic information collected and accessed using novel digital platforms. Initial results have clarified several long-standing debates in cultural evolutionary studies, such as population origins, the role of religion in the evolution of complex societies and the factors that shape global patterns of language diversity. However, future progress requires recognition of the unique challenges posed by cultural data. To address these challenges, standards for data collection, organisation and analysis must be improved and widely adopted. Here, we describe some major challenges to progress in the construction of large comparative databases of cultural history, including recognising the critical role of theory, selecting appropriate units of analysis, data gathering and sampling strategies, winning expert buy-in, achieving reliability and reproducibility in coding, and ensuring interoperability and sustainability of the resulting databases. We conclude by proposing a set of practical guidelines to meet these challenges.

Information

Type
Methods Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © The Author(s), 2020
Figure 0

Table 1. Representative large-scale, online cultural coding database projects

Figure 1

Figure 1. Steps involved in converting qualitative historical data to quantitative data, taken from the DRH. (1) historical texts or artifacts; (2) scholarly interpretations; (3) individual coding decisions (with justification or simply as expert opinion); and (4) quantitative data.

Figure 2

Figure 2. A comparison of the feature coverage in WALS, which employed an expert coding model, and Grambank, which employed a hybrid RA + Expert coding model. WALS coded 138 ‘core’ features (193 total), covered 2,679 languoids (2,466 unique languages) and had a mean feature coverage of only 18% per language. In contrast, Grambank currently codes 195 features for 1,478 languoids (1,456 unique languages) and has a mean feature coverage of 69% per language.

Figure 3

Figure 3. Rich data added to DRH Entry, ‘Arch of Titus’ (Places Poll), by Greta Rodrígez.