Hostname: page-component-6766d58669-nqrmd Total loading time: 0 Render date: 2026-05-17T23:35:14.805Z Has data issue: false hasContentIssue false

Quantifying and explaining the rise of fiction

Published online by Cambridge University Press:  14 July 2025

Edgar Dubourg*
Affiliation:
École normale supérieure-PSL, Institut Jean Nicod, Paris, France
Valentin Thouzeau
Affiliation:
École normale supérieure-PSL, Institut Jean Nicod, Paris, France
Quentin Borredon
Affiliation:
École normale supérieure-PSL, Institut Jean Nicod, Paris, France
Nicolas Baumard
Affiliation:
École normale supérieure-PSL, Institut Jean Nicod, Paris, France
*
Corresponding author: Edgar Dubourg; Email: edgar.dubourg@gmail.com

Abstract

We present a comprehensive analysis of the rise of fictions across human narratives, using large-scale datasets that collectively span over 65,000 works across various media (movies, literary works), cultures (over 30 countries, Western and non-Western), and time periods (2000 BCE to 2020 CE). We measured fictiveness – defined as the degree of departure from reality – across three narrative dimensions: protagonists, events, and settings. We used automatic annotations from large language models (LLMs) to systematically score fictiveness and ensured the robustness and validity of our measure, specifically by demonstrating predictable variations in fictiveness across different genres, in all media. Statistical analyses of the changes in fictiveness over time revealed a steady increase, culminating in the 20th and 21st centuries, across all narrative forms. Remarkably, this trend is also evident in our data spanning ancient times: fictiveness increased gradually in narratives dating back as far as 2000 BCE, with notable peaks of fictiveness during affluent periods such as the heights of the Roman Empire, the Tang Dynasty, and the European Renaissance. We explore potential psychological explanations for the rise in fictiveness, including changing audience preferences driven by ecological and social changes.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press.
Figure 0

Figure 1. Data extraction and annotation process, with the minimal version of the scale (see SM for the full scales and the prompt).

Figure 1

Figure 2. Examples of films (from IMDb) and literary works (from Babel), alongside their fictiveness scores for each referent, the overall (averaged) fictiveness score, and a selected excerpt from GPT’s generated output for one chosen referent (indicated by the colour of the text).

Figure 2

Figure 3. (A) Comparisons of fictiveness scores across referents in all five datasets. (B) Comparisons of fictiveness across datasets. (C) Comparisons of fictiveness across genres in four datasets where genres were available. In each graph, genres are ordered from higher to lower average fictiveness. For displaying significance, the overall fictiveness of adjacent genres is compared using a t-test (see SM for full statistics).

Figure 3

Figure 4. Correlations between personality traits and socio-demographic characteristics of the audiences who ‘liked’ movies on Facebook (N = 3.5 million), in function of the fictiveness of the movies (N = 690 movies).

Figure 4

Figure 5. (A) Evolution of fictiveness across time in IMDb. (B) Evolution of fictiveness across time and languages in Babel (with varying y-axis scaling).

Figure 5

Figure 6. (A) Forest plot of standardized regression coefficients predicting worldwide gross income from budget, duration, year, fictiveness, and the interaction of fictiveness and year. Points represent standardized effect sizes; horizontal lines show 95% confidence intervals. Model 1 includes budget, duration, and year (green); Model 2 adds fictiveness (orange); Model 3 adds the interaction between fictiveness and year (purple). (B) Plot of the interaction effect of year and fictiveness on gross income. Top: predicted values from the regression model. Bottom: actual values, displaying the distribution of log worldwide gross income over time across three bins of fictiveness (with regression lines representing linear model fits between year of release and log worldwide gross income for each bin of fictiveness across time).

Figure 6

Figure 7. Evolution of fictiveness across linguistic regions. Note that in our analysis, we used a linear model to capture the overall trend of increasing or decreasing fictiveness over time. However, in these graphs, we present a LOESS regression line, which provides a smoother visualization, allowing for the exploration of more fine-grained variations in the data.

Supplementary material: File

Dubourg et al. supplementary material

Dubourg et al. supplementary material
Download Dubourg et al. supplementary material(File)
File 9.3 MB