Hostname: page-component-89b8bd64d-sd5qd Total loading time: 0 Render date: 2026-05-08T22:14:15.231Z Has data issue: false hasContentIssue false

Increasing the equitability of data citation in paleontology: capacity building for the big data future

Published online by Cambridge University Press:  28 December 2023

Jansen A. Smith*
Affiliation:
GeoZentrum Nordbayern, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Bayern 91054, Germany Paleontological Research Institution, Ithaca, New York 14850, U.S.A. Department of Biology, University of New Mexico, Albuquerque, New Mexico 87110, U.S.A. Department of Earth and Environmental Sciences, University of Minnesota Duluth, Duluth, Minnesota 55812, U.S.A.
Nussaïbah B. Raja
Affiliation:
GeoZentrum Nordbayern, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Bayern 91054, Germany
Thomas Clements
Affiliation:
GeoZentrum Nordbayern, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Bayern 91054, Germany
Danijela Dimitrijević
Affiliation:
GeoZentrum Nordbayern, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Bayern 91054, Germany
Elizabeth M. Dowding
Affiliation:
GeoZentrum Nordbayern, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Bayern 91054, Germany
Emma M. Dunne
Affiliation:
GeoZentrum Nordbayern, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Bayern 91054, Germany
Bryan M. Gee
Affiliation:
Burke Museum and Department of Biology, University of Washington, Seattle, Washington 98195, U.S.A.
Pedro L. Godoy
Affiliation:
Department of Zoology, Institute of Biosciences, University of São Paulo, São Paulo, SP 04263, Brazil Department of Anatomical Sciences, Stony Brook University, Stony Brook, New York 11794, U.S.A.
Elizabeth M. Lombardi
Affiliation:
Department of Biology, University of New Mexico, Albuquerque, New Mexico 87110, U.S.A.
Laura P. A. Mulvey
Affiliation:
GeoZentrum Nordbayern, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Bayern 91054, Germany
Paulina S. Nätscher
Affiliation:
GeoZentrum Nordbayern, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Bayern 91054, Germany
Carl J. Reddin
Affiliation:
GeoZentrum Nordbayern, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Bayern 91054, Germany Museum für Naturkunde, Berlin, Bayern, 10115, Germany
Bryan Shirley
Affiliation:
GeoZentrum Nordbayern, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Bayern 91054, Germany Department of Earth Sciences, Faculty of Geosciences, Utrecht University, Utrecht, 3584 CB, The Netherlands
Rachel C. M. Warnock
Affiliation:
GeoZentrum Nordbayern, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Bayern 91054, Germany
Ádám T. Kocsis
Affiliation:
GeoZentrum Nordbayern, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Bayern 91054, Germany MTA-MTM-ELTE Research Group for Paleontology, Budapest 1431, Hungary
*
Corresponding author: Jansen A. Smith; Email: jansen.smith@fau.de

Abstract

Data compilations expand the scope of research; however, data citation practice lags behind advances in data use. It remains uncommon for data users to credit data producers in professionally meaningful ways. In paleontology, databases like the Paleobiology Database (PBDB) enable assessment of patterns and processes spanning millions of years, up to global scale. The status quo for data citation creates an imbalance wherein publications drawing data from the PBDB receive significantly more citations (median: 4.3 ± 3.5 citations/year) than the publications producing the data (1.4 ± 1.3 citations/year). By accounting for data reuse where citations were neglected, the projected citation rate for data-provisioning publications approached parity (4.2 ± 2.2 citations/year) and the impact factor of paleontological journals (n = 55) increased by an average of 13.4% (maximum increase = 57.8%) in 2019. Without rebalancing the distribution of scientific credit, emerging “big data” research in paleontology—and science in general—is at risk of undercutting itself through a systematic devaluation of the work that is foundational to the discipline.

Information

Type
On The Record
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press on behalf of Paleontological Society
Figure 0

Figure 1. The current balance of credit distribution in paleontology (A) and a reimagined dynamic in which data-provisioning publications are equitably cited (B).

Figure 1

Figure 2. Citation rates for official Paleobiology Database (PBDB) publications and the data-provisioning publications used in those PBDB publications. Only data-provisioning publications from the same time frame (since 2001) as PBDB publications are included to standardize for temporal effects. Citations to data-provisioning publications (i.e., primary literature) are presented as the current rate (i.e., no additions for neglected citations), the projected rate when including citations from PBDB publications where data were available (k = 112; i.e., additions), and the projected rate when making those additions and extrapolating to the entire set of PBDB publications (k = 396; i.e., additions and extrapolated).

Figure 2

Figure 3. The effects of adding neglected citations from data reuse on journal impact factor (JIF; A, B) and general patterns in publishing trends in paleontology (C, D). A, The increase in JIF for the 55 journals categorized to paleontology by Clarivate, for the period of 2010 to 2019. Note, an outlier value of 172% in 2018 for PalZ was not plotted. B, Increases in JIF for the 10 paleontological journals most affected by neglected citations, only including those with complete data for the duration of 2010 to 2019. For raw data for all 55 paleontological journals from 1997 to 2021, see “7_paleo_journal_JIFcalculation.csv” in Smith et al. (2023a). C, The number of citable items published in paleontological journals each year. D, The number of citations to items published in paleontological journals each year.