Hostname: page-component-6766d58669-zlvph Total loading time: 0 Render date: 2026-05-16T23:09:09.393Z Has data issue: false hasContentIssue false

Global Phanerozoic biodiversity—can variation be explained by spatial sampling intensity?

Published online by Cambridge University Press:  11 March 2025

Daniel B. Phillipi*
Affiliation:
Department of Earth and Environmental Sciences, Syracuse University, Syracuse, New York 13244, U.S.A.
Jesse Czekanski-Moir
Affiliation:
Department of Environmental Biology, SUNY College of Environmental Science and Forestry, Syracuse, New York 13210, U.S.A.
Linda C. Ivany*
Affiliation:
Department of Earth and Environmental Sciences, Syracuse University, Syracuse, New York 13244, U.S.A.
*
Corresponding authors: Daniel B. Phillipi; Email: daniel.phillipi.137@gmail.com; Linda C. Ivany; Email: lcivany@syr.edu
Corresponding authors: Daniel B. Phillipi; Email: daniel.phillipi.137@gmail.com; Linda C. Ivany; Email: lcivany@syr.edu

Abstract

Variation in observed global generic richness over the Phanerozoic must be partly explained by changes in the numbers of fossils and their geographic spread over time. The influence of sampling intensity (i.e., the number of samples) has been well addressed, but the extent to which the geographic distribution of samples might influence recovered biodiversity is comparatively unknown. To investigate this question, we create models of genus richness through time by resampling the same occurrence dataset of modern global biodiversity using spatially explicit sampling intensities defined by the paleo-coordinates of fossil occurrences from successive time intervals. Our steady-state null model explains about half of observed change in uncorrected fossil diversity and a quarter of variation in sampling-standardized diversity estimates. The inclusion in linear models of two additional explanatory variables associated with the spatial array of fossil data (absolute latitudinal range of occurrences, percentage of occurrences from shallow environments) and a Cenozoic step increases the accuracy of steady-state models, accounting for 67% of variation in sampling-standardized estimates and more than one-third of the variation in first differences. Our results make clear that the spatial distribution of samples is at least as important as numerical sampling intensity in determining the trajectory of recovered fossil biodiversity through time and caution against the overinterpretation of both the variation and the trend that emerge from analyses of global Phanerozoic diversity.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2025. Published by Cambridge University Press on behalf of Paleontological Society
Figure 0

Figure 1. Comparison of Paleobiology Database (PBDB) sampling in two distinct Paleozoic time bins, illustrating the wide variation in geographic sampling completeness within the PBDB. While it is true that the ca. 447 Ma time bin has more occurrences, those occurrences are also spread out across latitude and multiple ocean basins. On the other hand, the ca. 303 Ma time bin has occurrences concentrated in only one ocean basin and within a relatively small latitudinal band.

Figure 1

Figure 2. Schematic showing an overview of the geographic subsampling process, which imposes the sampling “design” the fossil record has “given” us from each of 48 ~11 Ma time bins onto the Ocean Biodiversity Information System (OBIS) dataset of modern marine diversity for the same major groups on continental shelves. A, Geography and sampling intensity of a given latitudinal band from the Paleobiology Database (PBDB), with the pooled number of genera (top number) and occurrences (bottom number) within occupied equal-area cells. B, Geography of available modern data in the same latitudinal band, which are shifted together to the east (with potential wraparound) until their east–west distribution roughly matches the longitudinal distributions of PBDB occurrence data. C, Probability of sampling from each cell within a geographic cluster (e.g., if there are five bins that approximately match the position of the PBDB data, each of those has a probability of 0.2 of being sampled in a given run). D, Modern OBIS cells with approximately correct geography are subsampled so as to match the sampling intensity (no. of occurrences) of corresponding cells in the PBDB dataset, and the resulting number of genera is recorded for each. This process is repeated 25 times using a different random subsample of OBIS each time. See the text for more details.

Figure 2

Figure 3. Spatial sampling through the Phanerozoic compared with modern data. The number of equal-area grid cells sampled across time within the coarse latitudinal bins used in this study, with cells sampled in the Ocean Biodiversity Information System (OBIS) dataset shown at the farthest right. Spatial sampling through time does not show a linear increase or decrease, but latitudinal sampling shifts northward, tracking the location of North America and Europe over the Phanerozoic; OBIS latitudinal sampling is more complete for every latitudinal band than every Paleobiology Database (PBDB) time bin. CM, Cambrian; O, Ordovician; S, Silurian; D, Devonian; C, Carboniferous; P, Permian; Tr, Triassic; J, Jurassic; K, Cretaceous; Pg, Paleogene; Ng, Neogene.

Figure 3

Figure 4. Comparison between three iterations of the resampled modern Ocean Biodiversity Information System (OBIS) model, which estimates expected fossil diversity based on only the spatial distribution of fossil data, and sampling-standardized shareholder quorum subsampling (SQS) richness of fossil marine invertebrate genera drawn from the Paleobiology Database (PBDB) (dashed black line). The raw OBIS model (blue), based only on resampled OBIS data, is adjusted to reflect the step increase into the Cenozoic (purple). The four-variable OBIS model (in red) further incorporates two additional variables: percent of shallow occurrences and latitudinal range of occurrences. A, Secular trends in both resampled model and SQS diversity estimates, with two-sigma error envelopes derived from replicate resampling. B, Correlation between the values of the four-variable OBIS model (best fit of all OBIS models based on adj. R2 and Akaike information criterion [AIC] values) and fossil SQS. C, Correlation between the time bin to time bin first differences of those estimates, showing that the model generally predicts the magnitude and direction of change in fossil SQS diversity through time. The model explains about 67% of variance in the value of fossil SQS and 38% of the variance in bin-to-bin change. CM, Cambrian; O, Ordovician; S, Silurian; D, Devonian; C, Carboniferous; P, Permian; Tr, Triassic; J, Jurassic; K, Cretaceous; Pg, Paleogene; Ng, Neogene.

Figure 4

Table 1. Results of linear correlations between resampled Ocean Biodiversity Information System (OBIS) diversity, recovered fossil diversity, and the additional sample descriptors derived from the Paleobiology Database (PBDB) dataset. AIC, Akaike information criterion. Bold indicates significance at the 0.05 level.

Figure 5

Figure 5. Ocean Biodiversity Information System (OBIS) occurrences and richness in our subset of the data. A, OBIS occurrences within equal-area cells. B, OBIS generic richness within equal-area cells. Patterns in A are largely driven by spatial sampling biases (e.g., a greater number of scientific institutions in the Northern Hemisphere), but important marine invertebrate diversity hotspots can be seen in B, implying that imperfect spatial sampling has not greatly distorted spatial diversity patterns in the OBIS.

Figure 6

Figure 6. Latitudinal sampling in the Ocean Biodiversity Information System (OBIS) dataset (blue) versus the most recent (mid-Miocene to Recent) time bin of the Paleobiology Database (PBDB) (red). OBIS sampling is more complete in the Southern Hemisphere (A), but both exhibit the latitudinal diversity gradient (B).

Figure 7

Figure 7. The number of phyla per sampled cell in the Ocean Biodiversity Information System (OBIS) dataset (A) versus the most recent time bin of the Paleobiology Database (PBDB) (B).

Figure 8

Figure 8. The numbers of families present in cells occupied in both the Ocean Biodiversity Information System (OBIS) and most recent Paleobiology Database (PBDB) data. All diverse PBDB cells are also diverse in OBIS, but the reverse is not the case.

Figure 9

Figure 9. Residuals between the resampled four-variable Ocean Biodiversity Information System (OBIS) model and fossil shareholder quorum subsampling (SQS) diversity. Dashed lines show the ages of the “big five” mass extinctions. CM, Cambrian; O, Ordovician; S, Silurian; D, Devonian; C, Carboniferous; P, Permian; Tr, Triassic; J, Jurassic; K, Cretaceous; Pg, Paleogene; Ng, Neogene.