The fossil record is subject to multiple biases that can distort macroevolutionary and paleoecological inferences. Although temporal and spatial sampling biases have received substantial attention, other sources of fossil sampling heterogeneity remain less well quantified. Using the Triton database of planktonic foraminifera, we assess the influence of geographic, ecological, morphological, and methodological factors on fossil recovery rates. We first apply a temporal subsampling method to standardize fossil occurrences over geologic time, validating this approach against an expert-curated lineage-through-time trajectory. After subsampling, the occurrences remain unevenly distributed throughout species’ lifetimes and inhomogeneously distributed across species, reflecting biological signal and/or persistent sampling biases.
We then investigate this residual sampling heterogeneity with a generalized additive model incorporating relevant predictors from Triton. Our results reveal that, after correcting for temporal biases, geographic predictors (paleolatitude, paleolongitude, longitudinal spread) explain nearly a third of sampling variation. Species-specific ecological and morphological attributes contribute an additional fraction, among which mean relative abundance emerges as the main factor. Additional predictors of fossil sampling rates include age-calculation methods and biostratigraphic sampling biases. Despite accounting for multiple sources of variation, 37% of the deviance remains unexplained, suggesting unmodeled biological, stratigraphic, diagenetic, or taxonomic drivers of sampling heterogeneity.
Overall, observed recovery rates question the validity of the homogeneous-sampling assumption used in most diversification models, and this heterogeneity cannot be reduced to a single dominant factor. This conclusion reinforces the need for integrated subsampling approaches and process-based models that explicitly account for heterogeneous fossilization rates to improve the reliability of macroevolutionary analyses.