Hostname: page-component-77f85d65b8-grvzd Total loading time: 0 Render date: 2026-03-26T23:01:40.645Z Has data issue: false hasContentIssue false

Accounting for uncertainty from zero inflation and overdispersion in paleoecological studies of predation using a hierarchical Bayesian framework

Published online by Cambridge University Press:  06 September 2021

Jansen A. Smith*
Affiliation:
GeoZentrum Nordbayern, Friedrich-Alexander University Erlangen-Nürnberg (FAU), Erlangen, Bayern 91054, Germany; Paleontological Research Institution, Ithaca, New York 14850, U.S.A. E-mail: jansen.smith@fau.de
John C. Handley
Affiliation:
Goergen Institute for Data Science, University of Rochester, Rochester, New York 14627, U.S.A.; Paleontological Research Institution, Ithaca, New York 14850, U.S.A. E-mail: john.handley@rochester.edu
Gregory P. Dietl
Affiliation:
Paleontological Research Institution, Ithaca, New York 14850, U.S.A.; Department of Earth and Atmospheric Sciences, Cornell University, Ithaca, New York 14853, U.S.A. E-mail: gpd3@cornell.edu
*
*Corresponding author.

Abstract

The effects of overdispersion and zero inflation (e.g., poor model fits) can result in misinterpretation in studies using count data. These effects have not been evaluated in paleoecological studies of predation and are further complicated by preservational bias and time averaging. We develop a hierarchical Bayesian framework to account for uncertainty from overdispersion and zero inflation in estimates of specimen and predation trace counts. We demonstrate its application using published data on drilling predators and their prey in time-averaged death assemblages from the Great Barrier Reef, Australia.

Our results indicate that estimates of predation frequencies are underestimated when zero inflation is not considered, and this effect is likely compounded by removal of individuals and predation traces via preservational bias. Time averaging likely reduces zero inflation via accumulation of rare taxa and events; however, it increases the uncertainty in comparisons between assemblages by introducing variability in sampling effort. That is, there is an analytical cost with time-averaged count data, manifesting as broader confidence regions. Ecological inferences in paleoecology can be strengthened by accounting for the uncertainty inherent to paleoecological count data and the sampling processes by which they are generated.

Information

Type
Articles
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © The Author(s), 2021. Published by Cambridge University Press on behalf of The Paleontological Society
Figure 0

Table 1. Potential types of zeros in paleoecological count data with respect to those identified in ecological studies. When taking a sample to evaluate the ecology of a community, ecological processes (i.e., true zeros) and sampling artifacts (i.e., false zeros) can distort the resulting sample relative to the true distribution of specimens or predation events. In paleoecological samples, distortions from additional factors (e.g., preservational bias, time averaging) must also be addressed. *Following Blasco-Moreno et al. (2019). The possibility of a zero count for a predation trace is predicated on the presence of the prey taxa in the assemblage (ni > 1). When testing a hypothesis on multiple prey taxa, we must also assume the sample from the assemblage is representative of the set of prey taxa that could possibly have been sampled (i.e., were present in the community over the duration of time averaging; see Table 2).

Figure 1

Table 2. Common assumptions for time-averaged count data in paleoecological studies of molluscan predator–prey interactions.

Figure 2

Figure 1. A, Example of a drilling trace made by a naticid predator in its clam prey (personal specimen, J.A.S.). B, An illustrative molluscan community, with three samples taken at two points in time, and the hypothetical time-averaged paleoecological assemblage. The predator is present in sample 3 during time 1 and in samples 1 and 3 during time 2; it is found in the paleoecological assemblage in samples 1 and 3. Time averaging obscures the absence of the predator in sample 1 during time 1. If paleoecological predation frequencies are calculated by pooling data from all samples—including sample 2, where the predator was never present—they would be underestimated because of zero inflation. Counts of individuals are also affected. For example, prey 4 is rare in the living communities but is well preserved in the paleoecological assemblage. With poor preservation, the true abundance of other taxa (e.g., prey 1) is underrepresented in paleoecological samples relative to the living samples. Mollusk drawings from thenounproject.com, with contributors in parentheses: prey 1 (public domain), prey 2 (icon 54), prey 3 (Ker'is), prey 4 (Yu luck), and predator (Juraj Sedlák).

Figure 3

Table 3. Parameter estimates for the posterior predictive check assessing goodness of fit for models of counts for Pinguitellina robusta. Estimates for each parameter of the various models are given as the posterior mean, with associated standard deviation (SD) and 95% credibility regions (2.5%–97.5%).

Figure 4

Figure 2. The effect of zero inflation on estimates of predation frequency. Species are sorted by sample size in decreasing order from left to right, with specimen counts given above the black bars for confidence regions—species with fewer than 15 individuals were excluded (see Supplementary Fig. S2). Black circles represent predation frequencies accounting for zero inflation (p), with 95% confidence regions (black bars). Red diamonds represent predation frequencies for each species estimated with the traditional calculation (number of drilled individuals divided by the total number individuals in each sample). Data from Martinelli et al. (2015). Asterisks on the x-axis with bolded taxon names indicate the two species, Pinguitellina robusta (far left) and Notocochlis gualtieriana (central), discussed in the main text.

Figure 5

Table 4. Parameter estimates for the posterior predictive check assessing goodness of fit for the model of predation counts on Notocochlis gualtieriana. Predation frequency (p) is estimated using the hierarchical Bayesian model. For comparison, the predation frequency estimate using the traditional method (number of drilled individuals divided by the total number of individuals) is 0.02. SD, standard deviation.