Hostname: page-component-5db58dd55d-f6s65 Total loading time: 0 Render date: 2026-05-31T19:34:51.163Z Has data issue: false hasContentIssue false

Using state space models to understand trait evolution in fossil lineages

Published online by Cambridge University Press:  08 October 2025

Gene Hunt
Affiliation:
National Museum of Natural History, Smithsonian Institution, 10th Street and Constitution Avenue NW, Washington, D.C. 20560, U.S.A.
Wilmer Martínez-Rivera
Affiliation:
Banco de la República de Colombia, Cra. 7 #14-78, Santa Fé, Bogotá, Columbia
Melanie Jane Hopkins
Affiliation:
American Museum of Natural History , 200 Central Park West, New York, New York 10024, U.S.A.
John Fricks
Affiliation:
School of Mathematical and Statistical Sciences, Arizona State University , 901 South Palm Walk, Tempe, Arizona 85287-1804, U.S.A.
Beckett Sterner*
Affiliation:
School of Life Sciences, Arizona State University , 427 East Tyler Mall, Tempe, Arizona 85287-1804, U.S.A.
*
Corresponding author: Beckett Sterner; Email: beckett.sterner@asu.edu

Abstract

Linear state space models provide a useful framework for investigating phenotypic evolution in fossil lineages for a wide variety of models representing Brownian motion, Ornstein-Uhlenbeck processes, and the potential influence of environmental covariates. A state space framework also provides access to residuals for the predicted and observed values at each time point as well as improved numerical stability. We illustrate the value of the state space approach by reanalyzing a classic dataset of trait evolution in the diatom lineage Stephanodiscus yellowstonensis. A series of increasingly complex models were fit to these data, including a novel modification of an Ornstein-Uhlenbeck model in which a trait tracks an exogenous covariate. These model results suggest that the number of spines on the periphery of the diatom is best explained by adaptation to changing solar insolation over time.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Paleontological Society
Figure 0

Figure 1. Evolutionary time series of log spine counts for Stephanodiscus yellowstonensis. Open circles and vertical bars show mean values ±1 SE on those means. The red line is the model-predicted trajectory of the fitness optimum for the best-supported model (see text). Age is in years before present day.

Figure 1

Figure 2. Measured environmental covariates, including June solar insolation (red solid line) and proportion of pollen attributable to Pinus contortus (blue dashed line; note reversed axis).

Figure 2

Table 1. Model fits to spine counts of the Stephanodiscus yellowstonensis lineage. From left to right, columns give model abbreviations, log-likelihoods, number of parameters, corrected Akaike information criterion (AICc) scores, ∆AICc scores, and Akaike weights. Model abbreviations: RW, random walk; DT, directional trend; OU, Ornstein-Uhlenbeck; DE, decelerating evolution; RWshift, random walk with a shift in the step variance parameter at 10 ka; OUpollen, OU model in which the trait optimum tracks the proportion of the pollen comprised of Pinus contortus; OUinsol, OU model in which the trait optimum tracks solar insolation; OUinsol−shift, OU model in which the trait optimum tracks solar insolation with a shift in the step variance at 10 ka

Figure 3

Figure 3. Modeled changes in the step variance predicted by the two models for which this parameter varies over time. The decelerated evolution (DE) model has a step variance that exponentially decreases, whereas the random walk with a shift in the step variance parameter at 10 ka (RWshift) model posits a single decrease in step variance that occurs at 10 ka.

Figure 4

Figure 4. Residuals from the random walk (RW) model. Note the greater spread of residuals early in the sequence.

Figure 5

Figure 5. Residuals from the decelerated evolution (DE) model. Note they are much less structured over time compared with those from the random walk (RW) model.

Figure 6

Figure 6. Residuals from the best-fitting model (model 9 in Table 1). Note that the spread of residuals does not markedly change over time as it does in the random walk (RW) model.

Figure 7

Figure 7. Log of spine count plotted with respect to June solar insolation in W/m2. Solid line shows the relationship between the two variables predicted by the parameter estimates (Table 2) of the best-fitting model (model 9 in Table 1). Dotted line shows the fit from ordinary least-squares (OLS) regression. The close similarity between the two curves is expected with rapid adaptation, as found here.

Figure 8

Table 2. Maximum-likelihood estimates (MLE) and confidence intervals (CI) for the best-fitting model: an Ornstein-Uhlenbeck (OU) process in which the position of the optimum depends linearly on the value of summer insolation and the step variance is estimated separately before (σ12) and after (σ22) 10,000 years ago. b0 and b1 are the intercept and slope of the relationship between solar insolation and the trait optimum, α represents the force of attraction to that optimum, and anc is the estimated trait value at the start of the time series

Figure 9

Table 3. Estimates of microevolutionary parameters calculated from the Ornstein-Uhlenbeck (OU) model in which the trait optimum follows summer insolation, with a step-down in step variance at 10 ka (model 9 in Table 1). Shown are calculations assuming low (0.1) and high (0.7) plausible values of trait heritability and 10 generations per year. ω is the computed variance of the population fitness function; larger values indicate broader fitness curves and therefore weaker stabilizing selection. Fitness reduction is the resulting decrease in population mean fitness between the optimal trait value and the trait values corresponding to three population standard deviations away from the optimum. Effective population size (Ne) is computed separately before and after 10 ka

Figure 10

Figure 8. Within-sample variance in spine count decreases in the initial part of the Stephanodiscus yellowstonensis sequence.

Figure 11

Figure A2.1. Parameter estimation for random walk (RW) and directional trend (DT) models using varying ratios of stepwise and total observational timescales. The left column shows estimation of the diffusion parameter for the RW model. The middle and right columns show estimation of the diffusion and the drift parameters, respectively, for the DT model. The true parameter values are $ {\sigma}^2=25 $ and $ \mu =5 $. We use different values for the sample size $ N=\left\{\mathrm{20,40,80,160}\right\} $, the size of the increments $ \Delta =\left\{1/8,1/4,1/2,1\right\} $, and the terminal time $ T=\left\{\mathrm{2.5,5,10,20}\right\} $. Box plots show 100 replicates.

Figure 12

Figure A2.2. Parameter estimation on simulated data from the Ornstein-Uhlenbeck (OU) model for different combinations of sample size, time step, and total duration. Each of the columns shows parameter estimation for $ \alpha, \sigma, {Z}_0 $, and $ \theta $, respectively. True values are $ \alpha = $0.50, $ {\sigma}^2=20 $, $ {Z}_0=40 $, and $ \theta =50 $, as represented with dashed red lines. We varied the sample size $ N=\left\{\mathrm{20,40,80,160}\right\} $, the size of the increments $ \Delta =\left\{1/8,1/4,1/2,1\right\} $, and the terminal time $ T=\left\{\mathrm{2.5,5,10,20}\right\} $. Box plots show 100 replicates.

Figure 13

Figure A2.3. Parameter estimation for decelerated evolution (DE) model on simulated data using varying combinations of sample size, time step, and total duration. The columns show parameter estimation of $ r,\sigma $, and $ {Z}_0 $, respectively. Dashed red lines show the true values of $ r= $−1, $ {\sigma}^2=20 $, and $ {Z}_0=40 $. We used sample sizes $ N=\left\{\mathrm{20,40,80,160}\right\} $, time increments of $ \Delta =\left\{1/8,1/4,1/2,1\right\} $, and terminal times $ T=\left\{\mathrm{2.5,5,10,20}\right\} $. Box plots show 100 replicates.

Figure 14

Figure A2.4. Model selection performance using the Akaike information criterion (AIC) on simulated data for varying ratios of stepwise and total observational timescales. In each panel, the true model is labeled on the x-axis, and the stacked histogram shows the average Akaike weight for each model. Perfect model performance would show each bar as completely filled by the corresponding true model’s shade on the legend (ST as white, RW as light gray, DT as medium gray, etc.). The true model parameters are ST: $ \theta =50 $, $ \omega =20 $; RW: $ {\sigma}_{step}^2=20 $; DT: $ Drift=5 $, $ {\sigma}_{step}^2=25 $; OU: $ \theta =50 $, $ \alpha =20 $, $ {\sigma}_{step}^2=20 $; and DE: $ r=-1 $, $ {\sigma}_{step}^2=20 $. The initial condition for all the models is $ {Z}_0=40 $, the variance of the evolutionary step is $ {V}_p=5 $, and the vector of population sample size is $ m=50 $. We varied sample size $ N=\left\{\mathrm{20,40,80,160}\right\} $, the size of the increments $ \Delta =\left\{1/8,1/4,1/2,1\right\} $, and the terminal time $ T=\left\{\mathrm{2.5,5,10,20}\right\} $. ST, stasis; RW, random walk; DT, directional trend; OU, Ornstein-Uhlenbeck; DE, decelerated evolution.