The preservation of cause and effect in the rock record

Abstract Evolutionary events may impact the geological carbon cycle via transient imbalances in silicate weathering, and such events have been implicated as causes of glaciations, mass extinctions, and oceanic anoxia. However, suggested evolutionary causes often substantially predate the environmental effects to which they are linked—problematic when carbon cycle perturbations must be resolved in less than a million years to maintain Earth's habitability. What is more, the geochemical signatures of such perturbations are recorded as they occur in widely distributed marine sedimentary rocks that have been densely sampled for important intervals in Earth history, whereas the fossil record—particularly on land—is governed by the availability of sedimentary basins that are patchy in both space and time, necessitating lags between the origination of an evolutionary lineage and its earliest occurrence in the fossil record. Here, we present a simple model of the impact of preservational filtering on sampling to show that an evolutionary event that causes an environmental perturbation via weathering imbalance should not appear earlier in the rock record than the perturbation itself and, if anything, should appear later rather than simultaneously. The Devonian Hangenberg glaciation provides an example of how evolutionary events might be more fruitfully considered as potential causes of environmental perturbations. Just as the last samplings of species lost in mass extinction are expected to come before the true environmental event, first appearance should be expected to postdate the geological expression of a lineage's environmental impact with important implications for our reading of Earth history.


Introduction
A variety of evolutionary innovations in the ecophysiology of land plants and other components of the terrestrial biota are proposed to have been transformative for the carbon cycle as the relevant clade became widespread and ecologically dominant, but how the carbon cycle can be transformed is highly constrained. The reaction of silicate minerals and CO 2 -both introduced to Earth's surface via volcanismforms clays and releases ions to solution to be later precipitated as marine carbonates (Berner 1991). This net chemical weathering reaction modulates atmospheric CO 2 concentration and regulates Earth's climate over geological time (Urey 1952). The input of carbon to Earth's surface environment via volcanic and metamorphic outgassing is balanced by the output of carbon from Earth's surface environment via burial in rocks as carbonate or organic carbon. This balance must be maintained on ∼10 6 yr timescales to avoid deterioration of Earth's climate to either a Mars-like (carbon removal outstrips introduction) or Venus-like (carbon introduction outstrips removal) state (Walker et al. 1981;Berner and Caldeira 1997;D'Antonio et al. 2020;Isson et al. 2020). Because weathering consumes CO 2 , a permanent or prolonged (>10 5 yr) change in weathering rate without a corresponding proportional change in carbon inputs would represent a scenario in Earth's exogenic carbon cycle with clear catastrophic consequences extreme enough (e.g., a Snowball Earth episode) that they demonstrably have not happened over the Phanerozoic.
Evolutionary events may be associated with perturbations of the carbon cycle that occur when a new equilibrium atmospheric CO 2 concentration is established. These perturbations can provoke sharp changes to weatheringderived nutrient fluxes and climate until steady state is reached, but such imbalances must be resolved within ∼1 Myr (Algeo and Scheckler 2010;Bachan et al. 2017;D'Antonio et al. 2020). For example, the Frasnian/Famennian (Late Devonian) Kellwasser events each lasted ∼100 kyr, calibrated to the short eccentricity Milankovitch cycle (Schindler [1990] as cited in House 2002;De Vleeschouwer et al. 2013Pier et al. 2021), and the end-Famennian (Devonian/Carboniferous) Hangenberg crisis lasted 50-100 kyr, calibrated to precise U-Pb zircon dates from bounding ash beds (Myrow et al. 2014)-both appropriate durations of short-term perturbations potentially arising from transiently elevated weathering fluxes. These perturbations can involve dramatic impacts, including glaciation, marine anoxia, and mass extinction, but their transience and rapid resolution require a close coupling of cause and effect on geological timescales. This presents a problem when the attributions of environmental events to evolutionary causes are often associations separated by tens of millions of years, such as Late Ordovician glaciation following the Middle Ordovician appearance of land plants and the Late Devonian black shale events following the Middle Devonian appearance of deep-rooting trees (Berner 1997;Algeo and Scheckler 1998;Lenton et al. 2012). (Here, we note that not all environmental events fit this <1 Myr context of carbon cycle perturbations. For example, the 10 Myr timescales of tectonics-driven changes to paleogeography and ocean circulation patterns are more relevant for explaining the longer Phanerozoic glaciations [Scher and Martin 2006;Pohl et al. 2014], and the 100 Myr timescales of the rock cycle [Bachan and Kump 2015;Boyce et al. 2022] become relevant for considering the longest events, such as the Paleoproterozoic Lomagundi-Jatuli event [Prave et al. 2021].) For carbon cycle perturbations, cause and effect are required to be near simultaneous over millionyear timescales; however, the geological expression of this cause and effect may then be distorted by differential preservation in the rock record. The global ocean is well mixed on 1-5 kyr timescales, far shorter than the 150 kyr residence time of carbon in the system, so that the potential preservation of carbon cycle perturbations in the geological record should be globally distributed. These perturbations can then be intensively sampled with carbon isotopic composition from the abundance and temporal continuity of nearshore marine carbonates (e.g., Caplan and Bustin 1999;Zachos et al. 2005;Hull 2015). For example, thousands of δ 13 C measurements have been taken within 100 kyr both of the Paleocene-Eocene thermal maximum (PETM) and of mass extinctions such as the Cretaceous/Paleogene and Permian/Triassic extinction events (e.g., Payne et al. 2004;Hull 2015;Hull et al. 2020).
The fossil record behaves differently. Just as the last appearance of a fossil will precede the actual extinction of that taxon (Signor and Lipps 1982), the first appearance of a fossil will lag the actual origination (Sepkoski 1998;Kirchner and Weil 2000), with range offset spanning anywhere from a few hundreds of thousands of years to several million years (Holland and Patzkowsky 2002). In addition to this phenomenon inherent to the structure of stratigraphic architecture, incomplete sampling has been shown to lengthen lags between origination and first appearance, or last appearance and extinction (Kirchner and Weil 2000). Statistical analyses of the fossil record have been conducted with marine strata and taxa (Marshall 1990(Marshall , 1994, but ecological gradients, stratigraphic architecture, and facies effects are all important hurdles for inferring taxon range (Holland and Patzkowsky 2002;Patzkowsky and Holland 2012;Holland 2020). The complications of incomplete preservation and sampling are compounded for terrestrial fossils due to the extreme patchiness of fossiliferous strata requiring a basin to have been present at the right place and at the right time, with elevation, relief, burial rate, and erosion rate likely playing important roles (Holland 1995(Holland , 2016(Holland , 2022Kidwell and Holland 2002;Peters and Husson 2017). This patchiness can be seen in the outsized importance of Euramerican foreland basins for our understanding of Pennsylvanian forests (Nelsen et al. 2016), followed by the specific importance of South Africa in understanding evolution of the land biota in the Permian and Triassic Anderson 1983, 1997;Gastaldo et al. 2005Gastaldo et al. , 2015.
Together, these factors are likely to lead to an inversion of our basic expectations regarding cause and effect in the rock record. Given the resolution limits of geological time, evolutionary events should be essentially simultaneous with any resulting weathering-mediated carbon cycle perturbations they might have caused. This simultaneity-once filtered through the differential preservation potential and sampling intensity of the geochemical record of environmental perturbation versus the fossil record of potential biotic causes-should result in the earliest record of environmental effect preceding the record of its biotic cause (Fig. 1). Here, this hypothesis is explained with a simple model, and its potential implications are explored in the context of Paleozoic land plant evolution.

Methods
For illustrative purposes, land plant evolution occurring in the terrestrial realm is modeled; however, this logic would similarly apply to any other evolutionary event and an environmental effect argued to be related via causality, such as the impact of earthworm evolution on soil carbon storage, of shallow-marine burrowers on sedimentary geochemistry, or of cyanobacterial evolution and atmospheric oxygenation. The expected number of samples per unit time that will capture either a carbon cycle perturbation or an appropriate fossil of the biotic trigger of the perturbation, E(x), is approximated by the equation: where A is the number of samples per unit time, B is the probability that appropriate sediments are available for geochemical or biological preservation, C is the probability that the geochemical or biological signal was regionally present, and D is the probability that an entirely appropriate sample will have captured the relevant geochemistry or biology. For carbon cycle perturbations, A geo was set to 100,000 samples per 2 Myr, B geo was set to 1 to reflect the relative abundance of marine limestones, C geo was set to 1 to reflect the well-mixed nature of the ocean on relevant timescales, and D geo was set to 0.95 to reflect the signal of some samples being lost to vital effects or to diagenesis and other postdepositional alteration. Fossils-especially terrestrial fossils-are preserved and distributed differently than marine carbonates and are also subject to differences in sampling intensity, leading to divergent value assignments for A bio through D bio . For biological fossils, A bio was set to 1000 samples per 2 Myrpurposely set as being 100 times less than A geo of the carbon cycle sampling; B bio was set to 0.05 to reflect the availability of floodplain, mire, and lake deposits (the relevant land plant fossil record archives) relative to the abundance of marine deposits; C bio was set to 0.25 to reflect relevant available land surface; and D bio was set to 0.3 to reflect the fossil needing to be of the correct lineage. These values are order of magnitude estimates and no precision is implied; however, it is unambiguous that each of the A-D values should be substantially lower for the terrestrial fossil record than for the shallow-marine carbonate record, and the preservational disadvantages of terrestrial fossils are magnified by the multiplication of these factors.
A uniform distribution with min = 0 yr and max = 2 Myr was then sampled A times for both the geochemical and biological samples (Fig. 2). These graphs represent spatial distributions of geochemical data points and fossils. The simulated data were then filtered progressively through sampling, without replacement, of factors B, C, and D (Fig. 2). The perturbation is assumed to last 100 kyr, with an onset 100 kyr after the initial appearance of its evolutionary cause. The expected frequency that the earliest fossil sampling of an evolutionary event will appear either before the onset of the environmental perturbation (i.e., within 100 kyr of the evolutionary origin) or simultaneous with the environmental perturbation (i.e., between 100 and 200 kyr of the evolutionary origin) was then calculated by dividing the number of FIGURE 1. Schematic of how cause and effect will be presented in the rock record. An evolutionary event-i.e., origination of a clade with a trait of biogeochemical importance-occurs, and its abundance increases over time until it has broad expression over the landscape. Its rise to environmental abundance can trigger a carbon cycle perturbation, e.g., through a transient imbalance in global weathering, nutrient fluxes, or organic carbon burial rates. Depending on the environmental impact, such a carbon cycle perturbation may be expressed as an isotopic excursion, a black shale horizon, and/or a mass extinction-all of which can be preserved in the rock record without delay relative to their true timing due to their global character and wide environmental expression. A substantial carbon cycle perturbation must be resolved within 1 Myr, but the return to equilibrium will often take much less time, related to the ∼150 kyr residence time of carbon in Earth's surface reservoirs (atmosphere, ocean, and biosphere). At the same time, a significant lag is expected between the true origin and the first fossil sampling of an evolutionary lineage with expected time lags lasting into the millions of years. This suggests that the cause of a carbon cycle perturbation will most often appear in the rock record after the geochemical perturbation itself.
biological data remaining after filtration through B, C, and D by the original biological data sample size (A bio = 1000). This filtration of the original uniform distribution was then repeated 1000 times. The code used for calculations is included as an R file in the Supplementary Material.

Results and Discussion
Given our assumptions, a new lineage of novel ecophysiological importance will be sampled as a fossil before its ecological spread and induction of a carbon cycle perturbation (i.e., before the leftmost vertical green line in Fig. 2) only 0.015% of the time. Thus, if our approximations of A-D are reasonable to a first order, then only once or twice out of 10,000 events would the fossil cause appear before the effect of a carbon cycle perturbation. Considering also the expected frequency of earliest fossil appearance simultaneous with the carbon cycle perturbation it caused (i.e., between the vertical green lines in Fig. 2) only adds an additional 0.015%.
Although crude, these simple calculations highlight that it is highly unlikely that the earliest fossil would come before an environmental perturbation it caused; rather, the opposite is the case and, most often, the earliest record of the perturbation should appear in the rock record before the earliest fossil of the relevant lineage. The assumptions made in our calculation are conservative where possible. For example, the likelihood of finding a relevant fossil is not uniform but should increase through time as the new lineage increases in abundance and geographic range; therefore, the expected frequency of early fossil finds close to the origination of the taxon or trait should be lower than coded here (Marshall 1990;Holland 2016). Furthermore, the D bio in our formulation assumes any fossil documentation of a lineage is equally adequate when more specific information may be required. As an example, most fossils of a plant lineage may be of leaves with relatively few specimens documenting a habit of large, deep-rooting trees.
Even in an extreme parameterization of the three biological values amplifying preservation potential (B bio = 0.25, C bio = 0.5, and D bio = 0.5, vs. the original B bio = 0.05, C bio = 0.25, and D bio = 0.3, with A bio held the same as in the "Methods"), the expected frequency of the earliest fossil of an evolutionary event appearing earlier than the carbon cycle perturbation it caused only rises to 0.3%. Alternate parameterizations might include a much longer lag between the origins of the relevant trait and its rise to ecological dominance and perturbation of the system (Fig. 1). If the lineage remained rare enough not to impact the system, however, then it is expected to have a more limited opportunity for fossil preservation.
There is a wider parameter space to consider, but the outcome of evolutionary cause generally appearing after environmental effect in the rock record may be inescapable. Certainly, much of Earth history is poorly sampled, but there can be no lag in the sampling of a carbon cycle perturbation: either it has been sampled from the rock record before its resolution or it has not. And if it has not been sampled, then it is simply an unknown for which no explanation will be sought. A longer carbon cycle FIGURE 2. Effect of differential taphonomic filtering on preserving cause and effect in the rock record. In all panels, an environmental perturbation lasting 100 kyr (bounded by the two green lines) lags the true origin (time 0) of its biotic cause by 100 kyr, allowing for the establishment and spread of the lineage before the resulting environmental effects. Geochemical sampling through the time interval can be expected to be effectively continuous (small gray circles, abundant enough to be individually indistinguishable in the graph), but the potential for fossil sampling (blue circles) of the biotic cause before or simultaneous with the resulting perturbation decreases drastically given the parameterization of A, B, C, and D from equation (1) in the text. Thus, the expected frequency of environmental perturbation sampling being earlier than the evolutionary event in the fossil record increases with each step in the data-filtration pipeline. a, Before filtration, 1000 paleontological vs. 100,000 geochemical sampling opportunities are available (i.e., A bio = 0.01A geo ) within the first 2 Myr following the true first appearance of the relevant evolutionary lineage. b-d, Fossil data points filtered successively through the probability of sampling being of an appropriate environment (B bio = 0.05), that the relevant lineage was regionally present (C bio = 0.25), and that an otherwise appropriate fossil is of the correct lineage (D bio = 0.3). Geochemical sampling remains steady through this filtration of the fossil record (i.e., A geo = B geo = C geo = 1), other than a slight drop in sampling (D geo = 0.95) in d, reflecting the possibility of diagenetic alteration of isotopic geochemistry. perturbation lasting >1 Myr can be constructed to allow greater likelihood of its evolutionary cause being sampled before its resolution. However, this would require small imbalances of ∼1% (Berner and Caldeira 1997;D'Antonio et al. 2020) that are unlikely to be recognized as a perturbation in need of explanation-far from the 100%-10,000% imbalances that are considered and implemented into carbon cycle models, for example, in weathering capacity increases between barren and vegetated substrates (Moulton and Berner 1998;Lenton et al. 2012).
At the other extreme, this exercise illustrates how far removed from the actual particulars are various suggestions in the literature that biotic events led to environmental perturbations millions or tens of millions of years later. These scenarios would require the routine capture as fossils of the earliest examples of a lineage when still found only in localized populations of low abundance-contrary to expectations of preservation potential-followed by prolonged suppression of any dispersal across the broader landscape so as to delay environmental impact, despite vegetation being capable of migrating thousands of kilometers on 10 kyr timescales, as documented both in the last deglaciation and during the PETM (Wing et al. 2005;Zanon et al. 2018). Such a scenario would then culminate with a rapid spread at the time of the actual perturbation. Some suggestions in the earlier literature of geobiological impact long after a first appearance can be recognized to have been reasonable in their original context of poor temporal precision regarding the events involved. In this way, it was reasonable in decades past to wonder whether Pangea formation was relevant to end-Permian extinctions, as the temporal constraints were not there to be confident the events were separated by tens of millions of years (Erwin 1993). However, these suggestions must be recognized as artifacts from the history of our science for which continuing citation as viable possibilities is not warranted.
Land Plants and Paleozoic Glaciations: A Case Study.-The appearance of land plants was perhaps the most consequential geobiological event in the Phanerozoic. Today, the group represents ∼80% of total global biomass ( Bar-On et al. 2018) and is responsible for ∼50% of net primary productivity (Field et al. 1998). Land plants are often implicated in changes in the Earth's surface, as they possess high potential for ecosystem and Earth system engineering. This includes their ability both to transport water deep into continental interiors via transpirational recycling (Shukla and Mintz 1982;Boyce and Lee 2017;Ibarra et al. 2019) and to impact weathering on micro- (Drever 1994) and macrospatial scales (Berner 1992;Winnick and Maher 2018), as well as ecological (Moulton and Berner 1998;Moulton et al. 2000) and geological temporal scales (Algeo and Scheckler 2010;D'Antonio et al. 2020;Boyce et al. 2022). Over the Paleozoic, the appearances of land plants, vascular plants, and deep-rooting vascular plant trees in the lowlands followed by colonization of the dry well-drained uplands have each been thought to have increased weathering capacity, with potential impacts including glaciations of varying duration, marine anoxia driven by increased nutrient fluxes, and mass extinction (Berner 1992;Scheckler 1998, 2010;Lenton et al. 2012).
The evolution of terrestrial vegetation and successive innovations in plant physiology and how plants interact with their substrate, such as deep rooting and mycorrhizal associations, can enhance weathering capacity at any given atmospheric CO 2 concentration, leading to lower equilibrium levels of atmospheric CO 2 , as borne out both by modeling and proxy data (Berner 1992(Berner , 2006Royer et al. 2014;Ibarra et al. 2019). Because CO 2 is a greenhouse gas, the permanent lowering of its baseline concentrations can correctly be viewed as a contributing factor in all later glaciations. In this sense, it is logically correct to view the Devonian evolution of trees as contributing to the late Paleozoic glaciations (Berner 1997); however, it would be equally correct to view Devonian tree evolution as contributing to our Cenozoic glaciation, because atmospheric CO 2 has never returned to the concentrations that existed before the Devonian. In both cases, these glaciations that each spanned tens of millions of years would have been rendered more likely to occur with lower equilibrium CO 2 concentrations but would also have been highly dependent on favorable continental configurations and other factors.
Where prior studies have suggested that an evolutionary innovation in the terrestrial biota, including several different land plant lineages, arbuscular mycorrhizal and ectomycorrhizal fungi, lichens, and cryptobiotic soil crusts, was the cause of a carbon cycle perturbation or long-term trend, these studies have done so either by identifying a perturbation or trend and scanning backward in time until reaching a suitable evolutionary origin or by identifying an evolutionary origin and scanning forward in time until reaching a suitable perturbation or trend (e.g., Berner 1992Berner , 2006Algeo et al. 2001;Heckman et al. 2001;Kennedy et al. 2006;Lenton et al. 2012Lenton et al. , 2016Kump 2014). When operating under this paradigm, the evolutionary origin coming before the effect in the rock record becomes an unavoidable outcome, because it is already assumed in the first place. Our findings suggest that a different, counterintuitive logic may be more accurate: one should look after the effect for the cause. In practice, this would involve recognizing the geochemically recorded timing of the perturbation to be accurate and then identifying evolutionary causes that might have been plausibly simultaneous with the perturbation while recognizing that the first record of that cause can be expected to come later in the stratigraphic record-perhaps by a few million years.
The glacial pulse associated with the Hangenberg crisis serves as a useful case study to apply this logic in Earth history. This glacial pulse was terminal-Devonian (∼359 Ma) and lasted >100 kyr (Myrow et al. 2014), and its timing and duration are now well understood globally (Caplan and Bustin 1999;Kaiser et al. 2015;Becker et al. 2016). The duration and directionality of climate deterioration are consistent with a pulse of globally elevated weathering fluxes relative to volcanic outgassing as atmospheric CO 2 declined to a new equilibrium concentration (D'Antonio et al. 2020). Although the cause of the perturbation remains uncertain and may have been abiotic (Caplan and Bustin 1999), it has been suggested that plant evolution played a role (Pawlik et al. 2020). If the perturbation was caused by a land plant evolutionary event, then the question should be, what could have happened at the same time as the event? On their own, the evolution of seed plants is removed from possibility, because they first appear as fossils within the Famennian (specifically, Fa2c miospore biozone) (Gillespie et al. 1981;Rothwell et al. 1989), approximately 363 Ma (House and Gradstein 2005)-roughly 4 Myr too early to have been a cause of the Hangenberg glacial episode. Likewise, on their own, the evolution of deep robust rooting systems is removed from possibility, because they appear as fossils in the mid-Devonian (Stein et al. 2020)-25-30 Myr before the Hangenberg glacial episode-much too early to have been relevant. If these earlier land plant evolutionary events in the Devonian (i.e., the origin of seeds and several independent originations of deep rooting systems) did represent carbon cycle perturbations, then they would most likely have been manifested as some of the Devonian black shale horizons, although with existing age constraints and the patchiness of the geological record it may be difficult to match specific evolutionary events with specific periods of widespread black shale deposition Scheckler 1998, 2010).
For the Hangenberg glaciation, the most realistic contender for a biotic cause might be the evolution of the first abundantly woody trees specifically among the seed plants. Large, deep-rooting trees had been present in proximal settings since the Middle Devonian among free-sporing vascular plants, but the severing of dependence on environmental water for reproduction may have allowed seed plant trees to spread inland more broadly, including to the uplands, and could have led to a meaningful pulse of elevated weathering fluxes on a global scale. Seed plants are first known earlier in the Late Devonian as smaller shrubs that would have been more shallowly rooting; the first massively woody seed plant trunks appear as fossils in the earliest Carboniferous (Galtier and Meyer-Berthaud 2006;Decombeix et al. 2011;Chen et al. 2021). Thus, it is these trees that might have plausibly spread at the time needed to have induced the global perturbation that is the Hangenberg.
For decades, we have recognized that the victims of mass extinction should last appear in the fossil record before the geological record of the environmental event (Signor and Lipps 1982;Marshall 1990;Marshall and Ward 1996). In a parallel way, the first appearance of a fossil lineage should postdate any global environmental disruption caused by the lineage, as with the carbon cycle. Of course, the evolution and spread of those seed plant trees seen in the earliest Carboniferous may instead have been a response to the climate change inherent in the Hangenberg glacial event, rather than being the cause of that event. A basic consequence of the logic advocated here is that effect and true cause should be difficult to distinguish. The record requires a complex reading that may always be ambiguous. However, there is still value in understanding the limits of what can be known and eliminating from contention all the traditional suspects that evolved millions of years too early.

Acknowledgments
We thank M. Patzkowsky, S. Holland, and A. Bush for their helpful comments during the review process, including the suggestion from A. Bush to add Fig. 1

Data Availability Statement
The R code used in this project is available as Supplementary Material in Dryad: https://doi. org/10.5061/dryad.fbg79cnxw.