Sorting of persistent morphological polymorphisms links paleobiological pattern to population process

Abstract Biological variation fuels evolutionary change. Across longer timescales, however, polymorphisms at both the genomic and phenotypic levels often persist longer than would be expected under standard population genetic models such as positive selection or genetic drift. Explaining the maintenance of this variation within populations across long time spans via balancing selection has been a major triumph of theoretical population genetics and ecology. Although persistent polymorphisms can often be traced in fossil lineages over long periods through the rock record, paleobiology has had little to say about either the long-term maintenance of phenotypic variation or its macroevolutionary consequences. I explore the dynamics that occur when persistent polymorphisms maintained over long lineage durations are filtered into descendant lineages during periods of demographic upheaval that occur at speciation. I evaluate these patterns in two lineages: Ectocion, a genus of Eocene mammals, and botryocrinids, a Mississippian cladid crinoid family. Following origination, descendants are less variable than their ancestors. The patterns by which ancestral variation is sorted cannot be distinguished from drift. Maintained and accumulated polymorphisms in highly variable ancestral lineages such as Barycrinus rhombiferus Owen and Shumard, 1852 may fuel radiations as character states are sorted into multiple descendant lineages. Interrogating the conditions under which trans-specific polymorphism is either maintained or lost during periods of demographic and ecological upheaval can explain how population-level processes contribute to the emergent macroevolutionary dynamics that shape the history of life as preserved in the fossil record.


Introduction
Polymorphism is ubiquitous within natural populations.Spanning levels from the genome to the phenotype, it is difficult to look far without observing variation.Positive selection, purifying selection, and genetic drift are all expected to remove variation from the population, which https://doi.org/10.1017/pab.2023.27Published online by Cambridge University Press has appeared at odds with the vast reservoirs of variation displayed by many organisms.However, theoretical and empirical work has shown how environmental heterogeneity (McDonald and Ayala 1974;Christiansen 1975;Chakraborty and Fry 2016;Gallet et al. 2018) and negative frequency-dependent selection (NFDS) (Clarke 1979;Charlesworth 2006;Fitzpatrick et al. 2007;Takahashi et al. 2010) can protect such "balanced polymorphisms," maintaining genetic and phenotypic variation over long timescales.While variation within populations is most often examined from the standpoint of microevolution, there is increasing evidence that polymorphisms can persist over long time periods.In spanning temporal and taxonomic scales, persistent polymorphic variation might offer a window into understanding how the microevolutionary processes that promote long-term polymorphism might shape macroevolutionary patterns and processes.
Despite its omnipresence in nature and frequent treatment in population genetics, the causes and effects of intrapopulation polymorphism have been understudied at deeper timescales.Characters employed in phylogenetic studies frequently display polymorphism across multiple lineages, but researchers have frequently downplayed this variation.Nevertheless, they are sufficiently common to substantially impact phylogenetic inferences when not explicitly accommodated in analyses (Wiens 1995(Wiens , 1999)).Phenotypic polymorphisms also feature strongly in paleontological systematics.Characters chosen for cladistic analysis between fossil lineages are frequently rife with polymorphism, which can present challenges to existing algorithms (e.g., Trueman 2010;Gilbert 2013;Whiting et al. 2016;among others).Relative to the goals of systematics, these are often treated as nuisances.This is perhaps due to historical gaps in the integration between systematics, phylogenetic methods, and population genetics.Despite being frequently observed, the apparent maintenance of polymorphism across populations separated by millions of years of independent evolution remains understudied as a macroevolutionary phenomenon.
A small, but growing body of work has contributed early inklings of the potential for sustained variation to explain evolutionary dynamics across species boundaries and over progressively deep timescales.One increasingly important issue in evolutionary ecology concerns how phenotypic and genetic variation evolves across species boundaries (e.g., Thompson et al. 2019).Standing genetic variation can facilitate rapid adaptation when populations are exposed to new ecological pressures, such as might occur during ecological speciation (Barrett and Schluter 2008;Sicard et al. 2016;Lai et al. 2019;McGee et al. 2020).There also has been an increasing focus in population genomics on understanding how (often balanced) polymorphisms in large ancestral populations may be "sieved" during the speciation process, becoming fixed among descendants either randomly or by selection (Pease et al. 2016;Guerrero and Hahn 2017)."Deeply persistent" polymorphisms can even be maintained by NFDS over millions of years, shaping macroevolutionary dynamics over large clades (Igic et al. 2006;Goldberg et al. 2010).
Paleobiologists have spent less time seeking process-based explanations for the evolution of phenotypic variation.However, the fossil record might be leveraged to offer unique insight into this fundamental question.Fossilized populations often display variation maintained over long intervals of time (e.g., Brothwell 1963; also see examples reviewed extensively by Van Valen 1969).Although the fossil record has been underutilized as a laboratory to investigate population genetics, the marine invertebrate record has shown promise in its ability to illuminate the processes governing the maintenance of phenotypic variation (Kermack 1954).Intraspecific variation has been hypothesized to have contributed to the explosive diversification of trilobites during the Cambrian (Webster 2007).Fossil lineages frequently display high population-level variation, both in quantitative traits and discrete character states.This variation can be maintained for long periods of time, across a range of environmental conditions (including environmental stasis; see Schopf and Gooch 1972).Despite these substantial advances, the full incorporation of persistent, trans-specific phenotypic polymorphisms in the fossil record into paleobiological theory has remained incomplete.Nevertheless, the presence of persistent intraspecific variation shared between fossilized lineages is an often-unrecognized curiosity when viewed from the lens of population genetics that may contain hidden insights when bridging micro-and macroevolutionary scales.
While not always formally recognized as such, the evolution of phenotypic variation during speciation has long factored implicitly into fundamental questions in paleobiology (Simpson 1944;Eldredge and Gould 1972;Gingerich 1974;Polly 1997).In particular, paleobiologists have often been concerned with understanding how speciation proceeds between ancestral and descendant lineages.Key neontological work has also suggested that examining the maintenance of ancestral variation and its distribution among descendant populations may explain the formation and adaptive divergence of new lineages (Wright 1982;Schluter and Conte 2009).Budding speciation, identified as asymmetric speciation between a large ancestral population and a smaller descendant population, has long served as paleobiological bread and butter (e.g., Rensch 1959;Jackson and Cheetham 1994;Aze et al. 2011;Warnock et al. 2020;Parins-Fukuchi 2021).One important characteristic of budding is the persistence of the ancestral lineage alongside its newly formed descendant.There may be a geographic component, wherein the ancestral lineage occupies a wide geographic range and the descendant lineage arises due to peripheral isolation within a smaller area, but this is not essential.Budding may also be used to describe the speciation dynamics observed in important neontological work (e.g., Schluter 2000), and so is a possible conceptual bridge by which the population processes underlying the evolution of variation studied by neontologists may be more thoroughly integrated into the paleobiological research program.
Tracing the evolution of polymorphic characters across the repeated episodes of budding speciation found in dense fossil records would provide a unique opportunity to test the predictions from population genetics surrounding the inheritance of persistent, standing variation into descendant populations and examine their effects over longer timescales.When populations bud from an ancestral lineage, the pattern by which traits and alleles become fixed in the new species may be "parallel," "divergent," or "random" (Schluter and Nagel 1995;Thompson et al. 2019).In the parallel case, new descendant lineages tend to fix the same traits as they radiate into similar environments.In the divergent case, descendants evolve in opposing phenotypic directions as they radiate into distinct environments.Finally, newly budded species may randomly sort variation present in the ancestor if demographic asymmetry in population divergences leads to bottlenecks.Each of these leads to different implications both within lineages and over deeper time.
In this paper, I attempt to explain the evolution of phenotypic variation across episodes of budding speciation in two fossil lineages: Ectocion, a genus of Eocene mammals closely related to perissodactyla; and botryocrinids, a family of mostly Mississippian cladid crinoids.My first goal was to explain how an explicit treatment of persistent polymorphisms can clarify interpretations of biological patterns in the fossil record.I then sought to further examine the ways by which persistent ancestral variation is sorted into descendant lineages.I sought to test whether (1) descendant species tended to be less variable than ancestral species, a key prediction of budding speciation; and (2) whether ancestral variation tended to sort randomly, consistent with the fixation of polymorphisms due to population bottlenecking, or nonrandomly, consistent with either convergent or divergent selection following speciation.

Datasets
I harvested morphologic and stratigraphic datasets encompassing two lineages from the literature: Ectocion (Thewissen 1992) and botryocrinidae (Gahn and Kammer 2002).These two lineages offer several advantages.Both possess dense fossil records, simplifying the identification of likely ancestor-descendant relationships.They also have benefited from careful work delimiting species and tracing continuous lineages through stratigraphic zones in particularly well-studied regional faunas.The operational taxonomic units (OTUs) represented within each dataset can therefore likely be regarded as trustworthy.This careful work has also provided greater certainty in trusting the persistence of polymorphisms observed in each dataset-while character state frequencies among several polymorphic characters fluctuate throughout the range of each lineage; overall, the polymorphic variation attributed to each lineage largely remained intact.The maintenance of polymorphisms throughout lineage durations is further supported by the persistence of polymorphic characters across several speciations.The repeated observation of polymorphism across multiple lineages favors the persistence of polymorphisms within and between lineages as the most parsimonious explanation.
The Ectocion dataset was entirely dental.After reconstructing the phylogeny, I mapped the evolution of the P3 metacone, which was polymorphic across two lineages over long time spans.While explicitly mapping such detailed dental morphology to dietary function is a very challenging prospect, it is reasonable to imagine that the metacone, which is haphazardly distributed across Ectocion, may have provided some dimension of dietary function.The botryocrinid dataset sampled characters across the Bauplan, spanning several anatomical regions, including the arms, stalk, and calyx plating, including configuration of the posterior plates.Polymorphisms were observed across all regions.Functional interpretations for each of these traits are very challenging.However, all, or nearly all, do interact with the environment in some way.Premolar morphology undoubtedly plays a major role in food processing in terrestrial vertebrates, such as Ectocion.Functional morphology is also well characterized in crinoids.For example, crown morphology appears to contribute to hydrodynamics (Cole et al. 2019), while column morphology can correspond to the ecological niche of crinoid species via tiering (Ausich 1980;Ausich and Bottjer 1982).As a result, it is feasible that they could be under some form of selection, although closer biomechanical and eco-phenotypic analysis would be needed to better understand the specific functional context of each of these traits within the lineages examined.
Several of the characters observed by Gahn and Kammer (2002) were not true discrete characters, but rather, discretized descriptions of variation that is fundamentally continuous.
While this challenges interpretations of these particular characters within the framework of strict allelic polymorphism, I believe they were appropriate to retain in the current study.This is because my goal in analyzing the botryocrinid dataset was to examine how ancestral variation is sorted and maintained across ancestordescendant transitions.While coarse, even these discretized treatments offer resolution into this topic, especially if the continuous variation represented within is at least somewhat discontinuous in its frequency distribution.Such variation is also likely to correspond to higher heterozygosity among underlying loci, supporting their use in interpreting broad patterns underlying the evolution of variation.

Phylogenetic Reconstruction
I implemented a very simple method for the reconstruction of phylogeny, including ancestor-descendant relationships, for this study.It is based upon the greedy, agglomerative algorithm employed by neighbor-joining (Saitou and Nei 1987).It only differs by incorporating stratigraphic information into the distance matrix and allowing earlier-occurring lineages to serve as direct ancestors of those that occur later.The approach is also very similar to the greedy algorithm described by Alroy (1995), with some simplifications.It starts by constructing a matrix of all of the morphological distances between each OTU, calculated as the number of distinct character states between each possible OTU pair.In the case of polymorphism, OTU pairs are assigned a distance of zero if the intersection between character states displayed by each was non-empty.Stratigraphic distances are then added.These are calculated as the number of discrete gaps separating non-overlapping lineages.Lineages with overlapping or abutting ranges are assigned a stratigraphic distance of zero.The least dissimilar OTU pair is then identified and joined.If one of the OTUs first occurs before the other, it is assumed to be ancestral.If the OTUs start in the same time horizon, they are assigned a shared hypothetical ancestor that is placed in the same time bin.The algorithm then proceeds iteratively, adding OTU pairs until all pairs are joined.The tree is then rooted either by using an outgroup or by presuming the most stratigraphically basal OTU is the root.The resulting tree is one that minimizes the amount of evolutionary change and stratigraphic gaps across lineages.It thus bears similarities to both stratophenetics (Gingerich 1979) and stratocladistics (Fisher 2008).
The method used here is very simplistic.I do not endorse its use in large, complex datasets.Nevertheless, I believe it is adequate for my purposes.This is because the datasets employed in this study are quite small and benefit from having very dense and well-characterized stratigraphic and geographic ranges.In addition, the high degree of polymorphism displayed by both lineages supported the use of the distance approach used here over existing implementations, in that it offered a simple solution to the treatment of polymorphism when calculating evolutionary distances.This was important, given the high degree of polymorphism in the Barycrinus dataset and the inability of most existing parametric and parsimony approaches to accommodate polymorphism explicitly.Several existing Bayesian approaches entertain ancestor-descendant hypotheses, including budding speciation (Stadler et al. 2018).However, given the extensive challenges associated with searching treespace to explicitly identify such hypotheses, rather than simply integrating over them, I felt the simplicity of the method here was justified for use on the particular datasets employed.As a final note, in the absence of contrary information, Polymorphism in the fossil record it is more parsimonious to assume that taxa are related through ancestor-descendant sequences rather than invoking hypothetical ancestors (Polly 1997).This is particularly the case if budding speciation is assumed to predominate and sampling rates are high (Foote 1996).Moderate to low preservation, when paired with bifurcating speciation, may make hypothetical ancestors a safer a priori assumption, and so the extent to which this appeal to model parsimony generalizes to other cases should be gauged against these considerations.Nevertheless, given that the density of the fossil records displayed by the lineages examined here increases the odds of sampling ancestral taxa, this reasoning should lend some epistemological confidence to the method's behavior of minimizing the number of hypothetical ancestors.Nonetheless, my approach to phylogeny reconstruction, while likely appropriate for the densely sampled and carefully studied lineages employed here, is propositional, and ultimately somewhat crude, in its nature.It should therefore be accompanied by a more rigorous, evaluative, approach for future larger-scale studies of budding dynamics in the fossil record, such as stratocladistics (e.g., Fisher 1991Fisher , 2008) ) stratolikelihood (e.g., Wagner 1998Wagner , 2000)), or full-Bayesian approaches (e.g., Wright et al. 2021).On larger datasets, the simple approach implemented here may remain useful in generating a starting tree for more exhaustive heuristic searches.
I applied this algorithm to reconstruct relationships in the taxa Ectocion and botryocrinids.Stratigraphic ranges were harvested from the same references from which I sourced the morphologic information.One small adjustment was made in the botryocrinids.I allowed Barycrinus rhombiferus Owen and Shumard, 1852 to be ancestral to Barycrinus spurius Hall, 1858, despite them having originated in the same layer.This is because Gahn and Kammer (2002) made a compelling case for B. rhombiferus's likely status as an ancestor to multiple other Barycrinus lineages.In addition, while B. rhombiferus and B. spurius were both longlasting and highly polymorphic lineages, B. rhombiferus possesses fewer derived character states than B. spurius.This makes a B. rhombiferus → B. spurius ancestor-descendant pair more parsimonious, according to stratocladistic criteria, than the reverse.In addition, allowing the possibility also achieved a result with several fewer hypothetical ancestors, making a strong appeal to model parsimony.It should be clarified that B. rhombiferus was not, a priori, constrained or assumed to be ancestral to B. spurius.This possibility was simply allowed by the analysis.

Random versus Nonrandom Sorting of Ancestral Variation
I devised a simple permutation test to identify whether ancestral variation in botryocrinids tended to sort randomly among descendant lineages, which would be suggestive of genetic drift driving fixation during speciation bottlenecks (i.e., founder events), or whether descendant lineages tended to fix the same ancestral variants, which would be consistent with natural selection during adaptation in similar environments (i.e., "ecological speciation" sensu Schluter and Conte [2009]).Using the ancestor-descendant transitions reconstructed across Barycrinus, I took the polymorphic character states displayed by each ancestral lineage.I then randomly sampled a single character state from each ancestrally polymorphic character to generate a hypothetical character sequence of inherited polymorphisms for each descendant lineage observed in the dataset.I then compared the pairwise Hamming distances between the simulated character states displayed by each descendant.For calculating distances, polymorphisms were treated as a unique character state.As a result, a comparison between a (01) polymorphism and (1) monomorphism resulted in a distance of 1.This was done according to the reasoning that, if incipient species were experiencing positive selection in new environments, they should purge the same ancestral character states.My expectation was that descendants adapting to similar environments would likely share more character states and thus have fewer differences.I repeated this sampling procedure 1000 times to generate a frequency distribution of pairwise distances between the character states displayed by each descendant that were randomly sampled from the polymorphic ancestor.I then compared the observed distances with the empirical distribution to identify any descendant taxon pairs that were more similar (indicating parallel adaptation) or different (indicating divergent adaptation) than expected under random sorting, using the 2.5% quantile as a significance threshold.

Ectocion Phylogeny
Phylogenetic relationships reconstructed within Ectocion (Fig. 1) are largely concordant with the interpretation that Thewissen (1992) derived from the same dataset based on stratocladistic criteria.The main point of contention lies in the placement of Ectocion mediotuber Thewissen, 1990 and Ectocion cedrus Thewissen, 1990 in relation to Ectocion collinus Russell, 1929.The original study suggested the three taxa form a grade, separated by hypothetical ancestors, whereas the current reconstruction posits that E. cedrus and E. mediotuber independently budded from E. collinus.This is a minor difference that will be discussed further later in the paper.Otherwise, the ancestordescendant linking of E. mediotuber to Ectocion osbornianus Cope, 1882 and Ectocion parvus Granger, 1915 corresponds identically to the original study.Thewissen was conservative in his Further evaluation would be needed to distinguish between these two.Timescale reflects the discrete zonation used by Thewissen (1992).
interpretation of Ectocion superstes Granger, 1915.However, the approach here suggested that it may have speciated from E. osbornianus.It should be noted that, while the diagram presented here gives the appearance of a direct ancestor-descendant relationship between the two, it is very possible that their true genealogical relationship is separated by one or more unobserved species.This would still make E. osbornianus an indirect ancestor.

Evolution of the P3 Metacone Polymorphism
Mapping the evolution of the P3 metacone to the Ectocion phylogeny illustrates the importance of considering both ancestordescendant relationships and polymorphism when reconstructing patterns in morphological evolution in the fossil record.A standard cladogram would fail to capture how polymorphisms in E. mediotuber and E. osbornianus persisted and sorted among the respective descendants of each, instead representing the pattern of character changes through multiple character reversals.One slightly confusing pattern in P3 metacone evolution is the apparent sudden emergence of a monomorphic metacone in E. cedrus.A more parsimonious interpretation, based on P3 morphology, of the sequence of lineage divergence than that achieved using the greedy method implemented here is given in the inset to Figure 1, which implies that the metacone evolved one time and sorted among descendant lineages while being maintained across multiple species.Alternatively, it is possible that the metacone truly did evolve twice in Ectocion.A stronger understanding of the developmental programs controlling dental variation would help to distinguish whether the metacone is sufficiently evolvable to have emerged twice, but distinguishing between these two possibilities falls outside the scope of this study.While the invocation of an additional, unaccounted polymorphic hypothetical ancestor provides a better explanation for P3 morphology evolution, detailed phylogenetic work that more thoroughly explores treespace using parsimony or probabilistic criteria will be needed to confidently distinguish between these possibilities.
Distinguishing between inheritance of ancestral polymorphism and character reversal is important for deriving meaningful biological interpretations.The pattern in P3 evolution reconstructed here is consistent with a scenario in which balancing selection maintains variation across a long-lived lineage, giving rise to descendant lineages that fix that ancestral variation in different ways.If the maintenance of polymorphism by balancing selection is as common as ecological theory might suggest, many character distributions among fossil taxa (which are often interpreted as being highly homoplasious) may simply be driven by the sorting of variation among a small handful of variable ancestral lineages.In addition to potentially helping solve problems in paleontological systematics, exploration of these dynamics may shed light on the processes that drive the diversification of clades over paleobiological timescales.
The persistence of the P3 metacone polymorphism across E. mediotuber and E. osbornianus over several million years is difficult to explain by genetic drift alone.While some fluctuation was observed in the character state frequency throughout the range of E. osbornianus, the polymorphism remained at reasonably intermediate frequency, with >30% and <60% of individuals displaying the metacone throughout the duration of the lineage (Thewissen 1992).NFDS provides one possible explanation for this maintenance.Competition for food resources could provide negative, frequency-dependent feedback on the frequency of each P3 morph if metacone presence yielded different fitness outcomes related to the processing of some limiting food resource as a function of population density.Several modes of competition have long been recognized as an important cause of NFDS (Antonovics and Kareiva 1988;Dijkstra and Border 2018).The most relevant mode for P3 morphology might be resource competition associated with density dependence.Alternatively, the persistence of the P3 polymorphism might instead result from some form of bet-hedging or response to soft selection in the presence of environmental heterogeneity over either space or time.At least equally likely is the possibility that P3 morphology is selectively neutral and fluctuated in frequency due to genetic drift.Whatever maintained this polymorphism, the results here demonstrate how sorting from ancestral stock shapes the phenotypes displayed by incipient descendant lineages.The fixation of monomorphic P3 forms in the descendant lineages E. major, E. parvus, and E. superstes reflects either adaptation or drift from standing variation present in the ancestral lineages.If some form of balancing selection maintained P3 polymorphism in E. mediotuber and E. osbornianus, it is possible that demographic bottlenecks encountered during speciation of the three descendant lineages led to random fixation.More exploration into the overall levels of phenotypic diversity and geography of these descendant lineages would help to examine the feasibility of this explanation.

Barycrinus Phylogeny
Throughout its evolutionary history, Barycrinus displayed several long-lived lineages that gave rise to more than one descendant (Fig. 2).In particular, Barycrinus rhombiferus is identified as having been a highly prolific lineage, with four immediate descendants.This is consistent with its status in Gahn and Kammer's  Barycrinus underwent a rapid radiation while drawing upon a stock of ancestral variation maintained by B. rhombiferus and, to a lesser extent, B. spurius.This dynamic provides a small-scale demonstration of how the filtering of phenotypic variation during speciation might shape patterns at deeper timescales.In the phylogenetic analysis presented here, B. rhombiferus is depicted as having given rise to four descendant lineages.Two of these, B. spectabilis and B. scitulus, displayed a small fraction of the variation displayed by their ancestor as well as significantly reduced durations.If both of these incipient lineages occupied the same ecological landscape as B. rhombiferus, they may have been at a competitive disadvantage (at the lineage level), which would explain their shorter durations.The high phenotypic variability displayed by B. rhombiferus could stem from environmental heterogeneity caused by broader niche occupancy.If so, the increased survivorship displayed by generalist crinoids (Kammer et al. 1997;Cole 2021) might be explained by increased bet-hedging capability displayed by generalists that are able to maintain pools of phenotypic variation.Alternatively, if functional interpretations for the traits that are polymorphic in B. rhombiferus are more consistent with non-adaptive explanations (as may be the case; F. Gahn personal communication 2023), it is possible that its high variability simply stems from its increased ability to acquire (and subsequently maintain) variation over its long duration (M.Foote personal communication 2023).This explanation would require a scenario wherein mutation and genetic drift cooperate to introduce and maintain neutral variation over geologic timescales.
Barycrinus magister and B. spurius experienced longer durations than their siblings.This could be explained in B. magister by its evolution of additional derived character states not present in B. rhombiferus, which may be indicative of an escape to distinct ecological conditions.However, this scenario cannot be evaluated at present due to the geographic co-occurrence of both lineages and the lack of complete stem preservation in B. magister (F.Gahn personal communication 2023).While crinoids, as generalist suspension feeders, often occupy very similar ecological niches, a spectrum of niche differentiation does exist (Ausich 1980;Ausich and Bottjer 1982).The extended duration and multiple speciations produced by B. spurius might be explained by its continued maintenance of much of the variation possessed by B. rhombiferus as well as its derivation of several new characters.These possibilities remain speculative within the context of this article; however, they may be fruitfully explored as hypotheses to be tested in future work that includes more comprehensive ecological information that can be compared against functional morphological reconstructions.Evaluating the ideas presented here in light of such additional information may help to more rigorously explain the distinct causes underlying the patterns of lineage persistence and morphological variation displayed by Barycrinus.The analysis here represents a starting point from which to draw deeper links between the evolution of phenotypic variation across species bounds and the ecological strategies employed by crinoid species.

Random versus Adaptive Sorting of Ancestral Variation
The process of budding speciation, as typically conceived, may often yield population bottlenecks.If the bottleneck is strong enough, different character states would be expected to fix when small subpopulations are drawn from polymorphic ancestral populations over repeated trials (Fig. 3A).Alternatively, sufficiently strong positive selection would fix the same character state over repeated trials if incipient lineages repeatedly move out of the ancestral niche and into the same derived niche (Fig. 3B).I leveraged the repeated budding speciations displayed across Barycrinus to identify whether ancestral variation tended to sort randomly (Fig. 3A) or adaptively (Fig. 3B).Based on the permutation tests, it was not possible to distinguish the pattern by which the descendants of B. rhombiferus and B. spurius sorted and fixed variation from their respective ancestors from the expectation under drift induced by bottlenecking (Tables 1 and 2).None of the pairs of descendants of either lineage showed statistically significant signs of parallel or divergent speciation.Nevertheless, B. spectabilis and B. magister were somewhat more similar than expected under random sorting, while B. spectabilis and B. spurius were more divergent than expected.The lack of significance could be a consequence of the small pool of traits included in this dataset, so further tests are needed to see whether more comprehensive phenotypic sampling would distinguish the observed patterns from drift.
Throughout the radiation of Barycrinus, variation maintained within ancestral lineages filtered down through a series of successively less variable descendants (Fig. 2).The only exceptions lie in the emergence of B. rhombiferus from Costalocrinus ibericus Kammer, 2001 andCostalocrinus rex McIntosh, 1984 from the hypothetical descendant of C. ibericus.The sudden and almost coincident emergence of high phenotypic variability in B. rhombiferus and C. rex from the monomorphic C. ibericus can be explained by a difference in sample size.Costalocrinus ibericus is known only from a single specimen (Kammer 2001), so the true extent of intraspecific variation that originated across this transition cannot be evaluated from the data used here.The descendants of polymorphic ancestors are universally less variable than their ancestors.In all, seven total lineages formed across Barycrinus by filtering variation maintained over millions of years by B. rhombiferus and B. spurius.While some of the variation displayed by B. rhombiferus may not have persisted over its entire range, the lineage did maintain high levels of polymorphism throughout its existence.Each descendant lineage that budded from B. rhombiferus and B. spurius might be characterized as a unique natural experiment, randomly fixing ancestral variation and either persisting or perishing in its new habitat.More ecological information would be needed to understand the specific phenotypic determinants of persistence and extinction among the varied Barycrinus offspring lineages.

A Brief Note on Speciation Mode
The phylogenetic results in both Ectocion and Barycrinus reveal a pattern of extensive budding speciation.The relative frequency of alternative speciation modes in the fossil record is still not well known.Researchers have dealt with this uncertainty through a range of approaches.Some studies have explored patterns across a range of potential modes (Foote 1996), others have assumed budding a priori (Raup and Gould 1974;Van Valen 1975;Raup 1985), and still others have resorted to the use of cladograms, which make no assumptions about mode, at the cost of evolutionary specificity.Nevertheless, several studies have examined the frequency of modes across lineages.In general, anagenesis is thought to be rare relative to budding, occurring in a small minority of cases at one extreme (2%; Bapst and Hopkins 2017) and in a larger minority at another (25%; Archibald 1993).Another important study found budding to form the predominant speciation pattern, with both anagenesis and bifurcating cladogenesis providing poor explanations for phylogenetic patterns in the fossil record (Wagner and Erwin 1995).Here, I assume that budding speciation forms the predominant pattern in both lineages analyzed, while also allowing for the potential of both bifurcating cladogenesis and anagenesis.The potential for anagenesis appears quite low across both datasets, with only three candidate anagenetic ancestor-descendant species pairs: E. osbornianus-E.superstes, Barycrinus stellatus Hall, 1858-Barycrinus punctus Feldman, 1989, and B. rhombiferus-B.spectabilis, due to their nonoverlapping ranges.Each of these three species transitions could be explained by either anagenesis or budding.The results recovered here are robust to a small handful of anagenetic transitions, with only minor modifications of the interpretation of the demographic processes underlying ancestral (pseudo-)extinction and descendant speciation.The general result that ancestors tend to be more variable than descendants and that ancestral variation tends to sort randomly into descendants would also remain intact.

Phenotypic Variation, Geographic Range, and Lineage Duration
The pattern uncovered here of long-lived, variable ancestors giving rise to multiple, shorter-lived, less variable descendants may be reflective of a more general phenomenon.Barycrinus rhombiferus, which gave rise to multiple descendant lineages over a long duration, was phenotypically variable, geographically widespread, and highly abundant (Gahn and Kammer 2002).The finding that descendant lineages displayed lower phenotypic variation is not surprising, given that they were also more geographically constrained and fewer in number than their ancestor.This is consistent with the demographic asymmetry associated with budding speciation and the peripheral geographic isolation expected under peripatric speciation (Mayr 1963).As a result, the pattern uncovered here highlights the potential role geography may play in shaping both patterns in lineage survivorship and the origin and maintenance of phenotypic variation across lineage radiations.However, while wide geographic range and high phenotypic variability have previously been linked to longer lineage durations (Liow 2007;Payne and Finnegan 2007), drawing causative links between these variables has proved challenging (Foote et al. 2008).It is therefore unclear whether B. rhombiferus's lineage persistence can be explained by its high morphological variability and/or wide geographic range, or vice versa.A stronger understanding of the links between geographic range and morphological variability is also needed.While all three variables are highly conflated, in the clades examined here, incipient species form as subsets of the geographic range and stock of variation present in the ancestor.Future work will be needed to better explain the links between morphological variation, geographic range, and lineage duration in Barycrinus and other lineages.

Budding, Phylogenetic Pattern, and Paleobiological Process
A pattern of multiple descendants arising from a single, longlived, ancestor was uncovered in both Ectocion and botryocrinids.Such complex relationships cannot be represented meaningfully in a purely bifurcating cladogram, resulting only in the appearance of "rogue" taxa and polytomies.This shortcoming of using cladograms as a proxy of evolutionary relationships in paleobiology has long been recognized (e.g., Wagner and Erwin 1995;Bapst 2013).Biological reality demands the accommodation of diverse speciation modes when reconstructing phylogeny in the fossil record.Moving forward, it is important to consider how diverse biological processes such as multiple buddings, anagenesis, and even hybridization (Ausich and Meyer 1994) might shape relationships among fossil lineages.Although substantial advances have been made recently in the evaluation and representation of a greater diversity of speciation modes (Bapst and Hopkins 2017;Stadler et al. 2018;Parins-Fukuchi et al. 2019;Wright et al. 2021Wright et al. , 2022)), many of these still operate from the premise that most relationships tend to be bifurcating and that hypothetical ancestors predominately link fossil lineages.This is largely a matter of implementation, stemming from the fact that most advances in phylogenetic tree searching over the past several decades have focused on inferring trees between contemporaneous lineages and perhaps also the pattern cladists' denial of the identifiability of ancestral taxa (Engelmann and Wiley 1977).In the absence of overturning information (which may also exist), it is simply more parsimonious to initially assume hypotheses of direct ancestry, because these do not demand the ad hoc construction of unobserved "ghost" lineages to explain phylogeny (Polly 1997).As a result, paleobiology might benefit from algorithms that begin from the premise that taxa originating at different times represent ancestor-descendent sequences a priori and allow character data to overturn this starting assumption when necessary.Gahn and Kammer (2002) provided an excellent example of how a cladistic analysis might be supplemented with extensive natural history knowledge to derive deep evolutionary insights.While developing a thorough understanding of the full context and nuances of specimen-based data forms the foundation for sound paleobiological research, such insights can be supported by quantitative evidence and thorough graphical explorations.Once biological patterns can be reconstructed with a reasonably high degree of confidence, it becomes possible to narrow interpretations into a smaller range of possible explanatory, processdriven, scenarios (Fisher 1981).Evidence supporting alternative scenarios consistent with the reconstructed pattern might then be weighed using statistical criteria.Nevertheless, the first step is to further develop theory and methods supporting the reconstruction of ancestor-descendant relationships.While much important work has already been done in this area (see references cited in "Material and Methods"), the results here especially highlight the importance of a continued rethinking of paleobiological phylogenetics and encourage a further untethering of the field from the shackles of cladistic dogma.

Ancestral Polymorphism, Parallel Speciation, and Adaptive Radiation
A major issue in the biology of adaptive radiation has been understanding how ecology shapes the phenotypes displayed by incipient species in the process and aftermath of adaptive radiation.Several authors have highlighted the role of parallel evolution in shaping these episodes (Schluter and Nagel 1995).Under this scenario, rapidly forming species evolve similar traits as they radiate into similar ecological niches.This model of adaptive radiation has been thought to be incompatible with the classic model of peripatric speciation (Mayr 1963), wherein species form rapidly during founder events.Paleobiologists might be inclined to liken these scenarios to budding speciation.The incompatibility between these two models presumably stems from the small population sizes displayed by the descendants that result from budding speciation, which would decrease the efficiency with which natural selection within each new population fixes similar variants in parallel among newly formed species.
While many examples of parallel speciation exist in the neontological literature, the paleobiological literature remains rife with examples of budding, suggesting a role for peripatry in the formation of new lineages.Careful examination of how variation is filtered and maintained at different taxonomic levels across varying ecological contexts might provide the key to reconciliation between these apparently conflicting views.Adaptation that occurs when populations move into new habitats can be fueled by large pools of standing ancestral variation (e.g., McGee et al. 2020).If selection is strong enough among the descendant lineages, advantageous alleles may still overcome the effects of drift in small, peripherally isolated populations, allowing ancestral variation to fix in beneficial ways (Fig. 3B).This demands that the ancestral and descendant lineages exist in distinct ecological conditions, because the maintenance of variation in the ancestral population would require a different, balancing, selective regime than the positively selective descendants.This type of dynamic may be shown between large, marine, ancestral populations of sticklebacks and their smaller descendant freshwater populations (e.g., Schluter and Conte 2009).Species selection also may be able to generate a similar pattern when the effect of drift is too great for selection to operate at the population level.Variation in peripherally isolated populations might become fixed randomly, with only populations that stochastically fixed beneficial alleles able to persist.The Barycrinus analyses here are more consistent with the latter scenario, which showed that the pattern of character state sorting in a descendant lineage cannot be distinguished from drift.Each of these alternatives would yield similar patterns to parallel speciation, with the only difference being the level at which variation filters out deleterious variation.While population-level positive selection may have driven the appearance of parallel speciation in sticklebacks, the Barycrinus analysis illustrates the possible role for species selection in generating similar patterns by differentially culling descendant lineages that each inherited a random suite of characters from its ancestor.

Long-Term Maintenance of Polymorphisms
The trans-specific maintenance of polymorphic characters creates a natural link between the population processes maintaining phenotypic and genetic variation and the macroevolutionary processes shaping the evolution of lineages.Each of the datasets examined here displays polymorphisms maintained over long timescales, both within and between species.The polymorphisms present within lineages were confirmed or implied by the original studies to have been present throughout the duration of each lineage, even though trait frequencies may have oscillated over time in some cases.The true temporal persistence of these polymorphisms was also supported by their frequent appearance across species boundaries.While polymorphisms often sorted into monomorphic descendant lineages, they also persisted in several cases, such as the transition from E. mediotuber to E. osbornianus or B. rhombiferus and B. spurius.Patterns in the maintenance and sorting of polymorphic traits across species boundaries in Barycrinus highlight how variation at the population level stochastically generates variation between species.The transformation of intrapopulation into trans-specific variation paves the way for higher-level processes, such as species selection (Stanley 1975), to operate.Because evolution is fundamentally rooted in the study of biological variation, it makes sense that understanding its persistence across scales provides the indispensable link between population processes and paleobiological patterns.Further developing this framework may provide the key to developing a true understanding of macroevolutionary process and how it connects to the mechanisms of population genetics.

Linking Population and Macroevolutionary Processes
When maintained over long periods against a consistent backdrop of balancing selection, lineages that maintain polymorphic variation will be better suited to persist compared with lineages that lose diversity to drift.It has been argued that the basic prerequisite for selection to operate above the species level is that phenotypic variation be fixed within lineages but variable between lineages (Stearns 1986).However, the persistence of polymorphisms maintained over long periods within and distributed between fossil lineages adds further complexity.When balancing selection is the maintaining force, even lineages that display variation may be subject to species selection under certain conditions.These polymorphic lineages display what might be viewed as a lineage-level analogue to heterozygote advantage (Fig. 4, left).Under this scenario, bet-hedging in the face of spatial or temporal variation in environment and/ or NFDS at the population level maintains stability in trait or genotype frequencies that facilitates higher-level dynamics, such as species selection, to operate simultaneously.In this case, the necessary condition for species selection to operate would be ecologically induced maintenance of polymorphism (potentially both within and across lineages), rather than a total lack of intraspecific variation.On the other hand, lineages that fix variation from polymorphic ancestors may achieve greater persistence by escaping into a new ecological niche that does not impose balancing selection (Fig. 4).Overall, it is important to consider both inter-and intraspecific variation and the ecological contexts within which they are maintained and sorted among lineages when exploring higher-level evolutionary dynamics.
While the level at which variation is displayed remains a critical point in identifying the boundary conditions under which higher-level selection can operate, perhaps an even more crucial point lies in considering the selective and ecological contexts occupied by different lineages.Rather than demanding the absence of variation within species, the conditions allowing species-level selection may instead be specified by comparing the distribution of variation within species (poly-vs.monomorphic) to the ecological conditions occupied across species (e.g., does competition or predation universally impose frequencydependent fitness effects within species?).The longer durations exhibited by B. rhombiferus and B. spurius may stem from higher species-level fitness for polymorphic taxa, which could be driven by balancing forces such as environmental variability or negative frequency dependence resulting from competition or predation.If intraspecific polymorphism is maintained by environmental heterogeneity associated with more generalist ecologies, the new niches occupied by successful monomorphic lineages should tend to be more specialized.This would yield distinct macroevolutionary dynamics, with generalist polymorphic lineages displaying high lineage survivorship and specialist monomorphic lineages displaying rapid species turnover.This dynamic would explain the persistence of polymorphism in the fossil record by the ecological dynamics displayed by Mississippian crinoids (Kammer et al. 1997).Nevertheless, moving forward, it will also be important to test such hypotheses within the context of geographic range and abundance, which may also impact lineage duration and turnover patterns (Liow 2007).
It has long been hypothesized that biased patterns of extinction among new species might shape macroevolutionary patterns observed in the fossil record by culling "ephemeral" species before they are able to fossilize (Raup and Stanley 1978: p. 105;Stanley 1979;Rosenblum et al. 2012;Rabosky 2013).Such biased extinction patterns could arise when new lineages are unable to adapt rapidly enough to new environmental conditions (De Lisle et al. 2021).Meanwhile, paleontological work has suggested that high morphological variation within species might help fuel lineage radiations (Webster 2007).If drawn from highly diverse ancestral populations, the variation displayed by rapidly radiating descendant lineages will become sorted through a mix of stochasticity and adaptation, potentially occurring at multiple levels.Incipient species that are able to draw upon a deep well of ancestral variation may stand a greater chance of inheriting "rescue" alleles that enable them to escape extinction long enough to avoid nonpreservation in the fossil record due to ephemerality.
The "natural experiments" displayed by descendant Barycrinus species illustrate how filtering of ancestral variation, a process that occurs at the population level, might form the basic conditions and raw material for macroevolutionary processes, such as species selection, to operate.Random sorting of ancestral variation into descendant lineages is a pattern predicted by classic evolutionary theory (e.g., Simpson 1944;Mayr 1963;Wright 1982).The manner by which this occurs, when compared against the range of selective conditions inhabited by incipient lineages, might explain how variation at the population level in highly diverse ancestral populations might filter into descendant lineages to shape macroevolutionary dynamics.Recent work has suggested that gene tree discordance, which is at least partially caused by incomplete lineage sorting (ILS)-the random sorting of ancestral variation into descendant lineages-is greater during periods of rapid phenotypic innovation, such as during the early evolution of mammals, birds, or angiosperms (Parins-Fukuchi et al. 2021).This link implies high genetic diversity (and thus perhaps high phenotypic variation as well) within lineages during the early stages of rapid clade diversification.Such variation will stand a good chance of sorting in ways that are discordant with the order of species divergences, yielding two main effects: (1) high gene-tree discordance and (2) the seeming appearance of rapid morphological variation distributed haphazardly across lineages.The resulting combination of stochastic sorting, parallel speciation, and species selection may yield the extensive gene-tree discordance and hemiplasy (Avise et al. 2008;Gurrero and Hahn 2018) that often accompany rapid radiations.Future work drawing more explicit links between coalescent expectations under ILS and the sorting of phenotypic ancestral variation across fossil lineages may help shed light on how population processes provide the fuel for large-scale macroevolutionary processes and patterns to operate.

Conclusion
Explaining the patterns observed by paleontologists in the fossil record in terms of population processes has been a major goal of evolutionary paleobiology since the modern synthesis (e.g., Simpson 1944;Kermack 1954;Van Valen 1963;Eldredge and Gould 1972;Fisher 1985).The persistence of phenotypic variation displayed across fossil lineages, while interesting from a population genetic perspective, may provide even more groundbreaking insights when considered from a macroevolutionary perspective.The patterns reconstructed here highlight how further advances in how we model (1) speciation dynamics, including ancestordescendant relationships, and (2) the filtering of polymorphic phenotypic variation can vastly increase the scope of evolutionary questions we are able to evaluate in the fossil record.More work will also be needed at the population level to better understand how frequencies in polymorphic traits evolve across the stratigraphic ranges of continuous populations.This will contribute to a stronger quantitative understanding of the processes that maintain biological variation among long-lived fossil lineages, such as E. osbornianus or B. rhombiferus.Scaling up, understanding how pools of maintained phenotypic variation segregate between incipient species can provide a crucial link between the mechanisms explored by population genetics and the patterns historically explored in evolutionary paleobiology.Developing stronger links across these levels can help provide more cohesive and deeper explanations of how population-level evolutionary change scales to the luxuriant diversity of patterns observed throughout the history of life.
Acknowledgments.Several of the core themes present in this article were sparked by a series of engaging conversations with L. Rowe.J. Saulsbury offered much salient guidance on the interpretation of the evolutionary morphology and stratigraphy of fossil crinoids as well as excellent discussion around the macroevolutionary points of the article.N. Walker-Hale offered helpful criticism.I thank M. Ahmad-Gawel for being a constant sounding board and critical discussion partner for my more audacious hypotheses of evolutionary process and for offering her keen and skilled eye for editing.M. Foote provided insightful criticism of the article and ideas presented herein.The article was greatly improved by reviews from F. Gahn and D. Wright, who each provided extensive and generous insights into cladid crinoid paleobiology and evolutionary processes.Any errors or overwrought biological interpretations that remain lie solely on my own shoulders.I was supported by National Science Foundation grant DEB-2217117 throughout the course of this work.
Competing Interests.The author declares no competing interests.Data Availability Statement.All data and code used in the analyses presented here are available on Figshare and associated with the digital object identifier https://doi.org/10.6084/m9.figshare.22584097.

Figure 1 .
Figure 1.Reconstructed ancestor-descendant relationships within Ectocion.Bars represent stratigraphic ranges.Dotted lines indicate genealogical relationship between ancestral and descendant lineages.Colors reflect presence or absence of the metacone on the P3.Inset phylogeny represents hypothetical alternative reconstruction that better accommodates pattern in the evolution of P3 polymorphism.Further evaluation would be needed to distinguish between these two.Timescale reflects the discrete zonation used byThewissen (1992).

Figure 2 .
Figure 2. Phylogeny of Barycrinus.Bars represent stratigraphic ranges.Dotted lines indicate genealogical relationship between ancestral and descendant lineages.Width of stratigraphic lines represents scaled number of polymorphisms-a measure of genetic variation within each lineage.Shading represents the proportion of character states displayed by each lineage that were also possessed by its ancestor.Timescale approximates the discrete stratigraphic units used byGahn and Kammer (2002).

Figure 3 .
Figure 3. Distribution of simulated allelic frequencies over 10 replicated populations while a balanced ancestral polymorphism is filtered into a budding descendant under two different adaptive scenarios: A, Polymorphism maintained by negative frequency dependent selection (NFDS) in a large ancestral population that becomes randomly fixed in bottlenecking encountered during budding speciation.B) Polymorphism maintained by NFDS that becomes fixed due to positive selection (PS) in a budding descendant that has dispersed into a new environment.Lineage widths in budding lineage diagrams represent effective population size.Under scenario A, variation is filtered randomly by drift into the descendant lineages.Under scenario B, the new regime rapidly fixes one allele/character state according to its new selective landscape.

Figure 4 .
Figure 4. Species selection dynamics stemming from a persistently polymorphic ancestral population.In this hypothetical scenario, negative frequency-dependent selection forms the selective background in the ancestor.When descendant lineages randomly fix this ancestral variation as they originate, they demonstrate low survivorship in the ancestral niche.Only lineages that can escape and radiate into a new niche while fixing ancestral variation display high survivorship.Ecological opportunity afforded by the new niche may even facilitate enhanced lineage survivorship and proliferation if the trait fixed in the descendant is congruent with the selective demands of the new habitat.
Gahn (2003)m in the fossil record 21 2002) analysis, which found the placement of B. rhombiferus to be highly uncertain, leading the authors to identify the lineage as a "rogue taxon."Theauthorsinterpretedthis to suggest that this uncertainty may have resulted from B. rhombiferus having itself given rise to multiple later-occurring Barycrinus lineages.Based on patterns in the inheritance of ancestral character states, Gahn and Kammer suggested that B. rhombiferus was likely the direct ancestor of Barycrinus magister Hall, 1858, Barycrinus spectabilis Meek and Worthen, 1870, and Barycrinus scitulus Meek and Worthen, 1860-an assertion that was further supported byGahn (2003).They also suggested a possible close link between B. rhombiferus and Barycrinus spurius, going on to speculate that B. rhombiferus may be ancestral to most other Barycrinus lineages.Nevertheless, Gahn and Kammer's analysis, which was relied only on traditional cladistic methods, achieved only low resolution, perhaps due to both the rampant polymorphism and the complex pattern of ancestor-descendant relationships found in the clade.The phylogenetic tree recovered here is also largely consistent with quantitative results achieved using stratocladistics (F.Gahn personal communication 2003)and Bayesian methods (D. Wright personal communication 2022).

Table 1 .
Pairwise phenotypic distances calculated among characters sorted from polymorphisms present in the ancestor, Barycrinus rhombiferus, between descendant lineages, relative to null expectation generated under random sorting of ancestral polymorphisms.No pairs were statistically significant at the 2.5% threshold.

Table 2 .
Pairwise phenotypic distances calculated among characters sorted from polymorphisms present in the ancestor, Barycrinus spectabilis, between descendant lineages, relative to null expectation generated under random sorting of ancestral polymorphisms.No pairs were statistically significant at the 2.5% threshold.