Introduction
Morphology is a standard biological tool for investigating the phenotypic variation that enables taxonomic differentiation of extant (Kaliontzopoulou, Reference Kaliontzopoulou2011) and fossil specimens (Wagner, Reference Wagner2000; Arratia, Reference Arratia2013). Classical morphological approaches have relied on describing the observable taxonomic traits during life, for example, ectoderm coloration (Vitt and Caldwell, Reference Vitt and Caldwell2013; Gray et al., Reference Gray, McDowell, Hutchinson and Jones2017), or skeletal variation between complete or nearly complete skeletons, for example, skull shape and features (Openshaw et al., Reference Openshaw, D’Amore, Vidal-García and Keogh2016). However, finding such traits can pose a problem for researchers who rely on recovered remains to provide sufficient observable characteristics to aid in precise taxonomic identification, but are often faced by single bones of varying completeness (Shipman, Reference Shipman1993; Peng et al., Reference Peng, Brinkman and Russell2001; Carrano and Velez-Juarbe, Reference Carrano and Velez-Juarbe2006; DeMar and Breithaupt, Reference DeMar, Breithaupt, Lucas and Sullivan2006; Lyman, Reference Lyman2008; Brown et al., Reference Brown, Evans, Campione, O’Brien and Eberth2013; Gray et al., Reference Gray, McDowell, Hutchinson and Jones2017). This includes researchers interested in identification of present-day remains, such as ecologists, conservation biologists, zoologists, and forensic scientists, as well as researchers interested in considerably older remains, such as zooarchaeologists, paleontologists, paleoecologists, and conservation paleobiologists. For example, paleo-conservation research focuses on identifying “near-time” fossils (temporal range from the late Pleistocene through the Holocene) to observe the biotic responses to global change factors such as climate change in the absence of anthropic influences (Tyler and Schneider, Reference Tyler, Schneider, Tyler and Schneider2018). This time frame (∼126 ka) represents a period during which recovered fossil specimens are represented by modern equivalents (Lyman, Reference Lyman2008; Dietl et al., Reference Dietl, Kidwell, Brenner, Burney, Flessa, Jackson and Koch2015; Gray et al., Reference Gray, McDowell, Hutchinson and Jones2017; Faith and Lyman, Reference Faith and Lyman2019; Kiessling et al., Reference Kiessling, Smith and Raja2023), which can be used for comparative identification.
Morphometrics is a quantitative method for addressing morphological shape differences to compare specimens of interest, with two main approaches used (Webster and Sheets, Reference Webster and Sheets2010; Zelditch et al., Reference Zelditch, Swiderski and Sheets2012). Traditional morphometrics provides biologists with quantitative measurements, for example, depth, length, proportion, and so on, commonly resulting in massive data tables that can be cumbersome to read and decipher (Zelditch et al., Reference Zelditch, Swiderski and Sheets2012). Additionally, researchers may include qualitative descriptions of morphological differences. However, these can be subjective due to the comparative nature of such terms as “robust,” “narrow,” or similar. Hence, traditional morphometrics, while providing numerous data measurements, can fail to provide a key component for morphology and specimen identification—shape (Zelditch et al., Reference Zelditch, Swiderski and Sheets2012; Richter et al., Reference Richter, Pickles and Barton2024).
Geometric morphometrics utilizes anatomical loci on a specimen of interest to aid in quantifying complex shapes (Zelditch et al., Reference Zelditch, Swiderski and Sheets2012). Such specimens can encompass single bones, for example, quadrate (Palci et al., Reference Palci, Hutchinson, Caldwell, Scanlon and Lee2018); the preserved head region of lizards (Gabelaia et al., Reference Gabelaia, Tarkhnishvili and Adriaens2018); leaf shapes in botanical studies (Viscosi and Cardini, Reference Viscosi and Cardini2012); marine shells (Bocxlaer and Schultheiß, Reference Bocxlaer and Schultheiß2010); and many more. Landmarks (homologous anatomical loci) and semilandmarks (points along a curve) strategically placed to outline the shape of interest serve as Cartesian coordinates in morphospace. These data points allow for the exploration of precise and accurate shape variation between specimens under study, which can then be examined using multivariate statistical techniques (Webster and Sheets, Reference Webster and Sheets2010; Kaliontzopoulou, Reference Kaliontzopoulou2011; Zelditch et al., Reference Zelditch, Swiderski and Sheets2012). Geometric morphometrics is a powerful tool for taxon differentiation (Viscosi et al., Reference Viscosi, Fortini, Slice, Loy and Blasi2009; Viscosi and Cardini, Reference Viscosi and Cardini2012; Cavalcanti, Reference Cavalcanti and Elewa2013; Marugán-Lobón and Buscalioni, Reference Marugán-Lobón, Buscalioni and Elewa2013; Pavlinov, Reference Pavlinov and Elewa2013; Openshaw et al., Reference Openshaw, D’Amore, Vidal-García and Keogh2016; Gabelaia et al., Reference Gabelaia, Tarkhnishvili and Adriaens2018; Kerschbaumer et al., Reference Kerschbaumer, Schäffer and Pfingstl2023) and holds promise for aiding in precise taxonomic identification from a single recovered fossil bone specimen (Bastir et al., Reference Bastir, Böhme and Sanchiz2014; Cornette et al., Reference Cornette, Herrel, Stoetzel, Moulin, Hutterer, Denys and Baylac2015; Dollion et al., Reference Dollion, Cornette, Tolley, Boistel, Euriat, Boller, Fernandez, Stynder and Herrel2015; Gray et al., Reference Gray, McDowell, Hutchinson and Jones2017; Rej and Mead, Reference Rej and Mead2017; Palci et al., Reference Palci, Hutchinson, Caldwell, Scanlon and Lee2018).
Paleontologists are commonly tasked with identifying fossil specimens to the most precise taxonomic level possible. Any given fossil may have experienced a wide range of ecological processes (e.g., predation) and taphonomic agents (e.g., fluvial transport) from the time of death through fossilization to its discovery, resulting in varying degrees of completeness (Badgley, Reference Badgley1986; Shipman, Reference Shipman1993; Brown et al., Reference Brown, Evans, Campione, O’Brien and Eberth2013) and fragmentation (Peng et al., Reference Peng, Brinkman and Russell2001; Carrano and Velez-Juarbe, Reference Carrano and Velez-Juarbe2006; DeMar and Breithaupt, Reference DeMar, Breithaupt, Lucas and Sullivan2006; Cornette et al., Reference Cornette, Herrel, Stoetzel, Moulin, Hutterer, Denys and Baylac2015; Dollion et al., Reference Dollion, Cornette, Tolley, Boistel, Euriat, Boller, Fernandez, Stynder and Herrel2015; Grey et al., Reference Gray, McDowell, Hutchinson and Jones2017). Commonly recovered vertebrate fossils exist as single elements such as teeth, jaws, and vertebrae (Peng et al., Reference Peng, Brinkman and Russell2001; Carrano and Velez-Juarbe, Reference Carrano and Velez-Juarbe2006; DeMar and Breithaupt, Reference DeMar, Breithaupt, Lucas and Sullivan2006). These small bones, which often go unidentified, can provide key insights into the species diversity and ecological interactions of that time, thus providing a more comprehensive view of past ecosystems (Dodson, Reference Dodson1973; Blob and Fiorillo, Reference Blob and Fiorillo1996). However, accurate identification of fragmented bones is inherently challenging due to the reduction or loss of diagnostic features. Geometric morphometrics can extract meaningful data from isolated elements, whether complete or fragmented, due to its multivariate statistical analysis of shape; thus, demonstrating great promise in advancing paleontological research (e.g., Bazzi et al., Reference Bazzi, Campione, Kear, Pimiento and Ahlberg2021).
Nonetheless, geometric morphometrics has its own inherent set of limitations that should be examined and highlighted in paleobiological studies, especially those dealing with underrepresented taxa resulting in smaller and/or imbalanced sample datasets. For example, principal component and canonical variate analyses, used for ordination and classification purposes, can produce inflated apparent separation between groups and exaggerated classification accuracy, respectively, when the sample size is small. Furthermore, an imbalance of group sizes can bias classification boundaries and ordination space, often favoring the larger group. Although cross-validation is purported as a strategy for mitigating imbalance effects (Courtenay, Reference Courtenay2023), performance estimates remain less reliable because classification accuracy can be artificially inflated for the majority class (Lopez et al., Reference López, Fernández, García, Palade and Herrera2013; Spezia and Recamonde-Mendoza, Reference Spezia and Recamonde-Mendoza2025). Hence, the comparative sample size “restricts the inferences that can be made about paleobiology and evolutionary history” (Cardini and Elton, Reference Cardini and Elton2007, p. 121).
Paleoherpetologists are often challenged to find dry skeletal comparatives curated in museum collections, let alone multiple representatives per taxon (Bell and Mead, Reference Bell and Mead2014). When it comes to disarticulated bones, there are even fewer resources; for example, the online repository DigiMorph (http://www.digimorph.org) contains only articulated crania (199 Squamata CT images), making it difficult to view the anatomically relevant positions of isolated bones recovered as fossils, such as for the medial side of the maxilla. Finally, a key factor that must be considered is the potential lack of expertise in species identification (Dettling et al., Reference Dettling, Samadi, Ratti, Fini and Laguionie2024) resulting in misidentified species (Pfenninger and Schwenk, Reference Pfenninger and Schwenk2007; Sigwart et al., Reference Sigwart, Chen, Tilic, Vences and Riehl2023), and specimen information that is “incomplete, imprecise, or inaccurate” (Johnson et al., Reference Johnson, Brooks, Fenberg, Glover, James, Lister and Michel2011, p. 149). Here we provide a protocol to aid researchers in improving the identification of individual specimens (fossils, bones, etc.) from underrepresented taxa using geometric morphometric analysis, while also emphasizing the limitations posed by small and uneven datasets. This approach offers a conservative and cautious framework for taxonomic identification, particularly in cases where researchers must work with single, fragmentary specimens. Working with individual elements is not ideal but is a common issue for taxon identification, especially for small or delicate taxa, that paleontologists are often forced to deal with. Our protocol works by determining (1) how the degree of fragmentation, and hence shape variation, will impact taxonomic differentiation, and (2) the minimum number of comparative specimens required to identify a known fragment to the family and genus level. We aim to provide an inexpensive and nondestructive methodology for improving the identification of fragmented remains for underrepresented taxa. While datasets for these groups may not meet traditional standards of sampling balance, we argue that their inclusion is both statistically feasible and ecologically meaningful when interpreted with appropriate caution.
Materials and methods
Comparative extant specimens
The practice of “whole-body” collection and curation of specimens supports a multitude of research endeavors (Nachman et al., Reference Nachman, Beckman, Bowie, Cicero, Conroy, Dudley and Hayes2023); however, in our specific case, few archaeologists and paleontologists are specialized in the identification and recovery of Quaternary reptile bones, resulting in a lack of comparative skeletal specimens curated in museums (Olsen, Reference Olsen1968; Holman, Reference Holman1995; Bell and Mead, Reference Bell and Mead2014; Broughton and Miller, Reference Broughton and Miller2016). Furthermore, catching and killing reptiles for comparative purposes may pose an ethical dilemma depending on the taxa of interest, as more than a fifth of global reptiles are classified as near threatened to critically endangered (Cox et al., Reference Cox, Young, Bowles, Fernandez, Marin, Rapacciuolo and Böhm2022; Farooq et al., Reference Farooq, Harfoot, Rahbek and Geldmann2024; IUCN, 2024).
To test the proof of concept for this methodology, we focused our attention on lizard taxa within the western regions of the United States, specifically the Pacific Northwest (PNW). The University of Texas at Austin provided the specimen loans (Table 1), which included articulated and disarticulated skeletons of 16 species belonging to 9 of the 11 genera living in the PNW (St. John, Reference St. John2002; Stebbins and McGinnis, Reference Stebbins and McGinnis2018). With permission, the articulated skeletons were macerated with the trypsin enzyme following the methods described by Burns and Meadow (Reference Burns and Meadow2013) to allow the disarticulation of the maxillae from the cranium. Maxillae were the chosen bones for this project, because they are a common occurrence at paleontological sites (Holman, Reference Holman1995) due to the teeth being harder and denser in nature, thus increasing their “preservation and survivorship potential” (Broughton and Miller, Reference Broughton and Miller2016, p. 9) and recovery bias (Bell and Mead, Reference Bell and Mead2014). Additionally, maxillae have useful morphological characters for taxon identification (Holman, Reference Holman1995); indeed, the specimens we prepared have been used in published work on lizard identification by other researchers (Ledesma et al., Reference Ledesma, Scarpetta, Jacisin, Meza and Kemp2024). Many maxillae were complete and in excellent condition; however, some were fragmented to varying degrees, accounting for 5–7% of the dataset. Because we already knew their identity, these fragmented maxillae, henceforth referred to as “test fragments,” allowed us to determine whether geometric morphometric analyses of 2D images could be used to identify them to the family and genus taxonomic levels, while being cognizant of statistical limitations of a small and uneven comparative dataset, with the aim of using these methods for unknown fossil identification.
Table 1. Comparative specimens under study.

a Maxillae are in excellent condition. No fragmentation is evident.
b Various degrees of fragmentation of some maxillae allowing for the representation of one or more of the shape categories (Table 2) of a test fragment.
Microscopy
Each maxilla, whether complete or fragmented, was positioned to lie flat parallel to the mounting surface with the medial side facing upward using a small piece of non-drying modeling clay. Photographs were taken using an AmScope (Irvine, CA, USA) digital stereoscope (model SKU SM-1TSZZ-144S-10M) along with the Zerene Stacker (student edition) focus-stacking program (Richland, WA, USA). This photo-stacking program allows multiple photos taken at differing vertical depths to be aligned and superimposed to create a detailed single image. We used the medial side of the right maxilla for data collection due to the abundance of distinct and easily definable homologous features compared with the lateral side (Zelditch et al., Reference Zelditch, Swiderski and Sheets2012).
As with most squamate datasets, ours faced frequent limitation in terms of sample size (Bell and Mead, Reference Bell and Mead2014). To increase the number of specimens in the dataset, each image of a complete left maxilla was flipped (Rej and Mead, Reference Rej and Mead2017) using the photo-editing GIMP software (Kimball and Mattis, Reference Kimball and Mattis2023; https://www.gimp.org/, accessed February 2025) to appear as a right maxilla (Supplementary Fig. 1), hence doubling the number of specimens in the dataset. The numerical suffixes “-.1” and “-.2” were applied to distinguish these images for comparative specimens, for example, M-13778.1 and M-13778.2. Although we would prefer to include data points from more individuals for statistical independence, this practice of using both left and right maxillae may prove necessary depending on the taxa under study.
Geometric morphometrics
Landmark and semilandmark acquisition
After a careful review of the comparative specimens’ images generated for all samples, we grouped the maxillae into six categories, henceforth referred to as “shapes,” that captured varying degrees of fragmentation commonly found in fossil reptile maxillae (Daza et al., Reference Daza, Bauer and Snively2014; Cornette et al., Reference Cornette, Herrel, Stoetzel, Moulin, Hutterer, Denys and Baylac2015; Georgalis et al., Reference Georgalis, Čerňanský and Klembara2021; Fig. 1): CMAP—a complete maxilla with a fragmented ascending process; MF—a midrange fragment along the dental shelf; AF—an anterior fragment; PF—a posterior fragment; AAP—the anterior curve along the ascending process; and VAP—the dental shelf ventral to the ascending process. We encourage researchers to align images of exemplar comparative specimens side by side to aid in identifying shape curves that are homologous in nature and highlight the subtle variations unique to each specimen. Once established, each shape curve can be assigned a unique code for the taxonomic group and element of interest.

Figure 1. Left, Outlines of a complete maxilla (COMP) and regions represented by the six categories of fragmented maxillae under study. Right, Outlines of the shapes of test fragments: complete maxilla with missing ascending process (CMAP), midrange fragment (MF), anterior curve of ascending process (AAP), anterior fragment (AF), posterior fragment (PF), and ventral midrange of the ascending process (VAP). The landmarks are represented with numbered red dots, while the semilandmark curve is represented by a blue line. Scale bar = 1 mm.
The images, which included the complete maxillae and the test fragments for the category under study, were first converted to a .tps file using the tpsUtil software (Rohlf, Reference Rohlf2023). Next, the .tps file was uploaded into the tpsDig2 software, which allows the placement of homologous landmarks and semilandmark curves along the chosen shape (Rohlf, Reference Rohlf2021). Although there is a recovery bias toward lizard maxillae (Bell and Mead, 2014), they are not wholly resistant to fragmentation. Hence, homologous landmarks were selected using points that are robust in nature to resist damage from natural processes (Gray et al., Reference Gray, McDowell, Hutchinson and Jones2017; Rej and Mead, Reference Rej and Mead2017). Semilandmarks were assigned to create curves that improve the shape variation between taxa (Cornette et al., Reference Cornette, Herrel, Stoetzel, Moulin, Hutterer, Denys and Baylac2015; Fig. 1). Previous studies have emphasized the importance of the placement of semilandmarks to visually capture the curve shape (Gunz and Mitteroecker, Reference Gunz and Mitteroecker2013; Cardini, Reference Cardini2016). However, few, if any, have quantified the threshold at which additional semilandmarks contribute to statistical noise or risk of overfitting. To assess the influence of the number of semilandmarks on classification accuracy and potential overfitting, we ran a series of sensitivity analyses by increasing the number of semilandmarks in increments of 50 for less and more distinguishable shapes, AF and VAP, respectively (Supplementary Tables 1 and 2). While classification accuracy increased with more semilandmarks (shortest Mahalanobis distances), redundancy was observed around 150 to 250 semilandmarks, indicating a potential threshold to avoid overfitting.
A common occurrence in curated specimens and recovered fossils is missing teeth (e.g., Čerňanský and Augé, Reference Čerňanský and Augé2019; Ledesma et al., Reference Ledesma, Scarpetta, Jacisin, Meza and Kemp2024); therefore, landmarks and semilandmarks were placed along the anterior edge and posterior edge of the anterior-most tooth and posterior-most tooth or tooth groove, respectively, for each complete and fragmented maxilla category, which focuses on the dental shelf. Next, the semilandmark curves for each image were resampled by length to contain the same number of semilandmark points, and finally, the semilandmarks were appended to the curve as landmarks using tpsUtil, resulting in 104–254 landmarks depending on the specific category (see Table 2).
Table 2. Number of specimens and landmarks (landmarks and appended semi-landmark points) in each dataset.

a Abbreviations: CMAP, complete maxilla with missing ascending process; AF, anterior fragment; MF, mid-range fragment; PF, posterior fragment; AAP, anterior curve of ascending process; and VAP, ventral mid-range of the ascending process.
Statistical analyses
All analyses were carried out using MorphoJ software (Klingenberg, Reference Klingenberg2011); however, other preferred software that is compatible with .tps files, for example, the geomorph package (Adams et al., Reference Adams, Collyer, Kaliontzopoulou and Baken2012) in R (R Core Team, 2024), would suffice.
Before analysis, metadata classifiers (specimen number, family, genus, and species) were imported into MorphoJ using a comma-delimited spreadsheet (Supplementary Table 3). For each shape category, a Procrustes fit was performed. MorphoJ only performs full Procrustes fit due to outliers having less influence; Klingenberg, Reference Klingenberg2011), whereby each coordinate was aligned along the principal axes, centered around the centroid, resized, and superimposed. Such steps eliminate differences due to orientation, position, and size between specimens, thus leaving only shape as the variable to be analyzed (Zelditch et al., Reference Zelditch, Swiderski and Sheets2012). Next, principal component analysis (PCA) was used to examine the shape variation for each maxilla within the dataset. The aim of this project was to determine whether there is enough shape variation to differentiate taxonomic groups, for example, family, genus, and species; hence, canonical variate analysis (CVA) was performed to observe differences between taxa. Next, CVA was conducted for family, genus, and species to determine which taxonomic level differentiated each taxon to the greatest degree. The aim for a given shape is to have little to no overlap of groups’ ellipses (equal frequency with a probability of 0.9) and a tighter clustering of points resulting in a smaller ellipse. These ellipses represent the multivariate shape variation encompassing 90% of the comparative specimens and provide a conservative threshold for group membership. After it was determined which category resulted in the greatest differentiation between taxa, the test fragment identification analysis was conducted.
Due to the small and unequal sample sizes in our dataset, and because the number of variables (landmarks) exceeded the number of specimens, we used Goodall’s F-test for statistical analysis. Goodall’s F-test is a permutation-based (10,000 permutations in MorphoJ), nonparametric method that does not assume normality or equal variance, making it well suited for high-dimensional morphometric data with imbalanced group representation (Airey et al., Reference Airey, Wu, Guan and Collins2006; Klingenberg, Reference Klingenberg2011, Reference Klingenberg2016; Zelditch et al., Reference Zelditch, Swiderski and Sheets2012; Esteve et al., Reference Esteve, Zhao, Maté-González, Gómez-Heras and Peng2018). This test assesses the shape differences between groups by comparing the variance explained by group membership to the variance within groups. Greater F values signify that a shape provides a higher degree of taxon differentiation with statistical significance. In the event that Goodall’s F values are low for all shapes, none of the selected shapes differentiate between the taxa, and new shapes must be identified.
Each test fragment’s identification code was adjusted to include the prefix “Frg”, and the taxonomic identification was deleted. The shapes CMAP and AAP provided fragmented exemplars; however, VAP did not. Therefore, we used complete specimens as test fragments, which were selected at random with the intent of providing a more diverse representation compared with the fragments available for CMAP and AAP. Additionally, before any geometric morphometric analyses for the fragmentation category under study, the dataset was adjusted to include all the complete specimens and only one test fragment. This was an intentional design choice for two reasons: (1) because we wanted to emulate the conditions of fossil recovery, where isolated fragments are typically damaged and analyzed independently rather than as known conspecific sets; and (2) because CVA is designed to assign an unclassified specimen to pre-defined groups, whereas treating multiple unknowns as a group is invalid within the framework of CVA (Webster and Sheets, Reference Webster and Sheets2010; Klingenberg, Reference Klingenberg2011). With the test fragment dataset established, we proceeded to apply statistical analyses to assess taxonomic placement.
Permutation tests with 10,000 iterations were conducted, and Mahalanobis distances were calculated. Mahalanobis distances calculate the distance between a point, in this case the test fragment, and the mean distribution of all identified groups (Klingenberg, Reference Klingenberg2011). The shortest distance indicates the test fragment is morphometrically more similar to that particular group. This process was repeated for each of the remaining test fragments within the category under study (Fig. 2). Accurate identification of the test fragments was defined by the smallest Mahalanobis distance to the centroid of the correct taxon in the CVA space. An identification was considered confident if, in addition to having the smallest Mahalanobis distance, the test fragment also fell within the ellipse of that taxon.

Figure 2. Flowchart representation of methodology divided into two key stages: microscopy and geometric morphometrics.
To investigate the minimum number of comparative specimens needed for taxonomic identification of a test fragment, we selected the genus Elgaria (Family Anguidae), because it had both the greatest number of comparative specimens and at least one test fragment representative for four of the six categories of fragmentation (Table 3). To start, CVA analysis included the test fragment under study with all Elgaria specimens excluded and all other genera retained in the dataset. Analyses were conducted by randomly adding one specimen of Elgaria and recording the type 1 and type 2 error occurrences for CVA using the family grouping classifier until all Elgaria specimens had been included in the dataset. This process was then repeated using the genus grouping classifier. Each taxon with suitable test fragments was tested in the same manner.
Table 3. Number of comparative maxillae (N) belonging to the Genus genus Elgaria and number of test fragments for each fragment category.

a Abbreviations: CMAP, complete maxilla with missing ascending process; AF, anterior fragment; MF, midrange fragment; PF, posterior fragment; AAP, anterior curve of ascending process; and VAP, ventral midrange of the ascending process.
b There were no fragmented AAP and VAP test fragments; hence, a specimen was assigned as a test fragment.
Results
Shape variation and taxonomic differentiation—excluding test fragments in the dataset
Taxonomic differentiation was achieved for all shapes to a varying degree at the family and genus levels (Table 4). For all categories, eigenvalues for principal component 1 (PC1) and PC2 accounted for >73% of variance, while canonical variate 1 (CV1) and CV2 accounted for >77% and >56% of the variance for family and genus, respectively. The categories AAP, PF, and CMAP resulted in the greatest degree of differentiation for family (16.3031, 16.0261, and 15.2460, respectively, with P < 0.0001), while AAP, CMAP, and VAP resulted in the greatest differentiation for genera (18.7141, 16.7794, and 15.2444, respectively, with P <0.0001). CV1 and CV2 ellipses further illustrate the degree of differentiation at the family (CMAP and AAP) and genus (CMAP, AAP, VAP) levels (Supplementary Fig. 2). To visually compare the morphological and the genetic relatedness of the taxa, we used BayesTrees software (Meade and Pagel, Reference Meade and Pagel2011) to trim a phylogenetic tree for the taxa within our dataset (Zheng and Wiens, Reference Zheng and Wiens2016; Fig. 3).
Test fragment identification accuracy
The categories CMAP, VAP, and AAP resulted in the greatest differentiation at the family and genus levels (Fig. 3, Table 4); therefore, these shapes were used to determine their accuracy at identifying test fragments at the family and genus levels.

Figure 3. Canonical variate (CV1 and CV2) ellipses of the categories that resulted in the greatest differentiation at the family and genus level. The phylogenetic tree (Zheng and Wiens, Reference Zheng and Wiens2016) illustrates the genetic relationship between the species included in the dataset. The tree was trimmed using the BayesTrees software (Meade and Pagel, Reference Meade and Pagel2011).
Table 4. Family and Genus genus principal component analysis (PCA)PCA and canonical variate analysis (CVA)CVA results.

a Abbreviations: CMAP, complete maxilla with missing ascending process; AF, anterior fragment; MF, midrange fragment; PF, posterior fragment; AAP, anterior curve of ascending process; and VAP, ventral midrange of the ascending process.
b Permutation tests (10,000 permutation iterations): P < 0.0001.
CMAP shape
The following specimens were confidently identified at the family level: Phrynosoma douglassii (Phrynosomatidae; Supplementary Fig. 3, Supplementary Table 4) and Elgaria kingii (Anguidae; Supplementary Fig. 3, Supplementary Table 4). Additionally, Elgaria kingii was accurately identified to the genus level (Supplementary Fig. 3, Supplementary Table 4). Uta stansburiana test fragments were not accurately identified at the family or genus level, potentially due to a lack of complete comparatives within the dataset (Supplementary Fig. 3, Supplementary Table 4).
AAP shape
Of the three AAP test fragments, Gerrhonotus infernalis and Crotaphytus collaris were confidently identified to both family and genus levels (Supplementary Fig. 3, Supplementary Table 4). The Sceloporus magister test fragment was not confidently identified; however, it was accurately identified to the family level (Supplementary Fig. 3, Supplementary Table 4).
VAP shape
The following VAP test fragments were accurately identified to the family level: Gerrhonotus liocephalus (Anguidae; Supplementary Fig. 3, Supplementary Table 4), Sceloporus occidentalis (Phrynosomatidae; Supplementary Fig. 3, Supplementary Table 4), Phrynosoma platyrhinos (Phrynosomatidae; Supplementary Fig. 3, Supplementary Table 4), and Elgaria kingii (Anguidae; Supplementary Fig. 3, Supplementary Table 4). The following test fragments were accurately identified to the genus level: Sceloporus occidentalis and Elgaria kingii (Supplementary Fig. 3, Supplementary Table 4). Confident identification at the family level was observed for Gerrhonotus liocephalus (Supplementary Fig. 3, Supplementary Table 4).
Minimum number of specimens
Using the genus Elgaria (N = 28), we determined the minimum number of comparative specimens needed for accurate and confident identification (Table 5).
Table 5. Minimum number of comparatives for accurate and confident identification.

a Abbreviations: CMAP, complete maxilla with missing ascending process; VAP, ventral midrange of the ascending process; and AAP, anterior curve of the ascending process.
The analysis of CMAP included the test fragment FrgM-8582.2 (Anguidae, Elgaria kingii). Family-level analysis of CVA accurately identified the test fragment with a minimum of four Elgaria specimens in the dataset. Confident identification was established with at least 23 specimens in the dataset. Accurate and confident genus identification was achieved with at least 4 and 27 specimens, respectively (Table 5).
Due to the lack of an AAP test fragment representative, a complete Elgaria specimen was selected at random for analysis (M-12129.1 Elgaria multicarinata). Accurate family-level identification was achieved with a minimum of six Elgaria specimens. No confident identification was observed when all Elgaria specimens were included (28). Similar observations were made when determining the minimum number of comparative specimens for genus identification. Accurate genus-level identification was achieved with six comparative specimens; confident identification was not achieved (Table 5).
There was no Elgaria test fragment representative for the VAP category; therefore, the complete specimen M-14800.1 (Elgaria coerulea) was used. Accurate identification at the family and genus levels was achieved with one comparative specimen, while confident identification was not observed at either taxonomic level (Table 5).
Discussion
Our results demonstrate the usefulness of utilizing geometric morphometrics to accurately identify fragmented single bone fossil specimens belonging to underrepresented taxa, which often results in small and unequal comparative datasets. In our maxilla samples, maximizing the length and number of semilandmarks along the curve, not exceeding a threshold to prevent statistical noise or overfitting, was found to provide the most accurate taxon identification. Furthermore, our study highlights that it is imperative to have a comprehensive comparative collection. This entails having specimen representation for each taxon for the geographic region under study. Failure to do so may result in incorrect genus identification, as was observed with U. stansburiana (CMAP). This taxon did not have comparative representation in the dataset, only “test fragments,” resulting in incorrect genus identification for the CMAP shape (Supplementary Table 4). Given that our dataset was not comprehensive for our region of study (9 of the 11 genera), our findings support a conservative approach for family-level identification only. This conservative approach is further supported by our small and imbalanced dataset, which could increase the likelihood of comparative specimens being misidentified at the species level (Pfenninger and Schwenk, Reference Pfenninger and Schwenk2007; Sigwart et al., Reference Sigwart, Chen, Tilic, Vences and Riehl2023) with imprecise, incomplete, or inaccurate specimen information (Johnson et al., Reference Johnson, Brooks, Fenberg, Glover, James, Lister and Michel2011). We note that for the CMAP, VAP, and AAP shapes with more than two comparative specimens within the dataset, where test fragments were assigned to the incorrect genus they were placed with the correct family. The exception was Aspidoscelis tigris (Teiidae), which was not correctly identified to family or genus for VAP, because one of the two maxilla was treated as a test fragment (no actual fragment was available), leaving the second as the only comparative maxilla.
Importantly, when identifying a test fragment using this technique, there will always be a CVA ellipse with the shortest Mahalanobis distance to the specimen, regardless of whether this identification is accurate. While MorphoJ offers cross-validation discriminant analysis, applying it to small and unevenly distributed datasets increases the risk of biased classifications toward overrepresented groups, leading to inflated confidence estimates (Webster and Sheets, Reference Webster and Sheets2010; Klingenberg, Reference Klingenberg2011). For this reason, we do not report cross-validation results.
Furthermore, when a test fragment’s placement is within overlapping ellipses or far removed from an ellipse (Fig. 4), we recommend visually comparing the test fragment’s image to the images of the taxon with the closest match (i.e., the taxon generating the shortest Mahalanobis distance) and making additional qualitative observations to support or refute the CVA-based identification. Such measures may prove necessary when dealing with specimens that are fragmented or preserve features that are missing, damaged, or difficult to interpret, as is the case for many fossil specimens. In these circumstances, our protocol provides an objective means of narrowing possible taxonomic identification or necessitating placement at a higher taxonomic level. For example, FrgM-12708.1 (Phrynosomatidae, Phrynosoma) Mahalanobis distance was identified as Sceloporus (13.1363; Supplementary Fig. 3, Supplementary Table 4); however, when comparing images of these genera visually with that of the test fragment, the absence of tricuspid teeth, which are present in Sceloporus yet absent in Phrynosoma, would refute the genus identification and support a family identification only. Finally, if possible, using more than one shape category can help further support or challenge the taxonomic identification.

Figure 4. Test fragment canonical variate (CV) coordinates relative to comparative taxon’s canonical variate analysis (CVA) ellipses. A red dot represents the test fragment’s CV coordinates relative to the comparative taxon’s CV ellipses. (A) FrgM-8582.2 (Anguidae, Elgaria kingii) is precisely identified at the family and genus level. (B) FrgM-12188.2 (Phrynosomatidae, Sceloporus magister) illustrates a lack of precision for identification; however, the Mahalanobis distance calculations correctly identified to the family level, but not the genus level (Supplementary Table 2).
Utilizing the protocol described above will support researchers in including underrepresented microvertebrate taxa, for example, squamates, that often experience disarticulation and poor preservation due to taphonomic processes (Shipman, Reference Shipman1993; Brown et al., Reference Brown, Evans, Campione, O’Brien and Eberth2013), thereby improving efforts to reconstruct paleoenvironments and aid conservation paleobiologists in their efforts toward mitigating anthropic climate change drivers (Conservation Paleobiology Workshop, 2012; Tyler and Schneider, Reference Tyler, Schneider, Tyler and Schneider2018).
Geometric morphometric taxon identification depending on degree of fragmentation
For our dataset, three of the six morphological shape categories (CMAP, AAP, and VAP) differentiated the taxa to a much greater degree compared with the others (AF, MF, and PF). These results could be attributed to the curve lengths of less distinctive shapes (e.g., AF and PF) being considerably shorter than those of more distinguishable shapes (e.g., CMAP and VAP), with nearly half the number of semilandmarks along the teeth/tooth grooves of the former two categories (five for AF and four for PF) compared with the latter (total teeth for CMAP and 10 for VAP).
Another necessary attribute of fossil or test fragment identification is shapes containing easily identifiable homologous morphological points for all specimens within the dataset (Sheets et al., Reference Sheets, Kim, Mitchell and Elewa2004; Zelditch et al., Reference Zelditch, Swiderski and Sheets2012). The lack of homologous morphological landmarks was observed with our MF shape, where the majority of the test fragments were incorrectly identified due to the difficulty with ascertaining the fragment’s true placement within the dental shelf.
For all three of the well-differentiated shape categories (CMAP, AAP, and VAP), the family Anguidae (genera Elgaria and Gerrhonotus) was identified with greater precision compared with other taxa. This higher accuracy may be attributed to Anguidae having the largest number of comparative representatives within the dataset (more comparative specimens result in a tighter CVA ellipse; see Fig. 3). As more comparative specimens are added to the dataset, the statistical power of the analysis improves (Cardini and Elton, Reference Cardini and Elton2007; Webster and Sheets, Reference Webster and Sheets2010). A more narrowly defined CVA ellipse indicates reduced variability within the taxon and greater differentiation between taxa (Webster and Sheets, Reference Webster and Sheets2010; Klingenberg, Reference Klingenberg2011). Although not tested here, these patterns may also reflect ecomorphological differences among lizard groups. For example, ecological drivers such as specialized diet, foraging methods, and substrate use have been shown to shape the cranial morphology of lizards, with specialization in feeding ecology leading to distinct cranial morphologies adapted for dietary acquisition (Metzger and Herrel, Reference Metzger and Herrel2005; Barros et al., Reference Barros, Herrel and Kohlsdorf2011; Ballell et al., Reference Ballell, Dutel, Fabbri, Martin-Silverstone, Kersley, Hammond, Herrel and Rayfield2024).
The minimum number of comparative specimens
Although our dataset showed VAP (landmark and semilandmark curve from the insertion point of the tooth/tooth groove ventral to the anterior edge of the ascending process to the insertion point of the 10th tooth along the dental shelf curve) could correctly identify a test fragment to the family and genus levels with a single comparative specimen, previous studies have shown that small sample sizes can bias variance estimates, reduce statistical power, and inflate classification accuracy (Cardini and Elton, Reference Cardini and Elton2007; Viscosi and Cardini, Reference Viscosi and Cardini2012). Given that squamate comparative collections remain limited, we recommend adopting a more conservative approach—restricting identification to higher taxonomic levels unless supported by multiple comparative specimens.
When determining the minimum number of comparative specimens needed for genus identification, we observed a substantial overlap between Elgaria and Gerrhonotus ellipses for two of the three most distinguishable shapes (CMAP and VAP, but not the AAP). An apomorphic trait that distinguishes these two genera is the presence of a spur located on the anterior edge of the ascending facial process for Elgaria, but not Gerrhonotus (Ledesma et al., Reference Ledesma, Scarpetta and Bell2021; Fig. 3). However, no apomorphic trait was observed along the dental shelf between the two genera. Hence, we encourage researchers utilizing these fossil identification methods to create an illustration of a single representative exemplar for each comparative specimen under study. This will aid in observing interspecific traits a priori with the aim of selecting the shape that would be most effective for identification.
Conclusion
Bone specimens have the potential to contain a variety of quantitative shapes that can allow geometric morphometrics to successfully differentiate between taxa. Such findings can aid researchers who are tasked with identifying recovered bone specimens with a reduced number of shape curves due to natural taphonomic processes.
Furthermore, these methods provide a relatively inexpensive approach in generating 2D versus 3D images and do not result in damaging the fossil bone specimen under study. Although not tested here, this methodology could be expanded to maxillae belonging to other underrepresented taxa, for example, Serpentes, and to other commonly recovered bone elements that contain distinct homologous points, to determine whether geometric morphometrics holds similar promise in fossil identification.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/qua.2025.10050.
Acknowledgments
We thank J. Chris Sagebiel, the collection manager for the Jackson School Museum at the University of Texas at Austin, for his generous support in loaning the comparative specimens used in this study.
Competing Interests
All authors declare there is no conflict of interest.