The evolution of herbicide resistance in weed species presents a global threat to the sustainability of crop production (Cummins et al. Reference Cummins, Wortley, Sabbadin, He, Coxon, Straker, Sellers, Knight, Edwards, Hughes, Kaundun, Hutchins, Steel and Edwards2013; Yu and Powles Reference Yu and Powles2014). One particularly problematic plant family, Amaranthaceae, includes many agronomically important weeds that have evolved multiple herbicide resistances. Perhaps the most problematic weeds in this genus are Palmer amaranth (Amaranthus palmeri S. Watson) and waterhemp [Amaranthus tuberculatus (Moq.) J. D. Sauer].
Amaranthus tuberculatus and A. palmeri are herbaceous, outcrossing annual plants native to the American Midwest and Southwest, respectively (Sauer Reference Sauer1957). Both species have successfully infested a wide range of cropping systems across the United States, and their increasing ability to survive herbicidal treatments makes them two of the main targets for weed management programs in row-crop systems. Whereas most of the species of genus Amaranthus are monoecious, A. palmeri and A. tuberculatus are dioecious (Murray Reference Murray1940; Waselkov and Olsen Reference Waselkov and Olsen2014). Their dioecy ensures outcrossing to exchange genetic information and maintain high genetic diversity (Adhikary and Pratt Reference Adhikary and Pratt2015), thereby increasing the potential for the evolution and spread of herbicide resistance. Unfortunately, the genetic components of dioecy and its evolution are still not well understood in Amaranthus (Gaines et al. Reference Gaines, Ward, Bukun, Preston, Leach and Westra2012; Tranel et al. Reference Tranel, Riggins, Bell and Hager2011; Trucco et al. Reference Trucco, Tatum, Rayburn and Tranel2009), although some recent studies have had success investigating dioecy in other plant species (Harkess et al. Reference Harkess, Zhou, Xu, Bowers, Van der Hulst, Ayyampalayam, Mercati, Riccardi, McKain, Kakrana, Tang, Ray, Groenenddijk, Arikit, Mathioni, Nakano, Shan, Telgmann-Rauber, Kanno, Yue, Chen, Li, Chen, Xu, Zhang, Luo, Chen, Gao, Mao, Pires, Luo, Kundra, Wing, Meyers, Yi, Kong, Lavrijsen, Sunseri, Falavigna, Ye, Leebens-Mack and Chen2017; Henry et al. Reference Henry, Akagi, Tao and Comia2018). In A. palmeri and A. tuberculatus, it is thought that males are the heterogametic sex (Murray Reference Murray1940; Trucco et al. Reference Trucco, Tatum, Rayburn and Tranel2009); however, cytological evaluation failed to identify heteromorphic sex chromosomes (Grant Reference Grant1959).
Ultimately, a deeper understanding of dioecy in A. tuberculatus and A. palmeri could provide for novel strategies to manage them. For example, a gene (or genes) controlling sex could be used in a gene-drive system to manipulate sex ratios, potentially to the point of causing local population collapse (National Academies of Sciences, Engineering, and Medicine 2016; Neve Reference Neve2018; Tranel and Trucco Reference Tranel, Trucco and Stewart2009). Recent advancements in CRISPR-Cas9 technology have opened up opportunities to create such a gene drive (Chen et al. Reference Chen, Wang, Zhang, Zhang and Gao2019). To be sure, there are significant technical and regulatory hurdles to controlling A. tuberculatus and A. palmeri via genetic manipulation of sex ratios. Nevertheless, a first step in this process is to gain a better understanding of the underlying genetic factors involved.
Over the last decade, various studies that used cytogenetic techniques, breeding experiments, and/or linkage analysis with genetic markers have been conducted to identify sex-determining regions in non-model species, including plants and animals (Charlesworth and Mank Reference Charlesworth and Mank2010). The use of linkage analysis to separate out sex-linked or sex-specific markers has recently gained popularity in relation to cytogenetic and breeding techniques, due in part to the ever-decreasing cost of sequencing technologies. Restriction site–associated DNA sequencing (RAD-Seq) is one such next-generation sequencing technique that can aid in carrying out high-throughput linkage analysis (Davey and Blaxter Reference Davey and Blaxter2010). RAD-Seq allows one to generate a massive number of short reads through parallel sequencing, wherein the regions of genomic DNA that flank restriction sites throughout the genome are selectively sequenced. These sequences can then be compared across hundreds of samples to identify abundantly placed polymorphic markers that can be associated with the trait being investigated. This strategy can be used on pooled populations to screen thousands of markers in a bulk segregant analysis. If a good reference genome is available, the markers can also be used to fine-map regions of interest. RAD-Seq has previously been used successfully in several studies of sex determination to identify genetic markers (Kafkas et al. Reference Kafkas, Khodaeiaminjan, Güney and Kafkas2015; Palaiokostas et al. Reference Palaiokostas, Bekaert, Davie, Cowan, Oral, Taggart, Gharbi, McAndrew, Penman and Migaud2013).
In this work, we used RAD-Seq to identify genomic segments that contribute or are linked to sex determination in A. tuberculatus and A. palmeri. The primary goals of this study were to verify that males are the heterogametic sex in both species and to develop reliable genetic markers that are linked to sex-determining regions. To address these goals, this paper focuses on the following research objectives: (1) identify the presence of male-specific regions of the genome in both species (2) design reliable assays to genotypically predict the sex of these two species based on male-specific sequences, and (3) validate the assays across many populations and individuals to confirm the conservation of these markers.
Material and Methods
Sampling and DNA Extraction
For A. tuberculatus, 192 males and 192 females were randomly selected from the G1 generation of plants described by Wu et al. (Reference Wu, Davis and Tranel2018). In brief, this population was obtained from intermating plants from four field collections (three in Illinois and one in Missouri) with or without various herbicide-resistance traits. Two or three leaves were taken from each plant following flower formation, and sex was noted for each individual sampled. The tissues were harvested, placed into 96-well, deep-well (1.2-ml) plates containing metal balls, and stored at −80 C for at least 24 h. DNA was extracted using a protocol described by Xin and Chen (Reference Xin and Chen2012). Following the extraction of DNA, the nucleic acid concentration was quantified using the Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen, Carlsbad, CA), and samples were diluted in biology-grade water to a concentration of 50 ng µl−1.
For A. palmeri, 180 males and 175 females were grown from six geographically distinct populations from throughout the United States (Illinois, Georgia, Kansas, Missouri, Arkansas, and New Mexico; Davis et al. Reference Davis, Schutte, Hager and Young2015). The plants were grown in the same greenhouse conditions as the A. tuberculatus plants and sampled in the same manner. Genomic DNA was extracted, quantified, and diluted as previously described.
Illumina Library Preparation and RAD-Sequencing
The library preparation and RAD-Seq protocols used in this work are explained in multiple literature sources (Baird et al. Reference Baird, Etter, Atwood, Currey, Shiver, Lewis, Selker, Cresko and Johnson2008; Poland et al. Reference Poland, Brown, Sorrells and Jannink2012). In summary, a combination of two restriction enzymes, HindIII (a 6-base cutter) and MseI (a 4-base cutter), was used to double digest the diluted DNA samples at 37 C for 4 h. The restriction digestion reactions were prepared using 0.5 µl of DNA (50 ng µl−1), 1.5 µl of 10X CutSmart Buffer (New England Biolabs, Ipswich, MA), 0.1 µl of rare cutter (HindIII HF; New England Biolabs), 0.2 µl of common cutter (MseI; New England Biolabs), and 8.5 µl of molecular biology-grade water. Following digestion, custom adapters (Integrated DNA Technologies, Skokie, IL) were ligated to the cut ends of the DNA fragments. The adapters contained multiplex identifier bar codes, which were used to label each sample, and complementary ends, which paired to enzyme cut sites for HindIII and MseI. Ligation reactions were performed at 25C for 2 h, followed by 20 min at 65C for ligation inactivation. Ligation reactions were prepared with 10 µl of digested DNA solution mix, 1 µl of 10X T4 DNA Ligase Reaction Buffer (New England Biolabs), 1.5 µl of HindIII adapter (0.1 µM; New England Biolabs), 0.5 µl of MseI adapter (10 µM; New England Biolabs), 1.5 µl of ATP (10 mM; New England Biolabs), 0.1 µl of T4 DNA Ligase (2,000 U µl−1; New England Biolabs), and 5.4 µl of biology-grade water. The ligated DNA samples were pooled and cleaned, using a 2:1 ratio of AMPure XP beads (Beckman Coulter, Brea, CA) to DNA solution and a magnetic particle concentrator (Invitrogen), with two washes in 95% ethanol and resuspension in 10 mM Tris-HCl (MilliporeSigma, St Louis, MO). Cleaned DNA pools were amplified using a 2X PhusionHF Master Mix (New England Biolabs) with thermocycler conditions as follows: 98 C for 30 s, 15 cycles (98 C for 10 s, 68 C for 30 s, 72 C for 30 s), and 72 C for 5 min. Gel electrophoresis on a 1% agarose gel was used to verify the presence of a smear before DNA samples were once again cleaned with AMPure XP beads. An Agilent Bioanalyzer 2100 was used to determine the average fragment length using the Agilent DNA1000 Kit (Agilent Technologies, Santa Clara, CA), and the Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen) was used to quantify concentrations of DNA. Samples with concentrations of less than 50 ng µl−1 were discarded before proceeding. These prepared sequencing libraries were pooled (within species) in equimolar concentrations and diluted to a final concentration of 10 nM in a library buffer consisting of 10 mM Tris-HCl with 0.05% v/v Tween-20 (MilliporeSigma). The pooled samples from A. tuberculatus and A. palmeri were submitted to the W. M. Keck Center at the Roy J. Carver Biotechnology Center at the University of Illinois, Urbana–Champaign, for single-end sequencing on one lane per species of an Illumina HiSeq2000. An additional qPCR assay was performed by the Keck Center on each pool to optimize nucleic acid concentrations before sequencing.
Parallel workflows were followed in the analysis of the A. tuberculatus and A. palmeri RAD-Seq data sets. The Trait Analysis by aSSociation, Evolution and Linkage (TASSEL) v. 4.0 software package was used to analyze the association of tag presence and sex. Reads originating from underrepresented individuals (those with fewer than 100 total reads) were discarded and attributed to error in library preparation or sequencing. The GBSSeqToTagDBPlugin was then used to identify all reads occurring at least 10 times in the total data set. These reads were collapsed into tags (each tag representing a unique sequence read) and retained for downstream analysis. This plug-in also used the bar-code identifiers to demultiplex the data set and input the data into a local SQLite database. A custom Python script was written to count the occurrence of each tag in both males and females and the total number of individuals in which the tag occurred. The tags were subjected to 1,000 iterations of permutation analysis with replacement using the coin package in R, with sex treated as the response variable (Hothorn et al. Reference Hothorn, Hornik, Van de Wiel and Zeileis2006). In each iteration, the sex of each plant was randomly assigned as male or female. Once this was accomplished, a ratio of male:female occurrence was calculated and logged. After 1,000 iterations, the actual male:female ratio of each tag was compared with the distribution of all 1,000 occurrence ratios derived from the permutations (Hothorn et al. Reference Hothorn, Hornik, Van de Wiel and Zeileis2008). The resulting P-value reflected the probability of observing the actual male:female ratio derived from the RAD-Seq reads within the distribution of the 1,000 ratios created during the permutation process. Male-biased tags were defined as tags that were assigned P-values less than or equal to 0.001. Additionally, male-specific tags were defined as tags appearing only in male samples and never in female samples. Female-specific and female-biased tags were identified using corresponding criteria.
Marker Design and Testing
PCR primers were designed to amplify a subset of male-specific tags from each species that had the highest frequency in their respective data sets, while being completely absent from all female individuals. Primers ranged in length from 15 to 18 bp and were designed using a combination of Web-based software packages, including BatchPrimer 3 (You et al. Reference You, Huo, Gu, Luo, Ma, Hane, Lazo, Dvorak and Anderson2008), FastPCR (Kalendar et al. Reference Kalendar, Khassenov, Ramankulov, Samuilova and Ivanov2017), and OligoPerfect Designer by Life Technologies (ThermoFisher Scientific n.d.). Default settings were used for each program, and the results were compared. Optimal primer sets for each selected tag were designed with consideration of each program output to reduce self-dimerization, heterodimerization, melting temperature difference, and hairpin secondary structure. To ensure specificity, designed primers were aligned, using the Basic Local Alignment Search Tool (BLAST) software package, to all female reads of the same species, and only primers that showed no exact matches in females were examined further (Camacho et al. Reference Camacho, Coulouris, Avagyan, Ma, Papadopoulos, Bealer and Madden2009).
Amaranthus tuberculatus male-specific primer sets were tested on separate pools of nondigested male and female DNA samples used in the RAD-Seq protocol. To prepare each sample pool, five diluted DNA samples of the same sex were mixed in equal concentration. Testing these primer sets on pooled samples saved time and reagents while selecting primer sets and protocol conditions that differentiate males from females. Amaranthus tuberculatus primer sets that were able to distinguish male from female pools were subsequently tested on individual plants from nine geographically diverse A. tuberculatus populations (from Ohio, Missouri, Nebraska, and six different Illinois counties). They were also tested on grain amaranth (Amaranthus hypochondriacus L.), both sexes of A. palmeri, and hybrids created by crossing A. tuberculatus with smooth pigweed (Amaranthus hybridus L.). Primer testing on additional species helped to ensure that primer sets were not only sex-specific, but also species-specific. For all A. tuberculatus marker tests, the components of each PCR mix included: 12.1 µl of biology-grade water, 5 µl of 5X Green GoTaq Flexi Buffer (Promega, Madison, WI), 2 µl of 25 mM MgCl2 (Promega), 2 µl of 10 mM equal mixture of dNTPs (New England Biolabs), 1.25 µl each of forward and reverse primer diluted to a concentration of 10 μM (Integrated DNA Technologies), 1 µl of DNA template diluted to 1 to 100 ng µl−1, and 0.2 µl each of Taq polymerase (Promega) and dimethyl sulfoxide (DMSO; MilliporeSigma). Thermocycler settings were as follows: 95 C for 5 min; 35 cycles of 95 C 30 s, optimal annealing temperature 30 s, and 72 C 20 s; and 72 C for 5 min, wherein each primer set had its own optimal annealing temperature (Table 1).
One of the primer sets that was most specific in silico (lowest homology to any female sequence based on BLAST), MU_657.2, was tested further to gain a better understanding of the reliability of this marker across hundreds of samples. Amaranthus tuberculatus plants from the original population used in the RAD-Seq assay were grown and DNA was extracted before flowering. Primer set MU_657.2 was used to predict sex of these plants at the 3- to 4-leaf stage, and phenotypic data were compared with these predicted results.
Amaranthus palmeri male-specific primer sets were tested in a similar manner. The reaction mixture for both A. palmeri markers included: 13.6 µl of biology-grade water, 5 µl of 5X Green GoTaq Flexi Buffer (Promega), 1.5 µl of 25 mM MgCl2 (Promega), 1.5 µl of 10 mM equal mixture of dNTPs (New England Biolabs), 1 µl each of forward and reverse primer diluted to a concentration of 10 μM (Integrated DNA Technologies), 1 µl of DNA template diluted to 1 to 100 ng µl−1, and 0.2 µl each of Taq polymerase and DMSO. Thermocycler settings for A. palmeri male-specific assays were identical to those for A. tuberculatus male-specific assays, with annealing temperatures listed in Table 1. Primer sets that were able to distinguish male pools from female pools were tested across individuals sourced from the six geographically diverse populations of A. palmeri used in the RAD-Seq protocol. They were also tested on DNA extracted from male and female A. tuberculatus, grain amaranth, and male and female hybrids created by crossing A. tuberculatus with A. hybridus. To further validate these primer sets and better understand their reliability, A. palmeri plants from three field populations in Kansas were grown, and sex predictions determined through PCR assays were compared with phenotypic data as in A. tuberculatus.
In Silico Marker Testing
Short-read, shotgun sequence data at a depth of ~10X were generated from 23 male and 19 female A. tuberculatus plants from Illinois as part of an ecological study to better understand A. tuberculatus population structure (Kreiner et al. Reference Kreiner, Giacomini, Waithaka, Bemm, Lanz, Hildebrandt, Regalado, Sikkema, Tranel, Weigel, Stinchcombe and Wright2018). The sequence reads produced by this protocol were 150-bp paired-end reads with an approximate insert length of 300 bp. We compared by BLAST all A. tuberculatus male- and female-specific tags that were used to produce primer sets that reliably differentiated males from females from the RAD-Seq DNA samples to all reads of each male and female data set separately under default settings (Camacho et al. Reference Camacho, Coulouris, Avagyan, Ma, Papadopoulos, Bealer and Madden2009). Furthermore, we used SOAPdenovo2 to assemble as much of the genome as possible for each individual after quality-trimming the reads with Trimmomatic (v. 0.38) under default paired-end settings (Bolger et al. Reference Bolger, Lohse and Usadel2014; Luo et al. Reference Luo, Liu, Xie, Li, Huang, Yuan, He, Chen, Pan, Liu, Tang, Wu, Zhang, Shi, Liu, Yu, Wang, Lu, Han, Cheung, Yiu, Peng, Xiaoqian, Liu, Liao, Li, Yang, Wang, Lam and Wang2012). Optimal k-mer size and coverage cutoff used in each assembly were identified with KmerGenie (Chikhi and Medvedev Reference Chikhi and Medvedev2014). The four male-specific tags were compared by BLAST against these assemblies to find associated genic regions (Camacho et al. Reference Camacho, Coulouris, Avagyan, Ma, Papadopoulos, Bealer and Madden2009). Genome segments showing perfect alignment were then compared using BLASTx against the National Center for Biotechnology Information’s (NCBI) nonredundant protein sequence database to search for homology to known genic regions or transcripts.
Investigation of Female-specific Sequences
To investigate tags appearing only in females, PCR primer sets were designed for two female-specific tags with the highest frequency in the data set and with no close alignments to male individuals in silico using the same software tools as indicated earlier to minimize self-dimerization, heterodimerization, melting temperature difference, and hairpin secondary structure. Reagent concentrations and thermocycler settings for these female-specific assays were similar to those for A. palmeri male-specific assays, with primers and annealing temperatures listed in Table 1. Primers were validated on DNA samples used in the RAD-Seq protocol.
One hypothesis for the female-specific markers was the presence of a cryptic Y region (Y′) in a subset of the females, likely derived from the male-specific Y (MSY) region, but nonfunctional. We further hypothesized that, in combination with a true MSY region, this genotype (YY′) would be missing key X genes, leading to a lethal phenotype, thus skewing the ratio of the population toward more females. To test for this, plants were grown from the same seed lot used to produce the plants involved in RAD-Seq and screened with the female-specific markers. All female plants were allowed to intermate with males from the same population, and seed was harvested from each female individually. Seed from 10 females testing positive for the female-specific markers (FS+) and from six females testing negative (FS−) were germinated, and 108 seedlings from each female parent were transplanted into soil and grown until sex could be identified under greenhouse conditions. Sex of all flowering plants was documented, and sex ratios were compared among populations derived from FS+ and FS− females. A chi-square goodness-of-fit test was used to determine whether any of the tested populations’ male:female ratio differed from the expected 1:1 at a confidence level of 95%.
Results and Discussion
The RAD-Seq protocol generated 204,658,182 total A. tuberculatus reads, each consisting of 64 bp after adapter and bar-code trimming. After these raw reads were processed with the GBSSeqToTagDBPlugin of TASSEL 4.0, they were consolidated into 13,031,934 total tags (Bradbury et al. Reference Bradbury, Zhang, Kroon, Casstevens, Ramdoss and Buckler2007; Glaubitz et al. Reference Glaubitz, Casstevens, Lu, Harriman, Elshire, Sun and Buckler2014). A total of 522,712 tags that occurred at least 10 times throughout the entire data set were retained, representing 178 males and 175 females. These tags were plotted to visualize the number of individuals in which each appeared and in what gender proportion (Figure 1). Sex-specific tags were identified as tags that were solely found in one gender and were labeled as male specific or female specific. There were 2,754 male-specific tags (found in 25 or more males and zero females) and 723 female-specific tags (found in 25 or more females and zero males). To select the most likely male-specific tags for later marker design, a final filtering step retained any tag that occurred 500 times or more in males. Twenty-two male-specific tags passed these criteria.
A similar analysis pipeline was followed with the A. palmeri RAD-Seq data. A total of 396,075,230 reads were passed into GBSSeqToTagDBPlugin TASSLE 4.0 and reduced to 6,209,074 unique tags. As in A. tuberculatus, any tag appearing less than 10 times in the data set was discarded, resulting in 1,037,755 unique tags across 140 males and 152 females to be analyzed further. These remaining tags were subjected to the same permutation analysis as the A. tuberculatus tags (Figure 2). Of the male-specific tags, 345 appeared in 25 or more male individuals and zero females. There were no female-specific tags in the A. palmeri data set.
The higher proportion of male-biased to female-biased tags in the two data sets supports the hypothesis that males are the heterogametic sex in both species. In a traditional XY sex chromosome system, there should be no sequence that is specific to females (the female-specific A. tuberculatus tags discovered in our RAD-Seq protocol are discussed under “Investigation of Female-specific Sequences”). In contrast, sequences nearby or inside a non-recombining region of the Y chromosome are liable to accumulate mutations and become specific to the Y chromosome, giving rise to male-biased and male-specific tags. The larger number of well-supported male-specific tags (those appearing solely in males and being represented by 25 or more individuals) in A. tuberculatus relative to A. palmeri suggests A. tuberculatus diverged from monoecy more recently than did A. palmeri (Bachtrog Reference Bachtrog2013; Puterova et al. Reference Puterova, Kubat, Kejnovsky, Jesionek, Cizkova, Vyskot and Hobza2018). It is hypothesized that a recently evolved dioecious species quickly accumulates transposable genomic segments in the non-recombining region of the Y chromosome. As evolutionary time goes on, the non-recombining region is hypothesized to degenerate, as deleterious mutations eliminate unnecessary genomic segments until the region is reduced to only the essential components for sex determination, as in the mammalian Y chromosomes. The proposition that A. tuberculatus evolved a dioecious reproduction system more recently than A. palmeri is also supported by previously conducted phylogenetic studies (Stetter and Schmid Reference Stetter and Schmid2017; Ward et al. Reference Ward, Webster and Steckel2013).
Marker Design and Testing
Male-specific tags from both species that showed the highest number of occurrences throughout their respective data sets were used to design PCR primer sets. These tags were assumed to be sequences that were part of the MSY and with little homology to sequences found in the X chromosome or any autosome. Thus, amplification should only occur in male samples. With just 64 bp per tag, options for primer design were limited. Several tags were excluded due to this primer design constraint, and even the successful primers (Table 1) required stringent annealing temperatures for sex discrimination.
Four A. tuberculatus primer sets and two A. palmeri primer sets were able to accurately distinguish male pools from female pools (Table 1). Figure 3 illustrates representative results of each A. tuberculatus and A. palmeri male-specific tag when tested on one male and one female of their own species. All four A. tuberculatus primer sets and both A. palmeri primer sets successfully differentiated males from females in groups of 30 plants of geographically diverse populations with no false positives or false negatives (unpublished data). The A. palmeri primer sets were tested on populations from Illinois, Georgia, Kansas, Missouri, Arkansas, and New Mexico (the same populations used for RAD-Seq), while the A. tuberculatus primer sets were tested on populations from Missouri (also used in the original RAD-Seq), Ohio, Nebraska, and six different Illinois counties. Additionally, these primer sets did not show any amplicons in any other species, with the exception of A. tuberculatus primer sets showing amplicons in male A. tuberculatus × A. hybridus hybrids (unpublished data). Amplification in only male A. tuberculatus and male hybrid samples indicates both sex specificity and species specificity, given that male hybrid plants received the genetic component of maleness from the male A. tuberculatus parent (Trucco et al. Reference Trucco, Tatum, Rayburn and Tranel2009).
Primer set MU_657.2 was used to screen 327 more A. tuberculatus plants originating from Illinois and Missouri, and it accurately predicted sex for 98% of the plants (Table 2). Similarly, A. palmeri primer sets PAMS_940 and PAMS_1106 were each able to predict the sex of 48 plants from three additional field populations in Kansas with 96% accuracy (Table 2). Sequence divergence in the marker regions could explain why the markers failed to detect some males. Additionally, the marker might not be entirely within the non-recombining MSY region. If so, recombination between the marker and the sex-determining locus would result in both false positives and false negatives. In both species, the validation of male-specific markers lends further strength to the hypothesis that males are the heterogametic sex.
a Assays were performed on DNA extracted from plants in the 3- to 4-leaf stage, and sex was phenotypically determined for each plant upon flowering.
No significant homology was found between male-specific tags in A. tuberculatus and A. palmeri (unpublished data). This lack of homology suggests that these two species evolved dioecious reproductive systems independently. Alternatively, we might have simply failed to recover the conserved sequences between the two species that are critical for sex determination.
In Silico Marker Testing
Among the 23 male individuals, male-specific tags MU_976, 533, and 505 all aligned perfectly (no mismatches or gaps in the alignment) to 18 males in silico. Male-specific tag MU_657.2 aligned to 21 males perfectly. The two individuals in which MU_657.2 was not present also did not show the presence of any of the other male-specific tags tested. None of the four male-specific tags were found in any of the 19 female individuals. This leads us to believe that variation in the Y chromosome makes these markers imperfect, although even without perfect sequence conservation across the tag, the primers may still yield a PCR product in males, giving a correct result. It is also possible that dioecious Amaranthus species exhibit some plasticity in sex expression (i.e., sex is not under absolute genetic control), as has been observed in other plants, including spinach (Spinacia oleracea L.) (Komai and Masuda Reference Komai and Masuda2004).
The sequenced A. tuberculatus genomes were used to search for contigs containing the male-specific markers. After trimming, approximately 90% of read sequence was retained for assembly. Because of the short read length and low depth of coverage, the average N50 of each genome assembly was only approximately 500 bp, with a maximum contig length of nearly 20,000 bp. Male-specific tags MU_976, 533, and 505 showed no suitable alignments in any of the male assemblies, likely because reads containing these tags were either not used in the assembly or assembled to a length less than 300 bp. However, MU_657.2 aligned perfectly to a single contig in each of the 21 male genomes to which it originally aligned. Contigs that housed the tag were 3,000 to 7,000 bp in length and were highly conserved. When these contigs were compared by BLAST to NCBI’s nonredundant protein sequence database, each contig shared the greatest homology with a Ty3 retrotransposon of the gypsy subclass found in beet (Beta vulgaris L.). Furthermore, the tag used to generate primer set MU_657.2 was contained within the proposed retrotransposon. Perhaps a transposable element once jumped into the non-recombining region of the Y-chromosome, where it mutated and lost function, and further mutations became unique to the Y-chromosome and were captured by MU_657.2.
Investigation of Female-specific Sequences
We unexpectedly found 723 tags that only appeared in a subset of female individuals. The discovery of female-specific tags in our A. tuberculatus data set brought our hypothesis of a traditional XY sex chromosome system into question. Given that males are hypothesized to be the heterogametic sex, any genetic sequence found in females should also be found in males. To investigate these tags further, primer sets were designed based on the 10 most frequent female-specific tags found in A. tuberculatus. These sets were tested on the original RAD-Seq DNA samples. After testing primer sets and optimizing PCR conditions, two of the primer sets proved to be female-specific across the RAD-Seq samples and were used for downstream testing (Table 1).
Siblings of the plants used to generate the RAD-Seq data were screened for the presence of these female-specific tags. An initial screen of 72 plants with these markers showed hits to 19 females and 2 males (unpublished data). Females testing positive for the female-specific markers (FS+) showed no discernibly different phenotype when compared with females lacking the female-specific markers (FS−), and the two female-specific markers segregated in male:female ratios of 0.96:1 and 0.90:1 in FS+ progeny, indicating the lack of true female specificity. In addition, chi-square analysis of each progeny’s sex ratio concluded that none significantly deviated from an expected 1:1 (male:female) distribution (unpublished data). This lack of lethality of a subset of the male offspring from FS+ females infers that the FS+ plants have two functional X regions, and the female-specific sequences do not belong to a cryptic Y region. Possibly the female-specific sequences we identified in the generation of plants in which we performed RAD-Seq were unique to that generation and arose from a mutational event (e.g., an insertion) in the X region of a male plant in the preceding generation.
In summary, our results are consistent with previous work that hypothesized males to be the heterogametic sex in both A. tuberculatus and A. palmeri. When high-quality reference genomes from males of the two species become available, the male-specific markers we identified should enable homing in on potential causative male-determining genes. In the long term, manipulation of such genes could form the basis of a novel genetic control strategy, perhaps through gene-drive technology, in which all plants of a population are converted to males. In the short term, the markers we identified are immediately valuable, for example, for setting up controlled crosses or for investigating environmental plasticity of sex in these two species.
Patrick J. Tranel https://orcid.org/0000-0003-0666-4564.
The USDA National Institute of Food and Agriculture (AFRI project 2018-67013-27818) provided partial funding of this research. No conflicts of interest have been declared.