Population and quantitative genomic properties of the USDA soybean germplasm collection

  • Alencar Xavier (a1) (a2), Rima Thapa (a1), William M. Muir (a3) and Katy Martin Rainey (a1)

This study is the first assessment of the entire soybean [Glycine max (L.) Merr] collection of the United State Department of Agriculture National Plant Germplasm System (USDA) reporting quantitative and population genomic parameters. It also provides a new insight into soybean germplasm structure. Germplasm studies enable plant breeders to incorporate novel genetic resources into breeding pipelines to improve valuable agronomic traits. We conducted comprehensive analyses on the 19,652 soybean accessions in the USDA-ARS germplasm collection, genotyped with the SoySNP50 K iSelect BeadChip SNP array, to elucidate the quantitative properties of existing subpopulations inferred through hierarchical clustering performed with Ward's D agglomeration method and Nei's standard genetic distance. We found the effective population size to be approximately 106 individuals based on the linkage disequilibrium of unlinked loci. The cladogram indicated the existence of eight major clusters. Each cluster displays particular properties with regard to major quantitative traits. Among those, cluster 3 represents the tropical and semi-tropical genetic material, cluster 5 displays large seeds and may represent food-grade germplasm, and cluster 7 represents the undomesticated material in the germplasm collection. The average FST among clusters was 0.22 and a total of 914 SNPs were exclusive to specific clusters. Our classification and characterization of the germplasm collection into major clusters provides valuable information about the genetic resources available to soybean breeders and researchers.

