Hostname: page-component-89b8bd64d-ksp62 Total loading time: 0 Render date: 2026-05-07T09:21:30.622Z Has data issue: false hasContentIssue false

An algorithmic model for constructing a linkage and linkage disequilibrium map in outcrossing plant populations

Published online by Cambridge University Press:  17 February 2009

JIAHAN LI*
Affiliation:
Department of Statistics, University of Florida, Gainesville, FL 32611, USA
QIN LI*
Affiliation:
Department of Statistics, University of Florida, Gainesville, FL 32611, USA
WEI HOU*
Affiliation:
Department of Epidemiology and Health Policy Research, University of Florida, Gainesville, FL 32611, USA
KUN HAN*
Affiliation:
School of Forestry and Biotechnology, Zhejiang Forestry University, Lin'an, Zhejiang 311300, People's Republic of China, and
YAO LI
Affiliation:
Department of Statistics, University of Florida, Gainesville, FL 32611, USA
SONG WU
Affiliation:
Department of Statistics, University of Florida, Gainesville, FL 32611, USA
YANCHUN LI
Affiliation:
School of Forestry and Biotechnology, Zhejiang Forestry University, Lin'an, Zhejiang 311300, People's Republic of China, and
RONGLING WU*
Affiliation:
Department of Statistics, University of Florida, Gainesville, FL 32611, USA School of Forestry and Biotechnology, Zhejiang Forestry University, Lin'an, Zhejiang 311300, People's Republic of China, and Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544, USA
*
*These authors contributed equally to this work.
*These authors contributed equally to this work.
*These authors contributed equally to this work.
*These authors contributed equally to this work.
*These authors contributed equally to this work.
Rights & Permissions [Opens in a new window]

Summary

A linkage–linkage disequilibrium map that describes the pattern and extent of linkage dis-equilibrium (LD) decay with genomic distance has now emerged as a viable tool to unravel the genetic structure of population differentiation and fine-map genes for complex traits. The prerequisite for constructing such a map is the simultaneous estimation of the linkage and LD between different loci. Here, we develop a computational algorithm for simultaneously estimating the recombination fraction and LD in a natural outcrossing population with multilocus marker data, which are often estimated separately in most molecular genetic studies. The algorithm is founded on a commonly used progeny test with open-pollinated offspring sampled from a natural population. The information about LD is reflected in the co-segregation of alleles at different loci among parents in the population. Open mating of parents will reveal the genetic linkage of alleles during meiosis. The algorithm was constructed within the polynomial-based mixture framework and implemented with the Expectation–Maximization (EM) algorithm. The by-product of the derivation of this algorithm is the estimation of outcrossing rate, a parameter useful to explore the genetic diversity of the population. We performed computer simulation to investigate the influences of different sampling strategies and different values of parameters on parameter estimation. By providing a number of testable hypotheses about population genetic parameters, this algorithmic model will open a broad gateway to understand the genetic structure and dynamics of an outcrossing population under natural selection.

Information

Type
Paper
Copyright
Copyright © 2009 Cambridge University Press
Figure 0

Table 1. Diplotype and genotype frequencies of two markers, A and B, in the offspring population through outcrossing and selfing pollination

Figure 1

Table 2. Two possible diplotypes of a maternal plant of double heterozygote and the frequencies of its four gametes for two markers

Figure 2

Table 3. The expected number (_{\setnum{11}}\rmPsi _{ii \prime jj \prime}^{ll \prime rr \prime}) of gamete 11, within an offspring genotype derived from a maternal genotype. Note that the double heterozygote is obtained by dividing the expression in the table by both the frequency of maternal genotype ({\rm P}_{\rm ii \prime jj \prime}) and the overall frequency of the corresponding offspring genotype ({\rm P}_{\rm ii \prime jj \prime}^{\rm ll \prime rr \prime}), whereas _{\setnum{11}}\rmPsi _{ii \prime jj \prime}^{ll \prime rr \prime} for all the genotypes is calculated by dividing the expression only by the overall frequency of the corresponding offspring genotypes

Figure 3

Table 4. The expected number (_{\setnum{10}}\rmPsi _{ii \prime jj \prime}^{ll \prime rr \prime}) of gamete 10, within an offspring genotype derived from a maternal genotype. Note that _{\setnum{10}}\rmPsi _{ii \prime jj \prime}^{ll \prime rr \prime} for the double heterozygote is obtained by dividing the expression in the table by both the frequency of maternal genotype ({\rm P}_{{\rm ii \prime jj \prime}}) and the overall frequency of the corresponding offspring genotype ({\rm P}_{\rm ii \prime jj \prime}^{\rm ll \prime rr \prime}), whereas _{\setnum{10}}\rmPsi _{ii \prime jj \prime}^{ll \prime rr \prime} for all the genotypes is calculated by dividing the expression only by the overall frequency of the corresponding offspring genotypes

Figure 4

Table 5. The expected number (_{\setnum{01}} \rmPsi _{ii \prime jj \prime}^{ll \prime rr \prime}) of gamete 01, within an offspring genotype derived from a maternal genotype. Note that _{\setnum{01}} \rmPsi _{\rm ii \prime jj \prime}^{\rm ll \prime rr \prime} for the double heterozygote is obtained by dividing the expression in the table by both the frequency of the maternal genotype (Pii′jj′) and the overall frequency of the corresponding offspring genotype ({\rm P}_{\rm ii \prime jj \prime}^{\rm ll \prime rr \prime}), whereas _{\setnum{01}} \rmPsi _{ii \prime jj \prime}^{ll \prime rr \prime} for all the genotypes is calculated by dividing the expression only by the overall frequency of the corresponding offspring genotypes

Figure 5

Table 6. The expected number (_{\setnum{00}} \rmPsi _{ii \prime jj \prime}^{ll \prime rr \prime}) of gamete 00, within an offspring genotype derived from a maternal genotype. Note that _{\setnum{00}} \rmPsi _{\rm ii \prime jj \prime}^{\rm ll \prime rr \prime} for the double heterozygote is obtained by dividing the expression in the table by both the frequency of maternal genotype (Pii′jj′) and the overall frequency of the corresponding offspring genotype ({\rm P}_{\rm ii \prime jj \prime}^{\rm ll \prime rr \prime}), whereas _{\setnum{00}} \rmPsi _{ii \prime jj \prime}^{ll \prime rr \prime} for all the genotypes is calculated by dividing the expression only by the overall frequency of the corresponding offspring genotypes

Figure 6

Table 7. MLEs of parameters and their standard errors (in parentheses) obtained from 100 simulation replicates with the (small family number×large family size) sampling strategy

Figure 7

Table 8. MLEs of parameters and their standard errors (in parentheses) obtained from 100 simulation replicates with the (moderate family number×moderate family size) sampling strategy

Figure 8

Table 9. MLEs of parameters and their standard errors (in parentheses) obtained from 100 simulation replicates with the (large family number×small family size) sampling strategy