Hostname: page-component-6766d58669-bkrcr Total loading time: 0 Render date: 2026-05-21T02:12:07.248Z Has data issue: false hasContentIssue false

Estimating the correlation of non-allele descents along chromosomes

Published online by Cambridge University Press:  14 December 2010

XIN-SHENG HU
Affiliation:
Department of Agricultural, Food and Nutritional Sciences, University of Alberta, Edmonton, AB T6J 2P5, Canada
ZHIQUAN WANG*
Affiliation:
Department of Agricultural, Food and Nutritional Sciences, University of Alberta, Edmonton, AB T6J 2P5, Canada
*
*Corresponding author: Department of Agricultural, Food and Nutritional Sciences, University of Alberta, Edmonton, AB T6J 2P5, Canada. Tel: 780-248-5423. Fax: -1900. e-mail: zhiquan@ualberta.ca
Rights & Permissions [Opens in a new window]

Summary

The pattern of the correlation of non-allele descents among linked sites is an important aspect for an insight into the genomic evolution at the population level. Here, we present a new statistical method for estimating two types of non-allele descent correlations. One is the standardized parental descent disequilibrium termed by Cockerham & Weir (1973),the other is the standardized disequilibrium between non-allele descent segments from the same chromosome. Essential to this analysis is the partitioning of the joint identity-by-state probability for a random pair of non-allele gametes into the different components of identity by descents at the two or three sites. We consider the samples of phased haplotypes of single nucleotide polymorphism (SNP) markers and the weighted least square method for fast parameter estimation. Monte Carlo simulations demonstrate that robustly unbiased estimates with appropriate precisions can be obtained with certain sample sizes, ~100 diploids, under the impacts of allele frequency distributions and linkage disequilibrium. This method can be used to construct the maps of non-allele descent correlation blocks for the population whose genetic pedigree is not required on a prior basis.

Information

Type
Research Papers
Copyright
Copyright © Cambridge University Press 2010
Figure 0

Fig. 1. Four combinations of non-allele descents are illustrated for a random pair of gametes at the two-site case. Each site has two alleles, with alleles a and a′ at the A site, and alleles b and b′ at the B site. The thicker solid lines indicate the linked non-alleles that are IBD. Fi,j (i,j=0,1) denotes the joint IBD probability for a random pair of gametes.

Figure 1

Fig. 2. Part of 64 combinations of non-allele descents is illustrated for a random pair of gametes at the three-site case. Each site has two alleles, with alleles a and a′ at the A site, alleles b and b′ at the B site and alleles c and c′ at the C site. The thicker solid lines indicate the linked non-alleles that are IBD. Fij,ij (i, j, i′,j′=0, 1)denotes the joint IBD probability for a random pair of gametes.

Figure 2

Fig. 3. Effects of the sample size on F-parameter estimation at the two-site case: (a) average estimates of F1,1 and F1,0 and (b) standard deviations of F1,1 and F1,0estimates. The results are obtained from 10 000 independent simulation runs under uniform distribution of allele frequencies. The parameter settings are the LD between the two sites=0·1, F1,1=0·1, F1,0=0·05, and the correlation coefficient rparent=0·6078. The dashed line represents the truth F-parameter values.

Figure 3

Fig. 4. Effects of the sample size on estimating the parental descent correlation at the two-site case. The results are obtained from 10 000 independent simulation runs under the non-uniform and uniform distributions of allele frequencies. The dashed lines with opened circles and opened triangles represent the average and standard deviation of parental descent correlations, respectively, with the LD between the two sites=0·05 under the non-uniform distribution of allele frequencies. The lines with closed circles and closed triangles represent the average and standard deviation of parental descent correlations, respectively, with the LD between the two sites=0·1 under the uniform distribution of allele frequencies. The common parameter settings are F1,1=0·1, F1,0=0·05, and the correlation coefficient rparent=0·6078. The dashed line represents the truth parental descent correlation coefficient.

Figure 4

Fig. 5. Effects of the sample size on estimating correlation coefficients at the three-site case: (a) average estimates of three correlation coefficients; (b) standard deviations for the estimates of correlation coefficients. The results are obtained from 10 000 independent simulation runs under the uniform distribution of allele frequencies at each site. The parameter settings are the LD between the A and B sites=0·12, between the A and C sites=0·10, between B and C sites=0·12, and LD among the three sites=0·05, F11,11=0·2, F11,10=0·01, F11,01=0·01, F11,00=0·1, F10,10=0·01, F10,01=0·02, F10,00=0·01, F01,00=0·02 and F01,01=0·01. The dashed line represents the truth correlation coefficients. The line with closed circles represents the estimate of rsegment=0·7655; the line with closed squares represents the estimate of rparent(AB)=0·39994; and the line with closed triangles represents the estimate of rparent(BC)=0·3633.

Figure 5

Fig. 6. Effects of low LD on estimating correlation coefficients at the three-site case: (a) average estimates of three correlation coefficients; (b) standard deviations for the estimates of correlation coefficients. The results are obtained from 10 000 independent simulation runs under the uniform distribution of allele frequency. The parameter settings are the LD between the A and B sites=0·05, between the A and C sites=0·01, between the B and C sites=0·05, and the LD among the three sites=0·005, F11,11=0·02, F11,10=0·01, F11,01=0·02, F11,00=0·04, F10,10=0·01, F10,01=0·01, F10,00=0·01, F01,00=0·02 and F01,01=0·02. The dashed line represents the truth values. The line with closed circles represents the estimate of rsegment=0·5613; the line with closed squares represent the estimate of rparent(AB)=0·2927; and the line with closed triangles represents the estimate of rparent(BC)=0·4048.