Hostname: page-component-89b8bd64d-4ws75 Total loading time: 0 Render date: 2026-05-07T23:56:43.588Z Has data issue: false hasContentIssue false

Correlation measures for linkage disequilibrium within and between populations

Published online by Cambridge University Press:  09 July 2009

J. A. SVED*
Affiliation:
School of Biological Sciences A12, Sydney University, NSW2006, Australia
*
*Corresponding author: School of Biological Sciences A12, Sydney University, NSW2006, Australia. Tel: +61 (8) 8362 4853. e-mail: j.sved@usyd.edu.au
Rights & Permissions [Opens in a new window]

Summary

Correlation statistics can be used to measure the amount of linkage disequilibrium (LD) between two loci in subdivided populations. Within populations, the square of the correlation of gene frequencies, r2, is a convenient measure of LD. Between populations, the statistic rirj, for populations i and j, measures the relatedness of LD. Recurrence relationships for these two parameters are derived for the island model of population subdivision, under the assumptions of the linked identity-by-descent (LIBD) model in which correlation measures are equated to probability measures. The recurrence relationships closely predict the build-up of r2 and rirj following population subdivision in computer simulations. The LIBD model predicts that a steady state will be reached with r2 equal to 1/[1+4Nec(1+(k−1)ρ)], where k is the number of island populations, Ne is the effective local population (island) size, and ρ measures the ratio of migration (m) to recombination (c) and is equal to m/[c(k−1)+m]. For low values of m/c, ρ=0, and E(r2) is equal to 1/(1+4Nec). For high values of m/c, ρ=1, and E(r2) is equal to 1/(1+4kNec). The value of rirj following separation eventually settles down to a steady state whose expectation, E(rirj), is equal to E(r2) multiplied by ρ. Equations predicting the change in rirj values are applied to the separation of African (Yoruba – YRI) and non-African (European – CEU) populations, using data from Hapmap. The primary data lead to an estimate of separation time of less than 1000 generations if there has been no migration, which is around one-third of minimum current estimates. Ancient rather than recent migration can explain the form of the data.

Information

Type
Paper
Copyright
Copyright © Cambridge University Press 2009
Figure 0

Fig. 1. Pathways for identity-by-descent for one and two loci.

Figure 1

Fig. 2. Test for predicted levels of LD within populations for low and high levels of migration and low- and high-MAF values. The predicted (expected) values are shown as either unbroken or broken lines, while the results from the simulations are shown as square or round symbols. Low migration values are shown with broken lines (expected) and unfilled symbols (observed), while high-migration values are shown with unbroken lines and filled symbols. High-MAF values are shown with thicker lines and larger symbols than low-MAF values. Allele frequencies and initial LD levels are from simulations of a single population infinite site mutation model with the same values of N (8192) and Nc (8) assumed in the subpopulation simulation.

Figure 2

Fig. 3. Values of r1r2 measuring the relationship of LD between populations. Symbols and conditions of the simulation are as in Fig. 2.

Figure 3

Fig. 4. Estimated separation time for Europe (CEU) from Africa (YRI). Filled squares show the calculated separation time in generations from applying eqn (19), derived under the assumption of no migration, to the Hapmap data. The straight line shows the values of T obtained from the same equation if actual separation occurred 3000 generations ago and migration occurred throughout at rate m=0·0016. The dashed line shows the values of T obtained from the same equation if actual separation time was again 3000 generations but migration occurred at rate m=0·0128 just for the first 2500 generations.