Hostname: page-component-89b8bd64d-9prln Total loading time: 0 Render date: 2026-05-06T07:51:25.502Z Has data issue: false hasContentIssue false

Mapping Mendelian traits in asexual progeny using changes in marker allele frequency

Published online by Cambridge University Press:  18 May 2011

SAYANTHAN LOGESWARAN*
Affiliation:
Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, West Mains Road, Edinburgh EH9 3JT, UK
NICK H. BARTON
Affiliation:
Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, West Mains Road, Edinburgh EH9 3JT, UK IST Austria, Am Campus 1, Klosterneuburg 3400, Austria
*
*Corresponding author: Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, West Mains Road, Edinburgh EH9 3JT, UK. e-mail: s.logeswaran@sms.ed.ac.uk
Rights & Permissions [Opens in a new window]

Summary

Linkage between markers and genes that affect a phenotype of interest may be determined by examining differences in marker allele frequency in the extreme progeny of a cross between two inbred lines. This strategy is usually employed when pooling is used to reduce genotyping costs. When the cross progeny are asexual, the extreme progeny may be selected by multiple generations of asexual reproduction and selection. We analyse this method of measuring phenotype in asexual progeny and examine the changes in marker allele frequency due to selection over many generations. Stochasticity in marker frequency in the selected population arises due to the finite initial population size. We derive the distribution of marker frequency as a result of selection at a single major locus, and show that in order to avoid spurious changes in marker allele frequency in the selected population, the initial population size should be in the low to mid hundreds.

Information

Type
Research Papers
Copyright
Copyright © Cambridge University Press 2011
Figure 0

Fig. 1. (a) Each line represents the typical marker composition of a single recombinant with a selected allele at position 4 on the genome represented by a circle. The black parts represent the fitter parental markers (positive markers) and the grey parts represent the less fit parental markers (negative markers). (b) Plot of the positive marker frequency in the selected population when there is just a single recombinant (the first genome in (a)) in the initial population. (c) The black and grey curves show two replicates of the positive marker frequencies in the selected population when all ten recombinants from the first graph are present in the initial population. It can be seen that the two replicates do not give the same frequencies. This reflects the random number of descendants each recombinant left in each replicate. (d) This shows the frequency of the positive markers in the selected population when there are 100 recombinants with the positive allele in the initial population. In (b), (c) and (d) the dotted curve represents the deterministic expectation for the positive marker frequency, which is 1−r, where r is calculated from the Haldane map function r=1/2 (1−e−2x), and x is the map distance between the marker and selected locus.

Figure 1

Fig. 2. Distribution of the relative numbers from a single recombinant given that its descendants have survived in the selected population. The diffusion curve represents (6) with parameters n0=1 and s=log(μ), and the exponential curve represents (7). The number of generations of growth were t={20, 10, 10} for μ={1·2, 2, 3}. The offspring distribution per generation was Poisson.

Figure 2

Fig. 3. Distribution of frequency for unlinked markers for a small initial population size of N=15 and a larger initial population size of N=100. For each of the initial population sizes, the distribution of frequency is plotted for small fitness μ=1·2 and large fitness μ=3. The black curve represents (8), while the grey curve in (c) and (d) is a normal approximation using (3) and (4). The number of generations of selection was 20 for small fitness and 10 for large fitness. The offspring distribution was Poisson.

Figure 3

Fig. 4. The variance in frequency in the selected population for unlinked markers. The variance was calculated from 2 Pm(1−Pm)(μ22−μ) (N (μ−1) μ)−1 (i.e. limit of (4) as t→∞), where Pm=0·5 and μ={1·2, 1·5, 3}. The offspring distribution used was a Poisson distribution and thus σ2=μ.

Figure 4

Fig. 5. The probability of getting a false positive plotted against the expected effective initial population size E(N*). The solid curves are the theoretical predictions using (9) (i.e. 1−P(unull<ulinked)) and the two dashed curves are simulation results. The parameters that were used was c=20 chromosomes each of length l=1 Morgan. The number of markers τ on each chromosome was τ=3 and 5. The black curves are results when τ=3 and the grey curves are the results when τ=5. The number of generations of selection was 10 and overall fitness of selected allele was 3.

Figure 5

Fig. 6. (a) Plot of the negative marker frequency (marker every 5 cM) in a single replicate where there is a selected allele fixed at position 1, and the effective initial population size N*=20. (b) The grey curve is a plot of the corresponding log likelihood ratios and the black line is the significance level. To calculate the log likelihood ratios, the genome was split into overlapping intervals of 10 cM, where the overlap was 5 cM. For each interval, the log likelihood ratio was calculated using the two markers that define the interval. The unknown parameter V in the log likelihood functions was estimated by using all markers and assuming they are all unlinked, and then solving dlog (L)/dV=0 for V. The significance levels were obtained by permutation analysis of simulated data from a null region. The simulated data were obtained by directly simulating frequencies from a multivariate normal with parameter \hats{V}.

Figure 6

Fig. 7. The marker frequencies and the genotypic composition of the population at various generations of selection when there are multiple selected loci are shown. There are a total of five selected alleles at positions {2, 4, 6, 8, 10} (shown by the filled circles) with selection coefficients {0·2, 0·05, 0·01, 0·03, 0·04}. With five selected alleles there are 32 possible genotypes. The bar charts show the proportion of each of these 32 genotypes in the population at that particular generation. Genotype number 1 refers to the least fit genotype (relative fitness of 1) and genotype 32 refers to the fittest possible genotype (relative fitness of 1·36). The initial population size was 200.