Estimating effective population size from samples of sequences: inefficiency of pairwise and segregating sites as compared to phylogenetic estimates

Joseph Felsenstein

doi:10.1017/S0016672300030354

Estimating effective population size from samples of sequences: inefficiency of pairwise and segregating sites as compared to phylogenetic estimates

Published online by Cambridge University Press: 14 April 2009

Joseph Felsenstein

Show author details

Joseph Felsenstein: Affiliation:
Department of Genetics SK-50, University of Washington, Seattle, Washington 98195

Article contents

Summary
References

Rights & Permissions

Summary

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

It is known that under neutral mutation at a known mutation rate a sample of nucleotide sequences, within which there is assumed to be no recombination, allows estimation of the effective size of an isolated population. This paper investigates the case of very long sequences, where each pair of sequences allows a precise estimate of the divergence time of those two gene copies. The average divergence time of all pairs of copies estimates twice the effective population number and an estimate can also be derived from the number of segregating sites. One can alternatively estimate the genealogy of the copies. This paper shows how a maximum likelihood estimate of the effective population number can be derived from such a genealogical tree. The pairwise and the segregating sites estimates are shown to be much less efficient than this maximum likelihood estimate, and this is verified by computer simulation. The result implies that there is much to gain by explicitly taking the tree structure of these genealogies into account.

Information

Type: Research Article
Information: Genetics Research , Volume 59 , Issue 2 , April 1992 , pp. 139 - 147

DOI: https://doi.org/10.1017/S0016672300030354 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 1992

References

Avise, J. C. (1989). Gene trees and organismal histories: a phylogenetic approach to population biology. Evolution 43, 1192–1208.CrossRef Google Scholar PubMed

Avise, J. C, Ball, R. M. Jr & Arnold, J. (1988). Current versus historical population sizes in vertebrate species with high gene flow: a comparison based on mitochondrial DNA polymorphism and inbreeding theory for neutral mutations. Molecular Biology and Evolution 5, 331–344.Google Scholar

Ball, R. M. Jr, Neigel, J. E. & Avise, J. C. (1990). Gene genealogies within the organismal pedigrees of randommating populations. Evolution 44, 360–370.Google Scholar PubMed

Cann, R. L., Stoneking, M. & Wilson, A. C. (1987). Mitochondrial DNA and human evolution. Nature 325, 31–36.CrossRef Google Scholar PubMed

Ethier, S. N. & Griffiths, R. C. (1987). The infinitely-manysites model as a measure-valued diffusion. Annals of Probability 15, 515–545.CrossRef Google Scholar

Feller, W. (1968). An Introduction to Probability Theory and Its Applications, 3rd edn.New York: John Wiley.Google Scholar

Griffiths, R. C. (1989). Genealogical tree probabilities in the infinitely-many-site model. Journal of Mathematical Biology 11, 667–680.CrossRef Google Scholar

Harding, E. F. (1971). The probabilities of rooted tree shapes generated by random bifurcation. Advances in Applied Probability 3, 44–77.CrossRef Google Scholar

Hudson, R. R. (1983). Testing the constant-rate neutral allele model with protein sequence data. Evolution 37, 203–217.CrossRef Google Scholar PubMed

Kingman, J. F. C. (1982 a). The coalescent. Stochastic Processes and Their Applications 13, 235–248.CrossRef Google Scholar

Kingman, J. F. C. (1982 b). On the genealogy of large populations. Journal of Applied Probability 19 A, 27–43.CrossRef Google Scholar

Maddison, W. P. & Slatkin, M. (1991). Null models for the number of evolutionary steps in a character on a phylogenetic tree. Evolution 45, 1184–1197.CrossRef Google Scholar

Moran, P. A. P. (1958). Random processes in genetics. Proc. Camb. Phil. Soc. 54, 60–71.CrossRef Google Scholar

Nei, M. & Tajima, F. (1981). DNA polymorphism detectable by restriction endonucleases. Genetics 97, 145–163.CrossRef Google Scholar PubMed

Nei, M. (1987). Molecular Evolutionary Genetics. New York: Columbia University Press.CrossRef Google Scholar

Saunders, I. W., Tavare, S. & Watterson, G. A. (1984). On the genealogy of nested subsamples from a haploid population. Advances in Applied Probability 16, 471–491.CrossRef Google Scholar

Slatkin, M. (1987). The average number of sites separating DNA sequences drawn from a subdivided population. Theoretical Population Biology 32, 42–49.CrossRef Google Scholar PubMed

Slatkin, M. (1989). Detecting small amounts of gene flow from phylogenies of alleles. Genetics 121, 609–612.CrossRef Google Scholar PubMed

Slatkin, M. & Maddison, W. P. (1989). Cladistic measure of gene flow inferred from the phylogenies of alleles. Genetics 123, 603–613.CrossRef Google Scholar PubMed

Slowinski, J. G. & Guyer, C. (1989). Testing the stochasticity of patterns of organismal diversity: an improved null model. American Naturalist 134, 907–921.CrossRef Google Scholar

Strobeck, C. (1983). Estimation of the neutral mutation rate in a finite population from DNA sequence data. Theoretical Population Biology 24, 160–172.CrossRef Google Scholar

Tajima, F. (1983). Evolutionary relationship of DNA sequences in finite populations. Genetics 105, 437–460.CrossRef Google Scholar PubMed

Takahata, N. (1988). The coalescent in two partially isolated diffusion populations. Genetical Research 52, 213–222.CrossRef Google Scholar PubMed

Takahata, N. & Slatkin, M. (1990). Genealogy of neutral genes in two partially isolated populations. Theoretical Population Biology 38, 331–350.CrossRef Google Scholar PubMed

Tavare, S. (1984). Line-of-descent and genealogical processes, and their applications in population genetics models. Theoretical Population Biology 26, 119–164.CrossRef Google Scholar PubMed

Watterson, G. A. (1975). On the number of segregating sites in genetical models without recombination. Theoretical Population Biology 7, 256–276.CrossRef Google Scholar PubMed

Wright, S. (1940). Breeding structure of populations in relation to speciation. American Naturalist 74, 232–248.CrossRef Google Scholar

Article contents

Estimating effective population size from samples of sequences: inefficiency of pairwise and segregating sites as compared to phylogenetic estimates

Summary

Information

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests