Skip to main content
×
Home
    • Aa
    • Aa

Estimating effective population size from samples of sequences: inefficiency of pairwise and segregating sites as compared to phylogenetic estimates

  • Joseph Felsenstein (a1)
Abstract
Summary

It is known that under neutral mutation at a known mutation rate a sample of nucleotide sequences, within which there is assumed to be no recombination, allows estimation of the effective size of an isolated population. This paper investigates the case of very long sequences, where each pair of sequences allows a precise estimate of the divergence time of those two gene copies. The average divergence time of all pairs of copies estimates twice the effective population number and an estimate can also be derived from the number of segregating sites. One can alternatively estimate the genealogy of the copies. This paper shows how a maximum likelihood estimate of the effective population number can be derived from such a genealogical tree. The pairwise and the segregating sites estimates are shown to be much less efficient than this maximum likelihood estimate, and this is verified by computer simulation. The result implies that there is much to gain by explicitly taking the tree structure of these genealogies into account.

Copyright
Linked references
Hide All

This list contains references from the content that can be linked to their source. For a full set of references and notes please see the PDF or HTML where available.

J. C. Avise (1989). Gene trees and organismal histories: a phylogenetic approach to population biology. Evolution 43, 11921208.

R. M. Ball Jr, J. E. Neigel & J. C. Avise (1990). Gene genealogies within the organismal pedigrees of randommating populations. Evolution 44, 360370.

R. L. Cann , M. Stoneking & A. C. Wilson (1987). Mitochondrial DNA and human evolution. Nature 325, 3136.

S. N. Ethier & R. C. Griffiths (1987). The infinitely-manysites model as a measure-valued diffusion. Annals of Probability 15, 515545.

E. F. Harding (1971). The probabilities of rooted tree shapes generated by random bifurcation. Advances in Applied Probability 3, 4477.

R. R. Hudson (1983). Testing the constant-rate neutral allele model with protein sequence data. Evolution 37, 203217.

J. F. C. Kingman (1982 a). The coalescent. Stochastic Processes and Their Applications 13, 235248.

J. F. C. Kingman (1982 b). On the genealogy of large populations. Journal of Applied Probability 19 A, 2743.

W. P. Maddison & M. Slatkin (1991). Null models for the number of evolutionary steps in a character on a phylogenetic tree. Evolution 45, 11841197.

M. Nei (1987). Molecular Evolutionary Genetics. New York: Columbia University Press.

I. W. Saunders , S. Tavare & G. A. Watterson (1984). On the genealogy of nested subsamples from a haploid population. Advances in Applied Probability 16, 471491.

M. Slatkin (1987). The average number of sites separating DNA sequences drawn from a subdivided population. Theoretical Population Biology 32, 4249.

J. G. Slowinski & C. Guyer (1989). Testing the stochasticity of patterns of organismal diversity: an improved null model. American Naturalist 134, 907921.

C. Strobeck (1983). Estimation of the neutral mutation rate in a finite population from DNA sequence data. Theoretical Population Biology 24, 160172.

S. Tavare (1984). Line-of-descent and genealogical processes, and their applications in population genetics models. Theoretical Population Biology 26, 119164.

G. A. Watterson (1975). On the number of segregating sites in genetical models without recombination. Theoretical Population Biology 7, 256276.

S. Wright (1940). Breeding structure of populations in relation to speciation. American Naturalist 74, 232248.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Genetics Research
  • ISSN: 0016-6723
  • EISSN: 1469-5073
  • URL: /core/journals/genetics-research
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×