Skip to main content

Dirichlet approximation of equilibrium distributions in Cannings models with mutation

  • Han L. Gan (a1), Adrian Röllin (a2) and Nathan Ross (a3)

Consider a haploid population of fixed finite size with a finite number of allele types and having Cannings exchangeable genealogy with neutral mutation. The stationary distribution of the Markov chain of allele counts in each generation is an important quantity in population genetics but has no tractable description in general. We provide upper bounds on the distributional distance between the Dirichlet distribution and this finite-population stationary distribution for the Wright–Fisher genealogy with general mutation structure and the Cannings exchangeable genealogy with parent independent mutation structure. In the first case, the bound is small if the population is large and the mutations do not depend too much on parent type; 'too much' is naturally quantified by our bound. In the second case, the bound is small if the population is large and the chance of three-mergers in the Cannings genealogy is small relative to the chance of two-mergers; this is the same condition to ensure convergence of the genealogy to Kingman's coalescent. These results follow from a new development of Stein's method for the Dirichlet distribution based on Barbour's generator approach and a probabilistic description of the semigroup of the Wright–Fisher diffusion due to Griffiths and Li (1983) and Tavaré (1984).

Corresponding author
* Current address: Mathematics Department, Northwestern University, 2033 Sheridan Road, Evanston, IL 60208, USA. Email address:
** Postal address: Department of Statistics and Applied Probability, National University of Singapore, 6 Science Drive 2, 117546, Singapore. Email address:
*** Postal address: School of Mathematics and Statistics, University of Melbourne, Peter Hall Building, Melbourne, VIC 3010, Australia. Email address:
Hide All
[1] Appell , Banaś and Merentes (2014). Bounded Variation and Around (De Gruyter Ser. Nonlinear Anal. Appl. 17). De Gruyter, Berlin.
[2] Barbour A. D. (1990). Stein's method for diffusion approximations. Prob. Theory Relat. Fields 84, 297322.
[3] Barbour A. D., Ethier S. N. and Griffiths R. C. (2000). A transition function expansion for a diffusion model with selection. Ann. Appl. Prob. 10, 123162.
[4] Bentkus V. (2003). On the dependence of the Berry–Esseen bound on dimension. J. Statist. Planning Infer. 113, 385402.
[5] Bhaskar A., Clark A. G. and Song Y. S. (2014). Distortion of genealogical properties when the sample is very large. Proc. Nat. Acad. Sci. USA 111, 23852390.
[6] Bhaskar A., Kamm J. A. and Song Y. S. (2012). Approximate sampling formulae for general finite-alleles models of mutation. Adv. Appl. Prob. 44, 408428.
[7] Cannings C. (1974). The latent roots of certain Markov chains arising in genetics: a new approach. I. Haploid models. Adv. Appl. Prob. 6, 260290.
[8] Chatterjee S. (2014). A short survey of Stein's method. In Proceedings of the International Congress of Mathematicians, Seoul 2014, Vol. IV, Invited Lectures. Kyung Moon, Seoul, pp. 124.
[9] Chatterjee S. and Meckes E. (2008). Multivariate normal approximation using exchangeable pairs. ALEA Latin Amer. J. Prob. Math. Statist. 4, 257283.
[10] Chatterjee S. and Shao Q.-M. (2011). Nonnormal approximation by Stein's method of exchangeable pairs with application to the Curie–Weiss model. Ann. Appl. Prob. 21, 464483.
[11] Chatterjee S., Fulman J. and Röllin A. (2011). Exponential approximation by Stein's method and spectral graph theory. ALEA Latin Amer. J. Prob. Math. Statist. 8, 197223.
[12] Chen L. H. Y., Goldstein L. and Shao Q.-M. (2011). Normal Approximation by Stein's Method. Springer, Heidelberg.
[13] Döbler C. (2012). A rate of convergence for the arcsine law by Stein's method. Preprint. Available at
[14] Döbler C. (2015). Stein's method of exchangeable pairs for the beta distribution and generalizations. Electron. J. Prob. 20, 109.
[15] Ethier S. N. (1976). A class of degenerate diffusion processes occurring in population genetics. Commun. Pure Appl. Math. 29, 483493.
[16] Ethier S. N. and Kurtz T. G. (1986). Markov Processes: Characterization and Convergence. John Wiley, New York.
[17] Ethier S. N. and Kurtz T. G. (1992). On the stationary distribution of the neutral diffusion model in population genetics. Ann. Appl. Prob. 2, 2435.
[18] Ethier S. N. and Norman M. F. (1977). Error estimate for the diffusion approximation of the Wright–Fisher model. Proc. Nat. Acad. Sci. USA 74, 50965098.
[19] Fu Y.-X. (2006). Exact coalescent for the Wright–Fisher model. Theoret. Pop. Biol. 69, 385394.
[20] Fulman J. and Ross N. (2013). Exponential approximation and Stein's method of exchangeable pairs. ALEA Latin Amer. J. Prob. Math. Statist. 10, 113.
[21] Goldstein L. and Reinert G. (2013). Stein's method for the beta distribution and the Pólya–Eggenberger urn. J. Appl. Prob. 50, 11871205.
[22] Gorham J., Duncan A. B., Vollmer S. J. and Mackey L. (2016). Measuring sample quality with diffusions. Preprint. Available at
[23] Götze F. (1991). On the rate of convergence in the multivariate CLT. Ann. Prob. 19, 724739.
[24] Griffiths R. C. and Tavare S. (1994). Simulating probability distributions in the coalescent. Theoret. Pop. Biol. 46, 131159.
[25] Griffiths R. C. and Li W.-H. (1983). Simulating allele frequencies in a population and the genetic differentiation of populations under mutation pressure. Theoret. Pop. Biol. 23, 1933.
[26] Kingman J. F. C. (1982). Exchangeability and the evolution of large populations. In Exchangeability in Probability and Statistics (Rome, 1981), North-Holland, Amsterdam, pp. 97112.
[27] Kingman J. F. C. (1982). On the genealogy of large populations. In Essays in Statistical Science (J. Appl. Prob. Spec. Vol. 19A), Applied Probability Trust, Sheffield, pp. 2743.
[28] Kingman J. F. C. (1982). The coalescent. Stoch. Process. Appl. 13, 235248.
[29] Lessard S. (2007). An exact sampling formula for the Wright–Fisher model and a solution to a conjecture about the finite-island model. Genetics 177, 12491254.
[30] Lessard S. (2010). Recurrence equations for the probability distribution of sample configurations in exact population genetics models. J. Appl. Prob. 47, 732751.
[31] Mahmoud H. M. (2009). Pólya Urn Models. CRC, Boca Raton, FL.
[32] Möhle M. (2000). Total variation distances and rates of convergence for ancestral coalescent processes in exchangeable population models. Adv. Appl. Prob. 32, 983993.
[33] Möhle M. (2004). The time back to the most recent common ancestor in exchangeable population models. Adv. Appl. Prob. 36, 7897.
[34] Möhle M. and Sagitov S. (2001). A classification of coalescent processes for haploid exchangeable population models. Ann. Prob. 29, 15471562.
[35] Möhle M. and Sagitov S. (2003). Coalescent patterns in diploid exchangeable population models. J. Math. Biol. 47, 337352.
[36] Morvan J.-M. (2008). Generalized Curvatures (Geom. Comput. 2). Springer, Berlin.
[37] Mukhopadhyay S. N. (2012). Higher Order Derivatives (Chapman & Hall/CRC Monogr. Surveys Pure Appl. Math. 144). CRC, Boca Raton, FL.
[38] Peköz E. A., Röllin A. and Ross N. (2017). Joint degree distributions of preferential attachment random graphs. Adv. Appl. Prob. 49, 368387.
[39] Reinert G. and Röllin A. (2009). Multivariate normal approximation with Stein's method of exchangeable pairs under a general linearity condition. Ann. Prob. 37, 21502173.
[40] Rinott Y. and Rotar V. (1997). On coupling constructions and rates in the CLT for dependent summands with applications to the antivoter model and weighted U-statistics. Ann. Appl. Prob. 7, 10801105.
[41] Röllin A. (2008). A note on the exchangeability condition in Stein's method. Statist. Prob. Lett. 78, 18001806.
[42] Ross N. (2011). Fundamentals of Stein's method. Prob. Surv. 8, 210293.
[43] Russell A. M. (1973). Functions of bounded kth variation. Proc. London Math. Soc. (3) 26, 547563.
[44] Shiga T. (1981). Diffusion processes in population genetics. J. Math. Kyoto Univ. 21, 133151.
[45] Stein C. (1972). bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Vol. II, Probability Theory. University of California Press, Berkeley, pp. 583602.
[46] Stein C. (1986). Approximate Computation of Expectations (Inst. Math. Statist. Lecture Notes Monogr. Ser. 7). Institute of Mathematical Statistics, Hayward, CA.
[47] Tavaré S. (1984). Line-of-descent and genealogical processes, and their applications in population genetics models. Theoret. Pop. Biol. 26, 119164.
[48] Wright S. (1949). Adaptation and selection. In Genetics, Paleontology and Evolution, Princeton University Press, pp. 365389.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Advances in Applied Probability
  • ISSN: 0001-8678
  • EISSN: 1475-6064
  • URL: /core/journals/advances-in-applied-probability
Please enter your name
Please enter a valid email address
Who would you like to send this to? *



Full text views

Total number of HTML views: 2
Total number of PDF views: 19 *
Loading metrics...

Abstract views

Total abstract views: 77 *
Loading metrics...

* Views captured on Cambridge Core between 8th September 2017 - 14th December 2017. This data will be updated every 24 hours.