Hostname: page-component-76fb5796d-r6qrq Total loading time: 0 Render date: 2024-04-25T17:16:08.411Z Has data issue: false hasContentIssue false

Dirichlet approximation of equilibrium distributions in Cannings models with mutation

Published online by Cambridge University Press:  08 September 2017

Han L. Gan*
Affiliation:
Washington University in St. Louis
Adrian Röllin*
Affiliation:
National University of Singapore
Nathan Ross*
Affiliation:
University of Melbourne
*
* Current address: Mathematics Department, Northwestern University, 2033 Sheridan Road, Evanston, IL 60208, USA. Email address: ganhl@math.northwestern.edu
** Postal address: Department of Statistics and Applied Probability, National University of Singapore, 6 Science Drive 2, 117546, Singapore. Email address: adrian.roellin@nus.edu.sg
*** Postal address: School of Mathematics and Statistics, University of Melbourne, Peter Hall Building, Melbourne, VIC 3010, Australia. Email address: nathan.ross@unimelb.edu.au

Abstract

Consider a haploid population of fixed finite size with a finite number of allele types and having Cannings exchangeable genealogy with neutral mutation. The stationary distribution of the Markov chain of allele counts in each generation is an important quantity in population genetics but has no tractable description in general. We provide upper bounds on the distributional distance between the Dirichlet distribution and this finite-population stationary distribution for the Wright–Fisher genealogy with general mutation structure and the Cannings exchangeable genealogy with parent independent mutation structure. In the first case, the bound is small if the population is large and the mutations do not depend too much on parent type; 'too much' is naturally quantified by our bound. In the second case, the bound is small if the population is large and the chance of three-mergers in the Cannings genealogy is small relative to the chance of two-mergers; this is the same condition to ensure convergence of the genealogy to Kingman's coalescent. These results follow from a new development of Stein's method for the Dirichlet distribution based on Barbour's generator approach and a probabilistic description of the semigroup of the Wright–Fisher diffusion due to Griffiths and Li (1983) and Tavaré (1984).

Type
Research Article
Copyright
Copyright © Applied Probability Trust 2017 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Appell, , Banaś, and Merentes, (2014). Bounded Variation and Around (De Gruyter Ser. Nonlinear Anal. Appl. 17). De Gruyter, Berlin. Google Scholar
[2] Barbour, A. D. (1990). Stein's method for diffusion approximations. Prob. Theory Relat. Fields 84, 297322. Google Scholar
[3] Barbour, A. D., Ethier, S. N. and Griffiths, R. C. (2000). A transition function expansion for a diffusion model with selection. Ann. Appl. Prob. 10, 123162. Google Scholar
[4] Bentkus, V. (2003). On the dependence of the Berry–Esseen bound on dimension. J. Statist. Planning Infer. 113, 385402. CrossRefGoogle Scholar
[5] Bhaskar, A., Clark, A. G. and Song, Y. S. (2014). Distortion of genealogical properties when the sample is very large. Proc. Nat. Acad. Sci. USA 111, 23852390. Google Scholar
[6] Bhaskar, A., Kamm, J. A. and Song, Y. S. (2012). Approximate sampling formulae for general finite-alleles models of mutation. Adv. Appl. Prob. 44, 408428. Google Scholar
[7] Cannings, C. (1974). The latent roots of certain Markov chains arising in genetics: a new approach. I. Haploid models. Adv. Appl. Prob. 6, 260290. CrossRefGoogle Scholar
[8] Chatterjee, S. (2014). A short survey of Stein's method. In Proceedings of the International Congress of Mathematicians, Seoul 2014, Vol. IV, Invited Lectures. Kyung Moon, Seoul, pp. 124. Google Scholar
[9] Chatterjee, S. and Meckes, E. (2008). Multivariate normal approximation using exchangeable pairs. ALEA Latin Amer. J. Prob. Math. Statist. 4, 257283. Google Scholar
[10] Chatterjee, S. and Shao, Q.-M. (2011). Nonnormal approximation by Stein's method of exchangeable pairs with application to the Curie–Weiss model. Ann. Appl. Prob. 21, 464483. Google Scholar
[11] Chatterjee, S., Fulman, J. and Röllin, A. (2011). Exponential approximation by Stein's method and spectral graph theory. ALEA Latin Amer. J. Prob. Math. Statist. 8, 197223. Google Scholar
[12] Chen, L. H. Y., Goldstein, L. and Shao, Q.-M. (2011). Normal Approximation by Stein's Method. Springer, Heidelberg. Google Scholar
[13] Döbler, C. (2012). A rate of convergence for the arcsine law by Stein's method. Preprint. Available at https://arxiv.org/abs/1207.2401. Google Scholar
[14] Döbler, C. (2015). Stein's method of exchangeable pairs for the beta distribution and generalizations. Electron. J. Prob. 20, 109. Google Scholar
[15] Ethier, S. N. (1976). A class of degenerate diffusion processes occurring in population genetics. Commun. Pure Appl. Math. 29, 483493. CrossRefGoogle Scholar
[16] Ethier, S. N. and Kurtz, T. G. (1986). Markov Processes: Characterization and Convergence. John Wiley, New York. Google Scholar
[17] Ethier, S. N. and Kurtz, T. G. (1992). On the stationary distribution of the neutral diffusion model in population genetics. Ann. Appl. Prob. 2, 2435. Google Scholar
[18] Ethier, S. N. and Norman, M. F. (1977). Error estimate for the diffusion approximation of the Wright–Fisher model. Proc. Nat. Acad. Sci. USA 74, 50965098. Google Scholar
[19] Fu, Y.-X. (2006). Exact coalescent for the Wright–Fisher model. Theoret. Pop. Biol. 69, 385394. Google Scholar
[20] Fulman, J. and Ross, N. (2013). Exponential approximation and Stein's method of exchangeable pairs. ALEA Latin Amer. J. Prob. Math. Statist. 10, 113. Google Scholar
[21] Goldstein, L. and Reinert, G. (2013). Stein's method for the beta distribution and the Pólya–Eggenberger urn. J. Appl. Prob. 50, 11871205. Google Scholar
[22] Gorham, J., Duncan, A. B., Vollmer, S. J. and Mackey, L. (2016). Measuring sample quality with diffusions. Preprint. Available at https://arxiv.org/abs/1611.06972. Google Scholar
[23] Götze, F. (1991). On the rate of convergence in the multivariate CLT. Ann. Prob. 19, 724739. Google Scholar
[24] Griffiths, R. C. and Tavare, S. (1994). Simulating probability distributions in the coalescent. Theoret. Pop. Biol. 46, 131159. Google Scholar
[25] Griffiths, R. C. and Li, W.-H. (1983). Simulating allele frequencies in a population and the genetic differentiation of populations under mutation pressure. Theoret. Pop. Biol. 23, 1933. Google Scholar
[26] Kingman, J. F. C. (1982). Exchangeability and the evolution of large populations. In Exchangeability in Probability and Statistics (Rome, 1981), North-Holland, Amsterdam, pp. 97112. Google Scholar
[27] Kingman, J. F. C. (1982). On the genealogy of large populations. In Essays in Statistical Science (J. Appl. Prob. Spec. Vol. 19A), Applied Probability Trust, Sheffield, pp. 2743. Google Scholar
[28] Kingman, J. F. C. (1982). The coalescent. Stoch. Process. Appl. 13, 235248. Google Scholar
[29] Lessard, S. (2007). An exact sampling formula for the Wright–Fisher model and a solution to a conjecture about the finite-island model. Genetics 177, 12491254. Google Scholar
[30] Lessard, S. (2010). Recurrence equations for the probability distribution of sample configurations in exact population genetics models. J. Appl. Prob. 47, 732751. CrossRefGoogle Scholar
[31] Mahmoud, H. M. (2009). Pólya Urn Models. CRC, Boca Raton, FL. Google Scholar
[32] Möhle, M. (2000). Total variation distances and rates of convergence for ancestral coalescent processes in exchangeable population models. Adv. Appl. Prob. 32, 983993. Google Scholar
[33] Möhle, M. (2004). The time back to the most recent common ancestor in exchangeable population models. Adv. Appl. Prob. 36, 7897. Google Scholar
[34] Möhle, M. and Sagitov, S. (2001). A classification of coalescent processes for haploid exchangeable population models. Ann. Prob. 29, 15471562. Google Scholar
[35] Möhle, M. and Sagitov, S. (2003). Coalescent patterns in diploid exchangeable population models. J. Math. Biol. 47, 337352. Google Scholar
[36] Morvan, J.-M. (2008). Generalized Curvatures (Geom. Comput. 2). Springer, Berlin. Google Scholar
[37] Mukhopadhyay, S. N. (2012). Higher Order Derivatives (Chapman & Hall/CRC Monogr. Surveys Pure Appl. Math. 144). CRC, Boca Raton, FL. Google Scholar
[38] Peköz, E. A., Röllin, A. and Ross, N. (2017). Joint degree distributions of preferential attachment random graphs. Adv. Appl. Prob. 49, 368387. Google Scholar
[39] Reinert, G. and Röllin, A. (2009). Multivariate normal approximation with Stein's method of exchangeable pairs under a general linearity condition. Ann. Prob. 37, 21502173. CrossRefGoogle Scholar
[40] Rinott, Y. and Rotar, V. (1997). On coupling constructions and rates in the CLT for dependent summands with applications to the antivoter model and weighted U-statistics. Ann. Appl. Prob. 7, 10801105. Google Scholar
[41] Röllin, A. (2008). A note on the exchangeability condition in Stein's method. Statist. Prob. Lett. 78, 18001806. Google Scholar
[42] Ross, N. (2011). Fundamentals of Stein's method. Prob. Surv. 8, 210293. Google Scholar
[43] Russell, A. M. (1973). Functions of bounded kth variation. Proc. London Math. Soc. (3) 26, 547563. Google Scholar
[44] Shiga, T. (1981). Diffusion processes in population genetics. J. Math. Kyoto Univ. 21, 133151. Google Scholar
[45] Stein, C. (1972). bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Vol. II, Probability Theory. University of California Press, Berkeley, pp. 583602. Google Scholar
[46] Stein, C. (1986). Approximate Computation of Expectations (Inst. Math. Statist. Lecture Notes Monogr. Ser. 7). Institute of Mathematical Statistics, Hayward, CA. Google Scholar
[47] Tavaré, S. (1984). Line-of-descent and genealogical processes, and their applications in population genetics models. Theoret. Pop. Biol. 26, 119164. Google Scholar
[48] Wright, S. (1949). Adaptation and selection. In Genetics, Paleontology and Evolution, Princeton University Press, pp. 365389. Google Scholar