Skip to main content
Log in

Distributions of Random Partitions and Their Applications

  • Published:
Methodology and Computing in Applied Probability Aims and scope Submit manuscript

Abstract

Assume that a random sample of size m is selected from a population containing a countable number of classes (subpopulations) of elements (individuals). A partition of the set of sample elements into (unordered) subsets, with each subset containing the elements that belong to same class, induces a random partition of the sample size m, with part sizes {Z 1,Z 2,...,Z N } being positive integer-valued random variables. Alternatively, if N j is the number of different classes that are represented in the sample by j elements, for j=1,2,...,m, then (N 1,N 2,...,N m ) represents the same random partition. The joint and the marginal distributions of (N 1,N 2,...,N m ), as well as the distribution of \(N=\sum^m_{j=1}N_{\!j}\) are of particular interest in statistical inference. From the inference point of view, it is desirable that all the information about the population is contained in (N 1,N 2,...,N m ). This requires that no physical, genetical or other kind of significance is attached to the actual labels of the population classes. In the present paper, combinatorial, probabilistic and compound sampling models are reviewed. Also, sampling models with population classes of random weights (proportions), and in particular the Ewens and Pitman sampling models, on which many publications are devoted, are extensively presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • C. E. Antoniak, “Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems,” Annals of Statistics vol. 2 pp. 1152–1174, 1974.

    MATH  MathSciNet  Google Scholar 

  • R. Arratia, A. D. Barbour, and S. Tavaré, “Poisson process approximations for the Ewens sampling formula,” Annals of Applied Probability vol. 2 pp. 519–535, 1992.

    MATH  MathSciNet  Google Scholar 

  • D. E. Barton and F. N. David, “Contagious occupancy,” Journal of the Royal Statistical Society, Series B vol. 21 pp. 120–123, 1959a.

    MATH  Google Scholar 

  • D. E. Barton and F. N. David, “Haemacytometer counts and occupancy theory,” Trabajos de Estadistica vol. 10 pp. 13–18, 1959b.

    MATH  MathSciNet  Google Scholar 

  • T. Cacoullos and Ch. A. Charalambides, “On minimum variance unbiased estimation for truncated binomial and negative binomial distributions,” Annals of the Institute of Statistical Mathematics vol. 27 pp. 235–244, 1975.

    Article  MATH  MathSciNet  Google Scholar 

  • Ch. A. Charalambides, “The asymptotic normality of certain combinatorial distributions,” Annals of the Institute of Statistical Mathematics vol. 28 pp. 499–506, 1976.

    Article  MATH  MathSciNet  Google Scholar 

  • Ch. A. Charalambides, “On a restricted occupancy model and its applications,” Biometrical Journal vol. 23 pp. 601–610, 1981.

    Article  MATH  MathSciNet  Google Scholar 

  • Ch. A. Charalambides, “On restricted and pseudo-contagious occupancy distributions,” Journal of Applied Probability vol. 20 pp. 872–876, 1983.

    Article  MATH  MathSciNet  Google Scholar 

  • Ch. A. Charalambides, Enumerative Combinatorics, CRC Press: Boca Raton, FL, 2002.

    MATH  Google Scholar 

  • Ch. A. Charalambides, Combinatorial Methods in Discrete Distributions, Wiley: Hoboken, NJ, 2005.

    Book  MATH  Google Scholar 

  • A. De Moivre, The Doctrine of Chances, Pearson: London, 1718 (2nd ed. 1738 and 3rd ed. 1756).

    Google Scholar 

  • P. Donnelly, “Partitions structures, Pólya urns, the Ewens sampling formula, and the ages of alleles,” Theoretical Population Biology vol. 30 pp. 271–288, 1986.

    Article  MATH  MathSciNet  Google Scholar 

  • P. Donnelly and G. Grimmett, “On the asymptotic distribution of large prime factors,” Journal of the London Mathematical Society vol. 47 pp. 395–404, 1993.

    Article  MATH  MathSciNet  Google Scholar 

  • P. Donnelly and S. Tavaré, “The ages of alleles and a coalescent,” Advances in Applied Probability vol. 18 pp. 1–19, 1986.

    Article  MATH  MathSciNet  Google Scholar 

  • S. Engen, Stochastic Abundance Models with Emphasis on Biological Communities and Species Diversity, Chapman & Hall: London, UK, 1978.

    MATH  Google Scholar 

  • W. J. Ewens, “The sampling theory of selectively neutral alleles,” Theoretical Population Biology vol. 3 pp. 87–112, 1972.

    Article  MathSciNet  Google Scholar 

  • W. Feller, An Introduction to Probability Theory and its Applications, (vol. 1, 3rd edn) Wiley: New York, 1968.

    MATH  Google Scholar 

  • C. M. Goldie, “Records, permutations and greatest convex minorants,” Mathematical Proceedings of the Cambridge Philosophical Society vol. 106 pp. 169–177, 1989.

    MATH  MathSciNet  Google Scholar 

  • R. C. Griffiths, “Lines of descent in the diffusion approximation of neutral Wright–Fisher models,” Theoretical Population Biology vol. 17 pp. 37–50, 1980.

    Article  MATH  MathSciNet  Google Scholar 

  • J. C. Hansen, “A functional central limit theorem for the Ewens sampling formula,” Journal of Applied Probability vol. 27 pp. 28–43, 1990.

    Article  MATH  MathSciNet  Google Scholar 

  • F. M. Hoppe, “Pólya-like urns and the Ewens sampling formula,” Journal of Mathematical Biology vol. 20 pp. 91–99, 1984.

    Article  MATH  MathSciNet  Google Scholar 

  • F. M. Hoppe, “Size-biased filtering of Poisson–Dirichlet samples with an application to partition structures in genetics,” Journal of Applied Probability vol. 23 pp. 1008–1012, 1986.

    Article  MATH  MathSciNet  Google Scholar 

  • F. M. Hoppe, “The sampling theory of neutral alleles and an urn model in population genetics,” Journal of Mathematical Biology vol. 25 pp. 123–159, 1987.

    MATH  MathSciNet  Google Scholar 

  • N. Hoshino, “Engen’s extended negative binomial model revisited,” Annals of the Institute of Statistical Mathematics vol. 57 pp. 369–387, 2005.

    Article  MATH  MathSciNet  Google Scholar 

  • T. Huillet, “Sampling formulae arising from random Dirichlet populations,” Communications in Statistics. Theory and Methods vol. 34 pp. 1019–1040, 2005.

    Article  MATH  MathSciNet  Google Scholar 

  • N. L. Johnson and S. Kotz, Urn Models and Their Applications, Wiley: New York, 1977.

    Google Scholar 

  • N. L. Johnson and S. Kotz, “Developments in discrete distributions, 1969–1980,” International Statistical Review vol. 50 pp. 71–101, 1982.

    Article  MATH  MathSciNet  Google Scholar 

  • N. L. Johnson, S. Kotz, and N. Balakrishnan, Discrete Multivariate Distributions, Wiley: New York, 1997.

    MATH  Google Scholar 

  • N. L. Johnson, S. Kotz, and A. W. Kemp, Univariate Discrete Distributions, (2nd edn) Wiley: New York, 1992.

    MATH  Google Scholar 

  • P. Joyce, “Partition structures and sufficient statistics,” Journal of Applied Probability vol. 35 pp. 622–632, 1998.

    Article  MATH  MathSciNet  Google Scholar 

  • S. Karlin and J. McGregor, “Addendum to a paper of W. Ewens,” Theoretical Population Biology vol. 3 pp. 113–116, 1972.

    Article  MathSciNet  Google Scholar 

  • F. P. Kelly, “On stochastic population models in genetics,” Journal of Applied Probability vol. 13 pp. 127–131, 1976.

    Article  MathSciNet  MATH  Google Scholar 

  • F. P. Kelly, “Exact results for the Moran neutral allele model,” Advances of Applied Probability vol. 9 pp. 197–201, 1977.

    Article  Google Scholar 

  • R. Keener, E. Rothman, and N. Starr, “Distributions on partitions,” Annals of Statistics vol. 15 pp. 1466–1481, 1987.

    MATH  MathSciNet  Google Scholar 

  • J. F. C. Kingman, “Random discrete distributions,” Journal of Royal Statistical Society, Series B vol. 37 pp. 1–22, 1975.

    MATH  MathSciNet  Google Scholar 

  • J. F. C. Kingman, “The population structure associated with the Ewens sampling formula,” Theoretical Population Biology vol. 11 pp. 274–283, 1977.

    Article  MathSciNet  Google Scholar 

  • J. F. C. Kingman, “Random partitions in population genetics,” Proceedings of the Royal Society London, Series A vol. 361 pp. 1–20, 1978a.

    MATH  MathSciNet  Google Scholar 

  • J. F. C. Kingman, “The representation of partition structures,” Journal of the London Mathematical Society vol. 18 pp. 374–380, 1978b.

    Article  MATH  MathSciNet  Google Scholar 

  • J. F. C. Kingman, “On the genealogy of large populations,” Journal of Applied Probability vol. 19A pp. 27–43, 1982a.

    Article  MathSciNet  Google Scholar 

  • J. F. C. Kingman, “The coalescent,” Stochastic Processes and Their Applications vol. 13 pp. 235–248, 1982b.

    Article  MATH  MathSciNet  Google Scholar 

  • S. Kotz and N. Balakrishnan, “Advances in urn models during the past two decades.” In N. Balakrishnan (ed.), Advances in Combinatorial Methods and Applications to Probability and Statistics, pp. 203–257, Birkhäuser: Boston, MA, 1997.

    Google Scholar 

  • M. Koutras, “Non-central Stirling numbers and some applications,” Discrete Mathematics vol. 42 pp. 73–89, 1982.

    Article  MATH  MathSciNet  Google Scholar 

  • S. Kullback, “On certain distributions derived from the multinomial distribution,” Annals of Mathematical Statistics vol. 8 pp. 128–144, 1937.

    Google Scholar 

  • J. W. McGloskey, “A model for the distribution of individuals by species in an environment,” Ph.D. thesis, Michigan State University, 1965.

  • K. Nishimura and M. Sibuya, “Extended Stirling family of discrete probability distributions,” Communications in Statistics. Theory and Methods vol. 26 pp. 1727–1744, 1997.

    MATH  MathSciNet  Google Scholar 

  • G. P. Patil and S. Bildikar, “On minimum variance unbiased estimation for the logarithmic series distribution,” Sankyā, Series A vol. 28 pp. 239–250, 1966.

    MATH  MathSciNet  Google Scholar 

  • G. P. Patil and C. Taillie, “Diversity as a concept and its applications for random communities,” Bulletin of the International Statistical Institute vol. XLVII pp. 497–515, 1977.

    MathSciNet  Google Scholar 

  • G. P. Patil and J. K. Wani, “On certain structural properties of the logarithmic series distribution and the first type Stirling distribution,” Sankyā, Series A vol. 27 pp. 271–280, 1965.

    MATH  MathSciNet  Google Scholar 

  • M. Perman, J. Pitman, and M. Yor, “Size-biased sampling of Poisson point processes and excursions,” Probability Theory and Related Fields vol. 92 pp. 21–39, 1992.

    Article  MATH  MathSciNet  Google Scholar 

  • J. Pitman, “Exchangeable and partially exchangeable random partitions,” Probability Theory and Related Fields vol. 102 pp. 145–158, 1995.

    Article  MATH  MathSciNet  Google Scholar 

  • J. Pitman, “Random discrete distributions invariant under size-biased permutation,” Advances in Applied Probability vol. 28 pp. 525–539, 1996.

    Article  MATH  MathSciNet  Google Scholar 

  • J. Pitman and M. Yor, “The two-parameter Poisson–Dirichlet distribution derived from a stable subordinator,” Annals of Probability vol. 25 pp. 855–900, 1997.

    Article  MATH  MathSciNet  Google Scholar 

  • G. B. Price, “Distributions derived from the multinomial expansion,” American Mathematical Monthly vol. 53 pp. 59–74, 1946.

    Article  MATH  MathSciNet  Google Scholar 

  • V. Romanovsky, “Su due problemi di distribuzione casuale,” Giornalle dell’ Istituto Italiano degli Attuari vol. 5 pp. 196–218, 1934.

    Google Scholar 

  • M. Sibuya, “A random clustering process,” Annals of the Institute of Statistical Mathematics vol. 45 pp. 459–465, 1993.

    Article  MATH  MathSciNet  Google Scholar 

  • M. Sibuya and H. Yamato, “Ordered and unordered random partitions of an integer and the GEM distribution,” Statistics & Probability Letters vol. 25 177–183, 1995.

    Article  MATH  MathSciNet  Google Scholar 

  • F. M. Steward, “Variability in the amount of heterozygosity maintained by neutral mutations,” Theoretical Population Biology vol. 9 pp. 188–201, 1976.

    Article  MathSciNet  Google Scholar 

  • A. C. Trajstman, “On a conjecture of G. A. Watterson,” Advances in Applied Probability vol. 6 pp. 489–493, 1974.

    Article  MATH  MathSciNet  Google Scholar 

  • G. Trieb, “A Pólya urn model and the coalescent,” Journal of Applied Probability vol. 29 pp. 1–10, 1992.

    Article  MATH  MathSciNet  Google Scholar 

  • G. A. Watterson, “Models for the logarithmic species abudance distributions,” Theoretical Population Biology vol. 6 pp. 217–250, 1974a.

    Article  MathSciNet  Google Scholar 

  • G. A. Watterson, “The sampling theory of selectively neutral alleles,” Advances in Applied Probability vol. 6 pp. 463–488, 1974b.

    Article  MATH  MathSciNet  Google Scholar 

  • G. A. Watterson, “The stationary distribution of the infinitely-many neutral alleles diffusion model,” Journal of Applied Probability vol. 13 pp. 639–651, 1976.

    Article  MATH  MathSciNet  Google Scholar 

  • H. Yamato, “A Pólya urn model with a continuum of colours,” Annals of the Institute of Statistical Mathematics vol. 45 pp. 453–458, 1993.

    Article  MATH  MathSciNet  Google Scholar 

  • H. Yamato and M. Sibuya, “Moments of some statistics of Pitman sampling formula,” Bulletin of Informatics and Cybernetics vol. 32 pp. 1–10, 2000.

    MATH  MathSciNet  Google Scholar 

  • H. Yamato, M. Sibuya, and T. Nomachi, “Ordered sample from two-parameter GEM distribution,” Statistics & Probability Letters vol. 55 pp. 19–27, 2001.

    Article  MATH  MathSciNet  Google Scholar 

  • J. E. Young, “Binary sequential representations of random partitions,” Bernoulli vol. 11 pp. 847–861, 2005.

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Charalambos A. Charalambides.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Charalambides, C.A. Distributions of Random Partitions and Their Applications. Methodol Comput Appl Probab 9, 163–193 (2007). https://doi.org/10.1007/s11009-007-9018-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11009-007-9018-6

Keywords

AMS 2000 Subject Classification

Navigation