Skip to main content
Log in

Robust finite mixture modeling of multivariate unrestricted skew-normal generalized hyperbolic distributions

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

In this paper, we introduce an unrestricted skew-normal generalized hyperbolic (SUNGH) distribution for use in finite mixture modeling or clustering problems. The SUNGH is a broad class of flexible distributions that includes various other well-known asymmetric and symmetric families such as the scale mixtures of skew-normal, the skew-normal generalized hyperbolic and its corresponding symmetric versions. The class of distributions provides a much needed unified framework where the choice of the best fitting distribution can proceed quite naturally through either parameter estimation or by placing constraints on specific parameters and assessing through model choice criteria. The class has several desirable properties, including an analytically tractable density and ease of computation for simulation and estimation of parameters. We illustrate the flexibility of the proposed class of distributions in a mixture modeling context using a Bayesian framework and assess the performance using simulated and real data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Andrews, D.R., Mallows, C.L.: Scale mixture of normal distribution. J. Roy. Stat. Soc. B 36, 99–102 (1974)

    MathSciNet  MATH  Google Scholar 

  • Arellano-Valle, R.B., Azzalini, A.: On the unification of families of skew-normal distributions. Scand. J. Stat. 33, 561–574 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Arellano-Valle, R.B., Genton, M.G.: On fundamental skew distributions. J. Multivar. Anal. 96, 93–116 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Arellano-Valle, R.B., Genton, M.G.: Multivariate unified skew-elliptical distributions. Chil. J. Stat. 2, 17–34 (2010)

    MathSciNet  MATH  Google Scholar 

  • Arellano-Valle, R.B., Branco, M.D., Genton, M.G.: A unified view on skewed distributions arising from selections. Can. J. Stat. 34, 581–601 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Arellano-Valle, R.B., Bolfarine, H., Lachos, G.H.: Bayesian inference for skew-normal linear mixed model. J. Appl. Stat. 33, 561–574 (2007)

    MathSciNet  Google Scholar 

  • Azzalini, A.: Package ‘sn’. http://azzalini.stat.unipd.it/SN (2015). Accessed 13 May 2017

  • Azzalini, A., with the collaboration of Capitanio, A.: The Skew-Normal and Related Families. IMS Monographs Series. Cambridge University Press (2014)

  • Barndorff-Nielsen, O.: Hyperbolic distributions and distributions on hyperbolae. Scand. J. Stat. 5, 151–157 (1978)

    MathSciNet  MATH  Google Scholar 

  • Barndorff-Nielsen, O., Blaesild, P.: Hyperbolic distributions. In: Kotz, S., Johnson, N.L., Read, C. (eds.) Encyclopedia of Statistical Sciences, vol. 3. Wiley, New York (1980)

    Google Scholar 

  • Barndorff-Nielsen, O., Halgreen, C.: Infinite divisibility of the hyperbolic and generalized inverse Gaussian distributions. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 38, 309–311 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  • Basso, R.M., Lachos, V.H., Cabral, C.R.B., Ghosh, P.: Robust mixture modeling based on the scale mixtures of skew-normal distributions. Comput. Stat. Data Anal. 54, 2926–2941 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  • Böhning, D.: Computer-Assisted Analysis of Mixtures and Applications. Meta-Analysis, Disease Mapping and Others. Chapman & Hall, Boca Raton (2000)

    MATH  Google Scholar 

  • Branco, M.D., Dey, D.K.: A general class of multivariate skew-elliptical distributions. J. Multivar. Anal. 79, 99–113 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Browne, R.P., McNicholas, P.D.: A mixture of generalized hyperbolic distributions. Can. J. Stat. 43(2), 176–198 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  • Carlin, B.P., Louis, T.A.: Bayesian Methods for Data Analysis. CRC Press, Boca Raton (2011)

    MATH  Google Scholar 

  • Celeux, G., Hurn, M., Robert, C.P.: Computational and inferential difficulties with mixture posterior distributions. J. Am. Stat. Assoc. 95, 957–970 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  • Celeux, G., Forbes, F., Robert, C.P., Titterington, D.M.: Deviance information criteria for missing data models. Bayesian Anal. 1, 651–674 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Chhikara, R.S., Folks, J.L.: The Inverse Gaussian Distribution. Marcel Dekker, New York (1989)

    MATH  Google Scholar 

  • Cook, R.D., Weisberg, S.: An Introduction to Regression Graphics. Wiley, New York (1994)

    Book  MATH  Google Scholar 

  • Forbes, F., Wraith, D.: A new family of multivariate heavy-tailed distributions with variable marginal amounts of tail weight: application to robust clustering. Stat. Comput. 24(6), 971–984 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  • Franczak, B.C., Browne, R.P., McNicholas, P.D.: Mixtures of shifted asymmetric laplace distributions. IEEE Trans. Pattern Anal. Mach. Intell. 36(6), 1149–1157 (2014)

    Article  Google Scholar 

  • Frühwirth-Schnatter, S.: Finite Mixture and Markov Switching Models. Springer Series in Statistics. Springer, Berlin (2006)

    MATH  Google Scholar 

  • Frühwirth-Schnatter, S., Pyne, S.: Bayesian inference for finite mixtures of skew-normal and skew-t distributions. Biostatistics 11(2), 317–336 (2010)

    Article  Google Scholar 

  • Gelman, A., Rubin, D.B.: Inference from iterative simulation using multiple sequences. Stat. Sci. 7, 457–511 (1992)

    Article  MATH  Google Scholar 

  • Genton, M.G.: Skew-Elliptical Distributions and Their Applications: A Journey Beyond Normality. Chapman & Hall, Boca Raton (2004)

    Book  MATH  Google Scholar 

  • Good, I.J.: The population frequencies of species and the estimation of population parameters. Biometrika 40, 237–260 (1953)

    Article  MathSciNet  MATH  Google Scholar 

  • Hogan, J.W., Laird, N.M.: Mixture models for the joint distribution of repeated measures and event times. Stat. Med. 16, 239–258 (1997)

    Article  Google Scholar 

  • Holzmann, H., Munk, A., Gneiting, T.: Identifiability of finite mixtures of elliptical distributions. Scand. J. Stat. 33(4), 753–763 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2, 193–218 (1985)

    Article  MATH  Google Scholar 

  • Johnson, N.L., Kotz, S., Balakrishnan, N.: Continous Univariate Distributions, vol. 1. Wiley, New York (1994)

    MATH  Google Scholar 

  • Jørgensen, B.: Statistical Properties of the Generalized Inverse Gaussian distribution. Springer, New York (1982)

    Book  MATH  Google Scholar 

  • Karlis, D., Santourian, A.: Model-based clustering with non-elliptically contoured distributions. Stat. Comput. 19(1), 73–83 (2009)

    Article  MathSciNet  Google Scholar 

  • Lachos, V.H., Bolfarine, H., Arellano-Valle, R.B.: Likelihood-based inference for multivariate skew-normal regression models. Commun. Stat. Theory Methods 36(9), 1769–1786 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Lachos, V.H., Ghosh, P., Arellano-Valle, R.B.: Likelihood based inference for skew-normal independent linear mixed models. Stat. Sin. 20, 303–322 (2010)

    MathSciNet  MATH  Google Scholar 

  • Lee, S.X., McLachlan, G.J.: Model-based clustering and classification with non-normal mixture distributions. Stat. Methods Appl. 22(4), 427–454 (2013a)

    Article  MathSciNet  MATH  Google Scholar 

  • Lee, S.X., McLachlan, G.J.: On mixtures of skew normal and skew t distributions. Adv. Data Anal. Classif. 7(3), 241–266 (2013b)

    Article  MathSciNet  MATH  Google Scholar 

  • Lee, S.X., McLachlan, G.J.: Finite mixtures of multivariate skew t distributions: some recent and new results. Stat. Comput. 24, 181–202 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  • Lee, S.X., McLachlan, G.J.: Finite mixtures of canonical fundamental skew t-distributions: the unification of the restricted and unrestricted skew t-mixture models. Stat. Comput. 26, 573–589 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  • Lin, T.I.: Maximum likelihood estimation for multivariate skew normal mixture models. J. Multivar. Anal. 100(2), 257–265 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  • Lin, T.I.: Robust mixture modeling using multivariate skew t distributions. Stat. Comput. 20(3), 343–356 (2010)

    Article  MathSciNet  Google Scholar 

  • Lin, T.I., Lee, J.C., Yen, S.Y.: Finite mixture modeling using the skew-normal distribution. Stat. Sin. 17(b), 909–927 (2007)

    MATH  Google Scholar 

  • Lin, T.I., Ho, H.J., Chen, C.L.: Analysis of multivariate skew normal models with incomplete data. J. Multivar. Anal. 100(10), 2337–2351 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  • Maier, L.M., Anderson, D.E., De Jager, P.L., Wicker, L.S., Hafler, D.A.: Allelic variant in CTLA4 alters t cell phosphorylation patterns. Proc. Natl. Acad. Sci. USA 104, 18607–18612 (2007)

    Article  Google Scholar 

  • Maleki, M., Arellano-Valle, R.B.: Maximum a-posteriori estimation of autoregressive processes based on finite mixtures of scale-mixtures of skew-normal distributions. J. Stat. Comput. Simul. 87(6), 1061–1083 (2017)

    Article  MathSciNet  Google Scholar 

  • McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, Chichester (2000)

    Book  MATH  Google Scholar 

  • McNeil, A.J., Frey, R., Embrechts, P.: Quantitative Risk Management: Concepts, Techniques and Tools. Princeton University Press, Princeton (2005)

    MATH  Google Scholar 

  • Mengersen, K., Robert, C., Titterington, D.M.: Mixtures: Estimation and Applications. Wiley, Chichester (2011)

    Book  MATH  Google Scholar 

  • Morris, K., McNicholas, P.D., Punzo, A., Browne, R.P.: Robust Asymmetric Clustering. ArXiv e-print arxiv:1402.6744 (2014)

  • Pyne, S., Hu, X., Wang, K., Rossin, E., Lin, T.I., Maier, L.M., Baecher-Allan, C., McLachlan, G.J., Tamayo, P., Hafler, D.A., De Jager, P.L., Mesirov, J.P.: Automated high-dimensional flow cytometric data analysis. Proc. Natl. Acad. Sci. 106(21), 8519–8524 (2009)

    Article  Google Scholar 

  • R Core Team.: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/ (2017). Accessed 20 June 2017

  • Sahu, S.K., Dey, D.K., Branco, M.D.: A new class of multivariate skew distributions with applications to Bayesian regression models. Can. J. Stat. 31(2), 129–150 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  • Seshadri, V.: The Inverse Gaussian Distribution: A Case Study in Exponential Families. Oxford University Press, New York (1993)

    Google Scholar 

  • Teicher, H.: Identifiability of finite mixtures. Ann. Math. Stat. 34(4), 1265–1269 (1963)

    Article  MathSciNet  MATH  Google Scholar 

  • Vilca, F., Balakrishnan, N., Zeller, C.B.: Multivariate skew-normal generalized hyperbolic distribution and its properties. J. Multivar. Anal. 128, 73–85 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  • Vrbik, I., McNicholas, P.D.: Analytic calculations for the EM algorithm for multivariate skew-t mixture models. Stat. Probab. Lett. 82(6), 1169–1174 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • Wang, H.X., Zhang, Q.B., Luo, B., Wei, S.: Robust mixture modelling using multivariate t-distribution with missing information. Pattern Recogn. Lett. 25(6), 701–710 (2004)

    Article  Google Scholar 

  • Wang, K., Ng, S.K., McLachlan, G.J.: Multivariate skew t mixture models: applications to fluorescence-activated cell sorting data. In: Digital Image Computing: Techniques and Applications, Los Alamitos, California, pp. 526–531. IEEE (2009)

  • Wraith, D., Forbes, F.: Location and scale mixtures of Gaussians with flexible tail behaviour: properties, inference and application to multivariate clustering. Comput. Stat. Data Anal. 90(Oct.), 61–73 (2015)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the coordinating editor and anonymous reviewers for their suggestions, corrections and encouragement, which helped us to improve earlier versions of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Darren Wraith.

Appendix

Appendix

1.1 A.1. Proof of Propositions 1 to 6

In this appendix, we prove Propositions 1 to 6.

Proof of Proposition 1

By considering (7),

  1. (a):
  2. (b):

\(\square \)

Proof of Proposition 2

By considering the stochastic representation (7) and the fact that \(\varvec{W}_{0} \) (and so \({\varvec{W}}\)) are uncorrelated, this subject proved. In the case of \(\varvec{\Lambda }^{*}=\left( {{\begin{array}{cc} {\varvec{\Lambda }_{p\times q} }&{} {\mathbf{0}_{p\times m} } \\ \end{array} }} \right) \), relation (7) for \({\varvec{Y}}\sim \mathrm{SUNGH}_{p,q+m} \left( {{\varvec{\mu }} ,\varvec{\Sigma },\varvec{\Lambda }^{*},\varpi } \right) \) is equivalent to \({\varvec{Y}}={\varvec{\mu }} +\varvec{\Lambda }^{*}{\varvec{W}}+\kappa \left( U \right) ^{1/2}\varvec{\Sigma }^{1/2}{\varvec{W}}_1 ={\varvec{\mu }} +\varvec{\Lambda }{\varvec{W}}^{\left( 1 \right) }+\kappa \left( U \right) ^{1/2}\varvec{\Sigma }^{1/2}{\varvec{W}}_1 \), where \({\varvec{W}}^{\left( 1 \right) }\) is the first q components of W, and in the case of \(\varvec{\Lambda }^{*}=\left( {{\begin{array}{cc} {\mathbf{0}_{p\times m} }&{} {\varvec{\Lambda }_{p\times q} } \\ \end{array} }} \right) \), relation (7) for \({\varvec{Y}}\sim \mathrm{SUNGH}_{p,q+m} \left( {{\varvec{\mu }} ,\varvec{\Sigma },\varvec{\Lambda }^{*},{\varvec{\varpi }} } \right) \) is equivalent to \({\varvec{Y}}={\varvec{\mu }} +\varvec{\Lambda }^{*}{\varvec{W}}+\kappa \left( U \right) ^{1/2}\varvec{\Sigma }^{1/2}{\varvec{W}}_1 ={\varvec{\mu }} +\varvec{\Lambda }{\varvec{W}}^{\left( 2 \right) }+\kappa \left( U \right) ^{1/2}\varvec{\Sigma }^{1/2}{\varvec{W}}_1 \), where \({\varvec{W}}^{\left( 2 \right) }\) is the last q components of \({\varvec{W}}\)\(\square \)

Proof of Proposition 3

By considering the stochastic representation (7), we have that \({\varvec{b}}+{\varvec{BY}}={\varvec{b}}+{\varvec{B}}{\varvec{\mu }} +{\varvec{B}}\varvec{\Lambda }{\varvec{W}}+\kappa \left( U \right) ^{1/2}\left( {{\varvec{B}}\varvec{\Sigma }{\varvec{B}}^{\top }} \right) ^{1/2}{\varvec{W}}_1 \)\(\square \)

Proof of Proposition 4

By considering Proposition 3, with \({\varvec{b}}=\mathbf{0}\) and the matrix \({\varvec{B}}\) in the form of \(\left( {{\begin{array}{cc} {{\varvec{I}}_{p_1 } }&{} {{\varvec{0}}_{p_1 \times p_2 } } \\ \end{array} }} \right) \) or \(\left( {{\begin{array}{cc} {{\varvec{0}}_{p_2 \times p_1 } }&{} {{\varvec{I}}_{p_2 } } \\ \end{array} }} \right) \), respectively, this subject proved \(\square \)

Proof of Proposition 5

Since \({\varvec{Y}}=\left( {{\varvec{Y}}_1^\top ,{\varvec{Y}}_2^\top } \right) ^{\top }\), from part b) of the Proposition 1, we have \(\hbox {Var}\left[ {\varvec{Y}} \right] =\left( {\hbox {Cov}\left( {{\varvec{Y}}_i ,{\varvec{Y}}_j } \right) } \right) _{i,j=1,2} =\left( {\varvec{\Sigma }_{ij} +\varvec{\Lambda }_i \left[ {\left( {k_2 -k_1^2 } \right) \frac{2}{\pi }{} \mathbf{1}_q \mathbf{1}_q^\top -\frac{2}{\pi }k_2 I_q } \right] \varvec{\Lambda }_j^\top } \right) \). Thus, if \(\varvec{\Sigma }_{12} =\mathbf{0}\), then \(\hbox {Cov}\left( {{\varvec{Y}}_1 ,{\varvec{Y}}_2 } \right) =\varvec{\Lambda }_1 \big [ \left( {k_2 -k_1^2 } \right) \frac{2}{\pi }{} \mathbf{1}_q \mathbf{1}_q^\top -\frac{2}{\pi }k_2 I_q \big ]\varvec{\Lambda }_2^\top \), thus following that each of the conditions \(\varvec{\Lambda }_1 =\mathbf{0}\) or \(\varvec{\Lambda }_2 =\mathbf{0}\) leads to \(\hbox {Cov}\left( {{\varvec{Y}}_1 ,{\varvec{Y}}_2 } \right) =\mathbf{0}\)\(\square \)

Proof of Proposition 6

The first part follows by applying Proposition 2 in the Proposition 4. For the proof of the second result, note from the proof of Proposition 5 that

$$\begin{aligned}&\hbox {Cov}\left( {{\varvec{Y}}_1 ,{\varvec{Y}}_2 } \right) =\left( {{\begin{array}{cc} {\varvec{\Lambda }_{11} }&{} {{\varvec{0}}_{p_1 \times q_2 } } \\ \end{array} }} \right) _{p_1 \times q}\\&\quad \left[ {\left( {k_2 -k_1^2 } \right) \frac{2}{\pi }{} \mathbf{1}_q \mathbf{1}_q^\top -\frac{2}{\pi }k_2 {\varvec{I}}_q } \right] _{q\times q} \left( {{\begin{array}{cc} {\mathbf{0}_{p_2 \times q_1 } }&{} {\varvec{\Lambda }_{22} } \\ \end{array} }} \right) _{q\times p_2 }^\top . \end{aligned}$$

Thus, using the partitions \({\varvec{I}}_q =\hbox {diag}\left( {{\varvec{I}}_{q_1 } ,{\varvec{I}}_{q_2 } } \right) \) and \(\mathbf{1}_q =\left( {\mathbf{1}_{q_1 }^\top ,\mathbf{1}_{q_2 }^\top } \right) ^{\top }\) we obtain the proof \(\square \)

1.2 A.2. Matrix variate priors for skewness matrix

Considering the matrix variate priors in the form of \(\varvec{\Lambda }_k \sim MN_{p,q} \left( {{\varvec{N}}_k ,{\varvec{S}}_k ,{\varvec{F}}_k } \right) ,k=1,\ldots ,K\), where MN denotes the matrix normal distributions, this leads to the following posteriors instead of (19) as follows:

\(\left. {\hbox {vec}(\varvec{\Lambda }_k )} \right| \varvec{\Theta }_{\left( {-\varvec{\Lambda }_k } \right) } ,{\varvec{y}},{\varvec{u}},{\varvec{w}},z_i =k\sim N_{pq} \left( {{\varvec{\mu }} ,\varvec{\Sigma }} \right) ;k=1,\ldots ,K\), where

$$\begin{aligned} {\varvec{\mu }}&=\varvec{\Sigma }\left[ {\left( {\mathbf{S}_k \otimes {\varvec{F}}_k } \right) ^{-1}\hbox {vec}\left( {{\varvec{N}}_k } \right) +\mathop \sum \limits _{B_k } \kappa \left( {u_{ik} } \right) ^{-1}\left( {{\varvec{M}}_{ik}^\top \otimes \varvec{\Sigma }_k^{-1} } \right) } \right] ,\\&\varvec{\Sigma }=\left[ {\left( {\mathbf{S}_k \otimes {\varvec{F}}_k } \right) ^{-1}+\mathop \sum \limits _{B_k } \kappa \left( {u_{ik} } \right) ^{-1}\left( {\varvec{\Sigma }_k^{-1}\otimes {\varvec{L}}_{ik} } \right) } \right] ^{-1}, \end{aligned}$$
(19a)

where \({\varvec{L}}_{ik} ={\varvec{w}}_{ik} {\varvec{w}}_{ik}^\top \) and \({\varvec{M}}_{ik} =\left( {{\varvec{y}}_i -{\varvec{\mu }} _k } \right) {\varvec{w}}_{ik}^\top \), for which \(\otimes \) denotes the Kronecker product and \(\hbox {vec}\) denotes the vectorization of a matrix (a linear transformation which converts the matrix into a column vector).

Using these forms for the Gibbs updates may improve mixing and convergence to a stationary distribution. However, they involve the use of matrix variate distributions for which users may not be familiar; hence, a simpler (computational) update is provided in the main text.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Maleki, M., Wraith, D. & Arellano-Valle, R.B. Robust finite mixture modeling of multivariate unrestricted skew-normal generalized hyperbolic distributions. Stat Comput 29, 415–428 (2019). https://doi.org/10.1007/s11222-018-9815-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-018-9815-5

Keywords

Navigation