Bayesian Robust Regression with the Horseshoe+ Estimator

Makalic, Enes; Schmidt, Daniel F.; Hopper, John L.

doi:10.1007/978-3-319-50127-7_37

Enes Makalic²¹,
Daniel F. Schmidt²¹ &
John L. Hopper²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9992))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

3145 Accesses

Abstract

The horseshoe$+$ estimator for Gaussian linear regression models is a novel extension of the horseshoe estimator that enjoys many favourable theoretical properties. We develop the first efficient Gibbs sampling algorithm for the horseshoe$+$ estimator for linear and logistic regression models. Importantly, our sampling algorithm incorporates robust data models that naturally handle non-Gaussian data and are less sensitive to outliers. The resulting software implementation provides a powerful, flexible and robust tool for building prediction and classification models from potentially high-dimensional data and represents the state-of-the-art in Bayesian machine learning techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Notes

1.
www.emakalic.org/blog and www.dschmidt.org.

References

Andrews, D.F., Mallows, C.L.: Scale mixtures of normal distributions. J. R. Stat. Soc. (Ser. B) 36(1), 99–102 (1974)
MathSciNet MATH Google Scholar
Bhadra, A., Datta, J., Polson, N.G., Willard, B.: The horseshoe+ estimator of ultra-sparse signals (2015). arXiv:1502.00560
Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions, and the Bayesian restoration images. IEEE Trans. Patttern Anal. Mach. 6, 721–741 (1984)
Article MATH Google Scholar
Makalic, E., Schmidt, D.F.: A simple sampler for the horseshoe estimator. IEEE Signal Process. Lett. 23(1), 179–182 (2016)
Article Google Scholar
Carvalho, C.M., Polson, N.G., Scott, J.G.: The horseshoe estimator for sparse signals. Biometrika 97(2), 465–480 (2010)
Article MathSciNet MATH Google Scholar
Polson, N.G., Scott, J.G.: Shrink globally, act locally: sparse Bayesian regularization and prediction. In: Bayesian Statistics. vol. 9 (2010)
Google Scholar
Wand, M.P., Ormerod, J.T., Padoan, S.A., Fruhwirth, R.: Mean field variational Bayes for elaborate distributions. Bayesian Anal. 6(4), 847–900 (2011)
Article MathSciNet MATH Google Scholar
Lindley, D.V., Smith, A.F.M.: Bayes estimates for the linear model. J. R. Stat. Soc. (Ser. B) 34(1), 1–41 (1972)
MathSciNet MATH Google Scholar
Rue, H.: Fast sampling of Gaussian Markov random fields. J. R. Stat. Soc. (Ser. B) 63(2), 325–338 (2001)
Article MathSciNet MATH Google Scholar
Cong, Y., Chen, B., Zhou, M.: Fast simulation of hyperplane-truncated multivariate normal distributions (2016)
Google Scholar
Bhattacharya, A., Pati, D., Pillai, N.S., Dunson, D.B.: Dirichlet-Laplace priors for optimal shrinkage. J. Am. Stat. Assoc. 110, 1479–1490 (2015)
Article MathSciNet MATH Google Scholar
Polson, N.G., Scott, J.G., Windle, J.: Bayesian inference for logistic models using Pólya-Gamma latent variables. J. Am. Stat. Assoc. 108(504), 1339–1349 (2013)
Article MATH Google Scholar
Windle, J., Polson, N.G., Scott, J.G.: Sampling Pólya-Gamma random variates: alternate and approximate techniques (2014)
Google Scholar
Ekholm, A., Palmgren, J.: Correction for misclassification using doubly sampled data. J. Official Stat. 3(4), 419–429 (1987)
Google Scholar
Copas, J.B.: Binary regression models for contaminated data. J. R. Stat. Soc. Ser. B (Methodol.) 50(2), 225–265 (1988)
MathSciNet Google Scholar
Carroll, R.J., Pederson, S.: On robustness in the logistic regression model. J. R. Stat. Soc. Ser. B (Methodol.) 55(3), 693–706 (1993)
MathSciNet MATH Google Scholar
Lichman, M.: UCI machine learning repository (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Epidemiology and Biostatistics, The University of Melbourne, Carlton, VIC, 3053, Australia
Enes Makalic, Daniel F. Schmidt & John L. Hopper

Authors

Enes Makalic
View author publications
You can also search for this author in PubMed Google Scholar
Daniel F. Schmidt
View author publications
You can also search for this author in PubMed Google Scholar
John L. Hopper
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Enes Makalic .

Editor information

Editors and Affiliations

University of Tasmania, Hobart, Australia
Byeong Ho Kang
Auckland University of Technology, Auckland, New Zealand
Quan Bai

A Appendix

1.1 A.1 Inverse Gamma Distribution

The inverse gamma probability density function is given by

$$\begin{aligned} p(x | \alpha , \beta ) = \frac{\beta ^\alpha }{\varGamma (\alpha )} x^{-\alpha - 1} \exp \left( - \frac{\beta }{x} \right) , \quad (x > 0), \end{aligned}$$

with shape parameter $(\alpha >0)$ and scale parameter $(\beta > 0)$. The first two moments are

$$\begin{aligned} \mathrm{E} \left( x \right) = \frac{\beta }{\alpha - 1}, \quad \mathrm{Var} \left( x \right) = \frac{\beta ^2}{(\alpha -1)^2 (\alpha - 2)}, \end{aligned}$$

where the mean and variance only exist for $(\alpha >1)$ and $(\alpha >2)$ respectively.

1.2 A.2 Inverse Gaussian Distribution

The inverse Gaussian probability density function is given by

$$\begin{aligned} p(x | \mu , \lambda ) = \left( \frac{\lambda }{2 \pi x^3} \right) ^{\frac{1}{2}} \exp \left( -\frac{\lambda (x - \mu )^2}{2 \mu ^2 x}\right) \!\!, \end{aligned}$$

for $(x > 0)$, where $(\mu > 0)$ is the mean and $(\lambda > 0)$ is the shape parameter. The first two moments are

$$\begin{aligned} \mathrm{E} \left( x \right) = \mu , \quad \mathrm{Var} \left( x \right) = \frac{\mu ^3}{\lambda }. \end{aligned}$$

1.3 A.3 Student-t Distribution

The Student-t distribution probability density function is given by

$$\begin{aligned} p(x | \mu , \sigma ^2, \nu ) = \frac{\varGamma \left( \frac{\nu + 1}{2} \right) }{\varGamma \left( \frac{\nu }{2} \right) \sqrt{\pi \nu \sigma ^2}} \left( 1 + \frac{1}{\nu }\frac{(x - \mu )^2}{\sigma ^2} \right) ^{-\frac{\nu +1}{2}}, \end{aligned}$$

where $(x \in \mathbb {R})$, $(\mu \in \mathbb {R})$, $(\sigma ^2 > 0)$ and the degrees of freedom $(\nu > 0)$. The first two moments are

$$\begin{aligned} \mathrm{E} \left( x \right) = \mu , \quad (\nu > 1), \quad \mathrm{Var} \left( x \right) = \sigma ^2 \left( \frac{\nu }{\nu - 2}\right) \!\!, \end{aligned}$$

(13)

where the mean and variance only exist for $(\nu > 1)$ and $(\nu > 2)$ respectively.

1.4 A.4 Laplace Distribution

The probability density function of the Laplace distribution is

$$\begin{aligned} p(x | \mu , b) = \frac{1}{2b} \exp \left( - \frac{|x - \mu |}{b} \right) , \end{aligned}$$

where $(x \in \mathbb {R})$, $(\mu \in \mathbb {R})$ is the location parameter and $(b > 0)$ is the scale parameter. The first two moments are

$$\begin{aligned} \mathrm{E} \left( x \right) = \mu , \quad \mathrm{Var} \left( x \right) = 2 b^2. \end{aligned}$$

1.5 A.5 Pólya-Gamma Distribution

A random variable x follows a Pólya-gamma distribution [12], $x \sim \mathrm{PG}(b, c)$, if

$$\begin{aligned} x \mathop {=}\limits ^{D} \frac{1}{2\pi ^2} \sum _{k=1}^\infty \frac{g_k}{(k-1/2)^2 + c^2/(4\pi ^2)}, \end{aligned}$$

where $g_k \sim \mathrm{Ga}(b,1)$ are independent gamma random variables, $(b > 0)$ and $(c \in \mathbb {R})$ are the parameters and $\mathop {=}\limits ^{D}$ denotes equality in distribution. The first two moments of x are

$$\begin{aligned} \mathrm{E} \left( x \right) = \frac{b}{2c} \mathrm{tanh} \left( \frac{c}{2}\right) \!\!, \quad \mathrm{Var} \left( x \right) = \frac{b}{4c^3} \left( \mathrm{sinh}(c) - c\right) \mathrm{sech}^2 \left( \frac{c}{2} \right) \!\!. \end{aligned}$$

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Makalic, E., Schmidt, D.F., Hopper, J.L. (2016). Bayesian Robust Regression with the Horseshoe+ Estimator. In: Kang, B.H., Bai, Q. (eds) AI 2016: Advances in Artificial Intelligence. AI 2016. Lecture Notes in Computer Science(), vol 9992. Springer, Cham. https://doi.org/10.1007/978-3-319-50127-7_37

Download citation

DOI: https://doi.org/10.1007/978-3-319-50127-7_37
Published: 29 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50126-0
Online ISBN: 978-3-319-50127-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Bayesian Robust Regression with the Horseshoe+ Estimator

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix

A Appendix

1.1 A.1 Inverse Gamma Distribution

1.2 A.2 Inverse Gaussian Distribution

1.3 A.3 Student-t Distribution

1.4 A.4 Laplace Distribution

1.5 A.5 Pólya-Gamma Distribution

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation