Robust Lasso Regression with Student-t Residuals

Schmidt, Daniel F.; Makalic, Enes

doi:10.1007/978-3-319-63004-5_29

Robust Lasso Regression with Student-t Residuals

Daniel F. Schmidt¹⁶ &
Enes Makalic¹⁶

Conference paper
First Online: 09 July 2017

1483 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10400))

Abstract

The lasso, introduced by Robert Tibshirani in 1996, has become one of the most popular techniques for estimating Gaussian linear regression models. An important reason for this popularity is that the lasso can simultaneously estimate all regression parameters as well as select important variables, yielding accurate regression models that are highly interpretable. This paper derives an efficient procedure for fitting robust linear regression models with the lasso in the case where the residuals are distributed according to a Student-t distribution. In contrast to Gaussian lasso regression, the proposed Student-t lasso regression procedure can be applied to data sets which contain large outlying observations. We demonstrate the utility of our Student-t lasso regression by analysing the Boston housing data set.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Akaike, H.: A new look at the statistical model identification. IEEE Trans. Autom. Control 19(6), 716–723 (1974)
Article MathSciNet MATH Google Scholar
Finegold, M., Drton, M.: Robust graphical modelling with t-distributions. In: 25th Conference on Uncertainty in Artificial Intelligence (UAI 2009) (2009)
Google Scholar
Lambert-Lacroix, S.: Robust regression through the Huber’s criterion and adaptive lasso penalty. Electron. J. Stat. 5, 1015–1053 (2011)
Article MathSciNet MATH Google Scholar
Lange, K.L., Little, R.J.A., Taylor, J.M.G.: Robust statistical modeling using the $t$ distribution. J. Am. Stat. Assoc. 84(408), 881–896 (1989)
MathSciNet Google Scholar
Li, Y., Zhu, J.: $l_1$-norm quantile regression. J. Comput. Graph. Stat. 17(1), 1–23 (2008)
Article MathSciNet Google Scholar
Osborne, M.R., Presnell, B., Turlach, B.A.: On the LASSO and its dual. J. Comput. Graph. Stat. 9(2), 319–337 (2000)
MathSciNet Google Scholar
Park, T., Casella, G.: The Bayesian lasso. J. Am. Stat. Assoc. 103(482), 681–686 (2008)
Article MathSciNet MATH Google Scholar
Polson, N.G., Scott, J.G.: Data augmentation for non-Gaussian regression models using variance-mean mixtures. Biometrika 100(2), 459–471 (2013)
Article MathSciNet MATH Google Scholar
Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)
Article MathSciNet MATH Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. Roy. Stat. Soc. (Ser. B) 58(1), 267–288 (1996)
MathSciNet MATH Google Scholar
West, M.: On scale mixtures of normal distributions. Biometrika 74(3), 646–648 (1987)
Article MathSciNet MATH Google Scholar
Yi, N., Xu, S.: Bayesian LASSO for quantitative trait loci mapping. Genetics 179(2), 1045–1055 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Epidemiology and Biostatistics, The University of Melbourne, Carlton, VIC, 3053, Australia
Daniel F. Schmidt & Enes Makalic

Authors

Daniel F. Schmidt
View author publications
You can also search for this author in PubMed Google Scholar
Enes Makalic
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel F. Schmidt .

Editor information

Editors and Affiliations

La Trobe University, Melbourne, Australia
Wei Peng
La Trobe Business School, La Trobe University, Bundoora, Victoria, Australia
Damminda Alahakoon
RMIT University, Melbourne, Australia
Xiaodong Li

Appendix A

To find an appropriate maximum value $\tau _\mathrm{max}$ of $\tau $ for producing a regularisation path we use the following heuristic procedure: let $\hat{\varvec{\beta }}_\mathrm{ML}$ and $\hat{\sigma }^2_\mathrm{ML}$ denote the maximum likelihood estimates for $\varvec{\beta }$ and $\sigma ^2$. The negative log-prior probability of the maximum likelihood estimates, under the Laplace prior (3), is given by

$$ p \log (\tau ) + \left( \frac{\sqrt{2}}{\tau \hat{\sigma }_\mathrm{ML}} \right) || \hat{\varvec{\beta }}_\mathrm{ML} ||_1 + \mathrm{const}, $$

where $\mathrm{const}$ denotes terms that do not depend on either $\tau $ or $\hat{\varvec{\beta }}_\mathrm{ML}$. The value of $\tau $ that maximises the prior probability for $\hat{\varvec{\beta }}_\mathrm{ML}$ is given by

$$ \tilde{\tau } = \frac{\sqrt{2} \, ||\hat{\varvec{\beta }}_\mathrm{ML}||_1}{p \hat{\sigma }_\mathrm{ML}}. $$

We then choose $\tau _\mathrm{max} = c \tilde{\tau }$, where $c>1$ is a constant that controls the distance of the maximum likelihood estimates to the final point on the regularisation path.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schmidt, D.F., Makalic, E. (2017). Robust Lasso Regression with Student-t Residuals. In: Peng, W., Alahakoon, D., Li, X. (eds) AI 2017: Advances in Artificial Intelligence. AI 2017. Lecture Notes in Computer Science(), vol 10400. Springer, Cham. https://doi.org/10.1007/978-3-319-63004-5_29

Download citation

DOI: https://doi.org/10.1007/978-3-319-63004-5_29
Published: 09 July 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63003-8
Online ISBN: 978-3-319-63004-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Abstract

Buying options

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix A

Appendix A

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation