Skip to main content
Log in

Bandwidth selection for kernel density estimation: a review of fully automatic selectors

  • Original Paper
  • Published:
AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Abstract

On the one hand, kernel density estimation has become a common tool for empirical studies in any research area. This goes hand in hand with the fact that this kind of estimator is now provided by many software packages. On the other hand, since about three decades the discussion on bandwidth selection has been going on. Although a good part of the discussion is about nonparametric regression, this parameter choice is by no means less problematic for density estimation. This becomes obvious when reading empirical studies in which practitioners have made use of kernel densities. New contributions typically provide simulations only to show that the own selector outperforms some of the existing methods. We review existing methods and compare them on a set of designs that exhibit few bumps and exponentially falling tails. We concentrate on small and moderate sample sizes because for large ones the differences between consistent methods are often negligible, at least for practitioners. As a byproduct we find that a mixture of simple plug-in and cross-validation methods produces bandwidths with a quite stable performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. We are grateful to the comments and suggestions of one of the anonymous referees.

References

  • Ahmad, I.A., Ran, I.S.: Data based bandwidth selection in kernel density estimation with paramteric start via kernel contrasts. J. Nonparametr. Stat. 16, 841–877 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  • Bean, S.J., Tsokos, C.P.: Developments in nonparametric density estimation. Int. Stat. Rev. 48, 267–287 (1980)

    Article  MathSciNet  MATH  Google Scholar 

  • Bowman, A.: An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71, 353–360 (1984)

    Article  MathSciNet  Google Scholar 

  • Cao, R.: Bootstrapping the mean integrated squared error. J. Multivar. Anal. 45, 137–160 (1993)

    Article  MATH  Google Scholar 

  • Cao, R., Cuevas, A., Gonzlez Manteiga, W.: A comparative study of several smoothing methods in density estimation. Comput. Stat. Data Anal. 17, 153–176 (1994)

    Article  MATH  Google Scholar 

  • Chacon, J.E., Montanero, J., Nogales, A.G.: Bootstrap bandwidth selection using an h-dependent pilot bandwidth. Scand. J. Stat. 35, 139–157 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Chaudhuri, P., Marron, J.S.: SiZer for exploration of structures in curves. J. Am. Stat. Assoc. 94, 807–823 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  • Chiu, S.T.: Some stabilized bandwidth selectors for nonparametric regression. Ann. Stat. 19, 1528–1546 (1991a)

    Article  MathSciNet  MATH  Google Scholar 

  • Chiu, S.T.: Bandwidth selection for kernel density estimation. Ann. Stat. 19, 1883–1905 (1991b)

    Article  MathSciNet  MATH  Google Scholar 

  • Chiu, S.T.: An automatic bandwidth selector for kernel density estimation. Biometrika 79, 771–782 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  • Chiu, S.T.: A comparative review of bandwidth selection for kernel density estimation. Stat. Sin. 6, 129–145 (1996)

    MathSciNet  MATH  Google Scholar 

  • Devroye, L.: The double kernel method in density estimation. Annales de l’Institut Henri Poincaré 25, 533–580 (1989)

    MathSciNet  MATH  Google Scholar 

  • Devroye, L.: Universal smoothing factor selection in density estimation: theory and practice. Test 6, 223–320 (1997)

    Google Scholar 

  • Devroye, L., Gyorfi, L.: Nonparametric Density Estimation: The \(L_1\) View. Wiley, New York (1985)

    MATH  Google Scholar 

  • Devroye, L., Lugosi, G.: A universal acceptable smoothing factor for kernel density estimation. Ann. Stat. 24, 2499–2512 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  • Duin, R.P.W.: On the choice of smoothing parameters of Parzen estimators of probability density functions. IEEE Trans. Comput. 25, 1175–1179 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  • Faraway, J.J., Jhun, M.: Bootstrap choice of bandwidth for density estimation. J. Am. Stat. Assoc. 85, 1119–1122 (1990)

    Article  MathSciNet  Google Scholar 

  • Feluch, W., Koronacki, J.: A note on modified cross-validation in density estimation. Comput. Stat. Data Anal. 13, 143–151 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  • Fryer, M.J.: A review of some non-parametric methods of density estimation. J. Appl. Math. 20(3), 335–354 (1977)

    MathSciNet  MATH  Google Scholar 

  • Godtliebsen, F., Marron, J.S., Chaudhuri, P.: Significance in scale space for bivariate density estimation. J. Comput. Graph. Stat. 11, 1–21 (2002)

    Article  MathSciNet  Google Scholar 

  • Grund, B., Polzehl, J.: Bias corrected bootstrap bandwidth selection. J. Nonparametr. Stat. 8, 97–126 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • Habbema, J.D.F., Hermans, J., van den Broek, K.: A stepwise discrimination analysis program using density estimation, In: Bruckman, G. (Ed.) COMPSTAT ’74. Proceedings in Computational Statistics, pp. 101–110. Physica, Vienna (1974)

  • Hall, P.: Using the bootstrap to estimate mean square error and select smoothing parameters in nonparametric problems. J. Multivar. Anal. 32, 177–203 (1990)

    Article  MATH  Google Scholar 

  • Hall, P., Johnstone, I.: Empirical functionals and efficient smoothing parameter selection. J. R. Stat. Soc. Ser. B 54, 475–530 (1992)

    MathSciNet  MATH  Google Scholar 

  • Hall, P., Marron, J.S.: Extent to which least-squares cross-validation minimises integrated square error in nonparametric density estimation. Probab. Theory Relat. Fields 74, 567–581 (1987a)

    Article  MathSciNet  MATH  Google Scholar 

  • Hall, P., Marron, J.S.: Estimation of integrated squared density derivatives. Stat. Probab. Lett. 6, 109–115 (1987b)

    Article  MathSciNet  MATH  Google Scholar 

  • Hall, P., Marron, J.S.: Lower bounds for bandwidth selection in density estimation. Probab. Theory Relat. Fields 90, 149–173 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  • Hall, P., Marron, J.S., Park, B.U.: Smoothed cross-validation. Probab. Theory Relat. Fields 92, 1–20 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  • Hall, P., Sheater, S.J., Jones, M.C., Marron, J.S.: On optimal databased bandwidth selection in kernel density estimation. Biometrika 78, 263–269 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  • Hanning, J., Marron, J.S.: Advanced distribution theory for SiZer. J. Am. Stat. Assoc 101, 484–499 (2006)

    Article  Google Scholar 

  • Hardle, W., Muller, M., Sperlich, S., Werwatz, A.: Nonparametric and Semiparametric Models. Springer Series in Statistics, Berlin (2004)

    Book  Google Scholar 

  • Hardle, W., Vieu, P.: Kernel regression smoothing of time series. J. Time Ser. Anal. 13, 209–232 (1992)

    Google Scholar 

  • Hart, J.D., Yi, S.: One-sided cross validation. J. Am. Stat. Assoc. 93, 620–631 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  • Jones, M.C.: On some kernel density estimation bandwidth selectors related to the double kernel method. Sankhya Ser. A 60, 249–264 (1998)

    MathSciNet  MATH  Google Scholar 

  • Jones, M.C., Marron, J.S., Park, B.U.: A simple root \(n\) bandwidth selector. Ann. Stat. 19, 1919–1932 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  • Jones, M.C., Marron, J.S., Sheather, S.J.: A brief survey of bandwidth selection for density estimation. J. Am. Stat. Assoc. 91, 401–407 (1996a)

    Article  MathSciNet  MATH  Google Scholar 

  • Jones, M.C., Marron, J.S., Sheather, S.J.: Progress in data-based bandwidth selection for kernel density estimation. Comput. Stat. 11, 337–381 (1996b)

    MathSciNet  MATH  Google Scholar 

  • Jones, M.C., Sheather, S.J.: Using non-stochastic terms to advantage in kernel-based estimation of integrated squared density derivatives. Stat. Probab. Lett. 11, 511–514 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  • Kim, W.C., Park, B.U., Marron, J.S.: Asymptotically best bandwidth selectors in kernel density estimation. Stat. Probab. Lett. 19, 119–127 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  • Loader, C.R.: Bandwidth selection: classical or plug-in? Ann. Stat. 27(2), 415–438 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  • Mammen, E., Martínez-Miranda, M.D., Nielsen, J.P., Sperlich, S.: Do-validation for kernel density estimation. J. Am. Stat. Assoc. 106, 651–660 (2011)

    Article  MATH  Google Scholar 

  • Marron, J.S.: Convergence properties of an empirical error criterion for multivariate density estimation. J. Multivar. Anal. 19, 1–13 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  • Marron, J.S.: Automatic smoothing parameter selection: a survey. Empir. Econ. 13, 187–208 (1988a)

    Article  Google Scholar 

  • Marron, J.S.: Partitioned cross-validation. Econ. Rev. 6, 271–283 (1988b)

    Article  MathSciNet  Google Scholar 

  • Marron, J.S.: Bootstrap bandwidth selection. In: LePage, R., Billard, L. (eds.) Exploring the Limits of Bootstrap, pp. 249–262. Wiley, New York (1992)

  • Marron, J.S.: Visual understanding of higher order kernels. J. Comput. Graph. Stat. 3, 447–458 (1994)

    Google Scholar 

  • Marron, J.S., Nolan, D.: Canonical kernels for density estimation. Stat. Probab. Lett. 7, 195–199 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  • Marron, J.S., Wand, M.P.: Exact mean integrated squared errors. Ann. Stat. 20, 712–736 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  • Martinez-Miranda, M.D., Nielsen, J., Sperlich, S.: One sided cross validation in density estimation. In: Gregoriou, G.N. (ed.) Operational Risk Towards Basel III: Best Practices and Issues in Modeling, Management and Regulation. Wiley, Hoboken (2009)

  • Park, B.U., Marron, J.S.: Comparison of data-driven bandwidth selectors. J. Am. Stat. Assoc. 85, 66–72 (1990)

    Article  Google Scholar 

  • Park, B.U., Turlach, B.A.: Practical performance of several data driven bandwidth selectors, CORE Discussion Paper 9205 (1992)

  • Rigollet, P., Tsybakov, A.: Linear and convex aggregation of density estimators. Math. Methods Stat. 16, 260–280 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Rudemo, M.: Empirical choice of histograms and kernel density estimators. Scand. J. Stat. 9, 65–78 (1982)

    MathSciNet  MATH  Google Scholar 

  • Ruppert, D., Cline, B.H.: Bias Reduction in kernel density estimation by smoothed empirical transformations. Ann. Stat. 22, 185–210 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  • Samarov, A., Tsybakov, A.: Aggregation of density estimators and dimension reduction. In: Nair, V. (ed.) Advances in Statistical Modeling and Inference: essays in honor of Kjell A. Doksum, pp. 233–251 (2007)

  • Savchuk, O.J., Hart, J.D., Sheather, S.J.: Indirect cross-validation for density estimation. J. Am. Stat. Assoc. 105, 415–423 (2010)

    Article  MathSciNet  Google Scholar 

  • Silverman, B.W.: Density estimation for statistics and data analysis. Monographs on Statistics and Applied Probability, vol. 26. Chapman and Hall, London (1986)

  • Scott, D.W., Terrell, G.R.: Biased and unbiased cross-validation in density estimation. J. Am. Stat. Assoc. 82, 1131–1146 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  • Sheather, S.J.: Density estimation. Stat. Sci. 19, 588–597 (2004)

    Google Scholar 

  • Sheather, S.J., Jones, M.C.: A reliable data-based bandwidth selection method for kernel density estimation. J. R. Stat. Soc. Ser. B 53, 683–690 (1991)

    MathSciNet  MATH  Google Scholar 

  • Stone, C.J.: An asymptotically optimal window selection rule for kernel density estimates. Ann. Stat. 12, 1285–1297 (1984)

    Article  MATH  Google Scholar 

  • Stute, W.: Modified cross validation in density estimation. J. Stat. Plan. Inference 30, 293–305 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  • Tartar, M.E., Kronmal, R.A.: An introduction to the implementation and theory of nonparametric density estimation. Am. Stat. 30, 105–112 (1976)

    Google Scholar 

  • Taylor, C.C.: Bootstrap choice of the smoothing parameter in kernel density estimation. Biometrika 76, 705–712 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  • Turlach, B.A.: Bandwidth selection in kernel density estimation: a review. Working Paper (1994)

  • Wand, M.P., Jones, M.C.: Kernel smoothing. Monographs on Statistics and Applied Probability, vol. 60. Chapman and Hall, London (1995)

  • Wand, M.P., Marron, J.S., Ruppert, D.: Transformations in density estimation. J. Am. Stat. Assoc. 86, 343–353 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  • Wegkamp, M.H.: Quasi universal bandwidth selection for kernel density estimators. Can. J. Stat. 27, 409–420 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  • Wegman, E.J.: Nonparametric probability density estimation: I. A summary of available methods. Technometrics 14, 533–546 (1972)

    Article  MATH  Google Scholar 

  • Wertz, W., Schneider, B.: Statistical density estimation: a bibliography. Int. Stat. Rev. 47, 155–175 (1979)

    MathSciNet  MATH  Google Scholar 

  • Yang, Y.: Mixing strategies for density estimation. Ann. Stat. 28, 75–87 (2000)

    Article  MATH  Google Scholar 

  • Yang, L., Marron, S.: Iterated transformation-kernel density estimation. J. Am. Stat. Assoc. 94, 580–589 (1999)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefan Sperlich.

Additional information

The authors thank Maria Dolores Martinez-Miranda, Lijian Yang, two anonymous referees and Göran Kauermann for helpful discussion and comments.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Heidenreich, NB., Schindler, A. & Sperlich, S. Bandwidth selection for kernel density estimation: a review of fully automatic selectors. AStA Adv Stat Anal 97, 403–433 (2013). https://doi.org/10.1007/s10182-013-0216-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10182-013-0216-y

Keywords

Navigation