Skip to main content
Log in

Bootstrap bandwidth selection in kernel density estimation from a contaminated sample

  • Bootstrap
  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

In this paper we consider kernel estimation of a density when the data are contaminated by random noise. More specifically we deal with the problem of how to choose the bandwidth parameter in practice. A theoretical optimal bandwidth is defined as the minimizer of the mean integrated squared error. We propose a bootstrap procedure to estimate this optimal bandwidth, and show its consistency. These results remain valid for the case of no measurement error, and hence also summarize part of the theory of bootstrap bandwidth selection in ordinary kernel density estimation. The finite sample performance of the proposed bootstrap selection procedure is demonstrated with a simulation study. An application to a real data example illustrates the use of the method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Barry, J. and Diggle, P. (1995). Choosing the smoothing parameter in a Fourier approach to nonparametric deconvolution of a density function,Journal of Nonparametric Statistics,4, 223–232.

    MathSciNet  MATH  Google Scholar 

  • Carroll, R. J. and Hall, P. (1988). Optimal rates of convergence for deconvolving a density,Journal of the American Statistical Association,83, 1184–1186.

    Article  MathSciNet  MATH  Google Scholar 

  • Carroll, R. J., Ruppert, D. and Stefanski, L. (1995).Measurement Error in Nonlinear Models, Chapman and Hall, London.

    MATH  Google Scholar 

  • Clayton, D. G. (1992) Models for the analysis of cohort and case control studies with inaccurately measured exposures,Statistical Models for Longitudinal Studies of Health (eds. J. Dwyer, M. Feinleib, P. Lippert and H. Hoffmeister), 301–331, Oxford University Press, New York.

    Google Scholar 

  • Cordy, C. B. and Thomas, D. R. (1997). Deconvolution of a distribution function,Journal of the American Statistical Association,92, 1459–1465.

    Article  MathSciNet  MATH  Google Scholar 

  • Delaigle, A. (1999). Bandwidth selection in kernel estimation of a density when the data are contaminated by errors, Mémoire de DEA (Master thesis), Institut de Statistique, Université catholique de Louvain, Belgium,http://www.stat.ucl.ac.be/ISpersonnel/delaigle

    Google Scholar 

  • Delaigle, A. (2003). Kernel estimation in deconvolution problems, PhD dissertation, Institute de Statistique, Université catholique de Louvain, Belgium.

    Google Scholar 

  • Delaigle, A. and Gijbels, I. (2002). Estimation of integrated squared density derivatives from a contaminated sample,Journal of the Royal Statistical Society, Series B,64, 869–886.

    Article  MathSciNet  MATH  Google Scholar 

  • Delaigle, A. and Gijbels, I. (2004). Practical bandwidth selection in deconvolution kernel density estimation,Computational Statistics and Data Analysis,45, 249–267.

    Article  MathSciNet  MATH  Google Scholar 

  • Devroye, L. (1989). Consistent deconvolution in density estimation,The Canadian Journal of Statistics,7, 235–239.

    Google Scholar 

  • Falk, M. (1992). Bootstrap optimal bandwidth selection for kernel density estimates,Journal of Statistical Planning and Inference,30, 13–22.

    Article  MathSciNet  MATH  Google Scholar 

  • Fan, J. (1991a). Asymptotic normality for deconvolution kernel density estimators,Sankhyā A,53, 97–110.

    MATH  Google Scholar 

  • Fan, J. (1991b). Global behaviour of deconvolution kernel estimates,Statistica Sinica,1, 541–551.

    MathSciNet  MATH  Google Scholar 

  • Fan, J. (1991c). On the optimal rates of convergence for nonparametric deconvolution problems,The Annals of Statistics,19, 1257–1272.

    MathSciNet  MATH  Google Scholar 

  • Fan, J. (1992). Deconvolution with supersmooth distributions,The Canadian Journal of Statistics,20, 155–169.

    MATH  Google Scholar 

  • Faraway, J. and Jhun, M. (1990). Bootstrap choice of bandwidth for density estimation,Journal of the American Statistical Association,85, 1119–1122.

    Article  MathSciNet  Google Scholar 

  • Grund, B. and Polzehl, J. (1997). Bias corrected bootstrap bandwidth selection,Journal of Nonparametric Statistics,8, 97–126.

    MathSciNet  MATH  Google Scholar 

  • Hall, P. (1983). Large sample optimality of least-squares cross-validation in density estimation,The Annals of Statistics,11, 1156–1174.

    MathSciNet  MATH  Google Scholar 

  • Hall, P. (1990). Using the bootstrap to estimate mean squared error and select smoothing parameter in nonparametric problems,Journal of Multivariate Analysis,32, 177–203.

    Article  MathSciNet  MATH  Google Scholar 

  • Hall, P., Marron, J. and Park, B. (1992). Smoothed cross-validation,Probability Theory and Related Fields,92, 1–20.

    Article  MathSciNet  MATH  Google Scholar 

  • Hazelton, M. L. (1999). An optimal local bandwidth selector for kernel density estimation,Journal of Statistical Planning and Inference,77, 37–50.

    Article  MathSciNet  MATH  Google Scholar 

  • Hesse, C. (1999). Data-driven deconvolution,Journal of Nonparametric Statistics,10, 343–373.

    MathSciNet  MATH  Google Scholar 

  • Jones, M. C. (2000). Rough-and-ready assessment of the degree and importance of smoothing in functional estimation,Statistica Neerlandica,54, 37–46.

    Article  MATH  Google Scholar 

  • Jones, M. C., Marron, J. and Park, B. (1991). A simple rootn bandwidth selector,The Annals of Statistics,19, 1919–1932.

    MathSciNet  MATH  Google Scholar 

  • Jones, M. C., Marron, J. and Sheather, S. J. (1996). Progress in data-based bandwidth selection for kernel density estimation,Computational Statistics,11, 337–381.

    MathSciNet  MATH  Google Scholar 

  • Marron, J. (1992). Bootstrap bandwidth selection,Exploring the Limits of Bootstrap (eds. R. LePage and L. Billard), 249–262, Wiley, New York.

    Google Scholar 

  • Morris, J. N., Marr, J. W. and Clayton, D. G. (1977). Diet and heart: A postscript,British Medical Journal,2, 1307–1314.

    Article  Google Scholar 

  • Neumann, M. H. (1997). On the effect of estimating the error density in nonparametric deconvolution,Journal of Nonparametric Statistics,7, 307–330.

    MathSciNet  MATH  Google Scholar 

  • Rachdi, M. and Sabre, R. (2000). Consistent estimates of the mode of the probability density function in nonparametric deconvolution problems,Statistics & Probability Letters,47, 105–114.

    Article  MathSciNet  MATH  Google Scholar 

  • Scott, D. and Terrell, G. (1987). Biased and unbiased cross-validation in density estimation,Journal of the American Statistical Association,82, 1131–1146.

    Article  MathSciNet  MATH  Google Scholar 

  • Stefanski, L.A. (1990). Rates of convergence of some estimators in a class of deconvolution problems,Statistics & Probability Letters,9, 229–235.

    Article  MathSciNet  MATH  Google Scholar 

  • Stefanski, L. and Carroll, R. J. (1990). Deconvoluting kernel density estimators,Statistics,2, 169–184.

    MathSciNet  Google Scholar 

  • Taylor, C. (1989). Bootstrap choice of the tuning parameter in kernel density estimation,Biometrika,76, 705–712.

    Article  MathSciNet  MATH  Google Scholar 

  • Wand, M. P. (1998). Finite sample performance of deconvolving density estimators,Statistics & Probability Letters,37, 131–139.

    Article  MathSciNet  MATH  Google Scholar 

  • Wand, M. P. and Jones, M. C. (1995).Kernel Smoothing, Chapman and Hall, London.

    MATH  Google Scholar 

  • Zhang, S. and Karunamuni, R. (2000). Boundary bias correction for nonparametric deconvolution,Annals of the Institute of Statistical Mathematics,52, 612–629.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This research was supported by ‘Projet d’Actions de Recherche Concertées’ (No. 98/03-217) from the Belgian government. Financial support from the IAP research network nr P5/24 of the Belgian State (Federal Office for Scientific, Technical and Cultural Affairs) is also gratefully acknowledged.

About this article

Cite this article

Delaigle, A., Gijbels, I. Bootstrap bandwidth selection in kernel density estimation from a contaminated sample. Ann Inst Stat Math 56, 19–47 (2004). https://doi.org/10.1007/BF02530523

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02530523

Key words and phrases

Navigation