Abstract
In this paper we consider kernel estimation of a density when the data are contaminated by random noise. More specifically we deal with the problem of how to choose the bandwidth parameter in practice. A theoretical optimal bandwidth is defined as the minimizer of the mean integrated squared error. We propose a bootstrap procedure to estimate this optimal bandwidth, and show its consistency. These results remain valid for the case of no measurement error, and hence also summarize part of the theory of bootstrap bandwidth selection in ordinary kernel density estimation. The finite sample performance of the proposed bootstrap selection procedure is demonstrated with a simulation study. An application to a real data example illustrates the use of the method.
Similar content being viewed by others
References
Barry, J. and Diggle, P. (1995). Choosing the smoothing parameter in a Fourier approach to nonparametric deconvolution of a density function,Journal of Nonparametric Statistics,4, 223–232.
Carroll, R. J. and Hall, P. (1988). Optimal rates of convergence for deconvolving a density,Journal of the American Statistical Association,83, 1184–1186.
Carroll, R. J., Ruppert, D. and Stefanski, L. (1995).Measurement Error in Nonlinear Models, Chapman and Hall, London.
Clayton, D. G. (1992) Models for the analysis of cohort and case control studies with inaccurately measured exposures,Statistical Models for Longitudinal Studies of Health (eds. J. Dwyer, M. Feinleib, P. Lippert and H. Hoffmeister), 301–331, Oxford University Press, New York.
Cordy, C. B. and Thomas, D. R. (1997). Deconvolution of a distribution function,Journal of the American Statistical Association,92, 1459–1465.
Delaigle, A. (1999). Bandwidth selection in kernel estimation of a density when the data are contaminated by errors, Mémoire de DEA (Master thesis), Institut de Statistique, Université catholique de Louvain, Belgium,http://www.stat.ucl.ac.be/ISpersonnel/delaigle
Delaigle, A. (2003). Kernel estimation in deconvolution problems, PhD dissertation, Institute de Statistique, Université catholique de Louvain, Belgium.
Delaigle, A. and Gijbels, I. (2002). Estimation of integrated squared density derivatives from a contaminated sample,Journal of the Royal Statistical Society, Series B,64, 869–886.
Delaigle, A. and Gijbels, I. (2004). Practical bandwidth selection in deconvolution kernel density estimation,Computational Statistics and Data Analysis,45, 249–267.
Devroye, L. (1989). Consistent deconvolution in density estimation,The Canadian Journal of Statistics,7, 235–239.
Falk, M. (1992). Bootstrap optimal bandwidth selection for kernel density estimates,Journal of Statistical Planning and Inference,30, 13–22.
Fan, J. (1991a). Asymptotic normality for deconvolution kernel density estimators,Sankhyā A,53, 97–110.
Fan, J. (1991b). Global behaviour of deconvolution kernel estimates,Statistica Sinica,1, 541–551.
Fan, J. (1991c). On the optimal rates of convergence for nonparametric deconvolution problems,The Annals of Statistics,19, 1257–1272.
Fan, J. (1992). Deconvolution with supersmooth distributions,The Canadian Journal of Statistics,20, 155–169.
Faraway, J. and Jhun, M. (1990). Bootstrap choice of bandwidth for density estimation,Journal of the American Statistical Association,85, 1119–1122.
Grund, B. and Polzehl, J. (1997). Bias corrected bootstrap bandwidth selection,Journal of Nonparametric Statistics,8, 97–126.
Hall, P. (1983). Large sample optimality of least-squares cross-validation in density estimation,The Annals of Statistics,11, 1156–1174.
Hall, P. (1990). Using the bootstrap to estimate mean squared error and select smoothing parameter in nonparametric problems,Journal of Multivariate Analysis,32, 177–203.
Hall, P., Marron, J. and Park, B. (1992). Smoothed cross-validation,Probability Theory and Related Fields,92, 1–20.
Hazelton, M. L. (1999). An optimal local bandwidth selector for kernel density estimation,Journal of Statistical Planning and Inference,77, 37–50.
Hesse, C. (1999). Data-driven deconvolution,Journal of Nonparametric Statistics,10, 343–373.
Jones, M. C. (2000). Rough-and-ready assessment of the degree and importance of smoothing in functional estimation,Statistica Neerlandica,54, 37–46.
Jones, M. C., Marron, J. and Park, B. (1991). A simple rootn bandwidth selector,The Annals of Statistics,19, 1919–1932.
Jones, M. C., Marron, J. and Sheather, S. J. (1996). Progress in data-based bandwidth selection for kernel density estimation,Computational Statistics,11, 337–381.
Marron, J. (1992). Bootstrap bandwidth selection,Exploring the Limits of Bootstrap (eds. R. LePage and L. Billard), 249–262, Wiley, New York.
Morris, J. N., Marr, J. W. and Clayton, D. G. (1977). Diet and heart: A postscript,British Medical Journal,2, 1307–1314.
Neumann, M. H. (1997). On the effect of estimating the error density in nonparametric deconvolution,Journal of Nonparametric Statistics,7, 307–330.
Rachdi, M. and Sabre, R. (2000). Consistent estimates of the mode of the probability density function in nonparametric deconvolution problems,Statistics & Probability Letters,47, 105–114.
Scott, D. and Terrell, G. (1987). Biased and unbiased cross-validation in density estimation,Journal of the American Statistical Association,82, 1131–1146.
Stefanski, L.A. (1990). Rates of convergence of some estimators in a class of deconvolution problems,Statistics & Probability Letters,9, 229–235.
Stefanski, L. and Carroll, R. J. (1990). Deconvoluting kernel density estimators,Statistics,2, 169–184.
Taylor, C. (1989). Bootstrap choice of the tuning parameter in kernel density estimation,Biometrika,76, 705–712.
Wand, M. P. (1998). Finite sample performance of deconvolving density estimators,Statistics & Probability Letters,37, 131–139.
Wand, M. P. and Jones, M. C. (1995).Kernel Smoothing, Chapman and Hall, London.
Zhang, S. and Karunamuni, R. (2000). Boundary bias correction for nonparametric deconvolution,Annals of the Institute of Statistical Mathematics,52, 612–629.
Author information
Authors and Affiliations
Additional information
This research was supported by ‘Projet d’Actions de Recherche Concertées’ (No. 98/03-217) from the Belgian government. Financial support from the IAP research network nr P5/24 of the Belgian State (Federal Office for Scientific, Technical and Cultural Affairs) is also gratefully acknowledged.
About this article
Cite this article
Delaigle, A., Gijbels, I. Bootstrap bandwidth selection in kernel density estimation from a contaminated sample. Ann Inst Stat Math 56, 19–47 (2004). https://doi.org/10.1007/BF02530523
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02530523