Object Detection Using the Statistics of Parts

Schneiderman, Henry; Kanade, Takeo

doi:10.1023/B:VISI.0000011202.85607.00

Object Detection Using the Statistics of Parts

Published: February 2004

Volume 56, pages 151–177, (2004)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Henry Schneiderman¹ &
Takeo Kanade¹

1158 Accesses
222 Citations
3 Altmetric
Explore all metrics

Abstract

In this paper we describe a trainable object detector and its instantiations for detecting faces and cars at any size, location, and pose. To cope with variation in object orientation, the detector uses multiple classifiers, each spanning a different range of orientation. Each of these classifiers determines whether the object is present at a specified size within a fixed-size image window. To find the object at any location and size, these classifiers scan the image exhaustively.

Each classifier is based on the statistics of localized parts. Each part is a transform from a subset of wavelet coefficients to a discrete set of values. Such parts are designed to capture various combinations of locality in space, frequency, and orientation. In building each classifier, we gathered the class-conditional statistics of these part values from representative samples of object and non-object images. We trained each classifier to minimize classification error on the training set by using Adaboost with Confidence-Weighted Predictions (Shapire and Singer, 1999). In detection, each classifier computes the part values within the image window and looks up their associated class-conditional probabilities. The classifier then makes a decision by applying a likelihood ratio test. For efficiency, the classifier evaluates this likelihood ratio in stages. At each stage, the classifier compares the partial likelihood ratio to a threshold and makes a decision about whether to cease evaluation—labeling the input as non-object—or to continue further evaluation. The detector orders these stages of evaluation from a low-resolution to a high-resolution search of the image. Our trainable object detector achieves reliable and efficient detection of human faces and passenger cars with out-of-plane rotation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Amit, Y. 2000. A neural network architecture for visual selection. Neural Computation, 12:1059–1089.
Google Scholar
Arun, K.S., Huang, T.S., and Blostein, S.D. 1987. Least-Squares fitting of two 3-D point sets. IEEE Transactions on Pattern Recognition and Machine Intelligence, (9):698–700.
Google Scholar
Burl, M.C. and Perona, P. 1996. Recognition of planar object classes. In IEEE Conference on Computer Vision and Pattern Recognition, pp. 223–230.
Burl, M.C., Weber, M., and Perona, P. 1998.Aprobabilistic approach to object recognition using local photometry and global geometry. In Proc. of the 5th European Conf. on Computer Vision.
Chow, C.K. and Liu, C.N. 1966. Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory, IT-14(3).
Cortes, C. and Vapnik, V. 1995. Support-vector networks. Machine Learning, 20:273–297.
Google Scholar
Cosman, P.C., Gray, R.M., and Vetterli, M. 1996. Vector quantization of image subbands: A survey. IEEE Transactions on Image Processing, 5(2):202–225.
Google Scholar
Domingos, P. and Pazzani, M. 1997. On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, 29:103–1
Google Scholar
Field, D.J. 1999.Wavelets, vision and the statistics of natural scenes. Philosophical Transactions of the Royal Society: Mathematical, Physical and Engineering Sciences, 357(1760):2527–2542.
Google Scholar
Freund, Y. and Shapire, R.E. 1997. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119–139.
Google Scholar
Geman, D. and Flueret, F. 2001. Coarse-to-fine face detection. International Journal of Computer Vision, 41:85–107.
Google Scholar
Gori, M. and Scarselli, F. 1998. Are mulilayer perceptrons adequate for pattern recognition and verification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11):1121–1132.
Google Scholar
Kung, Y. 1993. Digital Neural Networks. Prentice-Hall.
Lades, M., Vorbruggen, J.C., Buhmann, J., Lange, J., Malsburg, C.v.d., Wurtz, R.P., and Konen, W. 1993. Distortion invariant object recognition in the dynamic link architecture. IEEE Transactions on Computers, 42(3):300–311.
Google Scholar
Lewis II, P.M. 1959. Approximating probability distributions to reduce storage requirements. Information and Control, 2:214– 225.
Google Scholar
Osuna, E., Freund, R., and Girosi, F. 1997. Training support vector machines: An application to face detection. In IEEE Conference on Computer Vision and Pattern Recognition, pp. 130–136.
Romdhani, S., Torr, P., Scholkopf, B., and Blake, A. 2001. Computationally efficient face detection. In International Conference on Computer Vision, pp. 695–700.
Roth, D., Yang, M.-H., and Ahuja, N. 1999. A SNoW-based face detector. NPPS-12.
Rowley, H.A., Baluja, S., and Kanade, T. 1998. Neural networkbased face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(1):23–38.
Google Scholar
Rowley, H. 1999. Neural network-based face detection. Ph.D thesis. CMU-CS-99-117.
Schiele, B. and Crowley, J.L. 1996. Probabilistic object recognition using multidimensional receptive field histograms. In International Conference on Pattern Recognition.
Schiele, B. and Crowley, J.L. 2000. Recognition without correspondence using multidimensional receptive field histograms. International Journal of Computer Vision, 36(1):31–50.
Google Scholar
Schneiderman, H. and Kanade, T. 1998. Probabilistic modeling of local appearance and spatial relationships for object recognition. In IEEE Conference on Computer Vision and Pattern Recognition.
Shapire, R.E. and Singer, Y. 1999. Improving boosting algorithms using confidence-rated predictions. Machine Learning, 37(3):297–336.
Google Scholar
Strang, G. and Nguyen, T. 1997. Wavelets and Filter Banks. Wellesley, Cambridge Press: Wellesley, MA.
Google Scholar
Sung, K.-K. and Poggio, T. 1998. Example-based learning for viewbased human face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(1):39–51.
Google Scholar
Swain, M.J. and Ballard, D.H. 1991. Color indexing. International Journal of Computer Vision, 7(1):11–32.
Google Scholar
Vetterli, M. and Kovacevic, J. 1995. Wavelets and Subband Coding. Prentice-Hall.
Viola, P. and Jones, M. 2001. Rapid object detection using a boosted cascade of simple features. In IEEE Conference on Computer Vision and Pattern Recognition.
Wiskott, L., Fellous, J.-M., Kruger, N., Malsburg, C.v.d. 1997. Face recognition by elastic bunch graph matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):775–779.
Google Scholar

Download references

Author information

Authors and Affiliations

Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
Henry Schneiderman & Takeo Kanade

Authors

Henry Schneiderman
View author publications
You can also search for this author in PubMed Google Scholar
Takeo Kanade
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schneiderman, H., Kanade, T. Object Detection Using the Statistics of Parts. International Journal of Computer Vision 56, 151–177 (2004). https://doi.org/10.1023/B:VISI.0000011202.85607.00

Download citation

Issue Date: February 2004
DOI: https://doi.org/10.1023/B:VISI.0000011202.85607.00

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object Detection Using the Statistics of Parts

Abstract

Access this article

Similar content being viewed by others

Three information set-based feature types for the recognition of faces

Face Detection Based on Frequency Domain Features

Properties of information sets and information processing with an application to face recognition

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Object Detection Using the Statistics of Parts

Abstract

Access this article

Similar content being viewed by others

Three information set-based feature types for the recognition of faces

Face Detection Based on Frequency Domain Features

Properties of information sets and information processing with an application to face recognition

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation