A Trainable System for Object Detection

Papageorgiou, Constantine; Poggio, Tomaso

doi:10.1023/A:1008162616689

A Trainable System for Object Detection

Published: June 2000

Volume 38, pages 15–33, (2000)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Constantine Papageorgiou¹ &
Tomaso Poggio¹

5863 Accesses
950 Citations
9 Altmetric
Explore all metrics

Abstract

This paper presents a general, trainable system for object detection in unconstrained, cluttered scenes. The system derives much of its power from a representation that describes an object class in terms of an overcomplete dictionary of local, oriented, multiscale intensity differences between adjacent regions, efficiently computable as a Haar wavelet transform. This example-based learning approach implicitly derives a model of an object class by training a support vector machine classifier using a large set of positive and negative examples. We present results on face, people, and car detection tasks using the same architecture. In addition, we quantify how the representation affects detection performance by considering several alternate representations including pixels and principal components. We also describe a real-time application of our person detection system as part of a driver assistance system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Betke, M., Haritaoglu, E., and Davis, L. 1997. Highway scene analysis in hard real-time. In Proceedings of Intelligent Transportation Systems.
Betke, M. and Nguyen, H. 1998. Highway scene analysis form a moving vehicle under reduced visibility conditions. In Proceedings of Intelligent Vehicles, pp. 131–136.
Beymer, D., McLauchlan, P., Coifman, B., and Malik, J. 1997. A real-time computer vision system for measuring traffic parameters. In Proceedings of Computer Vision and Pattern Recognition, pp. 495–501.
Bregler, C. and Malik, J. 1996. Learning appearance based models: Mixtures of second moment experts. In Advances in Neural Information Processing Systems.
Burges, C. 1996. Simplified support vector decision rules. In Proceedings of 13th International Conference on Machine Learning.
Burges, C. 1998. A tutorial on support vector machines for pattern recognition. In Proceedings of Data Mining and Knowledge Discovery, U. Fayyad (Ed.), pp. 1–43.
Forsyth, D. and Fleck, M. 1997. Body plans. In Proceedings of Computer Vision and Pattern Recognition, pp. 678–683.
Forsyth, D. and Fleck, M. 1999. Automatic detection of human nudes, International Journal of Computer Vision, 32(1):63–77.
Google Scholar
Franke, U., Gavrila, D., Goerzig, S., Lindner, F., Paetzold, F., and Woehler, C. 1998. Autonomous driving goes downtown. IEEE Intelligent Systems, pp. 32–40.
Haritaoglu, I., Harwood, D., and Davis, L. 1998. W4: Who? When? Where? What? A real time system for detecting and tracking people. In Face and Gesture Recognition, pp. 222–227.
Heisele, B. and Wohler, C. 1998. Motion-based recognition of pedestrians. In Proceedings of International Conference on Pattern Recognition, pp. 1325–1330.
Hogg, D. 1983. Model-based vision: A program to see a walking person. Image and Vision Computing, 1(1):5–20.
Google Scholar
Itti, L. and Koch, C. 1999. A comparison of feature combination strategies for saliency-based visual attention systems. In Human Vision and Electronic Imaging, vol. 3644, pp. 473–482.
Google Scholar
Itti, L., Koch, C., and Niebur, E. 1998. A model of saliencybased visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11):1254–1259.
Google Scholar
Joachims, T. 1997. Text categorization with support vector machines. Technical Report LS-8 Report 23, University of Dortmund.
Lipson, P. 1996. Context and configuration based scene classification. Ph.D. thesis, Massachusetts Institute of Technology.
Lipson, P., Grimson, W., and Sinha, P. 1997. Configuration based scene classification and image indexing. In Proceedings of Computer Vision and Pattern Recognition, pp. 1007–1013.
Mallat, S. 1989. A theory for multiresolution signal decomposition: The wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7):674–693.
Google Scholar
McKenna, S. and Gong, S. 1997. Non-intrusive person authentication for access control by visual tracking and face recognition. In Audio-and Video-based Biometric Person Authentication, J. Bigun, G. Chollet, and G. Borgefors (Eds.), pp. 177–183.
Moghaddam, B. and Pentland, A. 1995. Probabilistic visual learning for object detection. In Proceedings of 6th International Conference on Computer Vision.
Mohan, A. 1999. Robust object detection in images by components. Master's Thesis, Massachusetts Institute of Technology.
Osuna, E., Freund, R., and Girosi, F. 1997a. Support vector machines: Training and applications. A.I. Memo 1602, MIT Artificial Intelligence Laboratory.
Osuna, E., Freund, R., and Girosi, F. 1997b. Training support vector machines: An application to face detection. In Proceedings of Computer Vision and Pattern Recognition, pp. 130–136.
Rohr, K. 1993. Incremental recognition of pedestrians from image sequences. In Proceedings of Computer Vision and Pattern Recognition, pp. 8–13.
Rowley, H., Baluja, S., and Kanade, T. 1998. Neural network-based face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(1):23–38.
Google Scholar
Shio, A. and Sklansky, J. 1991. Segmentation of people in motion. In IEEE Workshop on Visual Motion, pp. 325–332.
Sinha, P. 1994. Qualitative image-based representations for object recognition. A.I. Memo 1505, MIT Artificial Intelligence Laboratory.
Stollnitz, E., DeRose,T., and Salesin, D. 1994. Wavelets for computer graphics: A primer. Technical Report 94-09-11, Department of Computer Science and Engineering, University of Washington.
Sung, K.-K. 1995. Learning and example selection for object and pattern detection. Ph.D. Thesis, MIT Artificial Intelligence Laboratory.
Sung, K.-K. and Poggio, T. 1994. Example-based learning for viewbased human face detection. A.I. Memo 1521, MIT Artificial Intelligence Laboratory.
Vaillant, R., Monrocq, C., and Cun, Y.L. 1994. Original approach for the localisation of objects in images. IEE Proceedings Vision Image Signal Processing, 141(4):245–250.
Google Scholar
Vapnik, V. 1995. The Nature of Statistical Learning Theory. Springer Verlag.
Vapnik, V. 1998. Statistical Learning Theory. John Wiley and Sons: New York.
Google Scholar
Wren, C., Azarbayejani, A., Darrell, T., and Pentland, A. 1995. Pfinder: Real-time tracking of the human body. Technical Report 353, MIT Media Laboratory.

Download references

Author information

Authors and Affiliations

Center for Biological and Computational Learning, Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA
Constantine Papageorgiou & Tomaso Poggio

Authors

Constantine Papageorgiou
View author publications
You can also search for this author in PubMed Google Scholar
Tomaso Poggio
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Papageorgiou, C., Poggio, T. A Trainable System for Object Detection. International Journal of Computer Vision 38, 15–33 (2000). https://doi.org/10.1023/A:1008162616689

Download citation

Issue Date: June 2000
DOI: https://doi.org/10.1023/A:1008162616689

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Trainable System for Object Detection

Abstract

Access this article

Similar content being viewed by others

Visual Object Detection Using Cascades of Binary and One-Class Classifiers

Computer Vision Algorithms for Image Segmentation, Motion Detection, and Classification

Learning Detectors Quickly with Stationary Statistics

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

A Trainable System for Object Detection

Abstract

Access this article

Similar content being viewed by others

Visual Object Detection Using Cascades of Binary and One-Class Classifiers

Computer Vision Algorithms for Image Segmentation, Motion Detection, and Classification

Learning Detectors Quickly with Stationary Statistics

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation