Simultaneous Object Recognition and Segmentation by Image Exploration

Ferrari, Vittorio; Tuytelaars, Tinne; Van Gool, Luc

doi:10.1007/11957959_8

Vittorio Ferrari²⁰,
Tinne Tuytelaars²¹ &
Luc Van Gool^20,21

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4170))

2786 Accesses
7 Citations

Abstract

Methods based on local, viewpoint invariant features have proven capable of recognizing objects in spite of viewpoint changes, occlusion and clutter. However, these approaches fail when these factors are too strong, due to the limited repeatability and discriminative power of the features. As additional shortcomings, the objects need to be rigid and only their approximate location is found. We present an object recognition approach which overcomes these limitations. An initial set of feature correspondences is first generated. The method anchors on it and then gradually explores the surrounding area, trying to construct more and more matching features, increasingly farther from the initial ones. The resulting process covers the object with matches, and simultaneously separates the correct matches from the wrong ones. Hence, recognition and segmentation are achieved at the same time. Only very few correct initial matches suffice for reliable recognition. Experimental results on still images and television news broadcasts demonstrate the stronger power of the presented method in dealing with extensive clutter, dominant occlusion, large scale and viewpoint changes. Moreover non-rigid deformations are explicitly taken into account, and the approximative contours of the object are produced. The approach can extend any viewpoint invariant feature extractor.

This research was supported by EC project VIBES, the Fund for Scientific Research Flanders, and the IST Network of Excellence PASCAL.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Baumberg, A.: Reliable feature matching across widely separated views. In: Proceedings of the International Conference on Computer Vision, pp. 774–781 (2000)
Google Scholar
Bebis, G., Georgiopoulos, M., Lobo, N.V.: Learning geometric hashing functions for model-based object recognition. In: Proceedings of the International Conference on Computer Vision, pp. 543–548 (1995)
Google Scholar
Chum, O., Matas, J., Obdrzalek, S.: Epipolar geometry from three correspondences. In: Proceedings of Computer Vision Winter Workshop (2003)
Google Scholar
Cyr, C., Kimia, B.: 3d object recognition using similarity-based aspect graph. In: Proceedings of the International Conference on Computer Vision (2001)
Google Scholar
Ferrari, V.: Affine Invariant Regions ++. Ph.D Thesis, Selected Readings in Vision and Graphics, Springer Verlag, Zuerich, CH (2004)
Google Scholar
Ferrari, V., Tuytelaars, T., Van Gool, L.: Wide-baseline multiple-view correspondences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2003)
Google Scholar
Ferrari, V., Tuytelaars, T., Van Gool, L.: Integrating multiple model views for object recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2004)
Google Scholar
Ferrari, V., Tuytelaars, T., Van Gool, L.: Simultaneous object recognition and segmentation by image exploration. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 40–54. Springer, Heidelberg (2004)
Chapter Google Scholar
Ferrari, V., Tuytelaars, T., Van Gool, L.: Simultaneous object recognition and segmentation from single or multiple model views. International Journal of Computer Vision (to appear, 2006)
Google Scholar
Leibe, B., Schiele, B.: Scale-invariant object categorization using a scale-adaptive mean-shift search. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 145–153. Springer, Heidelberg (2004)
Chapter Google Scholar
Lhuillier, M., Quan, L.: Match propagation for image-based modeling and rendering. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(12) (2002)
Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision (to appear, 2004)
Google Scholar
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: Proceedings of the British Machine Vision Conference (2002)
Google Scholar
Mikolajczyk, K., Schmid, C.: Indexing based on scale-invariant interest points. In: Proceedings of the International Conference on Computer Vision (2001)
Google Scholar
Mikolajczyk, K., Schmid, C.: An affine invariant interest point detector. In: Proceedings of the European Conference on Computer Vision (2002)
Google Scholar
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol.II, pp. 257–263 (2003)
Google Scholar
Murase, H., Nayar, S.: Visual learning and recognition of 3d objects from appearance. International Journal of Computer Vision 14(1) (1995)
Google Scholar
Obrdzalek, S., Matas, J.: Object recognition using local affine frames on distinguished regions. In: Proceedings of the British Machine Vision Conference, pp. 414–431 (2002)
Google Scholar
Osian, M., Van Gool, L.: Video shot characterization. In: Proceedings of the TRECVID Workshop (2003)
Google Scholar
Pritchett, P., Zisserman, A.: Wide baseline stereo matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (1998)
Google Scholar
Rothganger, F., Lazebnik, S., Schmid, C., Ponce, J.: 3d object modeling and recognition using affine-invariant image descriptors and multi-view spatial constraints. International Journal of Computer Vision (to appear, 2005)
Google Scholar
Schaffalitzky, F., Zisserman, A.: Automated scene matching in movies. In: Proceedings of the Workshop on Content-based Image and Video Retrieval (2002)
Google Scholar
Schaffalitzky, F., Zisserman, A.: Multi-view matching for unordered image sets. In: Proceedings of the European Conference on Computer Vision (2002)
Google Scholar
Schmid, C.: Combining greyvalue invariants with local constraints for object recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 872–877 (1996)
Google Scholar
Selinger, A., Nelson, R.C.: A perceptual grouping hierarchy for appearance-based 3d object recognition. Computer Vision and Image Understanding 76(1), 83–92 (1999)
Article Google Scholar
Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: Proceedings of the International Conference on Computer Vision (2003)
Google Scholar
Swain, M.J., Ballard, B.H.: Color indexing. International Journal of Computer Vision 7(1), 11–32 (1991)
Article Google Scholar
Tell, D., Carlsson, S.: Combining appearance and topology for wide baseline matching. In: Proceedings of the European Conference on Computer Vision, pp. 68–81 (2002)
Google Scholar
Torr, P.H.S., Murray, D.W.: The development and comparison of robust methods for estimating the fundamental matrix. International Journal of Computer Vision 24(3), 271–300 (1997)
Article Google Scholar
Tuytelaars, T., Van Gool, L.: Wide baseline stereo based on local, affinely invariant regions. In: Proceedings of the British Machine Vision Conference (2000)
Google Scholar
Tuytelaars, T., Van Gool, L., Dhaene, L., Koch, R.: Matching affinely invariant regions for visual servoing. In: Proceedings of the IEEE Conference on Robotics and Automation, pp. 1601–1606 (1999)
Google Scholar
Yu, S.X., Gross, R., Shi, J.: Concurrent object recognition and segmentation by graph partitioning. In: Neural Information Processing Systems (2002)
Google Scholar
Zhang, Z., Deriche, R., Faugeras, O., Luong, Q.: A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry. Artificial Intelligence 78, 87–119 (1995)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Vision Group (BIWI), ETH Zürich, Switzerland
Vittorio Ferrari & Luc Van Gool
ESAT-PSI, University of Leuven, Belgium
Tinne Tuytelaars & Luc Van Gool

Authors

Vittorio Ferrari
View author publications
You can also search for this author in PubMed Google Scholar
Tinne Tuytelaars
View author publications
You can also search for this author in PubMed Google Scholar
Luc Van Gool
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Département d’Informatique, Ecole Normale Supérieure, P.O. Box, Paris, France
Jean Ponce
Carnegie Mellon University, Pittsburgh, USA
Martial Hebert
GRAVIR-INRIA, 655 avenue de l’Europe, P.O. Box, 38330, Montbonnot, France
Cordelia Schmid
Department of Engineering Science, University of Oxford, Parks Road, OX1 3PJ, Oxford, UK
Andrew Zisserman

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ferrari, V., Tuytelaars, T., Van Gool, L. (2006). Simultaneous Object Recognition and Segmentation by Image Exploration. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds) Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol 4170. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11957959_8

Download citation

DOI: https://doi.org/10.1007/11957959_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68794-8
Online ISBN: 978-3-540-68795-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics