Multiple Viewpoint Recognition and Localization

Helmer, Scott; Meger, David; Muja, Marius; Little, James J.; Lowe, David G.

doi:10.1007/978-3-642-19315-6_36

Scott Helmer¹⁹,
David Meger¹⁹,
Marius Muja¹⁹,
James J. Little¹⁹ &
…
David G. Lowe¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6492))

Included in the following conference series:

Asian Conference on Computer Vision

2985 Accesses
8 Citations

Abstract

This paper presents a novel approach for labeling objects based on multiple spatially-registered images of a scene. We argue that such a multi-view labeling approach is a better fit for applications such as robotics and surveillance than traditional object recognition where only a single image of each scene is available. To encourage further study in the area, we have collected a data set of well-registered imagery for many indoor scenes and have made this data publicly available. Our multi-view labeling approach is capable of improving the results of a wide variety of image-based classifiers, and we demonstrate this by producing scene labelings based on the output of both the Deformable Parts Model of [1] as well as a method for recognizing object contours which is similar to chamfer matching. Our experimental results show that labeling objects based on multiple viewpoints leads to a significant improvement in performance when compared with single image labeling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: Proc. IEEE CVPR (2008)
Google Scholar
Su, H., Sun, M., Fei-Fei, L., Savarese, S.: Learning a dense multi-view representation for detection, viewpoint classification and synthesis of object categories. In: Proc. IEEE ICCV (2009)
Google Scholar
Thomas, A., Ferrari, V., Leibe, B., Tuytelaars, T., Gool, L.V.: Using multi-view recognition and meta-data annotation to guide a robot’s attention. Int. J. Robotics Research (2009)
Google Scholar
Liebelt, J., Schmid, C., Schertler, K.: Viewpoint-independent object class detection using 3d feature maps. In: Proc. IEEE CVPR (2008)
Google Scholar
Whaite, P., Ferrie, F.: Autonomous exploration: Driven by uncertainty. Technical Report TR-CIM-93-17, McGill U. CIM (1994)
Google Scholar
Laporte, C., Arbel, T.: Efficient discriminant viewpoint selection for active bayesian recognition. Int. J. Computer Vision 68, 1573–1405 (2006)
Google Scholar
Andriluka, M., Roth, S., Schiele, B.: Monocular 3d pose estimation and tracking by detection. In: Proc. IEEE CVPR (2010)
Google Scholar
Wojek, C., Roth, S., Schindler, K., Schiele, B.: Monocular 3D scene modeling and inference: Understanding multi-object traffic scenes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 467–481. Springer, Heidelberg (2010)
Chapter Google Scholar
Coates, A., Ng, A.Y.: Multi-camera object detection for robotics. In: Proc. IEEE Int. Conf. Robotics and Automation (2010)
Google Scholar
Leibe, B., Schindler, K., Cornelis, N., Gool, L.V.: Coupled object detection and tracking from static cameras and moving vehicles. IEEE Trans. Pattern Analysis Machine Intelligence (2008)
Google Scholar
Wojek, C., Walk, S., Schiele, B.: Multi-cue onboard pedestrian detection. In: CVPR, pp. 1–8 (2009)
Google Scholar
Kragic, D., Björkman, M.: Strategies for object manipulation using foveal and peripheral vision. In: Proc. IEEE ICVS (2006)
Google Scholar
Gould, S., Arfvidsson, J., Kaehler, A., Sapp, B., Meissner, M., Bradski, G., Baumstarck, P., Chung, S., Ng, A.: Peripheral-foveal vision for real-time object recognition and tracking in video. In: Proc. IJCAI (2007)
Google Scholar
Rusu, R.B., Holzbach, A., Beetz, M., Bradski, G.: Detecting and segmenting objects for mobile manipulation. In: Proc. ICCV, S3DV Workshop (2009)
Google Scholar
Ye, Y., Tsotsos, J.K.: Sensor planning for 3d object search. Computer Vision and Image Understanding 73, 145–168 (1999)
Article Google Scholar
Savarese, S., Fei-Fei, L.: 3d generic object categorization, localization and pose estimation. In: Proc. IEEE ICCV (2007)
Google Scholar
Viksten, F., Forssen, P.E., Johansson, B., Moe, A.: Comparison of local image descriptors for full 6 degree-of-freedom pose estimation. In: Proceedings of the IEEE International Conference on Robotics and Automation, ICRA (2009)
Google Scholar
Forssen, P.E., Meger, D., Lai, K., Helmer, S., Little, J.J., Lowe, D.G.: Informed visual search: Combining attention and object recognition. In: ICRA, pp. 935–942 (2008)
Google Scholar
LeCun, Y., Huang, F., Bottou, L.: Learning methods for generic object recognition with invariance to pose and lighting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2004)
Google Scholar
Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: Proc. of the 10th IEEE International Conference on Computer Vision, ICCV (2005)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, San Diego, USA, vol. 2, pp. 886–893 (2005)
Google Scholar
Shotton, J., Blake, A., Cipolla, R.: Multiscale categorical object recognition using contour fragments. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 1270–1281 (2008)
Article Google Scholar
Fiala, M.: Artag, a fiducial marker system using digital techniques. In: CVPR 2005, vol. 1, pp. 590–596 (2005)
Google Scholar
Poupyrev, I., Kato, H., Billinghurst, M.: Artoolkit user manual, version 2.33. Human Interface Technology Lab, University of Washington (2000)
Google Scholar
Sattar, J., Bourque, E., Giguere, P., Dudek, G.: Fourier tags: Smoothly degradable fiducial markers for use in human-robot interaction. In: Fourth Canadian Conference on Computer and Robot Vision (CRV), Montreal, Quebec, Canada, pp. 165–174 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

University of British Columbia, Canada
Scott Helmer, David Meger, Marius Muja, James J. Little & David G. Lowe

Authors

Scott Helmer
View author publications
You can also search for this author in PubMed Google Scholar
David Meger
View author publications
You can also search for this author in PubMed Google Scholar
Marius Muja
View author publications
You can also search for this author in PubMed Google Scholar
James J. Little
View author publications
You can also search for this author in PubMed Google Scholar
David G. Lowe
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Technion, Israel Institute of Technology, 32000, Haifa, Israel
Ron Kimmel
The University of Auckland, 37 Kohimarama Road, 1071, Mission Bay, Auckland, New Zealand
Reinhard Klette
National Institute of Informatics, 1018430, Chiyoda, Tokyo, Japan
Akihiro Sugimoto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Helmer, S., Meger, D., Muja, M., Little, J.J., Lowe, D.G. (2011). Multiple Viewpoint Recognition and Localization. In: Kimmel, R., Klette, R., Sugimoto, A. (eds) Computer Vision – ACCV 2010. ACCV 2010. Lecture Notes in Computer Science, vol 6492. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19315-6_36

Download citation

DOI: https://doi.org/10.1007/978-3-642-19315-6_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19314-9
Online ISBN: 978-3-642-19315-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics