Skip to main content

Multiple Viewpoint Recognition and Localization

  • Conference paper
Computer Vision – ACCV 2010 (ACCV 2010)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6492))

Included in the following conference series:

Abstract

This paper presents a novel approach for labeling objects based on multiple spatially-registered images of a scene. We argue that such a multi-view labeling approach is a better fit for applications such as robotics and surveillance than traditional object recognition where only a single image of each scene is available. To encourage further study in the area, we have collected a data set of well-registered imagery for many indoor scenes and have made this data publicly available. Our multi-view labeling approach is capable of improving the results of a wide variety of image-based classifiers, and we demonstrate this by producing scene labelings based on the output of both the Deformable Parts Model of [1] as well as a method for recognizing object contours which is similar to chamfer matching. Our experimental results show that labeling objects based on multiple viewpoints leads to a significant improvement in performance when compared with single image labeling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: Proc. IEEE CVPR (2008)

    Google Scholar 

  2. Su, H., Sun, M., Fei-Fei, L., Savarese, S.: Learning a dense multi-view representation for detection, viewpoint classification and synthesis of object categories. In: Proc. IEEE ICCV (2009)

    Google Scholar 

  3. Thomas, A., Ferrari, V., Leibe, B., Tuytelaars, T., Gool, L.V.: Using multi-view recognition and meta-data annotation to guide a robot’s attention. Int. J. Robotics Research (2009)

    Google Scholar 

  4. Liebelt, J., Schmid, C., Schertler, K.: Viewpoint-independent object class detection using 3d feature maps. In: Proc. IEEE CVPR (2008)

    Google Scholar 

  5. Whaite, P., Ferrie, F.: Autonomous exploration: Driven by uncertainty. Technical Report TR-CIM-93-17, McGill U. CIM (1994)

    Google Scholar 

  6. Laporte, C., Arbel, T.: Efficient discriminant viewpoint selection for active bayesian recognition. Int. J. Computer Vision 68, 1573–1405 (2006)

    Google Scholar 

  7. Andriluka, M., Roth, S., Schiele, B.: Monocular 3d pose estimation and tracking by detection. In: Proc. IEEE CVPR (2010)

    Google Scholar 

  8. Wojek, C., Roth, S., Schindler, K., Schiele, B.: Monocular 3D scene modeling and inference: Understanding multi-object traffic scenes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 467–481. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  9. Coates, A., Ng, A.Y.: Multi-camera object detection for robotics. In: Proc. IEEE Int. Conf. Robotics and Automation (2010)

    Google Scholar 

  10. Leibe, B., Schindler, K., Cornelis, N., Gool, L.V.: Coupled object detection and tracking from static cameras and moving vehicles. IEEE Trans. Pattern Analysis Machine Intelligence (2008)

    Google Scholar 

  11. Wojek, C., Walk, S., Schiele, B.: Multi-cue onboard pedestrian detection. In: CVPR, pp. 1–8 (2009)

    Google Scholar 

  12. Kragic, D., Björkman, M.: Strategies for object manipulation using foveal and peripheral vision. In: Proc. IEEE ICVS (2006)

    Google Scholar 

  13. Gould, S., Arfvidsson, J., Kaehler, A., Sapp, B., Meissner, M., Bradski, G., Baumstarck, P., Chung, S., Ng, A.: Peripheral-foveal vision for real-time object recognition and tracking in video. In: Proc. IJCAI (2007)

    Google Scholar 

  14. Rusu, R.B., Holzbach, A., Beetz, M., Bradski, G.: Detecting and segmenting objects for mobile manipulation. In: Proc. ICCV, S3DV Workshop (2009)

    Google Scholar 

  15. Ye, Y., Tsotsos, J.K.: Sensor planning for 3d object search. Computer Vision and Image Understanding 73, 145–168 (1999)

    Article  Google Scholar 

  16. Savarese, S., Fei-Fei, L.: 3d generic object categorization, localization and pose estimation. In: Proc. IEEE ICCV (2007)

    Google Scholar 

  17. Viksten, F., Forssen, P.E., Johansson, B., Moe, A.: Comparison of local image descriptors for full 6 degree-of-freedom pose estimation. In: Proceedings of the IEEE International Conference on Robotics and Automation, ICRA (2009)

    Google Scholar 

  18. Forssen, P.E., Meger, D., Lai, K., Helmer, S., Little, J.J., Lowe, D.G.: Informed visual search: Combining attention and object recognition. In: ICRA, pp. 935–942 (2008)

    Google Scholar 

  19. LeCun, Y., Huang, F., Bottou, L.: Learning methods for generic object recognition with invariance to pose and lighting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2004)

    Google Scholar 

  20. Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: Proc. of the 10th IEEE International Conference on Computer Vision, ICCV (2005)

    Google Scholar 

  21. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, San Diego, USA, vol. 2, pp. 886–893 (2005)

    Google Scholar 

  22. Shotton, J., Blake, A., Cipolla, R.: Multiscale categorical object recognition using contour fragments. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 1270–1281 (2008)

    Article  Google Scholar 

  23. Fiala, M.: Artag, a fiducial marker system using digital techniques. In: CVPR 2005, vol. 1, pp. 590–596 (2005)

    Google Scholar 

  24. Poupyrev, I., Kato, H., Billinghurst, M.: Artoolkit user manual, version 2.33. Human Interface Technology Lab, University of Washington (2000)

    Google Scholar 

  25. Sattar, J., Bourque, E., Giguere, P., Dudek, G.: Fourier tags: Smoothly degradable fiducial markers for use in human-robot interaction. In: Fourth Canadian Conference on Computer and Robot Vision (CRV), Montreal, Quebec, Canada, pp. 165–174 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Helmer, S., Meger, D., Muja, M., Little, J.J., Lowe, D.G. (2011). Multiple Viewpoint Recognition and Localization. In: Kimmel, R., Klette, R., Sugimoto, A. (eds) Computer Vision – ACCV 2010. ACCV 2010. Lecture Notes in Computer Science, vol 6492. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19315-6_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19315-6_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19314-9

  • Online ISBN: 978-3-642-19315-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics