Skip to main content
Log in

3D Object Recognition in Cluttered Environments by Segment-Based Stereo Vision

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

We propose a new method for 3D object recognition which uses segment-based stereo vision. An object is identified in a cluttered environment and its position and orientation (6 dof) are determined accurately enabling a robot to pick up the object and manipulate it. The object can be of any shape (planar figures, polyhedra, free-form objects) and partially occluded by other objects. Segment-based stereo vision is employed for 3D sensing. Both CAD-based and sensor-based object modeling subsystems are available. Matching is performed by calculating candidates for the object position and orientation using local features, verifying each candidate, and improving the accuracy of the position and orientation by an iteration method. Several experimental results are presented to demonstrate the usefulness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Arman, F. and Aggarwal, J.K. 1993a. CAD-based vision: Object recognition in cluttered range images using recognition strategies. CVGIP: Image Understanding, 58(1):33–48.

    Google Scholar 

  • Arman, F. and Aggarwal, J.K. 1993b. Model-based object recognition in dense-range images—A review. ACM Computing Surveys, 25(1):5–43.

    Google Scholar 

  • Ayache, N. and Faugeras, O.D. 1986. HYPER: A new approach for the recognition and positioning of two-dimensional objects. IEEE Trans. on PAMI, PAMI-8(1):44–54.

    Google Scholar 

  • Basri, R. and Ullman, S. 1993. The alignment of objects with smooth surfaces. CVGIP: Image Understanding, 57(3):331–345.

    Google Scholar 

  • Besl, P.J. and McKay, N.D. 1992. A method for registration of 3-D shapes. IEEE Trans. on PAMI, 14(2):239–256.

    Google Scholar 

  • Chen, Y. and Medioni, G. 1992. Object modeling by registration of multiple range images. Image Vision Computing, 10(3):145–155.

    Google Scholar 

  • Chen, J.-L. and Stockman, G.C. 1996. Determining pose of 3D objects with curved surfaces. IEEE Trans. on PAMI, 18(1):52–57.

    Google Scholar 

  • Chua, C.S. and Jarvis, R. 1996. 3D free-form surface registration and object recognition. Int. J. of Computer Vision, 17(1):77–99.

    Google Scholar 

  • Chua, C.S. and Jarvis, R. 1997. Point signatures: A new representation for 3D object recognition. Int. J. of Computer Vision, 25(1):63–85.

    Google Scholar 

  • Cipolla, R. and Blake, A. 1992. Surface shape from the deformation of apparent contours. Int. J. of Computer Vision, 9(2):83–112.

    Google Scholar 

  • Dorai, C. and Jain, A.K. 1997. Cosmos—A representation scheme for 3D free-form objects. IEEE Trans. on PAMI, 19(10):1115–1130.

    Google Scholar 

  • Faugeras, O.D. and Hebert, M.1986. The representation, recognition, and locating of 3-D objects. Int. J. of Robotics Research, 5(3):27–52.

    Google Scholar 

  • Flynn, P.J. 1994. 3-D object recognition with symmetric models: Symmetry extraction and encoding. IEEE Trans. on PAMI, 16(8):814–818.

    Google Scholar 

  • Flynn, P.J. and Jain, A.K. 1991. BONSAI: 3-D object recognition using constrained search. IEEE Trans. on PAMI, 13(10):1066–1075.

    Google Scholar 

  • Grimson, W.E.L. 1990. Object Recognition by Computer: The Role of Geometric Constraints. The MIT Press, Cambridge, MA.

    Google Scholar 

  • Horn, B.K.P. 1984. Extended gaussian images. Proc. IEEE, 72(12):1656–1678.

    Google Scholar 

  • Huttenlocher, D.P. and Ullman, S. 1990. Recognizing solid objects by alignment with an image. Int. J. of Computer Vision, 5(2):195–212.

    Google Scholar 

  • Johnson, A.E. and Hebert, M. 1998. Surface matching for object recognition in complex three-dimensional scenes. Image and Vision Computing, 16(9/10):433–449.

    Google Scholar 

  • Johnson, A.E. and Hebert, M. 1999. Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Trans. on PAMI, 21(5):433–449.

    Google Scholar 

  • Joshi, T., Vijayakumar, B., Kriegman, D.J., and Ponce, J. 1997. Hot curves for modelling and recognition of smooth curved 3D objects. Image and Vision Computing, 15(7):479–498.

    Google Scholar 

  • Kawai, Y. and Tomita, F. 1996. Interactive tactile display system—A support system for the visually disabled to recognize 3Dobjects—. In Proc. 2nd Annual ACM Conference on Assistive Technologies, ASSET96, Vancouver, Canada, pp. 45–50.

  • Kawai, Y., Ueshiba, T., Ishiyama, Y., Sumi, Y., and Tomita, F. 1998. Stereo correspondence using segment connectivity. In Proc. 14th International Conference on Pattern Recognition, ICPR98, Brisbane, Australia, pp. 648–651.

  • Kriegman, D.J. and Ponce, J. 1990. On recognizing and positioning curved 3-D objects from image contours. IEEE Trans. on PAMI, 12(12):1127–1137.

    Google Scholar 

  • Matsui, T. and Hara, I. 1995. EusLisp Reference Manual version 8.00. Electrotechnical Laboratory, Tsukuba, Japan.

    Google Scholar 

  • Matsushita, T., Sumi, Y., Ishiyama, Y., and Tomita, F. 1998. A tracking based manipulation system built on stereo vision. In Proc. IEEE/RSJ International Conference on Intelligent Robotic Systems, iIROS'98, Victoria, Canada, pp. 185–190.

  • Ohta, Y., Watanabe, M., and Ikeda, K. 1986. Improving depth map by right-angled trinocular stereo. In Proc. ICPR, Paris, France, vol. 1, pp. 519–521.

    Google Scholar 

  • Oue, Y., Sugimoto, T., Kitamura, T., Sumi, Y., and Tomita, F. 1999. Object recognition on distributed computing environment. Trans. of the Institute of Electronics, Information and Communication Engineers, J82-D-II(12):2307–2315. (in Japanese).

    Google Scholar 

  • Ponce, J., Hoogs, A., and Kriegman, D.J. 1992. On using CAD models to compute the pose of curved 3D object. CVGIP: Image Understanding, 55(2):184–197.

    Google Scholar 

  • Porrill, J., Pollard, S.B., Pridmore, T.P., Bowen, J.B., Mayhew, J.E.W., and Frisby, J.P. 1988. TINA: A 3D vision system for pick and place. Image and Vision Computing, 6(2):91–99.

    Google Scholar 

  • Rygol, M., Pollard, S., and Brown, C. 1991. Multiprocessor 3Dvision system for pick and place. Image and Vision Computing, 9(1):33–38.

    Google Scholar 

  • Seales, W.B. and Faugeras, O.D. 1995. Building three-dimensional object models from image sequences. Computer Vision and Image Understanding, 61(3):308–324.

    Google Scholar 

  • Stein, F. and Medioni, G. 1992. Structural indexing: Efficient 3-D object recognition. IEEE Trans. on PAMI, 14(2):125–145.

    Google Scholar 

  • Sugimoto, K. and Tomita, F. 1994. Boundary segmentation by detection of corner, inflection and transition points. In Proc. the IEEE Workshop on Visualization and Machine Vision, Seattle, WA, pp. 13–17.

  • Tan, T.N., Sullivan, G.D., and Baker, K.D. 1998. Model-based localisation and recognition of road vehicles. Int. J. of Computer Vision, 27(1):5–25.

    Google Scholar 

  • Tomita, F. and Takahashi, H. 1986. Algorithms for a B-rep of an image as its intermediate description. Technical Report, Institute of Electronics, Information and Communication Engineers (in Japanese).

  • Tomita, F. and Tsuji, S. 1990. Computer Analysis of Visual Textures, Ch. 3. Kluwer Academic Publishers, Norwell, MA.

    Google Scholar 

  • Tomita, F., Yoshimi, T., Ueshiba, T., Kawai, Y., Sumi, Y., Matsushita, T., Ichimura, N., Sugimoto, K., and Ishiyama, Y. 1998. R&D of versatile 3D vision system VVV. In Proc. IEEE International Conference on Systems, Man, and Cybernetics, SMC'98, San Diego, CA, pp. 4510–4516.

  • Ueshiba, T., Kawai, Y., Ishiyama, Y., Sumi, Y., and Tomita, F. 1998. An efficient matching algorithm for segment-based stereo vision using dynamic programming technique. In Proc. IAPR Workshop on Machine Vision Applications, Chiba, Japan, pp. 61–64.

  • Vaillant, R. and Faugeras, O.D. 1992. Using extremal boundaries for 3-D object modeling. IEEE Trans. on PAMI, 14(2):157–173.

    Google Scholar 

  • Vayda, A.J. and Kak, A.C. 1991. A robot vision system for recognition of generic shaped objects. CVGIP: Image Understanding, 54(1):1–46.

    Google Scholar 

  • Yachida, M., Kitamura, Y., and Kimachi, M. 1986. Trinocular vision: New approach for correspondence problem. In Proc. ICPR, Paris, France, vol. 2, pp. 1041–1044.

    Google Scholar 

  • Zerroug, M. and Nevatia, R. 1995. Pose estimation of multi-part curved objects. In Proc. International Symposium on Computer Vision, Coral Gables, FL, pp. 431–436.

  • Zhang, Z. and Faugeras, O.D. 1992. Three-dimensional motion computation and object segmentation in a long sequence of stereo frames. Int. J. of Computer Vision, 7(3):211–241.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sumi, Y., Kawai, Y., Yoshimi, T. et al. 3D Object Recognition in Cluttered Environments by Segment-Based Stereo Vision. International Journal of Computer Vision 46, 5–23 (2002). https://doi.org/10.1023/A:1013240031067

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1013240031067

Navigation