Skip to main content
Log in

Dense 3-D Reconstruction of an Outdoor Scene by Hundreds-Baseline Stereo Using a Hand-Held Video Camera

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Three-dimensional (3-D) models of outdoor scenes are widely used for object recognition, navigation, mixed reality, and so on. Because such models are often made manually with high costs, automatic 3-D reconstruction has been widely investigated. In related work, a dense 3-D model is generated by using a stereo method. However, such approaches cannot use several hundreds images together for dense depth estimation because it is difficult to accurately calibrate a large number of cameras. In this paper, we propose a dense 3-D reconstruction method that first estimates extrinsic camera parameters of a hand-held video camera, and then reconstructs a dense 3-D model of a scene. In the first process, extrinsic camera parameters are estimated by tracking a small number of predefined markers of known 3-D positions and natural features automatically. Then, several hundreds dense depth maps obtained by multi-baseline stereo are combined together in a voxel space.So, we can acquire a dense 3-D model of the outdoor scene accurately by using several hundreds input images captured by a hand-held video camera.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Barnard, S.T. and Fischler, M.A. 1982. Computational stereo. ACM Computing Surveys, 14(4):553–572.

    Google Scholar 

  • Beardsley, P., Zisserman, A., and Murray, D. 1997. Sequential updating of projective and affine structure from motion. Int. Jour. of Computer Vision, 23(3):235–259.

    Google Scholar 

  • Harris, C. and Stephens, M. 1988. A combined corner and edge detector. In Proc. Alvey Vision Conf., pp. 147–151.

  • Kumar, R., Sawhney, H.S., Guo, Y., Hsu, S., and Samarasekera, S. 2000. 3D manipulation of motion imagery. In Proc. Int. Conf. on Image Processing, pp. 17–20.

  • Morris, D.D. and Kanade, T. 1998. A unified factorization algorithm for points, lines segments and planes with uncertainty models. In Proc. 6th Int. Conf. on Computer Vision, pp. 696–702.

  • Ohta, Y. and Kanade, T. 1985. Stereo by intra-and inter-scanline search using dynamic programming. IEEE Trans. Pattern Analysis and Machine Intelligence, PAMI-7(2):139–154.

    Google Scholar 

  • Okutomi, M. and Kanade, T. 1993. A multiple-baseline stereo. IEEE Trans. Pattern Analysis and Machine Intelligence, 15(4):353–363.

    Google Scholar 

  • Poleman, J. and Kanade, T. 1993. A paraperspective factorization method for shape and motion recovery. Technical Report CMU-CS-93-219, Carnegie-Mellon University.

  • Pollefeys, M., Koch, R., Vergauwen, M., Deknuydt, A.A., and Gool, L.J.V. 2000. Three-dimentional scene reconstruction from images. In Proc. SPIE, vol. 3958, pp. 215–226.

    Google Scholar 

  • Roth, G. and Whitehead, A. 2000. Using projective vision to find camera positions in an image sequence. In Proc. 13th Int. Conf. on Vision Interface, pp. 87–94.

  • Sato, T., Kanbara, M., Takemura, H., and Yokoya, N. 2001. 3-D reconstruction from a monocular image sequence by tracking markers and natural features. In Proc. 14th Int. Conf. on Vision Interface, pp. 157–164.

  • Sawhney, H.S., Guo, Y., Asmuth, J., and Kumar, R. 1999. Multi-view 3D estimation and application to match move. In Proc. IEEE Workshop on Multi-view Modeling and Analysis of Visual Scenes, pp. 21–28.

  • Schmid, C., Mohr, R., and Bauckhage, C. 1998. Comparing and evaluating interest points. In Proc. 6th Int. Conf. on Computer Vision, pp. 230–235.

  • Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: A factorization method. Int. Journal of Computer Vision, 9(2):137–154.

    Google Scholar 

  • Tsai, R.Y. 1986. An efficient and accurate camera calibration technique for 3D machine vision. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 364–374.

  • Yokoya, N. 1992. Surface reconstruction directly from binocular stereo images by multiscale-multistage regularization. In Proc. 11th Int. Conf. on Pattern Recognition, vol. I, pp. 642–646.

    Google Scholar 

  • Yokoya, N., Shakunaga, T., and Kanbara, M. 1999. Passive range sensing techniques: Depth from images. IEICE Trans. Inf. and Syst., E82-D(3):523–533.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sato, T., Kanbara, M., Yokoya, N. et al. Dense 3-D Reconstruction of an Outdoor Scene by Hundreds-Baseline Stereo Using a Hand-Held Video Camera. International Journal of Computer Vision 47, 119–129 (2002). https://doi.org/10.1023/A:1014537706773

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1014537706773

Navigation