Skip to main content
Log in

Visual Modeling with a Hand-Held Camera

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

In this paper a complete system to build visual models from camera images is presented. The system can deal with uncalibrated image sequences acquired with a hand-held camera. Based on tracked or matched features the relations between multiple views are computed. From this both the structure of the scene and the motion of the camera are retrieved. The ambiguity on the reconstruction is restricted from projective to metric through self-calibration. A flexible multi-view stereo matching scheme is used to obtain a dense estimation of the surface geometry. From the computed data different types of visual models are constructed. Besides the traditional geometry- and image-based approaches, a combined approach with view-dependent geometry and texture is presented. As an application fusion of real and virtual scenes is also shown.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Beardsley, P., Zisserman, A., and Murray, D. 1997. Sequential updating of projective and affine structure from motion. International Journal of Computer Vision, 23(3):235–259.

    Google Scholar 

  • Buehler, C., Bosse, M., McMillan, L., Gortler, S., and Cohen, M. 2001. Unstructured lumigraph rendering. In Proceedings ACM SIGGRAPH 2001, pp. 425–432.

  • Chai, J.-X., Tong, X., Chan, S.-C., and Shum, H.-Y. 2000. Plenoptic sampling, In Proc. Siggraph, pp. 307–318.

  • Chum, O. and Matas, J. 2002. Randomized ransac with td, d test. In Proceedings of the British Machine Vision Conference, P. Rosin and D. Marshall (Eds.), BMVA: London, UK, vol. 2, pp. 448–457.

    Google Scholar 

  • Cornelis K., Pollefeys M., Vergauwen, M., and Van Gool, L. 2001. Augmented reality from uncalibrated video sequences. In 3D Structure from Images—SMILE 2000, M. Pollefeys, L. Van Gool, A. Zisserman, and A. Fitzgibbon (Eds.), Lecture Notes in Computer Science, Springer-Verlag, vol. 2018, pp. 150–167.

  • Cox, I., Hingorani, S., and Rao, S. 1996. A maximum likelihood stereo algorithm. Computer Vision and Image Understanding, 63(3).

  • Curless, B. and Levoy, M. 1996. A volumetric method for building complex models from range images. In Proc. SIGGRAPH' 96, pp. 303–312.

  • Debevec, P., Borshukov, G., and Yu, Y. 1998. Efficient viewdependent image-based rendering with projective texturemapping. In 9th Eurographics RenderingWorkshop, Vienna, Austria.

  • Debevec, P., Taylor, C., and Malik, J. 1996. Modeling and rendering architecture from photographs: A hybrid geometry-and imagebased approach. In Proc. SIGGRAPH'96, pp. 11–20.

  • Falkenhagen, L. 1997. Hierarchical block-based disparity estimation considering neighbourhood constraints. In Proceedings International Workshop on SNHC and 3D Imaging, Rhodes, Greece, pp. 115–122.

  • Faugeras, O. 1992. What can be seen in three dimensions with an uncalibrated stereo rig. In Computer Vision—ECCV'92, Lecture Notes in Computer Science, Springer-Verlag, vol. 588, pp. 563–578.

    Google Scholar 

  • Faugeras, O., Luong, Q.-T., and Maybank, S. 1992. Camera selfcalibration: Theory and experiments. In Computer Vision—ECCV'92, Lecture Notes in Computer Science, Springer-Verlag, vol. 588, pp. 321–334.

    Google Scholar 

  • Faugeras, O., Luong, Q.-T., and Papadopoulo, T. 2001. The Geometry of Multiple Images. MIT Press.

  • Fischler, M. and Bolles, R. 1981. Random Sampling consensus: A paradigm for model fitting with application to image analysis and automated cartography. Commun. Assoc. Comp. Mach., 24:381–95.

    Google Scholar 

  • Fitzgibbon, A. and Zisserman, A. 1998. Automatic camera recovery for closed or open image sequences. In Computer Vision—ECCV'98, Lecture Notes in Computer Science, vol. 1406, Springer-Verlag, vol. 1, pp. 311–326.

    Google Scholar 

  • Gortler, S., Grzeszczuk, R., Szeliski, R., and Cohen, M.F. 1996. The Lumigraph. In Proc. SIGGRAPH' 96, ACM Press: New York, pp. 43–54.

    Google Scholar 

  • Harris, C. and Stephens, M. 1988. A combined corner and edge detector. In Fourth Alvey Vision Conference, pp. 147–151.

  • Hartley, R. 1994. Euclidean reconstruction from uncalibrated views. In Applications of Invariance in Computer Vision, J.L. Mundy, A. Zisserman, and D. Forsyth (Eds.), Lecture Notes in Computer Science, Springer-Verlag, vol. 825, pp. 237–256.

  • Hartley, R. 1997. In defense of the eight-point algorithm. IEEE Trans. on Pattern Analysis and Machine Intelligence, 19(6):580–593.

    Google Scholar 

  • Hartley, R., Gupta, R., and Chang, T. 1992. Stereo from uncalibrated cameras. In Proc. Conference ComputerVision andPattern Recognition, pp. 761–764.

  • Hartley, R. 1998. Chirality. International Journal of Computer Vision, 26(1):41–61.

    Google Scholar 

  • Hartley, R. and Sturm, P. 1997. Triangulation. Computer Vision and Image Understanding, 68(2):146–157.

    Google Scholar 

  • Hartley, R. and Zisserman, A. 2000. Multiple ViewGeometry in Computer Vision. Cambridge University Press.

  • Heigl, B., Koch, R., Pollefeys, M., Denzler, J., and Van Gool, L. 1999. Plenoptic modeling and rendering from image sequences taken by hand-held camera. In Proc. DAGM'99, pp. 94–101.

  • Heyden, A. and Åström, K. 1996. Euclidean reconstruction from constant intrinsic parameters. In Proc. 13th International Conference on Pattern Recognition, IEEE Computer Soc. Press, pp. 339–343.

  • Heyden, A. and Åström, K. 1997. Euclidean reconstruction from image sequences with varying and unknown focal length and principal point. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, IEEE Computer Soc. Press, pp. 438–443.

  • Kahl, F. 1999. Critical motions and ambiuous euclidean reconstructions in auto-calibration. In Proc. ICCV, pp. 469–475.

  • Koch, R. 1996. Automatische oberflachenmodellierung starrer dreidimensionaler Objekte aus stereoskopischen Rundum-Ansichten, PhD thesis, University of Hannover, Germany, also published as Fortschritte-Berichte VDI, Reihe 10, No. 499, VDI Verlag, 1997.

    Google Scholar 

  • Koch, R., Heigl, B., Pollefeys, M., Van Gool, L., and Niemann, H. 1999a. A geometric approach to lightfield calibration. In Proc. CAIP99, Springer-Verlag, vol. LNCS 1689, pp. 596–603.

    Google Scholar 

  • Koch, R., Pollefeys, M., and Van Gool, L., 1998. Multi viewpoint stereo from uncalibrated video sequences. In Proc. European Conference on Computer Vision, Freiburg, Germany, pp. 55–71.

    Google Scholar 

  • Koch, R., Pollefeys, M., Heigl, B., Van Gool, L., and Niemann, H. 1999b. Calibration of hand-held camera sequences for plenoptic modeling. In Proc.ICCV'99 (International Conference on Computer Vision), Corfu (Greece), pp. 585–591.

  • Kutulakos, K.N. and Seitz, S.M. 2000. A theory of shape by space carving. International Journal of Computer Vision, 38(3):199–218.

    Google Scholar 

  • Laveau, S. and Faugeras, O. 1996. Oriented projective geometry for computer vision. In Computer Vision—ECCV'96, B. Buxton and R. Cipolla (Eds.), Lecture Notes in Computer Science, Springer-Verlag, vol. 1064, pp. 147–156.

  • Levoy, M. and Hanrahan, P. 1996. Lightfield Rendering. In Proc. SIGGRAPH' 96, ACM Press: New York, pp. 31–42.

    Google Scholar 

  • Lorensen, W. and Cline, H. 1987. Marching cubes: A high resolution 3D surface construction algorithm. Computer Graphics (Proceedings of SIGGRAPH 87), 21(4):163–169.

    Google Scholar 

  • Lowe, D. 1999. Object recognition from local scale-invariant features. In Proc. International Conference on Computer Vision, pp. 1150–1157.

  • Matei, B. and Meer, P. 2000. A general method for errors-in-variables problems in computer vision. In Proc. CVPR 2000, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE Computer Society Press: Los Alamitos, CA (Hilton Head Island, South Carolina, vol. 2, pp. 18–25.

    Google Scholar 

  • McMillan, L. and Bishop, G. 1995. Plenoptic modeling: An image-based rendering system. In Proc. SIGGRAPH'95, pp. 39–46.

  • Ofek, E., Shilat, E., Rappopport, A., and Werman, M. 1997. Highlight and reflection independent multiresolution textures from image sequences. IEEE Computer Graphics and Applications, 17(2).

  • Okutomi, M. and Kanade, T. 1993. A multiple-baseline stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(4):353–363.

    Google Scholar 

  • Pollefeys, M. 1999. Self-calibration and metric 3D reconstruction from uncalibrated image sequences, Ph.D. dissertation, ESAT-PSI, K.U. Leuven.

  • Pollefeys, M., Koch, R., and Van Gool, L. 1998. Self-calibration and metric reconstruction in spite of varying and unknown internal camera parameters. In Proc. International Conference on Computer Vision, Narosa Publishing House, pp. 90–95.

  • Pollefeys, M., Koch, R., and Van Gool, L. 1999. A simple and effi-cient rectification method for general motion. In Proc. ICCV'99 (International Conference on Computer Vision), Corfu (Greece), pp. 496–501.

  • Pollefeys, M., Koch, R., and Van Gool, L. 1999. Self-calibration and metric reconstruction in spite of varying and unknown internal camera parameters. International Journal of Computer Vision, 32(1):7–25.

    Google Scholar 

  • Pollefeys, M. and Van Gool, L. 1999. Stratified self-calibration with the modulus constraint. IEEE transactions on Pattern Analysis and Machine Intelligence. 21(8):707–724.

    Google Scholar 

  • Pollefeys, M., Verbiest, F., and Van Gool, L. 2002. Surviving dominant planes in uncalibrated structure and motion recovery. In Computer Vision—ECCV 2002, 7th European Conference on Computer Vision, A. Heyden, G. Sparr, M. Nielsen, P. Johansen (Eds.), Lecture Notes in Computer Science, vol. 2351, pp. 837–851.

  • Press, W., Teukolsky, S., and Vetterling, W. 1992. Numerical Recipes in C: The Art of Scientific Computing, Cambridge University Press.

  • Rousseeuw, P. 1987. Robust Regression and Outlier Detection. Wiley: New York.

    Google Scholar 

  • Sawhney, H., Hsu, S., and Kumar, R. 1998. Robust video mosaicing through topology inference and local to global alignment. In Computer Vision—ECCV'98: Proc. 5th European Conference on Computer Vision, Lecture Notes in Computer Science, Springer-Verlag, vol. II, pp. 103–119.

    Google Scholar 

  • Scharstein, D., and Szeliski, R. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Internation Journal of Computer Vision, 47(1/2/3):7–42.

    Google Scholar 

  • Schmid, C. and Mohr, R. 1997. Local grayvalue invariants for image retrieval. IEEE Transaction on Pattern Analysis and Machine Intelligence, 19(5):530–534.

    Google Scholar 

  • Schroeder, W., Zarge, J., and Lorensen, W. 1992. Decimation of triangle meshes. Computer Graphics (Proceedings of SIGGRAPH 92), 26(2):65–70.

    Google Scholar 

  • Shi, J. and Tomasi, C. 1994. Good features to track. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR'94), pp. 593–600.

  • Slama, C. 1980. Manual of Photogrammetry, 4th edition. American society of photogrammetry: Church, VA, USA, Falls.

    Google Scholar 

  • Soucy, M. and Laurendeau, D. 1995. A general surface approach to the integration of a set of range views. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(4):344–358.

    Google Scholar 

  • Sturm, P. 1997a. Critical motion sequences for monocular selfcalibration and uncalibrated euclidean reconstruction. In Proc. 1997 Conference on Computer Vision and Pattern Recognition, IEEE Computer Soc. Press, pp. 1100–1105.

  • Sturm, P. 1997b. Vision 3D non calibrée-contributions à la reconstruction projective et étude des mouvements critiques pour l'auto-calibrage. Ph.D. Thesis, Institut National Polytechnique de Grenoble, 1997.

  • Sturm, P. 1999. Critical motion sequences for the self-calibration of cameras and stereo systems with variable focal length. In Proceedings of the tenth British Machine Vision Conference, T. Pridmore and D. Elliman (Eds.), Nottingham, England, British Machine Vision Association, pp. 63–72.

    Google Scholar 

  • Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: A factorization approach. International Journal of Computer Vision, 9(2):137–154.

    Google Scholar 

  • Torr, P. 1995. Motion Segmentation and Outlier Detection. PhD Thesis, Dept. of Engineering Science, University of Oxford.

    Google Scholar 

  • Torr, P., Fitzgibbon, A., and Zisserman, A. 1998. Maintaining multiple motion model hypotheses through many views to recover matching and structure. In Proc. ICCV, pp. 485–491.

  • Triggs, B. 1997. The absolute quadric. In Proc. 1997 Conference on Computer Vision and Pattern Recognition, IEEE Computer Soc. Press, pp. 609–614.

  • Triggs, B., McLauchlan, P., Hartley, R., and Fiztgibbon, A. 2000. Bundle adjustment—A modern synthesis. In Vision Algorithms: Theory and Practice, Triggs B., Zisserman A., Szeliski R. (Eds.), Springer-Verlag, vol. LNCS 1883, pp. 298–372.

  • Turk, G. and Levoy, M. 1994. Zippered polygon meshes from range images. In Proceedings of SIGGRAPH' 94 pp. 311–318.

  • Tuytelaars, T. and Van Gool, L. 2000.Wide baseline stereo based on local, affinely invariant regions. In British Machine Vision Conference, pp. 412–422.

  • Van Meerbergen, G., Vergauwen, M., Pollefeys, M., and Van Gool, L. 2002. A hierarchical symmetric stereo algorithm using dynamic programming. International Journal on Computer Vision, 47(1–3):275–285.

    Google Scholar 

  • Wheeler, M., Sato, Y., and Ikeuchi, K. 1998. Consensus surfaces for modeling 3D objects from multiple range images. In Sixth International Conference on Computer Vision, pp. 917–924.

  • Willson, R. 1994. Modeling and Calibration of Automated Zoom Lenses. Ph.D. thesis, Department of Electrical and Computer Engineering, Carnegie Mellon University.

  • Zhang, Z., Deriche, R., Faugeras, O., and Luong, Q.-T. 1995. A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry. Artificial Intelligence Journal, 78:87–119.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pollefeys, M., Van Gool, L., Vergauwen, M. et al. Visual Modeling with a Hand-Held Camera. International Journal of Computer Vision 59, 207–232 (2004). https://doi.org/10.1023/B:VISI.0000025798.50602.3a

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:VISI.0000025798.50602.3a

Navigation