article

High-quality video view interpolation using a layered representation

Authors:
C. Lawrence Zitnick

Microsoft Research, Redmond, WA

Microsoft Research, Redmond, WA
View Profile

,
Sing Bing Kang

Microsoft Research, Redmond, WA

Microsoft Research, Redmond, WA
View Profile

,
Matthew Uyttendaele

Microsoft Research, Redmond, WA

Microsoft Research, Redmond, WA
View Profile

,
Simon Winder

Microsoft Research, Redmond, WA

Microsoft Research, Redmond, WA
View Profile

,
Richard Szeliski

Microsoft Research, Redmond, WA

Microsoft Research, Redmond, WA
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 23 Issue 3pp 600–608https://doi.org/10.1145/1015706.1015766

Published:01 August 2004Publication History

ACM Transactions on Graphics

Abstract

The ability to interactively control viewpoint while watching a video is an exciting application of image-based rendering. The goal of our work is to render dynamic scenes with interactive viewpoint control using a relatively small number of video cameras. In this paper, we show how high-quality video-based rendering of dynamic scenes can be accomplished using multiple synchronized video streams combined with novel image-based modeling and rendering algorithms. Once these video streams have been processed, we can synthesize any intermediate view between cameras at any time, with the potential for space-time manipulation.In our approach, we first use a novel color segmentation-based stereo algorithm to generate high-quality photoconsistent correspondences across all camera views. Mattes for areas near depth discontinuities are then automatically extracted to reduce artifacts during view synthesis. Finally, a novel temporal two-layer compressed representation that handles matting is developed for rendering at interactive rates.

Supplemental Material

Available for Download

mov

pps051.mov (12.2 KB)

References

Baker, S., Szeliski, R., and Anandan, P. 1998. A layered approach to stereo reconstruction. In Conference on Computer Vision and Pattern Recognition (CVPR), 434--441. Google ScholarDigital Library
Buehler, C., Bosse, M., McMillan, L., Gortler, S. J., and Cohen, M. F. 2001. Unstructured lumigraph rendering. Proceedings of SIGGRAPH 2001, 425--432. Google ScholarDigital Library
Carceroni, R. L., and Kutulakos, K. N. 2001. Multi-view scene capture by surfel sampling: From video streams to non-rigid 3D motion, shape and reflectance. In International Conference on Computer Vision (ICCV), vol. II, 60--67.Google Scholar
Carranza, J., Theobalt, C., Magnor, M. A., and Seidel, H.-P. 2003. Free-viewpoint video of human actors. ACM Transactions on Graphics 22, 3, 569--577. Google ScholarDigital Library
Chang, C.-L., et al. 2003. Inter-view wavelet compression of light fields with disparity-compensated lifting. In Visual Communication and Image Processing (VCIP 2003), 14--22.Google Scholar
Chuang, Y.-Y., et al. 2001. A Bayesian approach to digital matting. In Conference on Computer Vision and Pattern Recognition (CVPR), vol. II, 264--271.Google Scholar
Chuang, Y.-Y., et al. 2002. Video matting of complex scenes. ACM Transactions on Graphics 21, 3, 243--248. Google ScholarDigital Library
Debevec, P. E., Taylor, C. J., and Malik, J. 1996. Modeling and rendering architecture from photographs: A hybrid geometry-and image-based approach. Computer Graphics (SIGGRAPH'96), 11--20. Google ScholarDigital Library
Debevec, P. E., Yu, Y., and Borshukov, G. D. 1998. Efficient view-dependent image-based rendering with projective texture-mapping. Eurographics Rendering Workshop 1998, 105--116.Google Scholar
Fitzgibbon, A., Wexler, Y., and Zisserman, A. 2003. Image-based rendering using image-based priors. In International Conference on Computer Vision (ICCV), vol. 2, 1176--1183. Google ScholarDigital Library
Goldlücke, B., Magnor, M., and Wilburn, B. 2002. Hardware-accelerated dynamic light field rendering. In Proceedings Vision, Modeling and Visualization VMV 2002, 455--462.Google Scholar
Gortler, S. J., Grzeszczuk, R., Szeliski, R., and Cohen, M. F. 1996. The Lumigraph. In Computer Graphics (SIGGRAPH'96) Proceedings, ACM SIGGRAPH, 43--54. Google ScholarDigital Library
Gross, M., et al. 2003. blue-c: A spatially immersive display and 3D video portal for telepresence. Proceedings of SIGGRAPH 2003 (ACM Transactions on Graphics), 819--827. Google ScholarDigital Library
Hall-Holt, O., and Rusinkiewicz, S. 2001. Stripe boundary codes for real-time structured-light range scanning of moving objects. In International Conference on Computer Vision (ICCV), vol. II, 359--366.Google Scholar
Heigl, B., et al. 1999. Plenoptic modeling and rendering from image sequences taken by hand-held camera. In DAGM'99, 94--101. Google ScholarDigital Library
Kanade, T., Rander, P. W., and Narayanan, P. J. 1997. Virtualized reality: constructing virtual worlds from real scenes. IEEE MultiMedia Magazine, 1(1):34--47. Google ScholarDigital Library
Levoy, M., and Hanrahan, P. 1996. Light field rendering. In Computer Graphics (SIGGRAPH'96) Proceedings, ACM SIGGRAPH, 31--42. Google ScholarDigital Library
Matusik, W., et al. 2000. Image-based visual hulls. Proceedings of SIGGRAPH 2000, 369--374. Google ScholarDigital Library
Patras, I., Hendriks, E., and Lagendijk, R. 2001. Video segmentation by MAP labeling of watershed segments. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 3, 326--332. Google ScholarDigital Library
Perona, P., and Malik, J. 1990. Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence 12, 7, 626--639. Google ScholarDigital Library
Pulli, K., et al. 1997. View-based rendering: Visualizing real objects from scanned range and color data. In Proceedings of the 8th Eurographics Workshop on Rendering, 23--34. Google ScholarDigital Library
Scharstein, D., and Szeliski, R. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision 47, 1, 7--42. Google ScholarDigital Library
Schirmacher, H., Ming, L., and Seidel, H.-P. 2001. On-the-fly processing of generalized Lumigraphs. In Proceedings of Eurographics, Computer Graphics Forum 20, 3, 165--173.Google ScholarCross Ref
Seitz, S. M., and Dyer, C. M. 1997. Photorealistic scene reconstrcution by voxel coloring. In Conference on Computer Vision and Pattern Recognition (CVPR), 1067--1073. Google ScholarDigital Library
Shade, J., Gortler, S., He, L.-W., and Szeliski, R. 1998. Layered depth images. In Computer Graphics (SIGGRAPH'98) Proceedings, ACM SIGGRAPH, 231--242. Google ScholarDigital Library
Smolic, A., and Kimata, H. 2003. AHG on 3DAV Coding. ISO/IEC JTC1/SC29/WG11 MPEG03/M9635.Google Scholar
Szeliski, R., and Golland, P. 1999. Stereo matching with transparency and matting. International Journal of Computer Vision 32, 1, 45--61. Google ScholarDigital Library
Tao, H., Sawhney, H., and Kumar, R. 2001. A global matching framework for stereo computation. In International Conference on Computer Vision (ICCV), vol. I, 532--539.Google Scholar
Tsin, Y., Kang, S. B., and Szeliski, R. 2003. Stereo matching with reflections and translucency. In Conference on Computer Vision and Pattern Recognition (CVPR), vol. I, 702--709. Google ScholarDigital Library
Vedula, S., Baker, S., Seitz, S., and Kanade, T. 2000. Shape and motion carving in 6D. In Conference on Computer Vision and Pattern Recognition (CVPR), vol. II, 592--598.Google Scholar
Wang, J. Y. A., and Adelson, E. H. 1993. Layered representation for motion analysis. In Conference on Computer Vision and Pattern Recognition (CVPR), 361--366.Google Scholar
Wexler, Y., Fitzgibbon, A., and Zisserman, A. 2002. Bayesian estimation of layers from multiple images. In Seventh European Conference on Computer Vision (ECCV), vol. III, 487--501. Google ScholarDigital Library
Wilburn, B., Smulski, M., Lee, H. H. K., and Horowitz, M. 2002. The light field video camera. In SPIE Electonic Imaging: Media Processors, vol. 4674, 29--36.Google Scholar
Yang, J. C., Everett, M., Buehler, C., and McMillan, L. 2002. A real-time distributed light field camera. In Eurographics Workshop on Rendering, 77--85. Google ScholarDigital Library
Yang, R., Welch, G., and Bishop, G. 2002. Real-time consensus-based scene reconstruction using commodity graphics hardware. In Proceedings of Pacific Graphics, 225--234. Google ScholarDigital Library
Zhang, Y., and Kambhamettu, C. 2001. On 3D scene flow and structure estimation. In Conference on Computer Vision and Pattern Recognition (CVPR), vol. II, 778--785.Google Scholar
Zhang, L., Curless, B., and Seitz, S. M. 2003. Spacetime stereo: Shape recovery for dynamic scenes. In Conference on Computer Vision and Pattern Recognition, 367--374.Google Scholar
Zhang, Z. 2000. A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 11, 1330--1334. Google ScholarDigital Library

Index Terms

High-quality video view interpolation using a layered representation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Video segmentation
      2. Computer vision tasks
        Scene understanding
  2. Computer graphics
    1. Image manipulation
    2. Rendering

Recommendations

High-quality video view interpolation using a layered representation
SIGGRAPH '04: ACM SIGGRAPH 2004 Papers

The ability to interactively control viewpoint while watching a video is an exciting application of image-based rendering. The goal of our work is to render dynamic scenes with interactive viewpoint control using a relatively small number of video ...
Read More
Real-time high-quality View-Dependent Texture Mapping using per-pixel visibility
GRAPHITE '05: Proceedings of the 3rd international conference on Computer graphics and interactive techniques in Australasia and South East Asia

We present an extension of View-Dependent Texture Mapping (VDTM) allowing rendering of complex geometric meshes at high frame rates without usual blurring or skinning artifacts. We combine a hybrid geometric and image-based representation of a given 3D ...
Read More
Special Section on CAD/Graphics 2013: Parallel and adaptive visibility sampling for rendering dynamic scenes with spatially varying reflectance

Fast rendering of dynamic scenes with natural illumination, all-frequency shadows and spatially varying reflections is important but challenging. One main difficulty brought by moving objects is that the runtime visibility update of dynamic occlusion is ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Graphics Volume 23, Issue 3
August 2004
684 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/1015706
Editor:
John C. Hart
Issue’s Table of Contents
Copyright © 2004 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 August 2004
Published in tog Volume 23, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Computer Vision
Dynamic Scenes
Image-Based Rendering
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 869
  Total Citations
  View Citations
- 5,126
  Total Downloads
- Downloads (Last 12 months)109
- Downloads (Last 6 weeks)15
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

High-quality video view interpolation using a layered representation

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

High-quality video view interpolation using a layered representation

Real-time high-quality View-Dependent Texture Mapping using per-pixel visibility

Special Section on CAD/Graphics 2013: Parallel and adaptive visibility sampling for rendering dynamic scenes with spatially varying reflectance