Abstract
The ability to interactively control viewpoint while watching a video is an exciting application of image-based rendering. The goal of our work is to render dynamic scenes with interactive viewpoint control using a relatively small number of video cameras. In this paper, we show how high-quality video-based rendering of dynamic scenes can be accomplished using multiple synchronized video streams combined with novel image-based modeling and rendering algorithms. Once these video streams have been processed, we can synthesize any intermediate view between cameras at any time, with the potential for space-time manipulation.In our approach, we first use a novel color segmentation-based stereo algorithm to generate high-quality photoconsistent correspondences across all camera views. Mattes for areas near depth discontinuities are then automatically extracted to reduce artifacts during view synthesis. Finally, a novel temporal two-layer compressed representation that handles matting is developed for rendering at interactive rates.
Supplemental Material
Available for Download
- Baker, S., Szeliski, R., and Anandan, P. 1998. A layered approach to stereo reconstruction. In Conference on Computer Vision and Pattern Recognition (CVPR), 434--441. Google ScholarDigital Library
- Buehler, C., Bosse, M., McMillan, L., Gortler, S. J., and Cohen, M. F. 2001. Unstructured lumigraph rendering. Proceedings of SIGGRAPH 2001, 425--432. Google ScholarDigital Library
- Carceroni, R. L., and Kutulakos, K. N. 2001. Multi-view scene capture by surfel sampling: From video streams to non-rigid 3D motion, shape and reflectance. In International Conference on Computer Vision (ICCV), vol. II, 60--67.Google Scholar
- Carranza, J., Theobalt, C., Magnor, M. A., and Seidel, H.-P. 2003. Free-viewpoint video of human actors. ACM Transactions on Graphics 22, 3, 569--577. Google ScholarDigital Library
- Chang, C.-L., et al. 2003. Inter-view wavelet compression of light fields with disparity-compensated lifting. In Visual Communication and Image Processing (VCIP 2003), 14--22.Google Scholar
- Chuang, Y.-Y., et al. 2001. A Bayesian approach to digital matting. In Conference on Computer Vision and Pattern Recognition (CVPR), vol. II, 264--271.Google Scholar
- Chuang, Y.-Y., et al. 2002. Video matting of complex scenes. ACM Transactions on Graphics 21, 3, 243--248. Google ScholarDigital Library
- Debevec, P. E., Taylor, C. J., and Malik, J. 1996. Modeling and rendering architecture from photographs: A hybrid geometry-and image-based approach. Computer Graphics (SIGGRAPH'96), 11--20. Google ScholarDigital Library
- Debevec, P. E., Yu, Y., and Borshukov, G. D. 1998. Efficient view-dependent image-based rendering with projective texture-mapping. Eurographics Rendering Workshop 1998, 105--116.Google Scholar
- Fitzgibbon, A., Wexler, Y., and Zisserman, A. 2003. Image-based rendering using image-based priors. In International Conference on Computer Vision (ICCV), vol. 2, 1176--1183. Google ScholarDigital Library
- Goldlücke, B., Magnor, M., and Wilburn, B. 2002. Hardware-accelerated dynamic light field rendering. In Proceedings Vision, Modeling and Visualization VMV 2002, 455--462.Google Scholar
- Gortler, S. J., Grzeszczuk, R., Szeliski, R., and Cohen, M. F. 1996. The Lumigraph. In Computer Graphics (SIGGRAPH'96) Proceedings, ACM SIGGRAPH, 43--54. Google ScholarDigital Library
- Gross, M., et al. 2003. blue-c: A spatially immersive display and 3D video portal for telepresence. Proceedings of SIGGRAPH 2003 (ACM Transactions on Graphics), 819--827. Google ScholarDigital Library
- Hall-Holt, O., and Rusinkiewicz, S. 2001. Stripe boundary codes for real-time structured-light range scanning of moving objects. In International Conference on Computer Vision (ICCV), vol. II, 359--366.Google Scholar
- Heigl, B., et al. 1999. Plenoptic modeling and rendering from image sequences taken by hand-held camera. In DAGM'99, 94--101. Google ScholarDigital Library
- Kanade, T., Rander, P. W., and Narayanan, P. J. 1997. Virtualized reality: constructing virtual worlds from real scenes. IEEE MultiMedia Magazine, 1(1):34--47. Google ScholarDigital Library
- Levoy, M., and Hanrahan, P. 1996. Light field rendering. In Computer Graphics (SIGGRAPH'96) Proceedings, ACM SIGGRAPH, 31--42. Google ScholarDigital Library
- Matusik, W., et al. 2000. Image-based visual hulls. Proceedings of SIGGRAPH 2000, 369--374. Google ScholarDigital Library
- Patras, I., Hendriks, E., and Lagendijk, R. 2001. Video segmentation by MAP labeling of watershed segments. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 3, 326--332. Google ScholarDigital Library
- Perona, P., and Malik, J. 1990. Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence 12, 7, 626--639. Google ScholarDigital Library
- Pulli, K., et al. 1997. View-based rendering: Visualizing real objects from scanned range and color data. In Proceedings of the 8th Eurographics Workshop on Rendering, 23--34. Google ScholarDigital Library
- Scharstein, D., and Szeliski, R. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision 47, 1, 7--42. Google ScholarDigital Library
- Schirmacher, H., Ming, L., and Seidel, H.-P. 2001. On-the-fly processing of generalized Lumigraphs. In Proceedings of Eurographics, Computer Graphics Forum 20, 3, 165--173.Google ScholarCross Ref
- Seitz, S. M., and Dyer, C. M. 1997. Photorealistic scene reconstrcution by voxel coloring. In Conference on Computer Vision and Pattern Recognition (CVPR), 1067--1073. Google ScholarDigital Library
- Shade, J., Gortler, S., He, L.-W., and Szeliski, R. 1998. Layered depth images. In Computer Graphics (SIGGRAPH'98) Proceedings, ACM SIGGRAPH, 231--242. Google ScholarDigital Library
- Smolic, A., and Kimata, H. 2003. AHG on 3DAV Coding. ISO/IEC JTC1/SC29/WG11 MPEG03/M9635.Google Scholar
- Szeliski, R., and Golland, P. 1999. Stereo matching with transparency and matting. International Journal of Computer Vision 32, 1, 45--61. Google ScholarDigital Library
- Tao, H., Sawhney, H., and Kumar, R. 2001. A global matching framework for stereo computation. In International Conference on Computer Vision (ICCV), vol. I, 532--539.Google Scholar
- Tsin, Y., Kang, S. B., and Szeliski, R. 2003. Stereo matching with reflections and translucency. In Conference on Computer Vision and Pattern Recognition (CVPR), vol. I, 702--709. Google ScholarDigital Library
- Vedula, S., Baker, S., Seitz, S., and Kanade, T. 2000. Shape and motion carving in 6D. In Conference on Computer Vision and Pattern Recognition (CVPR), vol. II, 592--598.Google Scholar
- Wang, J. Y. A., and Adelson, E. H. 1993. Layered representation for motion analysis. In Conference on Computer Vision and Pattern Recognition (CVPR), 361--366.Google Scholar
- Wexler, Y., Fitzgibbon, A., and Zisserman, A. 2002. Bayesian estimation of layers from multiple images. In Seventh European Conference on Computer Vision (ECCV), vol. III, 487--501. Google ScholarDigital Library
- Wilburn, B., Smulski, M., Lee, H. H. K., and Horowitz, M. 2002. The light field video camera. In SPIE Electonic Imaging: Media Processors, vol. 4674, 29--36.Google Scholar
- Yang, J. C., Everett, M., Buehler, C., and McMillan, L. 2002. A real-time distributed light field camera. In Eurographics Workshop on Rendering, 77--85. Google ScholarDigital Library
- Yang, R., Welch, G., and Bishop, G. 2002. Real-time consensus-based scene reconstruction using commodity graphics hardware. In Proceedings of Pacific Graphics, 225--234. Google ScholarDigital Library
- Zhang, Y., and Kambhamettu, C. 2001. On 3D scene flow and structure estimation. In Conference on Computer Vision and Pattern Recognition (CVPR), vol. II, 778--785.Google Scholar
- Zhang, L., Curless, B., and Seitz, S. M. 2003. Spacetime stereo: Shape recovery for dynamic scenes. In Conference on Computer Vision and Pattern Recognition, 367--374.Google Scholar
- Zhang, Z. 2000. A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 11, 1330--1334. Google ScholarDigital Library
Index Terms
- High-quality video view interpolation using a layered representation
Recommendations
High-quality video view interpolation using a layered representation
SIGGRAPH '04: ACM SIGGRAPH 2004 PapersThe ability to interactively control viewpoint while watching a video is an exciting application of image-based rendering. The goal of our work is to render dynamic scenes with interactive viewpoint control using a relatively small number of video ...
Real-time high-quality View-Dependent Texture Mapping using per-pixel visibility
GRAPHITE '05: Proceedings of the 3rd international conference on Computer graphics and interactive techniques in Australasia and South East AsiaWe present an extension of View-Dependent Texture Mapping (VDTM) allowing rendering of complex geometric meshes at high frame rates without usual blurring or skinning artifacts. We combine a hybrid geometric and image-based representation of a given 3D ...
Special Section on CAD/Graphics 2013: Parallel and adaptive visibility sampling for rendering dynamic scenes with spatially varying reflectance
Fast rendering of dynamic scenes with natural illumination, all-frequency shadows and spatially varying reflections is important but challenging. One main difficulty brought by moving objects is that the runtime visibility update of dynamic occlusion is ...
Comments