Abstract
Stereoscopic 3D (S3D) movies have become widely popular in the movie theaters, but the adoption of S3D at home is low even though most TV sets support S3D. It is widely believed that S3D with glasses is not the right approach for the home. A much more appealing approach is to use automulti-scopic displays that provide a glasses-free 3D experience to multiple viewers. A technical challenge is the lack of native multiview content that is required to deliver a proper view of the scene for every viewpoint. Our approach takes advantage of the abundance of stereoscopic 3D movies. We propose a real-time system that can convert stereoscopic video to a high-quality multiview video that can be directly fed to automultiscopic displays. Our algorithm uses a wavelet-based decomposition of stereoscopic images with per-wavelet disparity estimation. A key to our solution lies in combining Lagrangian and Eulerian approaches for both the disparity estimation and novel view synthesis, which leverages the complementary advantages of both techniques. The solution preserves all the features of Eulerian methods, e.g., subpixel accuracy, high performance, robustness to ambiguous depth cases, and easy integration of inter-view aliasing while maintaining the advantages of Lagrangian approaches, e.g., robustness to large disparities and possibility of performing non-trivial disparity manipulations through both view extrapolation and interpolation. The method achieves real-time performance on current GPUs. Its design also enables an easy hardware implementation that is demonstrated using a field-programmable gate array. We analyze the visual quality and robustness of our technique on a number of synthetic and real-world examples. We also perform a user experiment which demonstrates benefits of the technique when compared to existing solutions.
Supplemental Material
Available for Download
Supplemental files.
- Robert Anderson, David Gallup, Jonathan T. Barron, Janne Kontkanen, Noah Snavely, Carlos Hernández, Sameer Agarwal, and Steven M. Seitz. 2016. Jump: Virtual reality video. ACM Trans. Graph. 35, 6, Article 198 (Nov. 2016), 13 pages. Google ScholarDigital Library
- Myron Z Brown, Darius Burschka, and Gregory D Hager. 2003. Advances in computational stereo. IEEE Trans. on Pattern Analysis and Machine Intelligence 25, 8 (2003), 993--1008.Google ScholarDigital Library
- Alexandre Chapiro, Simon Heinzle, Tunç Ozan Aydin, Steven Poulakos, Matthias Zwicker, Aljosa Smolic, and Markus Gross. 2014. Optimizing stereo-to-multiview conversion for autostereoscopic displays. In Computer Graphics Forum, Vol. 33. Wiley Online Library, 63--72. Google ScholarDigital Library
- Chris Chinnock. 2012. Trends in the 3D TV market. In Handbook of Visual Display Technology. Springer, 2599--2606. Google ScholarCross Ref
- Piotr Didyk, Tobias Ritschel, Elmar Eisemann, Karol Myszkowski, Hans-Peter Seidel, and Wojciech Matusik. 2012. A luminance-contrast-aware disparity model and applications. ACM Trans. Graph. 31, 6 (2012), 184. Google ScholarDigital Library
- Piotr Didyk, Pitchaya Sitthi-Amorn, William Freeman, Frédo Durand, and Wojciech Matusik. 2013. Joint view expansion and filtering for automultiscopic 3D displays. ACM Trans. Graph. 32, 6 (2013), 221. Google ScholarDigital Library
- Song-Pei Du, Piotr Didyk, Frédo Durand, Shi-Min Hu, and Wojciech Matusik. 2014. Improving visual quality of view transitions in automultiscopic displays. ACM Trans. Graph. 33, 6 (2014), 192:1--192:9.Google ScholarDigital Library
- Ye Fan, Joshua Litven, David IW Levin, and Dinesh K Pai. 2013. Eulerian-on-Lagrangian simulation. ACM Trans. Graph. 32, 3 (2013), 22:1--22:9.Google ScholarDigital Library
- Miquel Farre, Oliver Wang, Manuel Lang, Nikolce Stefanoski, Alexander Hornung, and Aljoscha Smolic. 2011. Automatic content creation for multiview autostereoscopic displays using image domain warping. In IEEE International Conference on Multimedia and Expo. Google ScholarDigital Library
- David J Fleet and Allan D Jepson. 1990. Computation of component image velocity from local phase information. International Journal of Computer Vision 5, 1 (1990), 77--104. Google ScholarDigital Library
- David J Fleet, Allan D Jepson, and Michael RM Jenkin. 1991. Phase-based disparity measurement. CVGIP: Image Understanding 53, 2 (1991), 198--210.Google ScholarDigital Library
- John Flynn, Ivan Neulander, James Philbin, and Noah Snavely. 2015. DeepStereo: Learning to predict new views from the world's imagery. arXiv preprint arXiv:1506.06825 (2015).Google Scholar
- Andrea Fusiello, Emanuele Trucco, and Alessandro Verri. 2000. A compact algorithm for rectification of stereo pairs. Machine Vision and Applications 12, 1 (2000), 16--22. Google ScholarDigital Library
- Samuel W Hasinoff, Sing Bing Kang, and Richard Szeliski. 2006. Boundary matting for view synthesis. Computer Vision and Image Understanding 103, 1 (2006), 22--32.Google ScholarDigital Library
- Heiko Hirschmuller and Daniel Scharstein. 2007. Evaluation of cost functions for stereo matching. In Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on. IEEE, 1--8.Google ScholarCross Ref
- Asmaa Hosni, Christoph Rhemann, Michael Bleyer, Carsten Rother, and Margrit Gelautz. 2013. Fast cost-volume filtering for visual correspondence and beyond. IEEE Trans. on Pattern Analysis and Machine Intelligence 35, 2 (2013), 504--511. Google ScholarDigital Library
- Nima Khademi Kalantari, Ting-Chun Wang, and Ravi Ramamoorthi. 2016. Learning-based view synthesis for light field cameras. ACM Trans. Graph. (Proc. of SIGGRAPH Asia 2016) 35, 6 (2016).Google Scholar
- Johannes Kopf, Fabian Langguth, Daniel Scharstein, Richard Szeliski, and Michael Goesele. 2013. Image-based rendering in the gradient domain. ACM Trans. Graph. 32, 6 (2013), 199. Google ScholarDigital Library
- Manuel Lang, Alexander Hornung, Oliver Wang, Steven Poulakos, Aljoscha Smolic, and Markus Gross. 2010. Nonlinear disparity mapping for stereoscopic 3D. ACM Trans. Graph. 29, 4 (2010), 75:1--75:10.Google ScholarDigital Library
- Chao-Kang Liao, Hsiu-Chi Yeh, Ke Zhang, Vanmeerbeeck Geert, Tian-Sheuan Chang, and Gauthier Lafruit. 2013. Stereo matching and viewpoint synthesis FPGA implementation. In 3D-TV System with Depth-Image-Based Rendering. Springer, 69--106.Google Scholar
- QH Liu and N Nguyen. 1998. An accurate algorithm for nonuniform fast Fourier transforms (NUFFT's). IEEE Microwave and Guided Wave Letters 8, 1 (1998), 18--20. Google ScholarCross Ref
- Lytro Inc. 2015. (January 2015). https://www.lytro.com/.Google Scholar
- William R Mark, Leonard McMillan, and Gary Bishop. 1997. Post-rendering 3D warping. In Proc. of the 1997 Symposium on Interactive 3D Graphics. ACM, 7-ff.Google ScholarDigital Library
- Belen Masia, Gordon Wetzstein, Carlos Aliaga, Ramesh Raskar, and Diego Gutierrez. 2013. Display adaptive 3D content remapping. Computers & Graphics, Special Issue on Advanced Displays 37, 6 (2013), 983--996.Google Scholar
- Takuya Matsuo, Norishige Fukushima, and Yutaka Ishibashi. 2013. Weighted joint bilateral filter with slope depth compensation filter for depth map refinement. In VISAPP (2). 300--309.Google Scholar
- Wojciech Matusik and Hanspeter Pfister. 2004. 3D TV: A scalable system for real-time acquisition, transmission, and autostereoscopic display of dynamic scenes. ACM Trans. Graph. 23, 3 (2004), 814--824. Google ScholarDigital Library
- Lydia MJ Meesters, Wijnand A IJsselsteijn, and Piter JH Seuntiens. 2004. A survey of perceptual evaluations and requirements of three-dimensional TV. IEEE Trans. on Circuits and Systems for Video Technology 14, 3 (2004), 381--391.Google ScholarDigital Library
- H Keith Nishihara. 1984. Practical real-time imaging stereo matcher. Optical Engineering 23, 5 (1984), 235536--235536. Google ScholarCross Ref
- Karl Pauwels and Marc M Van Hulle. 2008. Realtime phase-based optical flow on the GPU. In Computer Vision and Pattern Recognition Workshops, 2008. CVPRW'08. IEEE Computer Society Conference on. IEEE, 1--8.Google ScholarCross Ref
- Raytrix GmbH. 2015. (January 2015). http://www.raytrix.de/.Google Scholar
- Christian Richardt, Carsten Stoll, Neil A Dodgson, Hans-Peter Seidel, and Christian Theobalt. 2012. Coherent spatiotemporal filtering, upsampling and rendering of RGBZ videos. In Computer Graphics Forum, Vol. 31. Wiley Online Library, 247--256. Google ScholarDigital Library
- Christian Riechert, Frederik Zilly, Peter Kauff, Jens Güther, and Ralf Schäfer. 2012. Fully automatic stereo-to-multiview conversion in autostereoscopic displays. The Best of IET and IBC 4, 8 (2012), 14.Google Scholar
- Michael Schaffner, Frank Gurkaynak, Pierre Greisen, Hubert Kaeslin, Luca Benini, and Aljosa Smolic. 2015. Hybrid ASIC/FPGA system for fully automatic stereo-to-multiview conversion using IDW. Circuits and Systems for Video Technology, IEEE Trans. on (2015).Google Scholar
- T. Shibata, J. Kim, D.M. Hoffman, and M.S. Banks. 2011. The zone of comfort: Predicting visual discomfort with stereo displays. Journal of Vision 11, 8 (2011), 11:1--11:29.Google ScholarCross Ref
- Eero P Simoncelli and William T Freeman. 1995. The steerable pyramid: A flexible architecture for multi-scale derivative computation. In Image Processing, International Conference on, Vol. 3. IEEE Computer Society, 3444--3444.Google Scholar
- Eero P Simoncelli, William T Freeman, Edward H Adelson, and David J Heeger. 1992. Shiftable multiscale transforms. IEEE Trans. on Information Theory 38, 2 (1992), 587--607.Google ScholarDigital Library
- Sudipta N Sinha, Drew Steedly, and Richard Szeliski. 2009. Piecewise planar stereo for image-based rendering.. In ICCV. 1881--1888.Google Scholar
- Aljoscha Smolic, Karsten Muller, Kristina Dix, Philipp Merkle, Peter Kauff, and Thomas Wiegand. 2008. Intermediate view interpolation based on multiview video plus depth for advanced 3D video systems. In IEEE International Conference on Image Processing. 2448--2451. Google ScholarCross Ref
- Nikolce Stefanoski, Oliver Wang, Michael Lang, Pierre Greisen, Simon Heinzle, and Aljoscha Smolic. 2013. Automatic view synthesis by image-domain-warping. Image Processing, IEEE Trans. on 22, 9 (2013), 3329--3341.Google ScholarDigital Library
- Richard Szeliski, Shai Avidan, and P Anandan. 2000. Layer extraction from multiple images containing reflections and transparency. In Computer Vision and Pattern Recognition, 2000. Proceedings. IEEE Conference on, Vol. 1. IEEE, 246--253. Google ScholarCross Ref
- Neal Wadhwa, Michael Rubinstein, Frédo Durand, and William T Freeman. 2013. Phase-based video motion processing. ACM Trans. Graph. 32, 4 (2013), 80:1--80:10.Google ScholarDigital Library
- Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. Image Processing, IEEE Trans. on 13, 4 (2004), 600--612.Google ScholarDigital Library
- Bennett Wilburn, Neel Joshi, Vaibhav Vaish, Eino-Ville Talvala, Emilio Antunez, Adam Barth, Andrew Adams, Mark Horowitz, and Marc Levoy. 2005. High performance imaging using large camera arrays. In ACM Trans. Graph., Vol. 24. ACM, 765--776. Google ScholarDigital Library
- Bennett S Wilburn, Michal Smulski, Hsiao-Heng K Lee, and Mark A Horowitz. 2001. Light field video camera. In Electronic Imaging 2002. International Society for Optics and Photonics, 29--36.Google Scholar
- Hao-Yu Wu, Michael Rubinstein, Eugene Shih, John V Guttag, Frédo Durand, and William T Freeman. 2012. Eulerian video magnification for revealing subtle changes in the world. (2012).Google Scholar
- Zhoutong Zhang, Yebin Liu, and Qionghai Dai. 2015. Light field from micro-baseline image pair. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition. 3800--3809. Google ScholarCross Ref
- Jun Zhou, Yi Xu, and Xiaokang Yang. 2007. Quaternion wavelet phase based stereo matching for uncalibrated images. Pattern Recognition Letters 28, 12 (2007), 1509--1522. Google ScholarDigital Library
- C Lawrence Zitnick, Sing Bing Kang, Matthew Uyttendaele, Simon Winder, and Richard Szeliski. 2004. High-quality video view interpolation using a layered representation. In ACM Trans. Graph., Vol. 23. ACM, 600--608.Google ScholarDigital Library
- Matthias Zwicker, Wojciech Matusik, Frédo Durand, and Hanspeter Pfister. 2006. Antialiasing for automultiscopic 3D displays. In Proc. of the 17th Eurographics Conference on Rendering Techniques. Eurographics Association, 73--82. Google ScholarDigital Library
Index Terms
- 3DTV at home: eulerian-lagrangian stereo-to-multiview conversion
Recommendations
Joint view expansion and filtering for automultiscopic 3D displays
Multi-view autostereoscopic displays provide an immersive, glasses-free 3D viewing experience, but they require correctly filtered content from multiple viewpoints. This, however, cannot be easily obtained with current stereoscopic production pipelines. ...
3DTV view generation with virtual pan/tilt/zoom functionality
ICVGIP '12: Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image ProcessingThis paper presents a novel rendering algorithm based on depth image warping to support virtual pan-tilt-zoom (PTZ) functionalities during 3D view generation. A method based on "3D-ness" knob is proposed for automatically specifying the virtual camera ...
3DTV view generation using uncalibrated pure rotating and zooming cameras
This paper proposes a novel method for synthesizing free viewpoint video captured by uncalibrated pure rotating and zooming cameras. Neither intrinsic nor extrinsic parameters of our cameras are known. Projective grid space (PGS), which is the 3D space ...
Comments