research-article

BundleFusion: Real-Time Globally Consistent 3D Reconstruction Using On-the-Fly Surface Reintegration

Authors:
Angela Dai

Stanford University

Stanford University
View Profile

,
Matthias Nießner

Stanford University

Stanford University
View Profile

,
Michael Zollhöfer

Max-Planck-Institute for Informatics

Max-Planck-Institute for Informatics
View Profile

,
Shahram Izadi

Microsoft Research

Microsoft Research
View Profile

,
Christian Theobalt

Max-Planck-Institute for Informatics

Max-Planck-Institute for Informatics
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 36 Issue 4Article No.: 76ahttps://doi.org/10.1145/3072959.3054739

Published:16 July 2017Publication History

ACM Transactions on Graphics

Abstract

Real-time, high-quality, 3D scanning of large-scale scenes is key to mixed reality and robotic applications. However, scalability brings challenges of drift in pose estimation, introducing significant errors in the accumulated model. Approaches often require hours of offline processing to globally correct model errors. Recent online methods demonstrate compelling results but suffer from (1) needing minutes to perform online correction, preventing true real-time use; (2) brittle frame-to-frame (or frame-to-model) pose estimation, resulting in many tracking failures; or (3) supporting only unstructured point-based representations, which limit scan quality and applicability. We systematically address these issues with a novel, real-time, end-to-end reconstruction framework. At its core is a robust pose estimation strategy, optimizing per frame for a global set of camera poses by considering the complete history of RGB-D input with an efficient hierarchical approach. We remove the heavy reliance on temporal tracking and continually localize to the globally optimized frames instead. We contribute a parallelizable optimization framework, which employs correspondences based on sparse features and dense geometric and photometric matching. Our approach estimates globally optimized (i.e., bundle adjusted) poses in real time, supports robust tracking with recovery from gross tracking failures (i.e., relocalization), and re-estimates the 3D model in real time to ensure global consistency, all within a single framework. Our approach outperforms state-of-the-art online systems with quality on par to offline methods, but with unprecedented speed and scan completeness. Our framework leads to a comprehensive online scanning solution for large indoor environments, enabling ease of use and high-quality results.¹

Supplemental Material

tog-13.mp4

mp4

436.4 MB

Download

Available for Download

zip

dai.zip (74.7 MB)

Supplemental movie, appendix, image and software files for, BundleFusion: Real-Time Globally Consistent 3D Reconstruction Using On-the-Fly Surface Reintegration

References

S. Agarwal, K. Mierle, and Others. 2013. Ceres Solver. Retrieved from http://ceres-solver.org. (2013).Google Scholar
P. J. Besl and N. D. McKay. 1992. A method for registration of 3-D shapes. IEEE Trans. PAMI 14, 2 (1992), 239--256. Google ScholarDigital Library
J. Chen, D. Bautembach, and S. Izadi. 2013. Scalable real-time volumetric surface reconstruction. ACM TOG 32, 4 (2013), 113.Google ScholarDigital Library
S. Choi, Q.-Y. Zhou, and V. Koltun. 2015. Robust reconstruction of indoor scenes. In Proc. CVPR. 5556--5565.Google Scholar
B. Curless and M. Levoy. 1996. A volumetric method for building complex models from range images. In Proc. SIGGRAPH. ACM, 303--312. Google ScholarDigital Library
Z. DeVito, M. Mara, M. Zollhöfer, G. Bernstein, J. Ragan-Kelley, C. Theobalt, P. Hanrahan, M. Fisher, and M. Nießner. 2016. Opt: A domain specific language for non-linear least squares optimization in graphics and imaging. arXiv Preprint arXiv:1604.06525 (2016).Google Scholar
A. Elfes and L. Matthies. 1987. Sensor integration for robot navigation: Combining sonar and stereo range data in a grid-based representataion. In 26th IEEE Conference on Decision and Control, 1987, Vol. 26. IEEE, 1802--1807. Google ScholarCross Ref
F. Endres, J. Hess, N. Engelhard, J. Sturm, D. Cremers, and W. Burgard. 2012. An evaluation of the RGB-D SLAM system. In Proc. ICRA. IEEE, 1691--1696. Google ScholarCross Ref
J. Engel, T. Schöps, and D. Cremers. 2014. LSD-SLAM: Large-scale direct monocular SLAM. In European Conference on Computer Vision. Google ScholarCross Ref
J. Engel, J. Sturm, and D. Cremers. 2013. Semi-dense visual odometry for a monocular camera. In Proc. ICCV. IEEE, 1449--1456. Google ScholarDigital Library
N. Fioraio, J. Taylor, A. Fitzgibbon, L. Di Stefano, and S. Izadi. 2015. Large-scale and drift-free surface reconstruction using online subvolume registration. Proc. CVPR (June 2015). Google ScholarCross Ref
C. Forster, M. Pizzoli, and D. Scaramuzza. 2014. SVO: Fast semi-direct monocular visual odometry. In Proc. ICRA. IEEE, 15--22. Google ScholarCross Ref
S. Fuhrmann and M. Goesele. 2014. Floating scale surface reconstruction. In Proc. SIGGRAPH. Google ScholarDigital Library
D. Gallup, M. Pollefeys, and J.-M. Frahm. 2010. 3D reconstruction using an n-layer heightmap. In Pattern Recognition. Springer, 1--10. Google ScholarCross Ref
B. Glocker, J. Shotton, A. Criminisi, and S. Izadi. 2015. Real-time RGB-D camera relocalization via randomized ferns for keyframe encoding. TVCG 21, 5 (2015), 571--583. Google ScholarCross Ref
J. C. Gower. 1975. Generalized procrustes analysis. Psychometrika 40, 1 (1975), 33--51. Google Scholar
A. Handa, T. Whelan, J. B. McDonald, and A. J. Davison. 2014. A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM. In Proc. ICRA. Google ScholarCross Ref
P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox. 2010. RGB-D mapping: Using depth cameras for dense 3D modeling of indoor environments. In Proc. Int. Symp. Experimental Robotics, Vol. 20. 22--25.Google Scholar
A. Hilton, A. Stoddart, J. Illingworth, and T. Windeatt. 1996. Reliable surface reconstruction from multiple range images. J. Proc. ECCV 1 (1996), 117--126. Google ScholarCross Ref
S. Izadi, D. Kim, O. Hilliges, D. Molyneaux, R. Newcombe, P. Kohli, J. Shotton, S. Hodges, D. Freeman, A. Davison, and A. Fitzgibbon. 2011. KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera. In Proc. UIST. 559--568. Google ScholarDigital Library
W. Kabsch. 1976. A solution for the best rotation to relate two sets of vectors. Acta Crystallogr. A: Crystal Phys. Diffract. Theoret. General Crystallogr. 32, 5 (1976), 922--923. Google ScholarCross Ref
M. Keller, D. Lefloch, M. Lambers, S. Izadi, T. Weyrich, and A. Kolb. 2013. Real-time 3D reconstruction in dynamic scenes using point-based fusion. In Proc. 3DV. IEEE, 1--8. Google ScholarDigital Library
C. Kerl, J. Sturm, and D. Cremers. 2013. Dense visual SLAM for RGB-D cameras. In Proc. IROS. Google ScholarCross Ref
G. Klein and D. Murray. 2007. Parallel tracking and mapping for small AR workspaces. In Proc. ISMAR. Google ScholarDigital Library
R. Kümmerle, G. Grisetti, H. Strasdat, K. Konolige, and W. Burgard. 2011. g 2 o: A general framework for graph optimization. In Proc. ICRA. IEEE, 3607--3613.Google Scholar
M. Levoy, K. Pulli, B. Curless, S. Rusinkiewicz, D. Koller, L. Pereira, M. Ginzton, S. Anderson, J. Davis, J. Ginsberg, and D. Fulk. 2000. The digital michelangelo project: 3D scanning of large statues. In Proc. SIGGRAPH. ACM Press/Addison-Wesley Publishing Co., 131--144. Google ScholarDigital Library
H. Li, E. Vouga, A. Gudym, L. Luo, J. T. Barron, and G. Gusev. 2013. 3D self-portraits. ACM TOG 32, 6 (2013), 187.Google ScholarDigital Library
D. G. Lowe. 2004. Distinctive image features from scale-invariant keypoints. IJCV 60 (2004), 91--110. Google ScholarDigital Library
R. Maier, J. Sturm, and D. Cremers. 2014. Submap-based bundle adjustment for 3D reconstruction from RGB-D data. In Proc. GCPR. Google ScholarCross Ref
M. Meilland and A. Comport. 2013. On unifying key-frame and voxel-based dense visual SLAM at large scales. In Proc. IROS. IEEE, 3677--3683. Google ScholarCross Ref
M. Meilland, A. Comport, P. Rives, and I. S. Antipolis Méditerranée. 2011. Real-time dense visual tracking under large lighting variations. In Proc. BMVC, Vol. 29. Google ScholarCross Ref
P. Merrell, A. Akbarzadeh, L. Wang, P. Mordohai, J. M. Frahm, R. Yang, D. Nistér, and M. Pollefeys. 2007. Real-time visibility-based fusion of depth maps. In Proc. ICCV. 1--8. Google ScholarCross Ref
R. M. Murray, S. S. Sastry, and L. Zexiang. 1994. A Mathematical Introduction to Robotic Manipulation. CRC Press.Google Scholar
R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohli, J. Shotton, S. Hodges, and A. Fitzgibbon. 2011a. KinectFusion: Real-time dense surface mapping and tracking. In Proc. ISMAR. 127--136. Google ScholarDigital Library
R. A. Newcombe, S. J. Lovegrove, and A. J. Davison. 2011b. DTAM: Dense tracking and mapping in real-time. In Proc. ICCV. 2320--2327. Google ScholarDigital Library
M. Nießner, A. Dai, and M. Fisher. 2014. Combining inertial navigation and ICP for real-time 3d surface reconstruction. In Eurographics (Short Papers). 13--16.Google Scholar
M. Nießner, M. Zollhöfer, S. Izadi, and M. Stamminger. 2013. Real-time 3D reconstruction at scale using voxel hashing. ACM TOG 32, 6 (2013), 169.Google ScholarDigital Library
V. Pradeep, C. Rhemann, S. Izadi, C. Zach, M. Bleyer, and S. Bathiche. 2013. MonoFusion: Real-time 3d reconstruction of small scenes with a single web camera. In Proc. ISMAR. 83--88. Google ScholarCross Ref
F. Reichl, J. Weiss, and R. Westermann. 2015. Memory-efficient interactive online reconstruction from depth image streams. In Computer Graphics Forum. Wiley Online Library.Google Scholar
H. Roth and M. Vona. 2012. Moving volume kinectfusion. In Proc. BMVC. Google ScholarCross Ref
S. Rusinkiewicz, O. Hall-Holt, and M. Levoy. 2002. Real-time 3D model acquisition. ACM TOG 21, 3 (2002), 438--446. Google ScholarDigital Library
S. Rusinkiewicz and M. Levoy. 2001. Efficient variants of the ICP algorithm. In Proc. 3DIM. 145--152. Google ScholarCross Ref
N. Silberman, D. Hoiem, P. Kohli, and R. Fergus. 2012. Indoor segmentation and support inference from RGBD images. In ECCV. Google ScholarDigital Library
F. Steinbruecker, C. Kerl, J. Sturm, and D. Cremers. 2013. Large-scale multi-resolution surface reconstruction from RGB-D sequences. In Proc. ICCV. Google ScholarDigital Library
F. Steinbruecker, J. Sturm, and D. Cremers. 2014. Volumetric 3D mapping in real-time on a CPU. In 2014 IEEE International Conference on Robotics and Automation (ICRA’14). Google ScholarCross Ref
J. Stückler and S. Behnke. 2014. Multi-resolution surfel maps for efficient dense 3D modeling and tracking. J. Visual Communication Image Representation 25, 1 (2014), 137--147. Google ScholarDigital Library
J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers. 2012. A benchmark for the evaluation of RGB-D SLAM systems. In Proc. IROS. Google ScholarCross Ref
B. Triggs, P. F. McLauchlan, R. I. Hartley, and A. W. Fitzgibbon. 2000. Bundle adjustment, a modern synthesis. In Vision Algorithms: Theory and Practice. Springer, 298--372. Google ScholarCross Ref
J. Valentin, M. Nießner, J. Shotton, A. Fitzgibbon, S. Izadi, and P. Torr. 2015. Exploiting uncertainty in regression forests for accurate camera relocalization. In Proc. CVPR. 4400--4408. Google ScholarCross Ref
T. Weise, T. Wismer, B. Leibe, and L. Van Gool. 2009. In-hand scanning with online loop closure. In Proc. ICCV Workshops. 1630--1637. Google ScholarCross Ref
T. Whelan, H. Johannsson, M. Kaess, J. Leonard, and J. McDonald. 2012. Robust Tracking for Real-Time Dense RGB-D Mapping with Kintinuous. Technical Report. Query date: 10-25-2012.Google Scholar
T. Whelan, H. Johannsson, M. Kaess, J.J. Leonard, and J. McDonald. 2013a. Robust real-time visual odometry for dense RGB-D mapping. In Proc. ICRA. Google ScholarCross Ref
T. Whelan, M. Kaess, J. J. Leonard, and J. McDonald. 2013b. Deformation-based loop closure for large scale dense RGB-D SLAM. In Proc. IROS. IEEE, 548--555. Google ScholarCross Ref
T. Whelan, S. Leutenegger, R. F. Salas-Moreno, B. Glocker, and A. J. Davison. 2015. ElasticFusion: Dense SLAM without a pose graph. In Proc. RSS. Rome, Italy. Google ScholarCross Ref
C. Wu, M. Zollhöfer, M. Nießner, M. Stamminger, S. Izadi, and C. Theobalt. 2014. Real-time shading-based refinement for consumer depth cameras. ACM TOG 33, 6 (2014), 200.Google ScholarDigital Library
K. M. Wurm, A. Hornung, M. Bennewitz, C. Stachniss, and W. Burgard. 2010. OctoMap: A probabilistic, flexible, and compact 3D map representation for robotic systems. In Proc. ICRA, Vol. 2.Google Scholar
J. Xiao, A. Owens, and A. Torralba. 2013. SUN3D: A database of big spaces reconstructed using sfm and object labels. In Proc. ICCV. IEEE, 1625--1632. Google ScholarDigital Library
M. Zeng, F. Zhao, J. Zheng, and X. Liu. 2012. Octree-based fusion for realtime 3d reconstruction. Graph. Models 75, 3 (2012), 126--136. Google ScholarDigital Library
Y. Zhang, W. Xu, Y. Tong, and K. Zhou. 2015. Online structure analysis for real-time indoor scene reconstruction. ACM TOG 34, 5 (2015), 159.Google ScholarDigital Library
Q-Y. Zhou and V. Koltun. 2013. Dense scene reconstruction with points of interest. ACM TOG 32, 4 (2013), 112.Google ScholarDigital Library
Q.-Y. Zhou and V. Koltun. 2014. Color map optimization for 3D reconstruction with consumer depth cameras. ACM TOG 33, 4 (2014), 155.Google ScholarDigital Library
Q.-Y. Zhou, S. Miller, and V. Koltun. 2013. Elastic fragments for dense scene reconstruction. In 2013 IEEE International Conference on Computer Vision (ICCV’13). IEEE, 473--480. Google ScholarDigital Library
M. Zollhöfer, A. Dai, M. Innmann, C. Wu, M. Stamminger, C. Theobalt, and M. Nießner. 2015. Shading-based refinement on volumetric signed distance functions. ACM TOG 34, 4 (2015), 96.Google ScholarDigital Library
M. Zollhöfer, M. Nießner, S. Izadi, C. Rehmann, C. Zach, M. Fisher, C. Wu, A. Fitzgibbon, C. Loop, C. Theobalt, and M. Stamminger. 2014. Real-time non-rigid reconstruction using an RGB-D camera. ACM TOG 33, 4 (2014), 156.Google ScholarDigital Library

Index Terms

BundleFusion: Real-Time Globally Consistent 3D Reconstruction Using On-the-Fly Surface Reintegration
1. Computing methodologies
  1. Computer graphics
    1. Shape modeling
      1. Mesh geometry models

Recommendations

BundleFusion: Real-Time Globally Consistent 3D Reconstruction Using On-the-Fly Surface Reintegration

Real-time, high-quality, 3D scanning of large-scale scenes is key to mixed reality and robotic applications. However, scalability brings challenges of drift in pose estimation, introducing significant errors in the accumulated model. Approaches often ...
Read More
Live RGB-D camera tracking for television production studios

Highlights A novel low-cost tool for camera tracking in broadcasting studio environments. Driftless tracking with keyframes. Real-time performance using a GPU. Allows moving actors in the scene while tracking. Comparison with Kinfu. In this work, a real-...
Read More
3Dlite: towards commodity 3D scanning for content creation

We present 3DLite¹, a novel approach to reconstruct 3D environments using consumer RGB-D sensors, making a step towards directly utilizing captured 3D content in graphics applications, such as video games, VR, or AR. Rather than reconstructing an ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Graphics Volume 36, Issue 4
August 2017
2155 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/3072959
Issue’s Table of Contents

Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 July 2017
- Accepted: 1 January 2017
- Revised: 1 December 2016
- Received: 1 April 2016
Published in tog Volume 36, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Artifacts Available
Author Tags
RGB-D
global consistency
real-time
scalable
scan
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 211
  Total Citations
  View Citations
- 3,142
  Total Downloads
- Downloads (Last 12 months)267
- Downloads (Last 6 weeks)34
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

BundleFusion: Real-Time Globally Consistent 3D Reconstruction Using On-the-Fly Surface Reintegration

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

BundleFusion: Real-Time Globally Consistent 3D Reconstruction Using On-the-Fly Surface Reintegration

Live RGB-D camera tracking for television production studios

3Dlite: towards commodity 3D scanning for content creation