Abstract
Dense direct RGB-D registration methods are widely used in tasks ranging from localization and tracking to 3D scene reconstruction. This work addresses a peculiar aspect which drastically limits the applicability of direct registration, namely the weakness of the convergence domain. First, we propose an activation function based on the conditioning of the RGB and ICP point-to-plane error terms. This function strengthens the geometric error influence in the first coarse iterations, while the intensity data term dominates in the finer increments. The information gathered from the geometric and photometric cost functions is not only considered for improving the system observability, but for exploiting the different convergence properties and convexity of each data term. Next, we develop a set of strategies as a flexible regularization and a pixel saliency selection to further improve the quality and robustness of this approach.
The methodology is formulated for a generic warping model and results are given using perspective and spherical sensor models. Finally, our method is validated in different RGB-D spherical datasets, including both indoor and outdoor real sequences and using the KITTI VO/SLAM benchmark dataset. We show that the different proposed techniques (weighted activation function, regularization, saliency pixel selection), lead to faster convergence and larger convergence domains, which are the main limitations to the use of direct methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. IEEE PAMI 33, 500–513 (2011)
Braux-Zin, J., Dupont, R., Bartoli, A.: A general dense image matching framework combining direct and feature-based costs. In: IEEE ICCV (2013)
Howard, A.: Real-time stereo visual odometry for autonomous ground vehicles. In: IEEE IROS (2008)
Davison, A., Murray, D.: Simultaneous localization and map-building using active vision. IEEE TPAMI 24, 865–880 (2002)
Nistér, D., Naroditsky, O., Bergen, J.: Visual odometry. In: IEEE CVPR (2004)
Kitt, B., Geiger, A., Lategahn, H.: Visual odometry based on stereo image sequences with ransac-based outlier rejection scheme. In: IEEE IV (2010)
Harris, C., Stephens, M.: A combined corner and edge detector. In: 4th Alvey Vision Conference (1988)
Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60, 91–110 (2004)
Hager, G., Belhumeur, P.: Efficient region tracking with parametric models of geometry and illumination. IEEE TPAMI 20, 1025–1039 (1998)
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: IJCAI (1981)
Irani, M., Anandan, P.: Robust multi-sensor image alignment. In: ICCV (1998)
Baker, S., Matthews, I.: Equivalence and efficiency of image alignment algorithms. In: IEEE CVPR (2001)
Mei, C., Benhimane, S., Malis, E., Rives, P.: Constrained multiple planar template tracking for central catadioptric cameras. In: BMVC (2006)
Caron, G., Marchand, E., Mouaddib, E.: Tracking planes in omnidirectional stereovision. In: IEEE ICRA (2011)
Comport, A., Malis, E., Rives, P.: Accurate quadrifocal tracking for robust 3d visual odometry. In: IEEE ICRA (2007)
Churchill, W., Tong, C., Gurau, C., Posner, I., Newman, P.: Know your limits: Embedding localiser performance models in teach and repeat maps. In: IEEE ICRA (2015)
Furgale, P., Barfoot, T.: Visual teach and repeat for long-range rover autonomy. JFR 27, 534–560 (2010)
Gelfand, N., Ikemoto, L., Rusinkiewicz, S., Levoy, M.: Geometrically stable sampling for the icp algorithm. In: 3DIM (2003)
Comport, A., Malis, E., Rives, P.: Real-time quadrifocal visual odometry. IJRR 29, 245–266 (2010)
Tykkala, T., Audras, C., Comport, A.: Direct iterative closest point for real-time visual odometry. In: ICCV Workshops (2011)
Kerl, C., Sturm, J., Cremers, D.: Dense visual SLAM for RGB-D cameras. In: IEEE IROS (2013)
Timofte, R., Gool, L.V.: Sparse flow: sparse matching for small to large displacement optical flow. In: IEEE WCACV (2015)
Morency, L., Darrell, T.: Stereo tracking using icp and normal flow constraint. In: ICPR (2002)
Martins, R., Fernandez-Moral, E., Rives, P.: Dense accurate urban mapping from spherical RGB-D images. In: IEEE IROS (2015)
Gokhool, T., Martins, R., Rives, P., Despre, N.: A compact spherical RGBD keyframe-based representation. In: IEEE ICRA (2015)
Weikersdorfer, D., Gossow, D., Beetz, M.: Depth-adaptative superpixels. In: IEEE ICP (2013)
Fernandez-Moral, E., Mayol-Cuevas, W., Arevalo, V., Gonzalez-Jimenez, J.: Fast place recognition with plane-based maps. In: IEEE ICRA (2013)
Zhang, Z.: Parameter estimation techniques: A tutorial with application to conic fitting. Technical report 2676, Inria (1995)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the kitti vision benchmark suite. In: IEEE CVPR (2012)
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Susstrunk, S.: SLIC superpixels compared to state-of-the-art superpixels methods. IEEE Trans. PAMI 34, 2274–2282 (2012)
Meilland, M., Comport, A., Rives, P.: Dense omnidirectional RGB-D mapping of large-scale outdoor environments for real-time localization and autonomous navigation. JFR 32, 474–503 (2015)
Fernandez-Moral, E., Gonzalez-Jimenez, J., Rives, P., Arevalo, V.: Extrinsic calibration of a set of range cameras in 5 seconds without pattern. In: IEEE IROS (2014)
Barker, S., Matthews, I.: Lucas-kanade 20 years on: a unifying framework. IJCV 56, 221–255 (2006)
Acknowledgements
The authors thank Josh Picard and Paolo Salaris for the discussions/proof reading of the manuscript, and the reviewers for their thoughtful comments. This work is funded by CNPq of Brazil under contract number 216026/2013-0.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix A: Error Jacobians and Optimization
Appendix A: Error Jacobians and Optimization
The pose \(\mathbf {T(x)} \in \mathbb {SE}(3)\) is parametrized as function of angular and linear velocities \(\mathbf {x} = (\varvec{\upsilon }\delta t,\varvec{\omega }\delta t) \in \mathbb {R}^6\) and the optimization will be related to this twist parametrization. The pose is related to the twist velocities by the exponential mapping \(\mathbf {T(x)} = \exp (se3(\mathbf {x}))\), with
which is the Lie algebra of \(\mathbb {SE}(3)\) at the identity element, \(\mathbf {S(z)}\) represents the skew symmetric matrix associated to vector \(\mathbf {z}\) and \(\delta t = 1\).
The respective Jacobians will be derived following this parametrization. We ask the reader to see [19] for details about the photometric Jacobian \(\mathbf {J}^I\). Next, for the geometric point-to-plane direct Jacobian \(\mathbf {J}^D \in \mathbb {R}^{1\times 6}\), we denote the 3D point error \(\mathbf {\zeta }(\mathbf {x})\):
From Eqs. (3), (12) and the product rule:
For clarity, the first term in Eq. (13) is \(\mathbf {{J}_{d1}}\) and we decompose the second term in two Jacobians \(\mathbf {{J}_{d2}}\) and \(\mathbf {{J}_{d3}}\), such as \(\mathbf {J}^D(\mathbf {0}) = \lambda \mathbf {n^*}^T (\mathbf {{J}_{d1}} (\mathbf {0}) + \mathbf {{J}_{d2}}(\mathbf {0}) + \mathbf {{J}_{d3}}(\mathbf {0}))\). From \( \frac{\partial (\mathbf {R(x)}\mathbf {\zeta })}{\partial \mathbf {x}} = \frac{\partial (\mathbf {R(x)}\mathbf {\zeta })}{\partial \mathbf {R(x)}} \frac{\partial \mathbf {R(x)}}{\partial \mathbf {x}}\) the first term is
The second term is decomposed in two Jacobians
And finally the last Jacobian is the one corresponding to \(\frac{\partial (g(w(\mathbf {p},\mathbf {\hat{T}}\mathbf {T(x)}))}{\partial \mathbf {x}}\). This derivative can be seen as an extended version of the image photometric gradient \(\mathbf {J^I}\), for each component of \(g(w(\mathbf {p},\mathbf {\hat{T}}\mathbf {T(x)})\). Then
And \(\mathbf {J_g}\big |_{[g(\mathbf {p}_w)]_i}\) is the image gradient (as in the photometric term) of an image produced with the ith-coordinate of \(g(w(\mathbf {p},\mathbf {\hat{T}}\mathbf {T(0)}))\). Note that this Jacobian is small for points belonging to planar surfaces. Therefore, \(\mathbf {{J}_{d3}}\) is neglected since only a fraction of the scene is on geometric discontinuities and since these points have higher sensitivity to depth error estimates and self-occlusions effects. Finally, we use the ESM formulation [19] for defining the optimization step for the RGB cost, while a Gauss-Newton step is employed for the geometric Jacobian. The reader is asked to see [33] for more details on the different optimization available techniques.
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Martins, R., Fernandez-Moral, E., Rives, P. (2017). Adaptive Direct RGB-D Registration and Mapping for Large Motions. In: Lai, SH., Lepetit, V., Nishino, K., Sato, Y. (eds) Computer Vision – ACCV 2016. ACCV 2016. Lecture Notes in Computer Science(), vol 10114. Springer, Cham. https://doi.org/10.1007/978-3-319-54190-7_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-54190-7_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54189-1
Online ISBN: 978-3-319-54190-7
eBook Packages: Computer ScienceComputer Science (R0)