Adaptive Direct RGB-D Registration and Mapping for Large Motions

Martins, Renato; Fernandez-Moral, Eduardo; Rives, Patrick

doi:10.1007/978-3-319-54190-7_12

Renato Martins^17,18,
Eduardo Fernandez-Moral¹⁷ &
Patrick Rives¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10114))

Included in the following conference series:

Asian Conference on Computer Vision

1953 Accesses

Abstract

Dense direct RGB-D registration methods are widely used in tasks ranging from localization and tracking to 3D scene reconstruction. This work addresses a peculiar aspect which drastically limits the applicability of direct registration, namely the weakness of the convergence domain. First, we propose an activation function based on the conditioning of the RGB and ICP point-to-plane error terms. This function strengthens the geometric error influence in the first coarse iterations, while the intensity data term dominates in the finer increments. The information gathered from the geometric and photometric cost functions is not only considered for improving the system observability, but for exploiting the different convergence properties and convexity of each data term. Next, we develop a set of strategies as a flexible regularization and a pixel saliency selection to further improve the quality and robustness of this approach.

The methodology is formulated for a generic warping model and results are given using perspective and spherical sensor models. Finally, our method is validated in different RGB-D spherical datasets, including both indoor and outdoor real sequences and using the KITTI VO/SLAM benchmark dataset. We show that the different proposed techniques (weighted activation function, regularization, saliency pixel selection), lead to faster convergence and larger convergence domains, which are the main limitations to the use of direct methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. IEEE PAMI 33, 500–513 (2011)
Article Google Scholar
Braux-Zin, J., Dupont, R., Bartoli, A.: A general dense image matching framework combining direct and feature-based costs. In: IEEE ICCV (2013)
Google Scholar
Howard, A.: Real-time stereo visual odometry for autonomous ground vehicles. In: IEEE IROS (2008)
Google Scholar
Davison, A., Murray, D.: Simultaneous localization and map-building using active vision. IEEE TPAMI 24, 865–880 (2002)
Article Google Scholar
Nistér, D., Naroditsky, O., Bergen, J.: Visual odometry. In: IEEE CVPR (2004)
Google Scholar
Kitt, B., Geiger, A., Lategahn, H.: Visual odometry based on stereo image sequences with ransac-based outlier rejection scheme. In: IEEE IV (2010)
Google Scholar
Harris, C., Stephens, M.: A combined corner and edge detector. In: 4th Alvey Vision Conference (1988)
Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60, 91–110 (2004)
Article Google Scholar
Hager, G., Belhumeur, P.: Efficient region tracking with parametric models of geometry and illumination. IEEE TPAMI 20, 1025–1039 (1998)
Article Google Scholar
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: IJCAI (1981)
Google Scholar
Irani, M., Anandan, P.: Robust multi-sensor image alignment. In: ICCV (1998)
Google Scholar
Baker, S., Matthews, I.: Equivalence and efficiency of image alignment algorithms. In: IEEE CVPR (2001)
Google Scholar
Mei, C., Benhimane, S., Malis, E., Rives, P.: Constrained multiple planar template tracking for central catadioptric cameras. In: BMVC (2006)
Google Scholar
Caron, G., Marchand, E., Mouaddib, E.: Tracking planes in omnidirectional stereovision. In: IEEE ICRA (2011)
Google Scholar
Comport, A., Malis, E., Rives, P.: Accurate quadrifocal tracking for robust 3d visual odometry. In: IEEE ICRA (2007)
Google Scholar
Churchill, W., Tong, C., Gurau, C., Posner, I., Newman, P.: Know your limits: Embedding localiser performance models in teach and repeat maps. In: IEEE ICRA (2015)
Google Scholar
Furgale, P., Barfoot, T.: Visual teach and repeat for long-range rover autonomy. JFR 27, 534–560 (2010)
Google Scholar
Gelfand, N., Ikemoto, L., Rusinkiewicz, S., Levoy, M.: Geometrically stable sampling for the icp algorithm. In: 3DIM (2003)
Google Scholar
Comport, A., Malis, E., Rives, P.: Real-time quadrifocal visual odometry. IJRR 29, 245–266 (2010)
Google Scholar
Tykkala, T., Audras, C., Comport, A.: Direct iterative closest point for real-time visual odometry. In: ICCV Workshops (2011)
Google Scholar
Kerl, C., Sturm, J., Cremers, D.: Dense visual SLAM for RGB-D cameras. In: IEEE IROS (2013)
Google Scholar
Timofte, R., Gool, L.V.: Sparse flow: sparse matching for small to large displacement optical flow. In: IEEE WCACV (2015)
Google Scholar
Morency, L., Darrell, T.: Stereo tracking using icp and normal flow constraint. In: ICPR (2002)
Google Scholar
Martins, R., Fernandez-Moral, E., Rives, P.: Dense accurate urban mapping from spherical RGB-D images. In: IEEE IROS (2015)
Google Scholar
Gokhool, T., Martins, R., Rives, P., Despre, N.: A compact spherical RGBD keyframe-based representation. In: IEEE ICRA (2015)
Google Scholar
Weikersdorfer, D., Gossow, D., Beetz, M.: Depth-adaptative superpixels. In: IEEE ICP (2013)
Google Scholar
Fernandez-Moral, E., Mayol-Cuevas, W., Arevalo, V., Gonzalez-Jimenez, J.: Fast place recognition with plane-based maps. In: IEEE ICRA (2013)
Google Scholar
Zhang, Z.: Parameter estimation techniques: A tutorial with application to conic fitting. Technical report 2676, Inria (1995)
Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the kitti vision benchmark suite. In: IEEE CVPR (2012)
Google Scholar
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Susstrunk, S.: SLIC superpixels compared to state-of-the-art superpixels methods. IEEE Trans. PAMI 34, 2274–2282 (2012)
Article Google Scholar
Meilland, M., Comport, A., Rives, P.: Dense omnidirectional RGB-D mapping of large-scale outdoor environments for real-time localization and autonomous navigation. JFR 32, 474–503 (2015)
Google Scholar
Fernandez-Moral, E., Gonzalez-Jimenez, J., Rives, P., Arevalo, V.: Extrinsic calibration of a set of range cameras in 5 seconds without pattern. In: IEEE IROS (2014)
Google Scholar
Barker, S., Matthews, I.: Lucas-kanade 20 years on: a unifying framework. IJCV 56, 221–255 (2006)
Article Google Scholar

Download references

Acknowledgements

The authors thank Josh Picard and Paolo Salaris for the discussions/proof reading of the manuscript, and the reviewers for their thoughtful comments. This work is funded by CNPq of Brazil under contract number 216026/2013-0.

Author information

Authors and Affiliations

Inria, Université Côte d’Azur, Sophia Antipolis, France
Renato Martins, Eduardo Fernandez-Moral & Patrick Rives
MINES ParisTech, PSL Research University, Sophia Antipolis, France
Renato Martins

Authors

Renato Martins
View author publications
You can also search for this author in PubMed Google Scholar
Eduardo Fernandez-Moral
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Rives
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Renato Martins .

Editor information

Editors and Affiliations

National Tsing Hua University, Hsinchu, Taiwan
Shang-Hong Lai
Graz University of Technology, Graz, Austria
Vincent Lepetit
Drexel University, Philadelphia, Pennsylvania, USA
Ko Nishino
The University of Tokyo, Tokyo, Japan
Yoichi Sato

Appendix A: Error Jacobians and Optimization

The pose $\mathbf {T(x)} \in \mathbb {SE}(3)$ is parametrized as function of angular and linear velocities $\mathbf {x} = (\varvec{\upsilon }\delta t,\varvec{\omega }\delta t) \in \mathbb {R}^6$ and the optimization will be related to this twist parametrization. The pose is related to the twist velocities by the exponential mapping $\mathbf {T(x)} = \exp (se3(\mathbf {x}))$, with

$$\begin{aligned} se3(\mathbf {x}) = \left[ \begin{array}{cc} \mathbf {S}(\varvec{\omega })\delta t &{} \varvec{\upsilon } \delta t\\ \mathbf {0}_{(1\times 3)} &{} 0 \end{array} \right] \in \mathfrak {se}(3) \end{aligned}$$

(11)

which is the Lie algebra of $\mathbb {SE}(3)$ at the identity element, $\mathbf {S(z)}$ represents the skew symmetric matrix associated to vector $\mathbf {z}$ and $\delta t = 1$.

The respective Jacobians will be derived following this parametrization. We ask the reader to see [19] for details about the photometric Jacobian $\mathbf {J}^I$. Next, for the geometric point-to-plane direct Jacobian $\mathbf {J}^D \in \mathbb {R}^{1\times 6}$, we denote the 3D point error $\mathbf {\zeta }(\mathbf {x})$:

$$\begin{aligned} \begin{array}{cl} \mathbf {\zeta }(\mathbf {x}) &{} = - \mathbf {\hat{T}}\mathbf {T(x)}\left[ \begin{array}{c} g^*(\mathbf {p}))\\ 1 \end{array} \right] + g(w(\mathbf {p},\mathbf {\hat{T}}\mathbf {T(x)})) \\ &{} = -\mathbf {\hat{R}}\mathbf {R(x)}g^*(\mathbf {p}) - \mathbf {\hat{R}t(x)} - \mathbf {\hat{t}} + g(w(\mathbf {p},\mathbf {\hat{T}}\mathbf {T(x)})) \end{array} \end{aligned}$$

(12)

From Eqs. (3), (12) and the product rule:

$$\begin{aligned} \mathbf {J}^D(\mathbf {0}) = \begin{array}{l} \lambda _D\mathbf {n^*}^T \left( \frac{\partial (\mathbf {R(x)}^T\mathbf {\hat{R}}^T\mathbf {\zeta }(\mathbf {z}))}{\partial \mathbf {x}}\Bigg |_{\mathbf {z = x}} + \mathbf {R(x)}^T\mathbf {\hat{R}}^T \frac{\partial (\mathbf {\zeta }(\mathbf {x}))}{\partial \mathbf {x}}\right) \Bigg |_{\mathbf {x = 0}} \end{array} \end{aligned}$$

(13)

For clarity, the first term in Eq. (13) is $\mathbf {{J}_{d1}}$ and we decompose the second term in two Jacobians $\mathbf {{J}_{d2}}$ and $\mathbf {{J}_{d3}}$, such as $\mathbf {J}^D(\mathbf {0}) = \lambda \mathbf {n^*}^T (\mathbf {{J}_{d1}} (\mathbf {0}) + \mathbf {{J}_{d2}}(\mathbf {0}) + \mathbf {{J}_{d3}}(\mathbf {0}))$. From $ \frac{\partial (\mathbf {R(x)}\mathbf {\zeta })}{\partial \mathbf {x}} = \frac{\partial (\mathbf {R(x)}\mathbf {\zeta })}{\partial \mathbf {R(x)}} \frac{\partial \mathbf {R(x)}}{\partial \mathbf {x}}$ the first term is

$$\begin{aligned} \mathbf {{J}_{d1}}(\mathbf {0}) = \left[ \begin{array}{c}\mathbf {0}_{3\times 3} ~~~ \mathbf {S(\mathbf {\hat{R}}^T\mathbf {\zeta }(\mathbf {0}))} \end{array} \right] \end{aligned}$$

(14)

The second term is decomposed in two Jacobians

$$\begin{aligned} \mathbf {{J}_{d2}}(\mathbf {0}) = \left[ \begin{array}{c}- \mathbf {I}_{3\times 3} ~~~ \mathbf {S}(g^*(\mathbf {p})) \end{array} \right] \end{aligned}$$

(15)

And finally the last Jacobian is the one corresponding to $\frac{\partial (g(w(\mathbf {p},\mathbf {\hat{T}}\mathbf {T(x)}))}{\partial \mathbf {x}}$. This derivative can be seen as an extended version of the image photometric gradient $\mathbf {J^I}$, for each component of $g(w(\mathbf {p},\mathbf {\hat{T}}\mathbf {T(x)})$. Then

$$\begin{aligned} \mathbf {{J}_{d3}}(\mathbf {0}) = \left[ \begin{array}{c}\mathbf {J_g}\big |_{[g(\mathbf {p}_w)]_1}^T ~ \mathbf {J_g}\big |_{[g(\mathbf {p}_w)]_2}^T ~ \mathbf {J_g}\big |_{[g(\mathbf {p}_w)]_3}^T \end{array} \right] ^T\mathbf {J_wJ_T} \end{aligned}$$

(16)

And $\mathbf {J_g}\big |_{[g(\mathbf {p}_w)]_i}$ is the image gradient (as in the photometric term) of an image produced with the ith-coordinate of $g(w(\mathbf {p},\mathbf {\hat{T}}\mathbf {T(0)}))$. Note that this Jacobian is small for points belonging to planar surfaces. Therefore, $\mathbf {{J}_{d3}}$ is neglected since only a fraction of the scene is on geometric discontinuities and since these points have higher sensitivity to depth error estimates and self-occlusions effects. Finally, we use the ESM formulation [19] for defining the optimization step for the RGB cost, while a Gauss-Newton step is employed for the geometric Jacobian. The reader is asked to see [33] for more details on the different optimization available techniques.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martins, R., Fernandez-Moral, E., Rives, P. (2017). Adaptive Direct RGB-D Registration and Mapping for Large Motions. In: Lai, SH., Lepetit, V., Nishino, K., Sato, Y. (eds) Computer Vision – ACCV 2016. ACCV 2016. Lecture Notes in Computer Science(), vol 10114. Springer, Cham. https://doi.org/10.1007/978-3-319-54190-7_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-54190-7_12
Published: 12 March 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54189-1
Online ISBN: 978-3-319-54190-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Adaptive Direct RGB-D Registration and Mapping for Large Motions

Abstract

Access this chapter

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix A: Error Jacobians and Optimization

Appendix A: Error Jacobians and Optimization

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation