Skip to main content

Adaptive Direct RGB-D Registration and Mapping for Large Motions

  • Conference paper
  • First Online:
Book cover Computer Vision – ACCV 2016 (ACCV 2016)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10114))

Included in the following conference series:

  • 1953 Accesses

Abstract

Dense direct RGB-D registration methods are widely used in tasks ranging from localization and tracking to 3D scene reconstruction. This work addresses a peculiar aspect which drastically limits the applicability of direct registration, namely the weakness of the convergence domain. First, we propose an activation function based on the conditioning of the RGB and ICP point-to-plane error terms. This function strengthens the geometric error influence in the first coarse iterations, while the intensity data term dominates in the finer increments. The information gathered from the geometric and photometric cost functions is not only considered for improving the system observability, but for exploiting the different convergence properties and convexity of each data term. Next, we develop a set of strategies as a flexible regularization and a pixel saliency selection to further improve the quality and robustness of this approach.

The methodology is formulated for a generic warping model and results are given using perspective and spherical sensor models. Finally, our method is validated in different RGB-D spherical datasets, including both indoor and outdoor real sequences and using the KITTI VO/SLAM benchmark dataset. We show that the different proposed techniques (weighted activation function, regularization, saliency pixel selection), lead to faster convergence and larger convergence domains, which are the main limitations to the use of direct methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. IEEE PAMI 33, 500–513 (2011)

    Article  Google Scholar 

  2. Braux-Zin, J., Dupont, R., Bartoli, A.: A general dense image matching framework combining direct and feature-based costs. In: IEEE ICCV (2013)

    Google Scholar 

  3. Howard, A.: Real-time stereo visual odometry for autonomous ground vehicles. In: IEEE IROS (2008)

    Google Scholar 

  4. Davison, A., Murray, D.: Simultaneous localization and map-building using active vision. IEEE TPAMI 24, 865–880 (2002)

    Article  Google Scholar 

  5. Nistér, D., Naroditsky, O., Bergen, J.: Visual odometry. In: IEEE CVPR (2004)

    Google Scholar 

  6. Kitt, B., Geiger, A., Lategahn, H.: Visual odometry based on stereo image sequences with ransac-based outlier rejection scheme. In: IEEE IV (2010)

    Google Scholar 

  7. Harris, C., Stephens, M.: A combined corner and edge detector. In: 4th Alvey Vision Conference (1988)

    Google Scholar 

  8. Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60, 91–110 (2004)

    Article  Google Scholar 

  9. Hager, G., Belhumeur, P.: Efficient region tracking with parametric models of geometry and illumination. IEEE TPAMI 20, 1025–1039 (1998)

    Article  Google Scholar 

  10. Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: IJCAI (1981)

    Google Scholar 

  11. Irani, M., Anandan, P.: Robust multi-sensor image alignment. In: ICCV (1998)

    Google Scholar 

  12. Baker, S., Matthews, I.: Equivalence and efficiency of image alignment algorithms. In: IEEE CVPR (2001)

    Google Scholar 

  13. Mei, C., Benhimane, S., Malis, E., Rives, P.: Constrained multiple planar template tracking for central catadioptric cameras. In: BMVC (2006)

    Google Scholar 

  14. Caron, G., Marchand, E., Mouaddib, E.: Tracking planes in omnidirectional stereovision. In: IEEE ICRA (2011)

    Google Scholar 

  15. Comport, A., Malis, E., Rives, P.: Accurate quadrifocal tracking for robust 3d visual odometry. In: IEEE ICRA (2007)

    Google Scholar 

  16. Churchill, W., Tong, C., Gurau, C., Posner, I., Newman, P.: Know your limits: Embedding localiser performance models in teach and repeat maps. In: IEEE ICRA (2015)

    Google Scholar 

  17. Furgale, P., Barfoot, T.: Visual teach and repeat for long-range rover autonomy. JFR 27, 534–560 (2010)

    Google Scholar 

  18. Gelfand, N., Ikemoto, L., Rusinkiewicz, S., Levoy, M.: Geometrically stable sampling for the icp algorithm. In: 3DIM (2003)

    Google Scholar 

  19. Comport, A., Malis, E., Rives, P.: Real-time quadrifocal visual odometry. IJRR 29, 245–266 (2010)

    Google Scholar 

  20. Tykkala, T., Audras, C., Comport, A.: Direct iterative closest point for real-time visual odometry. In: ICCV Workshops (2011)

    Google Scholar 

  21. Kerl, C., Sturm, J., Cremers, D.: Dense visual SLAM for RGB-D cameras. In: IEEE IROS (2013)

    Google Scholar 

  22. Timofte, R., Gool, L.V.: Sparse flow: sparse matching for small to large displacement optical flow. In: IEEE WCACV (2015)

    Google Scholar 

  23. Morency, L., Darrell, T.: Stereo tracking using icp and normal flow constraint. In: ICPR (2002)

    Google Scholar 

  24. Martins, R., Fernandez-Moral, E., Rives, P.: Dense accurate urban mapping from spherical RGB-D images. In: IEEE IROS (2015)

    Google Scholar 

  25. Gokhool, T., Martins, R., Rives, P., Despre, N.: A compact spherical RGBD keyframe-based representation. In: IEEE ICRA (2015)

    Google Scholar 

  26. Weikersdorfer, D., Gossow, D., Beetz, M.: Depth-adaptative superpixels. In: IEEE ICP (2013)

    Google Scholar 

  27. Fernandez-Moral, E., Mayol-Cuevas, W., Arevalo, V., Gonzalez-Jimenez, J.: Fast place recognition with plane-based maps. In: IEEE ICRA (2013)

    Google Scholar 

  28. Zhang, Z.: Parameter estimation techniques: A tutorial with application to conic fitting. Technical report 2676, Inria (1995)

    Google Scholar 

  29. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the kitti vision benchmark suite. In: IEEE CVPR (2012)

    Google Scholar 

  30. Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Susstrunk, S.: SLIC superpixels compared to state-of-the-art superpixels methods. IEEE Trans. PAMI 34, 2274–2282 (2012)

    Article  Google Scholar 

  31. Meilland, M., Comport, A., Rives, P.: Dense omnidirectional RGB-D mapping of large-scale outdoor environments for real-time localization and autonomous navigation. JFR 32, 474–503 (2015)

    Google Scholar 

  32. Fernandez-Moral, E., Gonzalez-Jimenez, J., Rives, P., Arevalo, V.: Extrinsic calibration of a set of range cameras in 5 seconds without pattern. In: IEEE IROS (2014)

    Google Scholar 

  33. Barker, S., Matthews, I.: Lucas-kanade 20 years on: a unifying framework. IJCV 56, 221–255 (2006)

    Article  Google Scholar 

Download references

Acknowledgements

The authors thank Josh Picard and Paolo Salaris for the discussions/proof reading of the manuscript, and the reviewers for their thoughtful comments. This work is funded by CNPq of Brazil under contract number 216026/2013-0.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Renato Martins .

Editor information

Editors and Affiliations

Appendix A: Error Jacobians and Optimization

Appendix A: Error Jacobians and Optimization

The pose \(\mathbf {T(x)} \in \mathbb {SE}(3)\) is parametrized as function of angular and linear velocities \(\mathbf {x} = (\varvec{\upsilon }\delta t,\varvec{\omega }\delta t) \in \mathbb {R}^6\) and the optimization will be related to this twist parametrization. The pose is related to the twist velocities by the exponential mapping \(\mathbf {T(x)} = \exp (se3(\mathbf {x}))\), with

$$\begin{aligned} se3(\mathbf {x}) = \left[ \begin{array}{cc} \mathbf {S}(\varvec{\omega })\delta t &{} \varvec{\upsilon } \delta t\\ \mathbf {0}_{(1\times 3)} &{} 0 \end{array} \right] \in \mathfrak {se}(3) \end{aligned}$$
(11)

which is the Lie algebra of \(\mathbb {SE}(3)\) at the identity element, \(\mathbf {S(z)}\) represents the skew symmetric matrix associated to vector \(\mathbf {z}\) and \(\delta t = 1\).

The respective Jacobians will be derived following this parametrization. We ask the reader to see [19] for details about the photometric Jacobian \(\mathbf {J}^I\). Next, for the geometric point-to-plane direct Jacobian \(\mathbf {J}^D \in \mathbb {R}^{1\times 6}\), we denote the 3D point error \(\mathbf {\zeta }(\mathbf {x})\):

$$\begin{aligned} \begin{array}{cl} \mathbf {\zeta }(\mathbf {x}) &{} = - \mathbf {\hat{T}}\mathbf {T(x)}\left[ \begin{array}{c} g^*(\mathbf {p}))\\ 1 \end{array} \right] + g(w(\mathbf {p},\mathbf {\hat{T}}\mathbf {T(x)})) \\ &{} = -\mathbf {\hat{R}}\mathbf {R(x)}g^*(\mathbf {p}) - \mathbf {\hat{R}t(x)} - \mathbf {\hat{t}} + g(w(\mathbf {p},\mathbf {\hat{T}}\mathbf {T(x)})) \end{array} \end{aligned}$$
(12)

From Eqs. (3), (12) and the product rule:

$$\begin{aligned} \mathbf {J}^D(\mathbf {0}) = \begin{array}{l} \lambda _D\mathbf {n^*}^T \left( \frac{\partial (\mathbf {R(x)}^T\mathbf {\hat{R}}^T\mathbf {\zeta }(\mathbf {z}))}{\partial \mathbf {x}}\Bigg |_{\mathbf {z = x}} + \mathbf {R(x)}^T\mathbf {\hat{R}}^T \frac{\partial (\mathbf {\zeta }(\mathbf {x}))}{\partial \mathbf {x}}\right) \Bigg |_{\mathbf {x = 0}} \end{array} \end{aligned}$$
(13)

For clarity, the first term in Eq. (13) is \(\mathbf {{J}_{d1}}\) and we decompose the second term in two Jacobians \(\mathbf {{J}_{d2}}\) and \(\mathbf {{J}_{d3}}\), such as \(\mathbf {J}^D(\mathbf {0}) = \lambda \mathbf {n^*}^T (\mathbf {{J}_{d1}} (\mathbf {0}) + \mathbf {{J}_{d2}}(\mathbf {0}) + \mathbf {{J}_{d3}}(\mathbf {0}))\). From \( \frac{\partial (\mathbf {R(x)}\mathbf {\zeta })}{\partial \mathbf {x}} = \frac{\partial (\mathbf {R(x)}\mathbf {\zeta })}{\partial \mathbf {R(x)}} \frac{\partial \mathbf {R(x)}}{\partial \mathbf {x}}\) the first term is

$$\begin{aligned} \mathbf {{J}_{d1}}(\mathbf {0}) = \left[ \begin{array}{c}\mathbf {0}_{3\times 3} ~~~ \mathbf {S(\mathbf {\hat{R}}^T\mathbf {\zeta }(\mathbf {0}))} \end{array} \right] \end{aligned}$$
(14)

The second term is decomposed in two Jacobians

$$\begin{aligned} \mathbf {{J}_{d2}}(\mathbf {0}) = \left[ \begin{array}{c}- \mathbf {I}_{3\times 3} ~~~ \mathbf {S}(g^*(\mathbf {p})) \end{array} \right] \end{aligned}$$
(15)

And finally the last Jacobian is the one corresponding to \(\frac{\partial (g(w(\mathbf {p},\mathbf {\hat{T}}\mathbf {T(x)}))}{\partial \mathbf {x}}\). This derivative can be seen as an extended version of the image photometric gradient \(\mathbf {J^I}\), for each component of \(g(w(\mathbf {p},\mathbf {\hat{T}}\mathbf {T(x)})\). Then

$$\begin{aligned} \mathbf {{J}_{d3}}(\mathbf {0}) = \left[ \begin{array}{c}\mathbf {J_g}\big |_{[g(\mathbf {p}_w)]_1}^T ~ \mathbf {J_g}\big |_{[g(\mathbf {p}_w)]_2}^T ~ \mathbf {J_g}\big |_{[g(\mathbf {p}_w)]_3}^T \end{array} \right] ^T\mathbf {J_wJ_T} \end{aligned}$$
(16)

And \(\mathbf {J_g}\big |_{[g(\mathbf {p}_w)]_i}\) is the image gradient (as in the photometric term) of an image produced with the ith-coordinate of \(g(w(\mathbf {p},\mathbf {\hat{T}}\mathbf {T(0)}))\). Note that this Jacobian is small for points belonging to planar surfaces. Therefore, \(\mathbf {{J}_{d3}}\) is neglected since only a fraction of the scene is on geometric discontinuities and since these points have higher sensitivity to depth error estimates and self-occlusions effects. Finally, we use the ESM formulation [19] for defining the optimization step for the RGB cost, while a Gauss-Newton step is employed for the geometric Jacobian. The reader is asked to see [33] for more details on the different optimization available techniques.

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Martins, R., Fernandez-Moral, E., Rives, P. (2017). Adaptive Direct RGB-D Registration and Mapping for Large Motions. In: Lai, SH., Lepetit, V., Nishino, K., Sato, Y. (eds) Computer Vision – ACCV 2016. ACCV 2016. Lecture Notes in Computer Science(), vol 10114. Springer, Cham. https://doi.org/10.1007/978-3-319-54190-7_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-54190-7_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-54189-1

  • Online ISBN: 978-3-319-54190-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics