Skip to main content

6D Pose Estimation Based on the Adaptive Weight of RGB-D Feature

  • Conference paper
  • First Online:
Parallel and Distributed Computing, Applications and Technologies (PDCAT 2020)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12606))

  • 1053 Accesses

Abstract

In the task of 6D pose estimation by RGB-D image, the crucial problem is how to make the most of two types of features respectively from RGB and depth input. As far as we know, prior approaches treat those two sources equally, which may overlook that the different combinations of those two properties could have varying degrees of impact. Therefore, we propose a Feature Selecting Mechanism (FSM) in this paper to find the most suitable ratio of feature dimension from RGB image and point cloud (converted from depth image) to predict the 6D pose more effectively. We first conduct artificial selection in our Feature Selecting Mechanism (FSM) to prove the potential for the weight of the RGB-D feature. Afterward, the neural network is deployed in our FSM to adaptively pick out features from RGB-D input. Through our experiments on the LINEMOD dataset, YCB-Video dataset, and our multi-pose synthetic image dataset, we show that there is an up to 2% improvement in the accuracy by utilizing our FSM, compared to the state-of-the-art method.

This work is supported by NSFC 12071460, Shenzhen research grant (KQJSCX20180330170311901, JCYJ20180305180840138 and GGFW2017073114031767).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Deng, X., Xiang, Y., Mousavian, A., Eppner, C., Bretl, T., Fox, D.: Self-supervised 6d object pose estimation for robot manipulation. CoRR abs/1909.10159 (2019)

    Google Scholar 

  2. Tremblay, J., To, T., Sundaralingam, B., Xiang, Y., Fox, D., Birchfield, S.: Deep object pose estimation for semantic robotic grasping of household objects. CoRR abs/1809.10790 (2018)

    Google Scholar 

  3. Marchand, É., Uchiyama, H., Spindler, F.: Pose estimation for augmented reality: a hands-on survey. IEEE Trans. Vis. Comput. Graph. 22, 2633–2651 (2016)

    Article  Google Scholar 

  4. Shahrokni, A., Vacchetti, L., Lepetit, V., Fua, P.: Polyhedral object detection and pose estimation for augmented reality applications. In: Computer Animation 2002, CA 2002, pp. 65–72 (2002)

    Google Scholar 

  5. Brachmann, E., Michel, F., Krull, A., Yang, M.Y., Gumhold, S., Rother, C.: Uncertainty-driven 6D pose estimation of objects and scenes from a single RGB image. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, 3364–3372 (2016)

    Google Scholar 

  6. Kehl, W., Manhardt, F., Tombari, F., Ilic, S., Navab, N.: SSD-6D: making RGB-based 3D detection and 6D pose estimation great again. In: IEEE International Conference on Computer Vision, ICCV 2017, pp. 1530–1538 (2017)

    Google Scholar 

  7. Kehl, W., Milletari, F., Tombari, F., Ilic, S., Navab, N.: Deep learning of local RGB-D patches for 3D object detection and 6D pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 205–220. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_13

    Chapter  Google Scholar 

  8. Li, C., Bai, J., Hager, G.D.: A unified framework for multi-view multi-class object pose estimation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 263–281. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01270-0_16

    Chapter  Google Scholar 

  9. Wang, C., et al.: DenseFusion: 6D object pose estimation by iterative dense fusion. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, pp. 3343–3352 (2019)

    Google Scholar 

  10. Brachmann, E., Krull, A., Michel, F., Gumhold, S., Shotton, J., Rother, C.: Learning 6D object pose estimation using 3D object coordinates. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 536–551. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10605-2_35

    Chapter  Google Scholar 

  11. Hinterstoisser, S., et al.: Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012. LNCS, vol. 7724, pp. 548–562. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37331-2_42

    Chapter  Google Scholar 

  12. Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. In: Robotics: Science and Systems XIV, Carnegie Mellon University (2018)

    Google Scholar 

  13. Wu, C., Fraundorfer, F., Frahm, J., Pollefeys, M.: 3D model search and pose estimation from single images using VIP features. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR Workshops 2008, pp. 1–8 (2008)

    Google Scholar 

  14. Zhu, M., et al.: Single image 3D object detection and pose estimation for grasping. In: 2014 IEEE International Conference on Robotics and Automation, ICRA 2014, pp. 3936–3943 (2014)

    Google Scholar 

  15. Hinterstoisser, S., et al.: Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: IEEE International Conference on Computer Vision, ICCV 2011, pp. 858–865 (2011)

    Google Scholar 

  16. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)

    Article  Google Scholar 

  17. Wu, C., Clipp, B., Li, X., Frahm, J., Pollefeys, M.: 3D model matching with viewpoint-invariant patches (VIP). In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2008 (2008)

    Google Scholar 

  18. Wohlhart, P., Lepetit, V.: Learning descriptors for object recognition and 3D pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, pp. 3109–3118 (2015)

    Google Scholar 

  19. Xu, D., Anguelov, D., Jain, A.: PointFusion: deep sensor fusion for 3D bounding box estimation. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, pp. 244–253 (2018)

    Google Scholar 

  20. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)

    Article  MathSciNet  Google Scholar 

  21. Masci, J., Meier, U., Cireşan, D., Schmidhuber, J.: Stacked convolutional auto-encoders for hierarchical feature extraction. In: Honkela, T., Duch, W., Girolami, M., Kaski, S. (eds.) ICANN 2011. LNCS, vol. 6791, pp. 52–59. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21735-7_7

    Chapter  Google Scholar 

  22. Liang, Y., Cai, Z., Jiguo, Yu., Han, Q., Li, Y.: Deep learning based inference of private information using embedded sensors in smart devices. IEEE Netw. Mag. 32(4), 8–14 (2018)

    Article  Google Scholar 

  23. Cai, Z., Xu, Z., Yu, J.: A differential-private framework for urban traffic flows estimation via taxi companies. IEEE Trans. Ind. Inform. (TII) 15(12), 6492–6499 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Ning .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, G., Ning, L., Feng, L. (2021). 6D Pose Estimation Based on the Adaptive Weight of RGB-D Feature. In: Zhang, Y., Xu, Y., Tian, H. (eds) Parallel and Distributed Computing, Applications and Technologies. PDCAT 2020. Lecture Notes in Computer Science(), vol 12606. Springer, Cham. https://doi.org/10.1007/978-3-030-69244-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-69244-5_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-69243-8

  • Online ISBN: 978-3-030-69244-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics