skip to main content
research-article
Public Access

Learning Local Shape Descriptors from Part Correspondences with Multiview Convolutional Networks

Published:16 November 2017Publication History
Skip Abstract Section

Abstract

We present a new local descriptor for 3D shapes, directly applicable to a wide range of shape analysis problems such as point correspondences, semantic segmentation, affordance prediction, and shape-to-scan matching. The descriptor is produced by a convolutional network that is trained to embed geometrically and semantically similar points close to one another in descriptor space. The network processes surface neighborhoods around points on a shape that are captured at multiple scales by a succession of progressively zoomed-out views, taken from carefully selected camera positions. We leverage two extremely large sources of data to train our network. First, since our network processes rendered views in the form of 2D images, we repurpose architectures pretrained on massive image datasets. Second, we automatically generate a synthetic dense point correspondence dataset by nonrigid alignment of corresponding shape parts in a large collection of segmented 3D models. As a result of these design choices, our network effectively encodes multiscale local context and fine-grained surface detail. Our network can be trained to produce either category-specific descriptors or more generic descriptors by learning from multiple shape categories. Once trained, at test time, the network extracts local descriptors for shapes without requiring any part segmentation as input. Our method can produce effective local descriptors even for shapes whose category is unknown or different from the ones used while training. We demonstrate through several experiments that our learned local descriptors are more discriminative compared to state-of-the-art alternatives and are effective in a variety of shape analysis applications.

Skip Supplemental Material Section

Supplemental Material

tog37-1-a6-huang.mp4

mp4

218.4 MB

References

  1. M. Ankerst, G. Kastenmüller, H.-P. Kriegel, and T. Seidl. 1999. 3D shape histograms for similarity search and classification in spatial databases. In Proceedings of the International Symposium on Advances in Spatial Databases. 207--226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Aubry, U. Schlickewei, and D. Cremers. 2011. The wave kernel signature: A quantum mechanical approach to shape analysis. In 2011 IEEE International Conference on Computer Vision Workshops.Google ScholarGoogle Scholar
  3. S. Belongie, J. Malik, and J. Puzicha. 2002. Shape matching and object recognition using shape contexts. In IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 4 (2002), 509--522. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Bo, X. Ren, and D. Fox. 2014. Learning hierarchical sparse features for RGB-(D) object recognition. The International Journal of Robotics Research 33, 4 (2014), 581--599. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. F. Bogo, J. Romero, M. Loper, and M. J. Black. 2014. FAUST: Dataset and evaluation for 3D mesh registration. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Boscaini, J. Masci, S. Melzi, M. M. Bronstein, U. Castellani, and P. Vandergheynst. 2015. Learning class-specific descriptors for deformable shapes using localized spectral convolutional networks. In Proceedings of the Symposium on Geometry Processing (SGP’15). 13--23.Google ScholarGoogle Scholar
  7. D. Boscaini, J. Masci, E. Rodol, and M. M. Bronstein. 2016. Learning shape correspondence with anisotropic convolutional neural networks. The Conference and Workshop on Neural Information Processing Systems (NIPS’16). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Bromley, I. Guyon, Y. Lecun, E. Sackinger, and R. Shah. 1994. Signature Verification using a Siamese Time Delay Neural Network. Advances in Neural Information Processing Systems 6. Morgan-Kaufmann. 737--744. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. M. Bronstein, M. M. Bronstein, L. J. Guibas, and M. Ovsjanikov. 2011. Shape Google: Geometric words and expressions for invariant shape retrieval. ACM Transactions on Graphics 30, 1 (2011), 1:1--1:20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. X. Chang, T. A. Funkhouser, L. J. Guibas, P. Hanrahan, Q.-X. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su, J. Xiao, L. Yi, and F. Yu. 2015. ShapeNet: An information-rich 3D model repository. CoRR.Google ScholarGoogle Scholar
  11. D.-Y. Chen, X.-P. Tian, Y.-T. Shen, and M. Ouhyoung. 2003. On visual similarity based 3D model retrieval. Computer Graphics Forum 22, 3 (2003), 223--232.Google ScholarGoogle ScholarCross RefCross Ref
  12. H. Fu, D. Cohen-Or, G. Dror, and A. Sheffer. 2008. Upright orientation of man-made objects. ACM Trans. Graph. 27, 3 (2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Gal and D. Cohen-Or. 2006. Salient geometric features for partial shape matching and similarity. ACM Transactions on Graphics 25, 1 (2006), 130--150. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. K. Guo, D. Zou, and X. Chen. 2015. 3D mesh labeling via deep convolutional neural networks. ACM Transactions on Graphics 35, 1 (2015), 3:1--3:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Hadsell, S. Chopra, and Y. LeCun. 2006. Dimensionality reduction by learning an invariant mapping. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. X. Han, T. Leung, Y. Jia, R. Sukthankar, and A. C. Berg. 2015. MatchNet: Unifying feature and metric learning for patch-based matching. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15).Google ScholarGoogle Scholar
  17. Q.-X. Huang, H. Su, and L. Guibas. 2013. Fine-grained semi-supervised labeling of large shape collections. ACM Transactions on Graphics 32, 6 (2013), 190:1--190:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. CoRR.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. E. Johnson and M. Hebert. 1999. Using spin images for efficient object recognition in cluttered 3D scenes. In IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 5 (1999), 433--449. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. E. Kalogerakis, M. Averkiou, S. Maji, and S. Chaudhuri. 2017. 3D shape segmentation with projective convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15).Google ScholarGoogle Scholar
  21. E. Kalogerakis, A. Hertzmann, and K. Singh. 2010. Learning 3D mesh segmentation and labeling. ACM Transactions on Graphics 29, 4 (2010), 102:1--102:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Kazhdan, T. Funkhouser, and S. Rusinkiewicz. 2004. Symmetry descriptors and 3D shape matching. In Proceedings of the Symposium on Geometry Processing (SGP’04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. V. G. Kim, S. Chaudhuri, L. Guibas, and T. Funkhouser. 2014. Shape2Pose: Human-centric shape analysis. ACM Transactions on Graphics 33, 4 (2014), 120:1--120:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. V. G. Kim, W. Li, N. J. Mitra, S. Chaudhuri, S. DiVerdi, and T. Funkhouser. 2013. Learning part-based templates from large collections of 3D shapes. ACM Transactions on Graphics 32, 4 (2013), 70:1--70:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. D. P. Kingma and J. Ba. 2014. Adam: A method for stochastic optimization. CoRR.Google ScholarGoogle Scholar
  26. A. Krizhevsky, I. Sutskever, and G. E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. The Conference and Workshop on Neural Information Processing Systems (NIPS’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. K. Lai, L. Bo, and D. Fox. 2014. Unsupervised feature learning for 3D scene labeling. In IEEE International Conference on Robotics and Automation (ICRA’14).Google ScholarGoogle Scholar
  28. G. Lavoue. 2012. Combination of bag-of-words descriptors for robust partial shape retrieval. The Visual Computer 28, 9 (2012), 931--942. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R. Litman, A. Bronstein, M. Bronstein, and U. Castellani. 2014. Supervised learning of bag-of-features shape descriptors using sparse coding. Computer Graphics Forum 33, 5 (2014), 127--136.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Y. Liu, H. Zha, and H. Qin. 2006. Shape topics: A compact representation and new algorithms for 3D partial shape retrieval. IEEE Conference on Computer Vision and Pattern Recognition (CVPR’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Masci, D. Boscaini, M. Bronstein, and P. Vandergheynst. 2015. Geodesic convolutional neural networks on Riemannian manifolds. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 37--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. D. Maturana and S. Scherer. 2015. 3D convolutional neural networks for landing zone detection from LiDAR. In IEEE International Conference on Robotics and Automation (ICRA’15).Google ScholarGoogle Scholar
  33. F. Monti, D. Boscaini, J. Masci, E. Rodola, J. Svoboda, and M. M. Bronstein. 2017. Geometric deep learning on graphs and manifolds using mixture model CNNs. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).Google ScholarGoogle Scholar
  34. M. Novotni and R. Klein. 2003. 3D Zernike descriptors for content based shape retrieval. The 8th ACM Symposium on Solid Modeling and Applications. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. R. Ohbuchi and T. Furuya. 2010. Distance metric learning and feature combination for shape-based 3D model retrieval. Proc. 3DOR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. R. Osada, T. Funkhouser, B. Chazelle, and D. Dobkin. 2002. Shape distributions. ACM Transactions on Graphics 21, 4 (2002), 807--832. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Ovsjanikov, W. Li, L. Guibas, and N. J. Mitra. 2011. Exploration of continuous variability in collections of 3D shapes. ACM Transactions on Graphics 30, 4 (2011), 33:1--33:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. C. R. Qi, H. Su, K. Mo, and L. J. Guibas. 2017. PointNet: Deep learning on point sets for 3D classification and segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).Google ScholarGoogle Scholar
  39. C. R. Qi, H. Su, M. Niener, A. Dai, M. Yan, and L. J. Guibas. 2016. Volumetric and multi-view CNNs for object classification on 3D data. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 5648--5656.Google ScholarGoogle Scholar
  40. E. Rodola, S. Bulo, T. Windheuser, M. Vestner, and D. Cremers. 2014. Dense non-rigid shape correspondence using random forests. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. E. Rodola, L. Cosmo, O. Litany, M. M. Bronstein, A. M. Bronstein, N. Audebert, A. Ben Hamza, A. Boulch, U. Castellani, M. N. Do, A.-D. Duong, T. Furuya, A. Gasparetto, Y. Hong, J. Kim, B. Le Saux, R. Litman, M. Masoumi, G. Minello, H.-D. Nguyen, V.-T. Nguyen, R. Ohbuchi, V.-K. Pham, T. V. Phan, M. Rezaei, A. Torsello, M.-T. Tran, Q.-T. Tran, B. Truong, L. Wan, and C. Zou. 2017. Deformable shape retrieval with missing parts. In Eurographics Workshop on 3D Object Retrieval (3DOR’17).Google ScholarGoogle Scholar
  42. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2015), 211--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. D. Saupe and D. V. Vranic. 2001. 3D model retrieval with spherical harmonics and moments. In Symposium on Pattern Recognition. 392--397. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. M. Savva, F. Yu, Hao Su, M. Aono, B. Chen, D. Cohen-Or, W. Deng, Hang Su, S. Bai, X. Bai, N. Fish, J. Han, E. Kalogerakis, E. G. Learned-Miller, Y. Li, M. Liao, S. Maji, A. Tatsuma, Y. Wang, N. Zhang, and Z. Zhou. 2016. Large-scale 3D shape retrieval from shapenet core55. Eurographics Workshop on 3D Object Retrieval (3DOR’16). Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. L. Shapira, S. Shalom, A. Shamir, D. Cohen-Or, and H. Zhang. 2010. Contextual part analogies in 3D objects. International Journal of Computer Vision 89, 2--3 (2010), 309--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. E. Simo-Serra, E. Trulls, L. Ferraz, I. Kokkinos, P. Fua, and F. Moreno-Noguer. 2015. Discriminative learning of deep convolutional feature point descriptors. In IEEE International Conference on Computer Vision (ICCV’15). 9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. K. Simonyan and A. Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. CoRR.Google ScholarGoogle Scholar
  48. A. Sinha, J. Bai, and K. Ramani. 2016. Deep learning 3D shape surfaces using geometry images. European Conference on Computer Vision (ECCV’16).Google ScholarGoogle Scholar
  49. R. Socher, B. Huval, B. Bhat, C. D. Manning, and A. Y. Ng. 2012. Convolutional-recursive deep learning for 3D object classification. The Conference and Workshop on Neural Information Processing Systems (NIPS’12). 656--664. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. S. Song and J. Xiao. 2016. Deep sliding shapes for amodal 3d object detection in RGB-D images. European Conference on Computer Vision (ECCV’16).Google ScholarGoogle Scholar
  51. O. Sorkine and M. Alexa. 2007. As-rigid-as-possible surface modeling. In Proceedings of the Symposium on Geometry Processing (SGP’07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. H. Su, S. Maji, E. Kalogerakis, and E. G. Learned-Miller. 2015. Multi-view convolutional neural networks for 3D shape recognition. In Proceedings of ICCV. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. R. W. Sumner, J. Schmid, and M. Pauly. 2007. Embedded deformation for shape manipulation. ACM Trans. Graph. 26, 3 (2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. F. Tombari, S. Salti, and L. Di Stefano. 2010. Unique signatures of histograms for local surface description. European Conference on Computer Vision (ECCV’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. L. Wei, Q. Huang, D. Ceylan, E. Vouga, and H. Li. 2016. Dense human body correspondences using convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16).Google ScholarGoogle Scholar
  56. Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao. 2015. 3D shapenets: A deep representation for volumetric shapes. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 1912--1920.Google ScholarGoogle Scholar
  57. Y. Xian, B. Schiele, and Z. Akata. 2017. Zero-shot learning - The good, the bad and the ugly. CoRR (2017).Google ScholarGoogle Scholar
  58. J. Xie, Y. Fang, F. Zhu, and E. Wong. 2015. Deepshape: Deep learned shape descriptor for 3D shape matching and retrieval. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15).Google ScholarGoogle Scholar
  59. K. Xu, V. G. Kim, Q. Huang, N. Mitra, and E. Kalogerakis. 2016. Data-driven shape analysis and processing. In SIGGRAPH ASIA 2016 Courses (SA’16). ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. K. M. Yi, E. Trulls, V. Lepetit, and P. Fua. 2016. LIFT: Learned invariant feature transform. European Conference on Computer Vision (ECCV’16).Google ScholarGoogle Scholar
  61. L. Yi, V. G. Kim, D. Ceylan, I.-C. Shen, M. Yan, H. Su, C. Lu, Q. Huang, A. Sheffer, and L. Guibas. 2016. A scalable active framework for region annotation in 3D shape collections. ACM Transactions on Graphics 35, 6 (2016), 210:1--210:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. L. Yi, H. Su, X. Guo, and L. Guibas. 2017. Synchronized spectral CNN for 3D shape segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).Google ScholarGoogle Scholar
  63. A. Zeng, S. Song, M. Nießner, M. Fisher, J. Xiao, and T. Funkhouser. 2016. 3DMatch: Learning local geometric descriptors from RGB-D reconstructions. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).Google ScholarGoogle Scholar
  64. E. Zhang, K. Mischaikow, and G. Turk. 2005. Feature-based surface parameterization and texture mapping. ACM Transactions on Graphics 24, 1 (2005), 1--27. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning Local Shape Descriptors from Part Correspondences with Multiview Convolutional Networks

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Graphics
        ACM Transactions on Graphics  Volume 37, Issue 1
        February 2018
        167 pages
        ISSN:0730-0301
        EISSN:1557-7368
        DOI:10.1145/3151031
        Issue’s Table of Contents

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 16 November 2017
        • Accepted: 1 September 2017
        • Revised: 1 August 2017
        • Received: 1 May 2017
        Published in tog Volume 37, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader