skip to main content
10.1145/3123266.3123417acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Selective Deep Convolutional Features for Image Retrieval

Authors Info & Claims
Published:23 October 2017Publication History

ABSTRACT

Convolutional Neural Network (CNN) is a very powerful approach to extract discriminative local descriptors for effective image search. Recent work adopts fine-tuned strategies to further improve the discriminative power of the descriptors. Taking a different approach, in this paper, we propose a novel framework to achieve competitive retrieval performance. Firstly, we propose various masking schemes, namely SIFT-mask, SUM-mask, and MAX-mask, to select a representative subset of local convolutional features and remove a large number of redundant features. We demonstrate that this can effectively address the burstiness issue and improve retrieval accuracy. Secondly, we propose to employ recent embedding and aggregating methods to further enhance feature discriminability. Extensive experiments demonstrate that our proposed framework achieves state-of-the-art retrieval accuracy.

References

  1. Relja Arandjelović, Petr Gronat, Akihiko Torii, Tomas Pajdla, and Josef Sivic. 2016. NetVLAD: CNN architecture for weakly supervised place recognition CVPR.Google ScholarGoogle Scholar
  2. Relja Arandjelović and Andrew Zisserman. 2012. Three things everyone should know to improve object retrieval CVPR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Hossein Azizpour, Ali Sharif Razavian, Josephine Sullivan, Atsuto Maki, and Stefan Carlsson. 2015. From generic to specific deep representations for visual recognition CVPR Workshops.Google ScholarGoogle Scholar
  4. Artem Babenko and Victor Lempitsky. 2015. Aggregating Local Deep Features for Image Retrieval ICCV.Google ScholarGoogle Scholar
  5. Artem Babenko, Anton Slesarev, Alexandr Chigorin, and Victor Lempitsky. 2014. Neural codes for image retrieval. In ECCV.Google ScholarGoogle Scholar
  6. Y-Lan Boureau, Jean Ponce, and Yann Lecun. 2010. A Theoretical Analysis of Feature Pooling in Visual Recognition ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jiewei Cao, Zi Huang, Peng Wang, Chao Li, Xiaoshuai Sun, and Heng Tao Shen. 2016. Quartet-net Learning for Visual Instance Retrieval ACM MM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jonathan Delhumeau, Philippe-Henri Gosselin, Hervé Jégou, and Patrick Pérez. 2013. Revisiting the VLAD image representation. In ACM MM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Thanh-Toan Do and Ngai-Man Cheung. 2017. Embedding based on function approximation for large scale image search. TPAMI (2017).Google ScholarGoogle Scholar
  10. Thanh-Toan Do, Anh-Dzung Doan, and Ngai-Man Cheung. 2016. Learning to hash with binary deep neural network. ECCV.Google ScholarGoogle Scholar
  11. Thanh-Toan Do, Dang-Khoa Le Tan, Trung T Pham, and Ngai-Man Cheung. 2017. Simultaneous Feature Aggregating and Hashing for Large-scale Image Search CVPR.Google ScholarGoogle Scholar
  12. Thanh-Toan Do, Quang Tran, and Ngai-Man Cheung. 2015. FAemb: A function approximation-based embedding method for image retrieval CVPR.Google ScholarGoogle Scholar
  13. Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation CVPR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Yunchao Gong, Liwei Wang, Ruiqi Guo, and Svetlana Lazebnik. 2014. Multi-scale orderless pooling of deep convolutional activation features ECCV.Google ScholarGoogle Scholar
  15. Albert Gordo, Jon Almazan, Jerome Revaud, and Diane Larlus. 2016. Deep Image Retrieval: Learning Global Representations for Image Search ECCV.Google ScholarGoogle Scholar
  16. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep Residual Learning for Image Recognition. arXiv preprint arXiv:1512.03385 (2015).Google ScholarGoogle Scholar
  17. Hervé Jégou and Ondvrej Chum. 2012. Negative Evidences and Co-occurences in Image Retrieval: The Benefit of PCA and Whitening ECCV.Google ScholarGoogle Scholar
  18. Hervé Jégou, Matthijs Douze, and Cordelia Schmid. 2009. On the burstiness of visual elements. In CVPR.Google ScholarGoogle Scholar
  19. Hervé Jégou, Matthijs Douze, and Cordelia Schmid. 2010. Improving Bag-of-Features for Large Scale Image Search. IJCV, Vol. 87, 3 (May. 2010), 316--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Hervé Jégou, Matthijs Douze, Cordelia Schmid, and Patrick Pérez. 2010. Aggregating local descriptors into a compact image representation CVPR.Google ScholarGoogle Scholar
  21. Hervé Jégou and Andrew Zisserman. 2014. Triangulation embedding and democratic aggregation for image search CVPR.Google ScholarGoogle Scholar
  22. Yannis Kalantidis, Clayton Mellina, and Simon Osindero. 2016. Cross-dimensional Weighting for Aggregated Deep Convolutional Features ECCV Workshops.Google ScholarGoogle Scholar
  23. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ying Li, Xiangwei Kong, Liang Zheng, and Qi Tian. 2016. Exploiting Hierarchical Activations of Neural Network for Image Retrieval ACM MM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. David G. Lowe. 1999. Object Recognition from Local Scale-Invariant Features ICCV. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Romain Negrel, David Picard, and P Gosselin. 2013. Web scale image retrieval using compact tensor aggregation of visual descriptors MultiMedia, Vol. Vol. 20. IEEE, 24--33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Florent Perronnin and Christopher Dance. 2007. Fisher Kernels on Visual Vocabularies for Image Categorization CVPR.Google ScholarGoogle Scholar
  28. Florent Perronnin, Jorge Sánchez, and Thomas Mensink. 2010. Improving the fisher kernel for large-scale image classification ECCV. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. James Philbin, Ondrej Chum, Michael Isard, Josef Sivic, and Andrew Zisserman. 2007. Object retrieval with large vocabularies and fast spatial matching CVPR.Google ScholarGoogle Scholar
  30. James Philbin, Ondrej Chum, Michael Isard, Josef Sivic, and Andrew Zisserman. 2008. Lost in quantization: Improving particular object retrieval in large scale image databases CVPR.Google ScholarGoogle Scholar
  31. Filip Radenović, Giorgos Tolias, and Ondvrej Chum. 2016. CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples ECCV.Google ScholarGoogle Scholar
  32. Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. IJCV, Vol. 115, 3 (2015), 211--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google ScholarGoogle Scholar
  35. Josef Sivic, Andrew Zisserman, and others. 2003. Video Google: a text retrieval approach to object matching in videos ICCV. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In CVPR.Google ScholarGoogle Scholar
  37. Bart Thomee, David A Shamma, Gerald Friedland, Benjamin Elizalde, Karl Ni, Douglas Poland, Damian Borth, and Li-Jia Li. 2016. YFCC100M: The new data in multimedia research. Commun. ACM Vol. 59, 2 (2016), 64--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Giorgos Tolias, Yannis Avrithis, and Hervé Jégou. 2013. To Aggregate or Not to aggregate: Selective Match Kernels for Image Search ICCV. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Giorgos Tolias, Ronan Sicre, and Hervé Jégou. 2016. Particular object retrieval with integral max-pooling of CNN activations ICLR.Google ScholarGoogle Scholar
  40. Andrea Vedaldi and Brian Fulkerson. 2008. VLFeat: An Open and Portable Library of Computer Vision Algorithms. http://www.vlfeat.org/,. (2008).Google ScholarGoogle Scholar
  41. Andrea Vedaldi and Karel Lenc. 2014. MatConvNet - Convolutional Neural Networks for MATLAB. CoRR Vol. abs/1412.4564 (2014). http://arxiv.org/abs/1412.4564Google ScholarGoogle Scholar
  42. Ke Yan, Yaowei Wang, Dawei Liang, Tiejun Huang, and Yonghong Tian. 2016. CNN vs. SIFT for Image Retrieval: Alternative or Complementary? ACM MM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Kai Yu and Tong Zhang. 2010. Improved Local Coordinate Coding using Local Tangents ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Matthew D. Zeiler and Rob Fergus. 2013. Visualizing and Understanding Convolutional Networks. CoRR Vol. abs/1311.2901 (2013). http://arxiv.org/abs/1311.2901Google ScholarGoogle Scholar

Index Terms

  1. Selective Deep Convolutional Features for Image Retrieval

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MM '17: Proceedings of the 25th ACM international conference on Multimedia
      October 2017
      2028 pages
      ISBN:9781450349062
      DOI:10.1145/3123266

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 October 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      MM '17 Paper Acceptance Rate189of684submissions,28%Overall Acceptance Rate995of4,171submissions,24%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader