research-article

Selective Deep Convolutional Features for Image Retrieval

Authors:
Tuan Hoang

Singapore University of Technology and Design, Singapore, Singapore

Singapore University of Technology and Design, Singapore, Singapore
View Profile

,
Thanh-Toan Do

University of Adelaide, Adelaide, Australia

University of Adelaide, Adelaide, Australia
View Profile

,
Dang-Khoa Le Tan

Singapore University of Technology and Design, Singapore, Singapore

Singapore University of Technology and Design, Singapore, Singapore
View Profile

,
Ngai-Man Cheung

Singapore University of Technology and Design, Singapore, Singapore

Singapore University of Technology and Design, Singapore, Singapore
View Profile

MM '17: Proceedings of the 25th ACM international conference on MultimediaOctober 2017Pages 1600–1608https://doi.org/10.1145/3123266.3123417

Published:23 October 2017Publication History

MM '17: Proceedings of the 25th ACM international conference on Multimedia

Pages 1600–1608

ABSTRACT

Convolutional Neural Network (CNN) is a very powerful approach to extract discriminative local descriptors for effective image search. Recent work adopts fine-tuned strategies to further improve the discriminative power of the descriptors. Taking a different approach, in this paper, we propose a novel framework to achieve competitive retrieval performance. Firstly, we propose various masking schemes, namely SIFT-mask, SUM-mask, and MAX-mask, to select a representative subset of local convolutional features and remove a large number of redundant features. We demonstrate that this can effectively address the burstiness issue and improve retrieval accuracy. Secondly, we propose to employ recent embedding and aggregating methods to further enhance feature discriminability. Extensive experiments demonstrate that our proposed framework achieves state-of-the-art retrieval accuracy.

References

Relja Arandjelović, Petr Gronat, Akihiko Torii, Tomas Pajdla, and Josef Sivic. 2016. NetVLAD: CNN architecture for weakly supervised place recognition CVPR.Google Scholar
Relja Arandjelović and Andrew Zisserman. 2012. Three things everyone should know to improve object retrieval CVPR. Google ScholarDigital Library
Hossein Azizpour, Ali Sharif Razavian, Josephine Sullivan, Atsuto Maki, and Stefan Carlsson. 2015. From generic to specific deep representations for visual recognition CVPR Workshops.Google Scholar
Artem Babenko and Victor Lempitsky. 2015. Aggregating Local Deep Features for Image Retrieval ICCV.Google Scholar
Artem Babenko, Anton Slesarev, Alexandr Chigorin, and Victor Lempitsky. 2014. Neural codes for image retrieval. In ECCV.Google Scholar
Y-Lan Boureau, Jean Ponce, and Yann Lecun. 2010. A Theoretical Analysis of Feature Pooling in Visual Recognition ICML. Google ScholarDigital Library
Jiewei Cao, Zi Huang, Peng Wang, Chao Li, Xiaoshuai Sun, and Heng Tao Shen. 2016. Quartet-net Learning for Visual Instance Retrieval ACM MM. Google ScholarDigital Library
Jonathan Delhumeau, Philippe-Henri Gosselin, Hervé Jégou, and Patrick Pérez. 2013. Revisiting the VLAD image representation. In ACM MM. Google ScholarDigital Library
Thanh-Toan Do and Ngai-Man Cheung. 2017. Embedding based on function approximation for large scale image search. TPAMI (2017).Google Scholar
Thanh-Toan Do, Anh-Dzung Doan, and Ngai-Man Cheung. 2016. Learning to hash with binary deep neural network. ECCV.Google Scholar
Thanh-Toan Do, Dang-Khoa Le Tan, Trung T Pham, and Ngai-Man Cheung. 2017. Simultaneous Feature Aggregating and Hashing for Large-scale Image Search CVPR.Google Scholar
Thanh-Toan Do, Quang Tran, and Ngai-Man Cheung. 2015. FAemb: A function approximation-based embedding method for image retrieval CVPR.Google Scholar
Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation CVPR. Google ScholarDigital Library
Yunchao Gong, Liwei Wang, Ruiqi Guo, and Svetlana Lazebnik. 2014. Multi-scale orderless pooling of deep convolutional activation features ECCV.Google Scholar
Albert Gordo, Jon Almazan, Jerome Revaud, and Diane Larlus. 2016. Deep Image Retrieval: Learning Global Representations for Image Search ECCV.Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep Residual Learning for Image Recognition. arXiv preprint arXiv:1512.03385 (2015).Google Scholar
Hervé Jégou and Ondvrej Chum. 2012. Negative Evidences and Co-occurences in Image Retrieval: The Benefit of PCA and Whitening ECCV.Google Scholar
Hervé Jégou, Matthijs Douze, and Cordelia Schmid. 2009. On the burstiness of visual elements. In CVPR.Google Scholar
Hervé Jégou, Matthijs Douze, and Cordelia Schmid. 2010. Improving Bag-of-Features for Large Scale Image Search. IJCV, Vol. 87, 3 (May. 2010), 316--336. Google ScholarDigital Library
Hervé Jégou, Matthijs Douze, Cordelia Schmid, and Patrick Pérez. 2010. Aggregating local descriptors into a compact image representation CVPR.Google Scholar
Hervé Jégou and Andrew Zisserman. 2014. Triangulation embedding and democratic aggregation for image search CVPR.Google Scholar
Yannis Kalantidis, Clayton Mellina, and Simon Osindero. 2016. Cross-dimensional Weighting for Aggregated Deep Convolutional Features ECCV Workshops.Google Scholar
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks NIPS. Google ScholarDigital Library
Ying Li, Xiangwei Kong, Liang Zheng, and Qi Tian. 2016. Exploiting Hierarchical Activations of Neural Network for Image Retrieval ACM MM. Google ScholarDigital Library
David G. Lowe. 1999. Object Recognition from Local Scale-Invariant Features ICCV. Google ScholarDigital Library
Romain Negrel, David Picard, and P Gosselin. 2013. Web scale image retrieval using compact tensor aggregation of visual descriptors MultiMedia, Vol. Vol. 20. IEEE, 24--33. Google ScholarDigital Library
Florent Perronnin and Christopher Dance. 2007. Fisher Kernels on Visual Vocabularies for Image Categorization CVPR.Google Scholar
Florent Perronnin, Jorge Sánchez, and Thomas Mensink. 2010. Improving the fisher kernel for large-scale image classification ECCV. Google ScholarDigital Library
James Philbin, Ondrej Chum, Michael Isard, Josef Sivic, and Andrew Zisserman. 2007. Object retrieval with large vocabularies and fast spatial matching CVPR.Google Scholar
James Philbin, Ondrej Chum, Michael Isard, Josef Sivic, and Andrew Zisserman. 2008. Lost in quantization: Improving particular object retrieval in large scale image databases CVPR.Google Scholar
Filip Radenović, Giorgos Tolias, and Ondvrej Chum. 2016. CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples ECCV.Google Scholar
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks NIPS. Google ScholarDigital Library
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. IJCV, Vol. 115, 3 (2015), 211--252. Google ScholarDigital Library
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
Josef Sivic, Andrew Zisserman, and others. 2003. Video Google: a text retrieval approach to object matching in videos ICCV. Google ScholarDigital Library
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In CVPR.Google Scholar
Bart Thomee, David A Shamma, Gerald Friedland, Benjamin Elizalde, Karl Ni, Douglas Poland, Damian Borth, and Li-Jia Li. 2016. YFCC100M: The new data in multimedia research. Commun. ACM Vol. 59, 2 (2016), 64--73. Google ScholarDigital Library
Giorgos Tolias, Yannis Avrithis, and Hervé Jégou. 2013. To Aggregate or Not to aggregate: Selective Match Kernels for Image Search ICCV. Google ScholarDigital Library
Giorgos Tolias, Ronan Sicre, and Hervé Jégou. 2016. Particular object retrieval with integral max-pooling of CNN activations ICLR.Google Scholar
Andrea Vedaldi and Brian Fulkerson. 2008. VLFeat: An Open and Portable Library of Computer Vision Algorithms. http://www.vlfeat.org/,. (2008).Google Scholar
Andrea Vedaldi and Karel Lenc. 2014. MatConvNet - Convolutional Neural Networks for MATLAB. CoRR Vol. abs/1412.4564 (2014). http://arxiv.org/abs/1412.4564Google Scholar
Ke Yan, Yaowei Wang, Dawei Liang, Tiejun Huang, and Yonghong Tian. 2016. CNN vs. SIFT for Image Retrieval: Alternative or Complementary? ACM MM. Google ScholarDigital Library
Kai Yu and Tong Zhang. 2010. Improved Local Coordinate Coding using Local Tangents ICML. Google ScholarDigital Library
Matthew D. Zeiler and Rob Fergus. 2013. Visualizing and Understanding Convolutional Networks. CoRR Vol. abs/1311.2901 (2013). http://arxiv.org/abs/1311.2901Google Scholar

Index Terms

Selective Deep Convolutional Features for Image Retrieval
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations

Recommendations

From Selective Deep Convolutional Features to Compact Binary Representations for Image Retrieval

In the large-scale image retrieval task, the two most important requirements are the discriminability of image representations and the efficiency in computation and storage of representations. Regarding the former requirement, Convolutional Neural ...
Read More
Deep convolutional features for image retrieval
Highlights
- A comprehensive study that explores deep convolutional features for CBIR.
- The ...
Abstract
Nowadays, the use of Convolutional Neural Networks (CNNs) has led to tremendous achievements in several computer vision challenges. CNN-based image retrieval methods vary in complexity, growing capacity, and execution time. This work ...
Read More
Reproducibility Companion Paper: Selective Deep Convolutional Features for Image Retrieval
MM '20: Proceedings of the 28th ACM International Conference on Multimedia

In this companion paper, firstly, we briefly summarize the contributions of our main manuscript: Selective Deep Convolutional Features for Image Retrieval, published in ACM MultiMedia 2017. In addition, we provide detail instructions together with pre-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '17: Proceedings of the 25th ACM international conference on Multimedia
October 2017
2028 pages
ISBN:9781450349062
DOI:10.1145/3123266
General Chairs:
Qiong Liu
FXPAL, USA
,
Rainer Lienhart
Universität Augsburg, Germany
,
Haohong Wang
TCL America, USA
,
Program Chairs:
Sheng-Wei "Kuan-Ta" Chen
Academia Sinica, Taiwan
,
Susanne Boll
University of Oldenburg, Germany
,
Phoebe Chen
La Trobe University, Australia
,
Gerald Friedland
Lawrence Livermore National Lab, USA
,
Jia Li
Google, USA
,
Shuicheng Yan
Qihoo 360, China
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 October 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
aggregating
content based image retrieval
deep convolutional features
embedding
unsupervised
Qualifiers
- research-article
Conference

Acceptance Rates
MM '17 Paper Acceptance Rate189of684submissions,28%Overall Acceptance Rate995of4,171submissions,24%
More
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 42
  Total Citations
  View Citations
- 355
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Selective Deep Convolutional Features for Image Retrieval

MM '17: Proceedings of the 25th ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

From Selective Deep Convolutional Features to Compact Binary Representations for Image Retrieval

Deep convolutional features for image retrieval

Reproducibility Companion Paper: Selective Deep Convolutional Features for Image Retrieval