Neural Codes for Image Retrieval

Babenko, Artem; Slesarev, Anton; Chigorin, Alexandr; Lempitsky, Victor

doi:10.1007/978-3-319-10590-1_38

Artem Babenko^19,21,
Anton Slesarev¹⁹,
Alexandr Chigorin¹⁹ &
…
Victor Lempitsky²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8689))

Included in the following conference series:

European Conference on Computer Vision

40k Accesses
491 Citations
13 Altmetric

Abstract

It has been shown that the activations invoked by an image within the top layers of a large convolutional neural network provide a high-level descriptor of the visual content of the image. In this paper, we investigate the use of such descriptors (neural codes) within the image retrieval application. In the experiments with several standard retrieval benchmarks, we establish that neural codes perform competitively even when the convolutional neural network has been trained for an unrelated classification task (e.g. Image-Net). We also evaluate the improvement in the retrieval performance of neural codes, when the network is retrained on a dataset of images that are similar to images encountered at test time.

We further evaluate the performance of the compressed neural codes and show that a simple PCA compression provides very good short codes that give state-of-the-art accuracy on a number of datasets. In general, neural codes turn out to be much more resilient to such compression in comparison other state-of-the-art descriptors. Finally, we show that discriminative dimensionality reduction trained on a dataset of pairs of matched photographs improves the performance of PCA-compressed neural codes even further. Overall, our quantitative experiments demonstrate the promise of neural codes as visual descriptors for image retrieval.

Download to read the full chapter text

Chapter PDF

Learned features versus engineered features for multimedia indexing

Article 24 December 2016

A deep learning based multi-image compression technique

Article 22 April 2024

Convolutional Patch Representations for Image Retrieval: An Unsupervised Approach

Article 12 July 2016

Keywords

References

Berg, A., Deng, J., Fei-Fei, L.: Large scale visual recognition challenge, ILSVRC (2010), http://www.image-net.org/challenges/LSVRC/2010/
Arandjelović, R., Zisserman, A.: All about VLAD. In: Computer Vision and Pattern Recognition (2013)
Google Scholar
Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: Computer Vision and Pattern Recognition (2005)
Google Scholar
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: Decaf: A deep convolutional activation feature for generic visual recognition. CoRR abs/1310.1531 (2013)
Google Scholar
Fischler, M.A., Bolles, R.C.: Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM (1981)
Google Scholar
Ge, T., Ke, Q., Sun, J.: Sparse-coded features for image retrieval. In: British Machine Vision Conference (2013)
Google Scholar
Gordo, A., Rodríguez-Serrano, J.A., Perronnin, F., Valveny, E.: Leveraging category-level labels for instance-level image retrieval. In: Computer Vision and Pattern Recognition (2012)
Google Scholar
Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008)
Chapter Google Scholar
Jégou, H., Zisserman, A.: Triangulation embedding and democratic aggregation for image search. In: Computer Vision and Pattern Recognition (2014)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Neural Information Processing Systems (2012)
Google Scholar
Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Attribute and simile classifiers for face verification. In: International Conference on Computer Vision, pp. 365–372 (2009)
Google Scholar
Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view rgb-d object dataset. In: Neural Information Processing Systems (2011)
Google Scholar
LeCun, Y., Boser, B.E., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W.E., Jackel, L.D.: Handwritten digit recognition with a back-propagation network. In: Neural Information Processing Systems, pp. 396–404 (1989)
Google Scholar
Li, Y., Crandall, D., Huttenlocher, D.: Landmark classification in large-scale image collections. In: International Conference on Computer Vision (2009)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision (2004)
Google Scholar
Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: Computer Vision and Pattern Recognition (2006)
Google Scholar
Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: Computer Vision and Pattern Recognition (June 2014)
Google Scholar
Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Chapter Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Computer Vision and Pattern Recognition (2007)
Google Scholar
Razavian, A.S., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. CoRR (2014)
Google Scholar
Simonyan, K., Parkhi, O.M., Vedaldi, A., Zisserman, A.: Fisher Vector Faces in the Wild. In: British Machine Vision Conference (2013)
Google Scholar
Tolias, G., Avrithis, Y., Jégou, H.: To aggregate or not to aggregate: selective match kernels for image search. In: International Conference on Computer Vision (2013)
Google Scholar
Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient object category recognition using classemes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 776–789. Springer, Heidelberg (2010)
Chapter Google Scholar
Wang, G., Hoiem, D., Forsyth, D.A.: Learning image similarity from flickr groups using stochastic intersection kernel machines. In: International Conference on Computer Vision (2009)
Google Scholar
Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: Large-scale scene recognition from abbey to zoo. In: Computer Vision and Pattern Recognition (2010)
Google Scholar
Yang, L., Jin, R.: Distance metric learning: A comprehensive survey, vol. 2. Michigan State Universiy (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Yandex, Russia
Artem Babenko, Anton Slesarev & Alexandr Chigorin
Skolkovo Institute of Science and Technology (Skoltech), Russia
Victor Lempitsky
Moscow Institute of Physics and Technology, Russia
Artem Babenko

Authors

Artem Babenko
View author publications
You can also search for this author in PubMed Google Scholar
Anton Slesarev
View author publications
You can also search for this author in PubMed Google Scholar
Alexandr Chigorin
View author publications
You can also search for this author in PubMed Google Scholar
Victor Lempitsky
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Toronto, 6 King’s College Road, M5H 3S5, Toronto, ON, Canada
David Fleet
Faculty of Electrical Engineering, Department of Cybernetics, Czech Technical University in Prague, Technicka 2, 166 27, Prague 6, Czech Republic
Tomas Pajdla
Max-Planck-Institut für Informatik, Campus E1 4, 66123, Saarbrücken, Germany
Bernt Schiele
PSI, iMinds, KU Leuven, ESAT, Kasteelpark Arenberg 10, Bus 2441, 3001, Leuven, Belgium
Tinne Tuytelaars

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V. (2014). Neural Codes for Image Retrieval. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8689. Springer, Cham. https://doi.org/10.1007/978-3-319-10590-1_38

Download citation

DOI: https://doi.org/10.1007/978-3-319-10590-1_38
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10589-5
Online ISBN: 978-3-319-10590-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Neural Codes for Image Retrieval

Abstract

Chapter PDF

Similar content being viewed by others

Learned features versus engineered features for multimedia indexing

A deep learning based multi-image compression technique

Convolutional Patch Representations for Image Retrieval: An Unsupervised Approach

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Neural Codes for Image Retrieval

Abstract

Chapter PDF

Similar content being viewed by others

Learned features versus engineered features for multimedia indexing

A deep learning based multi-image compression technique

Convolutional Patch Representations for Image Retrieval: An Unsupervised Approach

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation