skip to main content
research-article

Mixed image-keyword query adaptive hashing over multilabel images

Published:14 February 2014Publication History
Skip Abstract Section

Abstract

This article defines a new hashing task motivated by real-world applications in content-based image retrieval, that is, effective data indexing and retrieval given mixed query (query image together with user-provided keywords). Our work is distinguished from state-of-the-art hashing research by two unique features: (1) Unlike conventional image retrieval systems, the input query is a combination of an exemplar image and several descriptive keywords, and (2) the input image data are often associated with multiple labels. It is an assumption that is more consistent with the realistic scenarios. The mixed image-keyword query significantly extends traditional image-based query and better explicates the user intention. Meanwhile it complicates semantics-based indexing on the multilabel data. Though several existing hashing methods can be adapted to solve the indexing task, unfortunately they all prove to suffer from low effectiveness. To enhance the hashing efficiency, we propose a novel scheme “boosted shared hashing”. Unlike prior works that learn the hashing functions on either all image labels or a single label, we observe that the hashing function can be more effective if it is designed to index over an optimal label subset. In other words, the association between labels and hash bits are moderately sparse. The sparsity of the bit-label association indicates greatly reduced computation and storage complexities for indexing a new sample, since only limited number of hashing functions will become active for the specific sample. We develop a Boosting style algorithm for simultaneously optimizing both the optimal label subsets and hashing functions in a unified formulation, and further propose a query-adaptive retrieval mechanism based on hash bit selection for mixed queries, no matter whether or not the query words exist in the training data. Moreover, we show that the proposed method can be easily extended to the case where the data similarity is gauged by nonlinear kernel functions. Extensive experiments are conducted on standard image benchmarks like CIFAR-10, NUS-WIDE and a-TRECVID. The results validate both the sparsity of the bit-label association and the convergence of the proposed algorithm, and demonstrate that the proposed hashing scheme achieves substantially superior performances over state-of-the-art methods under the same hash bit budget.

References

  1. J. L. Bentley. 1975. Multidimensional binary search trees used for associative searching. Commun. ACM 18, 509--517. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Broder, M. Charikar, A. Frieze, and M. Mitzenmacher. 1998. Min-wise independent permutations. In Proceedings of the Symposium on Theory of Computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Caruana. 1997. Multitask learning. Mach. Learn. 28, 41--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Chatfield, V. Lempitsky, A. Vedaldi, and A. Zisserman. 2011. The devil is in the details: An evaluation of recent feature encoding methods. In Proceedings of the British Machine Vision Conference.Google ScholarGoogle Scholar
  5. T. -S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. 2009. Nus-wide: A real-world web image database from National University of Singapore. In Proceedings of the ACM Conference on Image and Video Retrieval. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni. 2004. Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the Symposium on Computational Geometry. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  8. E. Elhamifar and R. Vidal. 2009. Sparse subspace clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2790--2797.Google ScholarGoogle Scholar
  9. Y. Freund and R. E. Schapire. 1995. A decision-theoretic generalization of on-line learning and an application to boosting. In Proceedings of the 2nd European Conference on Computational Learning Theory (EuroCOLT'95). Springer, 23--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Friedman, T. Hastie, and R. Tibshirani. 1998. Additive logistic regression: a statistical view of boosting. In Annals of Statistics.Google ScholarGoogle Scholar
  11. Y. Gong and S. Lazebnik. 2011. Iterative quantization: A Procrustean approach to learning binary codes. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 817--824. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. He, S. Kumar, and S.-F. Chang. 2012. On the difficulty of nearest neighbor search. In Proceedings of the International Conference on Machine Learning.Google ScholarGoogle Scholar
  13. J. He, W. Liu, and S.-F. Chang. 2010. Scalable similarity search with optimized kernel hashing. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. P. Indyk. 2004. Nearest neighbours in high-dimensional spaces. In Handbook of Discrete and Computational Geometry 2nd Ed., E. Goodman and J. O'Rourke, Eds. Chapter 39, CRC Press.Google ScholarGoogle Scholar
  15. P. Indyk and R. Motwani. 1998. Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the ACM Symposium on Theory of Computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Ji, L. Tang, S. Yu, and J. Ye. 2010. A shared-subspace learning framework for multi-label classification. ACM Trans. Knowl. Discov. Data 4, 8:1--8:29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Krizhevsky. 2009. Learning multiple layers of features from tiny images. Tech. rep.Google ScholarGoogle Scholar
  18. B. Kulis and T. Darrell. 2009. Learning to hash with binary reconstructive embeddings. In Advances in Neural Information Processing Systems.Google ScholarGoogle Scholar
  19. B. Kulis and K. Grauman. 2009. Kernelized locality-sensitive hashing for scalable image search. In Proceedings of the IEEE International Conference on Computer Vision.Google ScholarGoogle Scholar
  20. S. Lazebnik, C. Schmid, and J. Ponce. 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR'06). IEEE, 2169--2178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. H. Lee, A. Battle, R. Raina, and A. Y. Ng. 2007. Efficient sparse coding algorithms. In Advances in Neural Information Processing Systems, B. Schölkopf, J. Platt, and T. Hoffman, Eds., 801--808.Google ScholarGoogle Scholar
  22. P. Li, M. Wang, J. Cheng, C. Xu, and H. Lu. 2013. Spectral hashing with semantically consistent graph for image indexing. IEEE Trans. Multimedia 15, 1, 141--152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. W. Liu, J. Wang, R. Ji, Y.-G. Jiang, and S.-F. Chang. 2012a. Supervised hashing with kernels. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. W. Liu, J. Wang, S. Kumar, and S.-F. Chang. 2011. Hashing with graphs. In Proceedings of the International Conference on Machine Learning.Google ScholarGoogle Scholar
  25. X. Liu, J. He, D. Liu, and B. Lang. 2012b. Compact kernel hashing with multiple features. In Proceedings of the ACM International Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. X. Liu, Y. Mu, B. Lang, and S.-F. Chang. 2012c. Compact hashing for mixed image-keyword query over multi-label images. In Proceedings of the ACM International Conference on Multimedia Retrieval. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Y. Mu, X. Chen, T.-S. Chua, and S. Yan. 2011. Learning reconfigurable hashing for diverse semantics. In Proceedings of the ACM International Conference on Multimedia Retrieval. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Y. Mu, X. Chen, X. Liu, T.-S. Chua, and S. Yan. 2012. Multimedia semantics-aware query-adaptive hashing with bits reconfigurability. Int. J. Multimedia Information Retrieval, 1--12.Google ScholarGoogle Scholar
  29. Y. Mu, J. Shen, and S. Yan. 2010. Weakly-supervised hashing in kernel space. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  30. A. Oliva and A. Torralba. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vision 42, 145--175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Y. Pati, R. Rezaiifar, and P. S. Krishnaprasad. 1993. Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In Conference Record of the 27th Asilomar Conference on Signals, Systems and Computers. Vol. 1, 40--44.Google ScholarGoogle Scholar
  32. J. Song, Y. Yang, Z. Huang, H. T. Shen, and R. Hong. 2011. Multiple feature hashing for real-time large scale near-duplicate video retrieval. In Proceedings of the ACM International Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. Torralba, K. Murphy, and W. Freeman. 2004. Sharing features: efficient boosting procedures for multiclass object detection. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. A. Vedaldi and B. Fulkerson. 2008. VLFeat: An open and portable library of computer vision algorithms. http://www.vlfeat.org/.Google ScholarGoogle Scholar
  35. J. Wang, S. Kumar, and S.-F. Chang. 2010a. Semi-supervised hashing for scalable image retrieval. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  36. J. Wang, S. Kumar, and S.-F. Chang. 2010b. Sequential projection learning for hashing with compact codes. In Proceedings of the InternationalConference on Machine Learning.Google ScholarGoogle Scholar
  37. J. Wang, S. Kumar, and S.-F. Chang. 2012. Semi-supervised hashing for large scale search. IEEE Trans. Pattern Anal. Mach. Intell. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. K. Q. Weinberger and L. K. Saul. 2009. Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207--244. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Y. Weiss, A. Torralba, and R. Fergus. 2008. Spectral hashing. In Advances in Neural Information Processing Systems.Google ScholarGoogle Scholar
  40. R. Yan, J. Tesic, and J. R. Smith. 2007a. Model-shared subspace boosting for multi-label classification. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. S. Yan, D. Xu, B. Zhang, H. Zhang, Q. Yang, and S. Lin. 2007b. Graph embedding and extensions: A general framework for dimensionality reduction. IEEE Trans. Pattern Anal. Mach. Intell. 29, 1, 40--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. F. Yu, R. Ji, M.-H. Tsai, G. Ye, and S.-F. Chang. 2012. Weak attributes for large-scale image retrieval. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. A. L. Yuille. 2002. The concave-convex procedure (cccp). In Advances in Neural Information Processing Systems. MIT Press.Google ScholarGoogle Scholar
  44. D. Zhang, F. Wang, and L. Si. 2011. Composite hashing with multiple information sources. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Mixed image-keyword query adaptive hashing over multilabel images

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Multimedia Computing, Communications, and Applications
        ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 10, Issue 2
        February 2014
        142 pages
        ISSN:1551-6857
        EISSN:1551-6865
        DOI:10.1145/2579228
        Issue’s Table of Contents

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 14 February 2014
        • Accepted: 1 September 2013
        • Revised: 1 May 2013
        • Received: 1 January 2013
        Published in tomm Volume 10, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader