skip to main content
10.1145/2600428.2609563acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Discriminative coupled dictionary hashing for fast cross-media retrieval

Authors Info & Claims
Published:03 July 2014Publication History

ABSTRACT

Cross-media hashing, which conducts cross-media retrieval by embedding data from different modalities into a common low-dimensional Hamming space, has attracted intensive attention in recent years. The existing cross-media hashing approaches only aim at learning hash functions to preserve the intra-modality and inter-modality correlations, but do not directly capture the underlying semantic information of the multi-modal data. We propose a discriminative coupled dictionary hashing (DCDH) method in this paper. In DCDH, the coupled dictionary for each modality is learned with side information (e.g., categories). As a result, the coupled dictionaries not only preserve the intra-similarity and inter-correlation among multi-modal data, but also contain dictionary atoms that are semantically discriminative (i.e., the data from the same category is reconstructed by the similar dictionary atoms). To perform fast cross-media retrieval, we learn hash functions which map data from the dictionary space to a low-dimensional Hamming space. Besides, we conjecture that a balanced representation is crucial in cross-media retrieval. We introduce multi-view features on the relatively ``weak'' modalities into DCDH and extend it to multi-view DCDH (MV-DCDH) in order to enhance their representation capability. The experiments on two real-world data sets show that our DCDH and MV-DCDH outperform the state-of-the-art methods significantly on cross-media retrieval.

References

  1. M. Aharon, M. Elad, and A. Bruckstein. K-svd: An algorithm for designing overcomplete dictionries for sparse representation. IEEE Trans.Signal Processing, 54(11):4311--4322, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In FOCS, pages 459--468, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Bronstein, A. Bronstein, F. Michel, and N. Paragios. Data fusion through cross-modality metric learning using similarity-sensitive hashing. In CVPR, pages 3594--3601, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  4. B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. The Annals of Sstatistics, 32(2):407--499, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  5. Y. Gong and S. Lazebnik. Iterative quantization: A procrustean approach to learning binary codes. In CVPR, pages 817--824, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. K. Jia, X. Tang, and X. Wang. Image transformation based on learning dictionaries across image spaces. IEEE Trans.Pattern Anal. Mach. Intell., 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Z. Jiang, G. Zhang, and L. S. Davis. Submodular dictionary learning for sparse coding. In CVPR, pages 3418--3425, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. B. Kulis and K. Grauman. Kernelized locality-sensitive hashing for scalable image search. In ICCV, pages 2130--2137, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  9. S. Kumar and R. Udupa. Learning hash functions for cross-view similarity search. In IJCAI, pages 1360--1365, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M.-Y. Liu, O. Tuzel, S. Ramalingam, and R. Chellappa. Entropy rate superpixel segmentation. In CVPR, pages 2097--2104, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. W. Liu, J. Wang, S. Kumar, and S. Chang. Hashing with graphs. In ICML, pages 1--8, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Y. Liu, F. Wu, Y. Yi, Y. Zhuang, and A. Hauptman. Spline regression hashing for fast image search. IEEE Trans. Image Processing, 2012.Google ScholarGoogle Scholar
  13. X. Lu, F. Wu, S. Tang, Z. Zhang, X. He, and Y. Zhuang. A low rank structural large margin method for cross-modal ranking. In SIGIR, pages 433--442. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher. An analysis of approximations for maximizing submodular set functions i. Mathematical Programming, 14(1):265--294, 1978.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Ou, P. Cui, F. Wang, J. Wang, W. Zhu, and S. Yang. Comparing apples to oranges: a scalable solution with heterogeneous hashing. In SIGKDD, pages 230--238, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. N. Rasiwasia, J. Costa Pereira, E. Coviello, G. Doyle, G. Lanckriet, R. Levy, and N. Vasconcelos. A new approach to cross-modal multimedia retrieval. In ACM MM, pages 251--260, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Song, Y. Yang, Z. Huang, H. Shen, and R. Hong. Multiple feature hashing for real-time large scale near-duplicate video retrieval. In ACM MM, pages 423--432, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), pages 267--288, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  19. C. Wang and S. Mahadevan. A general framework for manifold alignment. In AAAI, 2009.Google ScholarGoogle Scholar
  20. J. Wang, S. Kumar, and S. Chang. Semi-supervised hashing for scalable image retrieval. In CVPR, pages 3424--3431, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  21. Q. Wang, D. Zhang, and L. Si. Semantic hashing using tags and topic modeling. In SIGIR, pages 213--222, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Wang, L. Zhang, Y. Liang, and Q. Pan. Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis. In CVPR, pages 2216--2223, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. In NIPS, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma. Robust face recognition via sparse representation. IEEE Trans.Pattern Anal. Mach. Intell., 31(2):210--227, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. F. Wu, Z. Yu, Y. Yang, S. Tang, Y. Zhang, and Y. Zhuang. Sparse multi modal hashing. IEEE Trans. Multimedia, 16(2):427--439.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. Zhang, F. Wang, and L. Si. Composite hashing with multiple information sources. In SIGIR, pages 225--234, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. Zhang, J. Wang, D. Cai, and J. Lu. Self-taught hashing for fast similarity search. In SIGIR, pages 18--25, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Y. Zhen and D. Yeung. A probabilistic model for multimodal hash function learning. In SIGKDD, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Y. Zhen and D.-Y. Yeung. Co-regularized hashing for multimodal data. In NIPS, pages 1385--1393, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. X. Zhu, Z. Huang, H. T. Shen, and X. Zhao. Linear cross-modal hashing for efficient multimedia search. In ACM MM, pages 143--152, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Y. Zhuang, Y. Wang, F. Wu, Y. Zhang, and W. Lu. Supervised coupled dictionary learning with group structures for multi-modal retrieval. In AAAI, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Y. Zhuang, Y. Yang, and F. Wu. Mining semantic correlation of heterogeneous multimedia data for cross-media retrieval. IEEE Trans. Multimedia, 10(2):221--229, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Discriminative coupled dictionary hashing for fast cross-media retrieval

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval
      July 2014
      1330 pages
      ISBN:9781450322577
      DOI:10.1145/2600428

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 3 July 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SIGIR '14 Paper Acceptance Rate82of387submissions,21%Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader