skip to main content
10.1145/2911451.2911493acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article
Public Access

Composite Correlation Quantization for Efficient Multimodal Retrieval

Published:07 July 2016Publication History

ABSTRACT

Efficient similarity retrieval from large-scale multimodal database is pervasive in modern search engines and social networks. To support queries across content modalities, the system should enable cross-modal correlation and computation-efficient indexing. While hashing methods have shown great potential in achieving this goal, current attempts generally fail to learn isomorphic hash codes in a seamless scheme, that is, they embed multiple modalities in a continuous isomorphic space and separately threshold embeddings into binary codes, which incurs substantial loss of retrieval accuracy. In this paper, we approach seamless multimodal hashing by proposing a novel Composite Correlation Quantization (CCQ) model. Specifically, CCQ jointly finds correlation-maximal mappings that transform different modalities into isomorphic latent space, and learns composite quantizers that convert the isomorphic latent features into compact binary codes. An optimization framework is devised to preserve both intra-modal similarity and inter-modal correlation through minimizing both reconstruction and quantization errors, which can be trained from both paired and partially paired data in linear time. A comprehensive set of experiments clearly show the superior effectiveness and efficiency of CCQ against the state of the art hashing methods for both unimodal and cross-modal retrieval.

References

  1. A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In FOCS. IEEE, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Babenko and V. Lempitsky. The inverted multi-index. In CVPR, pages 3069--3076. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Babenko and V. Lempitsky. Additive quantization for extreme vector compression. In CVPR. IEEE, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Besag. On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society, 48(3):259--320, 1986.Google ScholarGoogle Scholar
  5. M. Bronstein, A. Bronstein, F. Michel, and N. Paragios. Data fusion through cross-modality metric learning using similarity-sensitive hashing. In CVPR. IEEE, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  6. T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y.-T. Zheng. Nus-wide: A real-world web image database from national university of singapore. In CIVR. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6):391--407, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  8. F. Feng, X. Wang, and R. Li. Cross-modal retrieval with correspondence autoencoder. In MM. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Y. Gong and S. Lazebnik. Iterative quantization: A procrustean approach to learning binary codes. In CVPR, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Y. Hu, Z. Jin, H. Ren, D. Cai, and X. He. Iterative multi-view hashing for cross media indexing. In MM. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. J. Huiskes and M. S. Lew. The mir flickr retrieval evaluation. In ICMR. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. H. Jegou, M. Douze, and C. Schmid. Product quantization for nearest neighbor search. TPAMI, 33(1):117--128, Jan 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Kumar and R. Udupa. Learning hash functions for cross-view similarity search. In IJCAI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Z. Lin, G. Ding, M. Hu, and J. Wang. Semantics-preserving hashing for cross-view retrieval. In CVPR, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  15. X. Lu, F. Wu, S. Tang, Z. Zhang, X. He, and Y. Zhuang. A low rank structural large margin method for cross-modal ranking. In SIGIR. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Ma, Z. Lu, L. Shang, and H. Li. Multimodal convolutional neural networks for matching image and sentence. In ICCV, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Norouzi and D. J. Fleet. Cartesian k-means. In CVPR. IEEE, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. C. Pereira, E. Coviello, G. Doyle, N. Rasiwasia, G. R. G. Lanckriet, R. Levy, and N. Vasconcelos. On the role of correlation and abstraction in cross-modal multimedia retrieval. TPAMI, 36(3):521--535, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. P. H. Schönemann. A generalized solution of the orthogonal procrustes problem. Psychometrika, 31(1):1--10, 1966.Google ScholarGoogle ScholarCross RefCross Ref
  20. J. Song, Y. Yang, Y. Yang, Z. Huang, and H. T. Shen. Inter-media hashing for large-scale retrieval from heterogeneous data sources. In SIGMOD. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. N. Srivastava and R. Salakhutdinov. Multimodal learning with deep boltzmann machines. JMLR, 15:2949--2980, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Wang, H. T. Shen, J. Song, and J. Ji. Hashing for similarity search: A survey. Arxiv, 2014.Google ScholarGoogle Scholar
  23. Q. Wang, L. Si, and B. Shen. Learning to hash on partial multi-modal data. In IJCAI, pages 3904--3910, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. W. Wang, B. C. Ooi, X. Yang, D. Zhang, and Y. Zhuang. Effective multi-modal retrieval based on stacked auto-encoders. In VLDB. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. In NIPS, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. B. Wu, Q. Yang, W.-S. Zheng, Y. Wang, and J. Wang. Quantized correlation hashing for fast cross-modal search. In IJCAI, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Z. Yu, F. Wu, Y. Yang, Q. Tian, J. Luo, and Y. Zhuang. Discriminative coupled dictionary hashing for fast cross-media retrieval. In SIGIR. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. D. Zhang and W.-J. Li. Large-scale supervised multimodal hashing with semantic correlation maximization. In AAAI, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. Zhang, F. Wang, and L. Si. Composite hashing with multiple information sources. In SIGIR. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. T. Zhang, C. Du, and J. Wang. Composite quantization for approximate nearest neighbor search. In ICML. ACM, 2014.Google ScholarGoogle Scholar
  31. F. Zhao, Y. Huang, L. Wang, and T. Tan. Deep semantic ranking based hashing for multi-label image retrieval. In CVPR, 2015.Google ScholarGoogle Scholar
  32. Y. Zhen and D.-Y. Yeung. Co-regularized hashing for multimodal data. In NIPS, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Y. Zhen and D.-Y. Yeung. A probabilistic model for multimodal hash function learning. In SIGKDD. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. X. Zhu, Z. Huang, H. T. Shen, and X. Zhao. Linear cross-modal hashing for efficient multimedia search. In MM. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Composite Correlation Quantization for Efficient Multimodal Retrieval

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
      July 2016
      1296 pages
      ISBN:9781450340694
      DOI:10.1145/2911451

      Copyright © 2016 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 July 2016

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SIGIR '16 Paper Acceptance Rate62of341submissions,18%Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader