research-article

Discriminative coupled dictionary hashing for fast cross-media retrieval

Authors:
Zhou Yu

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Fei Wu

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Yi Yang

the University of Queensland, Queensland, Australia

the University of Queensland, Queensland, Australia
View Profile

,
Qi Tian

University of Texas, San Antonio, USA

University of Texas, San Antonio, USA
View Profile

,
Jiebo Luo

University of Rochester, Rochester, USA

University of Rochester, Rochester, USA
View Profile

,
Yueting Zhuang

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrievalJuly 2014Pages 395–404https://doi.org/10.1145/2600428.2609563

Published:03 July 2014Publication History

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

Pages 395–404

ABSTRACT

Cross-media hashing, which conducts cross-media retrieval by embedding data from different modalities into a common low-dimensional Hamming space, has attracted intensive attention in recent years. The existing cross-media hashing approaches only aim at learning hash functions to preserve the intra-modality and inter-modality correlations, but do not directly capture the underlying semantic information of the multi-modal data. We propose a discriminative coupled dictionary hashing (DCDH) method in this paper. In DCDH, the coupled dictionary for each modality is learned with side information (e.g., categories). As a result, the coupled dictionaries not only preserve the intra-similarity and inter-correlation among multi-modal data, but also contain dictionary atoms that are semantically discriminative (i.e., the data from the same category is reconstructed by the similar dictionary atoms). To perform fast cross-media retrieval, we learn hash functions which map data from the dictionary space to a low-dimensional Hamming space. Besides, we conjecture that a balanced representation is crucial in cross-media retrieval. We introduce multi-view features on the relatively ``weak'' modalities into DCDH and extend it to multi-view DCDH (MV-DCDH) in order to enhance their representation capability. The experiments on two real-world data sets show that our DCDH and MV-DCDH outperform the state-of-the-art methods significantly on cross-media retrieval.

References

M. Aharon, M. Elad, and A. Bruckstein. K-svd: An algorithm for designing overcomplete dictionries for sparse representation. IEEE Trans.Signal Processing, 54(11):4311--4322, 2006. Google ScholarDigital Library
A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In FOCS, pages 459--468, 2006. Google ScholarDigital Library
M. Bronstein, A. Bronstein, F. Michel, and N. Paragios. Data fusion through cross-modality metric learning using similarity-sensitive hashing. In CVPR, pages 3594--3601, 2010.Google ScholarCross Ref
B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. The Annals of Sstatistics, 32(2):407--499, 2004.Google ScholarCross Ref
Y. Gong and S. Lazebnik. Iterative quantization: A procrustean approach to learning binary codes. In CVPR, pages 817--824, 2011. Google ScholarDigital Library
K. Jia, X. Tang, and X. Wang. Image transformation based on learning dictionaries across image spaces. IEEE Trans.Pattern Anal. Mach. Intell., 2012. Google ScholarDigital Library
Z. Jiang, G. Zhang, and L. S. Davis. Submodular dictionary learning for sparse coding. In CVPR, pages 3418--3425, 2012. Google ScholarDigital Library
B. Kulis and K. Grauman. Kernelized locality-sensitive hashing for scalable image search. In ICCV, pages 2130--2137, 2009.Google ScholarCross Ref
S. Kumar and R. Udupa. Learning hash functions for cross-view similarity search. In IJCAI, pages 1360--1365, 2011. Google ScholarDigital Library
M.-Y. Liu, O. Tuzel, S. Ramalingam, and R. Chellappa. Entropy rate superpixel segmentation. In CVPR, pages 2097--2104, 2011. Google ScholarDigital Library
W. Liu, J. Wang, S. Kumar, and S. Chang. Hashing with graphs. In ICML, pages 1--8, 2011.Google ScholarDigital Library
Y. Liu, F. Wu, Y. Yi, Y. Zhuang, and A. Hauptman. Spline regression hashing for fast image search. IEEE Trans. Image Processing, 2012.Google Scholar
X. Lu, F. Wu, S. Tang, Z. Zhang, X. He, and Y. Zhuang. A low rank structural large margin method for cross-modal ranking. In SIGIR, pages 433--442. ACM, 2013. Google ScholarDigital Library
G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher. An analysis of approximations for maximizing submodular set functions i. Mathematical Programming, 14(1):265--294, 1978.Google ScholarDigital Library
M. Ou, P. Cui, F. Wang, J. Wang, W. Zhu, and S. Yang. Comparing apples to oranges: a scalable solution with heterogeneous hashing. In SIGKDD, pages 230--238, 2013. Google ScholarDigital Library
N. Rasiwasia, J. Costa Pereira, E. Coviello, G. Doyle, G. Lanckriet, R. Levy, and N. Vasconcelos. A new approach to cross-modal multimedia retrieval. In ACM MM, pages 251--260, 2010. Google ScholarDigital Library
J. Song, Y. Yang, Z. Huang, H. Shen, and R. Hong. Multiple feature hashing for real-time large scale near-duplicate video retrieval. In ACM MM, pages 423--432, 2011. Google ScholarDigital Library
R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), pages 267--288, 1996.Google ScholarCross Ref
C. Wang and S. Mahadevan. A general framework for manifold alignment. In AAAI, 2009.Google Scholar
J. Wang, S. Kumar, and S. Chang. Semi-supervised hashing for scalable image retrieval. In CVPR, pages 3424--3431, 2010.Google ScholarCross Ref
Q. Wang, D. Zhang, and L. Si. Semantic hashing using tags and topic modeling. In SIGIR, pages 213--222, 2013. Google ScholarDigital Library
S. Wang, L. Zhang, Y. Liang, and Q. Pan. Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis. In CVPR, pages 2216--2223, 2012. Google ScholarDigital Library
Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. In NIPS, 2008.Google ScholarDigital Library
J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma. Robust face recognition via sparse representation. IEEE Trans.Pattern Anal. Mach. Intell., 31(2):210--227, 2009. Google ScholarDigital Library
F. Wu, Z. Yu, Y. Yang, S. Tang, Y. Zhang, and Y. Zhuang. Sparse multi modal hashing. IEEE Trans. Multimedia, 16(2):427--439.Google ScholarDigital Library
D. Zhang, F. Wang, and L. Si. Composite hashing with multiple information sources. In SIGIR, pages 225--234, 2011. Google ScholarDigital Library
D. Zhang, J. Wang, D. Cai, and J. Lu. Self-taught hashing for fast similarity search. In SIGIR, pages 18--25, 2010. Google ScholarDigital Library
Y. Zhen and D. Yeung. A probabilistic model for multimodal hash function learning. In SIGKDD, 2012. Google ScholarDigital Library
Y. Zhen and D.-Y. Yeung. Co-regularized hashing for multimodal data. In NIPS, pages 1385--1393, 2012.Google ScholarDigital Library
X. Zhu, Z. Huang, H. T. Shen, and X. Zhao. Linear cross-modal hashing for efficient multimedia search. In ACM MM, pages 143--152, 2013. Google ScholarDigital Library
Y. Zhuang, Y. Wang, F. Wu, Y. Zhang, and W. Lu. Supervised coupled dictionary learning with group structures for multi-modal retrieval. In AAAI, 2013.Google ScholarDigital Library
Y. Zhuang, Y. Yang, and F. Wu. Mining semantic correlation of heterogeneous multimedia data for cross-media retrieval. IEEE Trans. Multimedia, 10(2):221--229, 2008. Google ScholarDigital Library

Index Terms

Discriminative coupled dictionary hashing for fast cross-media retrieval
1. Information systems
  1. Information retrieval

Recommendations

Modality-Dependent Cross-Media Retrieval
Special Issue on Crowd in Intelligent Systems, Research Note/Short Paper and Regular Papers

In this article, we investigate the cross-media retrieval between images and text, that is, using image to search text (I2T) and using text to search images (T2I). Existing cross-media retrieval methods usually learn one couple of projections, by which ...
Read More
Cross-media Relevance Computation for Multimedia Retrieval
MM '17: Proceedings of the 25th ACM international conference on Multimedia

In this paper, we summarize our works for cross-media retrieval where the queries and retrieval content are of different media types. We study cross-media retrieval in the context of two applications, i.e., ~image retrieval by textual queries, and ...
Read More
Semi-supervised modality-dependent cross-media retrieval

In this paper, we propose a modality-dependent cross-media retrieval approach under semi-supervised conditions. The approach utilizes both labeled samples and unlabeled ones to obtain two couples of projection matrices and uses feature distance to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval
July 2014
1330 pages
ISBN:9781450322577
DOI:10.1145/2600428
General Chairs:
Shlomo Geva
Queensland University of Technology
,
Andrew Trotman
University of Dunedin
,
Program Chairs:
Peter Bruza
Queensland University of Technology
,
Charles L.A. Clarke
University of Waterloo
,
Kal Järvelin
University of Tampere
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 July 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
coupled dictionary learning
cross-media retrieval
hashing
Qualifiers
- research-article
Conference

Acceptance Rates
SIGIR '14 Paper Acceptance Rate82of387submissions,21%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 79
  Total Citations
  View Citations
- 1,117
  Total Downloads
- Downloads (Last 12 months)10
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Discriminative coupled dictionary hashing for fast cross-media retrieval

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Modality-Dependent Cross-Media Retrieval

Cross-media Relevance Computation for Multimedia Retrieval

Semi-supervised modality-dependent cross-media retrieval