short-paper

Dictionary Learning Based Hashing for Cross-Modal Retrieval

Author:
Xin-Shun Xu

Shandong University, Jinan, China

Shandong University, Jinan, China
View Profile

MM '16: Proceedings of the 24th ACM international conference on MultimediaOctober 2016Pages 177–181https://doi.org/10.1145/2964284.2967206

Published:01 October 2016Publication History

MM '16: Proceedings of the 24th ACM international conference on Multimedia

Pages 177–181

ABSTRACT

Recent years have witnessed the growing popularity of cross-modal hashing for fast multi-modal data retrieval. Most existing cross-modal hashing methods project heterogeneous data directly into a common space with linear projection matrices. However, such scheme will lead to large error as there will probably be some heterogeneous data with semantic similarity hard to be close in latent space when linear projection is used. In this paper, we propose a dictionary learning cross-modal hashing (DLCMH) to perform cross-modal similarity search. Instead of projecting data directly, DLCMH learns dictionaries and generates sparse representation for each instance, which is more suitable to be projected to latent space. Then, it assumes that all modalities of one instance have identical hash codes, and gets final binary codes by minimizing quantization error. Experimental results on two real-world datasets show that DLCMH outperforms or is comparable to several state-of-the-art hashing models.

References

D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, 2003. Google ScholarDigital Library
M. M. Bronstein, A. M. Bronstein, F. Michel, and N. Paragios. Data fusion through cross-modality metric learning using similarity-sensitive hashing. In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pages 3594--3601, 2010.Google ScholarCross Ref
T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. Nus-wide: a real-world web image database from national university of singapore. In Proceedings of ACM International Conference on Image and Video Retrieval, page 48, 2009. Google ScholarDigital Library
G. Ding, Y. Guo, and J. Zhou. Collective matrix factorization hashing for multimodal data. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pages 2075--2082, 2014. Google ScholarDigital Library
S. Kim, Y. Kang, and S. Choi. Sequential spectral learning to hash with multiple representations. In Proceedings of European Conference on Computer Vision, pages 538--551, 2012. Google ScholarDigital Library
S. Kumar and R. Udupa. Learning hash functions for cross-view similarity search. In Proceedings of International Joint Conference on Artificial Intelligence, pages 1360--1365, 2011. Google ScholarDigital Library
Z. Lin, G. Ding, M. Hu, and J. Wang. Semantics-preserving hashing for cross-view retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3864--3872, 2015.Google ScholarCross Ref
W. Liu, J. Wang, R. Ji, Y.-G. Jiang, and S.-F. Chang. Supervised hashing with kernels. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pages 2074--2081, 2012. Google ScholarDigital Library
M. Ou, P. Cui, F. Wang, J. Wang, W. Zhu, and S. Yang. Comparing apples to oranges: a scalable solution with heterogeneous hashing. In Proceedings of ACM International Conference on Knowledge Discovery and Data Mining, pages 230--238, 2013. Google ScholarDigital Library
N. Rasiwasia, J. Costa Pereira, E. Coviello, G. Doyle, G. R. Lanckriet, R. Levy, and N. Vasconcelos. A new approach to cross-modal multimedia retrieval. In Proceedings of ACM International Conference on Multimedia, pages 251--260, 2010. Google ScholarDigital Library
J. Song, Y. Yang, Y. Yang, Z. Huang, and H. T. Shen. Inter-media hashing for large-scale retrieval from heterogeneous data sources. In Proceedings of ACM International Conference on Management of Data, pages 785--796, 2013. Google ScholarDigital Library
D. Wang, X. Gao, X. Wang, and L. He. Semantic topic multimodal hashing for cross-media retrieval. In Proceedings of International Joint Conference on Artificial Intelligence, pages 3890--3896, 2015. Google ScholarDigital Library
J. Wang, X.-S. Xu, S. Guo, L. Cui, and X.-L. Wang. Linear unsupervised hashing for ann search in euclidean space. Neurocomputing, 171(C):283--292, 2016. Google ScholarDigital Library
S.-S. Wang, Z. Huang, and X.-S. Xu. A multi-label least-squares hashing for scalable image search. In Proceedings of SIAM International Conference on Data Mining, pages 954--962, 2015.Google ScholarCross Ref
Y. Yang, Z.-J. Zha, Y. Gao, X. Zhu, and T.-S. Chua. Exploiting web images for robust semantic video indexing via sample-specific loss. IEEE Transactions on Multimedia, 16(6):1677--1689, 2014.Google ScholarCross Ref
Y. Yang, H. Zhang, M. Zhang, F. Shen, and X. Li. Visual coding in a semantic hierarchy. In Proceedings of ACM International Conference on Multimedia, pages 59--68, 2015. Google ScholarDigital Library
Z. Yu, F. Wu, Y. Yang, Q. Tian, J. Luo, and Y. Zhuang. Discriminative coupled dictionary hashing for fast cross-media retrieval. In Proceedings of ACM International Conference on Research and Development in Information Retrieval, pages 395--404, 2014. Google ScholarDigital Library
D. Zhai, H. Chang, Y. Zhen, X. Liu, X. Chen, and W. Gao. Parametric local multimodal hashing for cross-view similarity search. In Proceedings of International Joint Conference on Artificial Intelligence, pages 2754--2760, 2013. Google ScholarDigital Library
D. Zhang and W.-J. Li. Large-scale supervised multimodal hashing with semantic correlation maximization. In Proceedings of AAAI Conference on Artificial Intelligence, pages 2177--2183, 2014. Google ScholarDigital Library
D. Zhang, F. Wang, and L. Si. Composite hashing with multiple information sources. In Proceedings of ACM SIGIR International Conference on Research and Development in Information Retrieval, pages 225--234, 2011. Google ScholarDigital Library
Y. Zhen and D.-Y. Yeung. Co-regularized hashing for multimodal data. In Advances in Neural Information Processing Systems 25, pages 1376--1384, 2012. Google ScholarDigital Library
Y. Zhen and D.-Y. Yeung. A probabilistic model for multimodal hash function learning. In Proceedings of ACM International Conference on Knowledge Discovery and Data Mining, pages 940--948, 2012. Google ScholarDigital Library
J. Zhou, G. Ding, and Y. Guo. Latent semantic sparse hashing for cross-modal similarity search. In Proceedings of ACM International Conference on Research and Development in Information Retrieval, pages 415--424, 2014. Google ScholarDigital Library

Index Terms

Dictionary Learning Based Hashing for Cross-Modal Retrieval
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Visual content-based indexing and retrieval
  2. Machine learning

Recommendations

Dictionary Learning based Supervised Discrete Hashing for Cross-Media Retrieval
ICMR '18: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval

Hashing technique has attracted considerable attention for large-scale multimedia retrieval due to its low storage cost and fast query speed. Moreover, many hashing models have been proposed for cross-modal retrieval task. However, there are still some ...
Read More
Data-Aware Proxy Hashing for Cross-modal Retrieval
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Recently, numerous proxy hash code based methods, which sufficiently exploit the label information of data to supervise the training of hashing models, have been proposed. Although these methods have made impressive progress, their generating processes ...
Read More
Neighborhood-Preserving Hashing for Large-Scale Cross-Modal Search
MM '16: Proceedings of the 24th ACM international conference on Multimedia

In the literature of cross-modal search, most methods employ linear models to pursue hash codes that preserve data similarity, in terms of Euclidean distance, both within-modal and across-modal. However, data dimensionality can be quite different across ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '16: Proceedings of the 24th ACM international conference on Multimedia
October 2016
1542 pages
ISBN:9781450336031
DOI:10.1145/2964284
General Chairs:
Alan Hanjalic
Delft University of Technology
,
Cees Snoek
Qualcomm Research Netherlands / University of Amsterdam
,
Marcel Worring
University of Amsterdam
,
Moderator:
Dick Bulterman
CWI / VU University Amsterdam
,
Program Chairs:
Benoit Huet
EURECOM
,
Aisling Kelliher
Virginia Tech
,
Yiannis Kompatsiaris
CERTH-ITI
,
Jin Li
Microsoft
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 October 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
cross-modal
dictionary learning
hashing
sparse representation
Qualifiers
- short-paper
Conference

Acceptance Rates
MM '16 Paper Acceptance Rate52of237submissions,22%Overall Acceptance Rate995of4,171submissions,24%
More
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 17
  Total Citations
  View Citations
- 470
  Total Downloads
- Downloads (Last 12 months)12
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Dictionary Learning Based Hashing for Cross-Modal Retrieval

MM '16: Proceedings of the 24th ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Dictionary Learning based Supervised Discrete Hashing for Cross-Media Retrieval

Data-Aware Proxy Hashing for Cross-modal Retrieval

Neighborhood-Preserving Hashing for Large-Scale Cross-Modal Search