research-article

Online Multi-modal Hashing with Dynamic Query-adaption

Authors:
Xu Lu

Shandong Normal University, Jinan, China

Shandong Normal University, Jinan, China
View Profile

,
Lei Zhu

Shandong Normal University, Jinan, China

Shandong Normal University, Jinan, China
View Profile

,
Zhiyong Cheng

Qilu University of Technology, Jinan, China

Qilu University of Technology, Jinan, China
View Profile

,
Liqiang Nie

Shandong University, Jinan, China

Shandong University, Jinan, China
View Profile

,
Huaxiang Zhang

Shandong Normal University, Jinan, China

Shandong Normal University, Jinan, China
View Profile

SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information RetrievalJuly 2019Pages 715–724https://doi.org/10.1145/3331184.3331217

Published:18 July 2019Publication History

SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 715–724

ABSTRACT

Multi-modal hashing is an effective technique to support large-scale multimedia retrieval, due to its capability of encoding heterogeneous multi-modal features into compact and similarity-preserving binary codes. Although great progress has been achieved so far, existing methods still suffer from several problems, including: 1) All existing methods simply adopt fixed modality combination weights in online hashing process to generate the query hash codes. This strategy cannot adaptively capture the variations of different queries. 2) They either suffer from insufficient semantics (for unsupervised methods) or require high computation and storage cost (for the supervised methods, which rely on pair-wise semantic matrix). 3) They solve the hash codes with relaxed optimization strategy or bit-by-bit discrete optimization, which results in significant quantization loss or consumes considerable computation time. To address the above limitations, in this paper, we propose an Online Multi-modal Hashing with Dynamic Query-adaption (OMH-DQ) method in a novel fashion. Specifically, a self-weighted fusion strategy is designed to adaptively preserve the multi-modal feature information into hash codes by exploiting their complementarity. The hash codes are learned with the supervision of pair-wise semantic labels to enhance their discriminative capability, while avoiding the challenging symmetric similarity matrix factorization. Under such learning framework, the binary hash codes can be directly obtained with efficient operations and without quantization errors. Accordingly, our method can benefit from the semantic labels, and simultaneously, avoid the high computation complexity. Moreover, to accurately capture the query variations, at the online retrieval stage, we design a parameter-free online hashing module which can adaptively learn the query hash codes according to the dynamic query contents. Extensive experiments demonstrate the state-of-the-art performance of the proposed approach from various aspects.

Supplemental Material

cite4-14h30-d3.mp4

mp4

346.2 MB

Download

References

Micael Carvalho, Rémi Cadène, David Picard, Laure Soulier, Nicolas Thome, and Matthieu Cord. 2018. Cross-Modal Retrieval in the Cooking Context: Learning Semantic Text-Image Embeddings. In SIGIR. 135--44. Google ScholarDigital Library
Suthee Chaidaroon, Travis Ebesu, and Yi Fang. 2018. Deep Semantic Text Hashing with Weak Supervision. In SIGIR. 1109--1112. Google ScholarDigital Library
Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: a real-world web image database from National University of Singapore. In CIVR. 48. Google ScholarDigital Library
Guiguang Ding, Yuchen Guo, and Jile Zhou. 2014. Collective matrix factorization hashing for multimodal data. In CVPR. 2075--2082. Google ScholarDigital Library
Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2013. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. TPAMI, Vol. 35, 12 (2013), 2916--2929. Google ScholarDigital Library
Mark J. Huiskes and Michael S. Lew. 2008. The MIR flickr retrieval evaluation. In SIGMM. 39--43. Google ScholarDigital Library
Saehoon Kim and Seungjin Choi. 2013. Multi-view anchor graph hashing. In ICASSP. 3123--3127.Google Scholar
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In ECCV. 740--755.Google Scholar
Zhouchen Lin, Minming Chen, and Yi Ma. 2010. The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. arXiv preprint arXiv:1009.5055 (2010).Google Scholar
Zijia Lin, Guiguang Ding, Jungong Han, and Jianmin Wang. 2017. Cross-view retrieval via probability-based semantics-preserving hashing. TCYB, Vol. 47, 12 (2017), 4342--4355.Google Scholar
Han Liu, Xiangnan He, Fuli Feng, Liqiang Nie, Rui Liu, and Hanwang Zhang. 2018. Discrete Factorization Machines for Fast Feature-based Recommendation. In IJCAI. 3449--3455. Google ScholarDigital Library
Li Liu, Mengyang Yu, and Ling Shao. 2015. Multiview alignment hashing for efficient image search. TIP, Vol. 24, 3 (2015), 956--966.Google ScholarDigital Library
Xianglong Liu, Junfeng He, Di Liu, and Bo Lang. 2012. Compact kernel hashing with multiple features. In ACM MM. 881--884. Google ScholarDigital Library
Fuchen Long, Ting Yao, Qi Dai, Xinmei Tian, Jiebo Luo, and Tao Mei. 2018. Deep Domain Adaptation Hashing with Adversarial Learning. In SIGIR. 725--734. Google ScholarDigital Library
Katta G. Murty. 2013. Nonlinear Programming: Theory and Algorithms 3rd ed.). Wiley Publishing. Google ScholarDigital Library
Fumin Shen, Xin Gao, Li Liu, Yang Yang, and Heng Tao Shen. 2017. Deep Asymmetric Pairwise Hashing. In ACM MM. 1522--1530. Google ScholarDigital Library
Fumin Shen, Chunhua Shen, Wei Liu, and Heng Tao Shen. 2015. Supervised discrete hashing. In CVPR. 37--45.Google Scholar
Xiaobo Shen, Funmin Shen, Liliu, Yunhao Yuan, Weiwei Liu, and Quansen Sun. 2018. Multiview Discrete Hashing for Scalable Multimedia Search. ACM TIST, Vol. 9, 5 (2018), 53. Google ScholarDigital Library
Xiaobo Shen, Fumin Shen, Quan-Sen Sun, and Yunhao Yuan. 2015. Multi-view latent hashing for efficient multimedia search. In ACM MM. 831--834. Google ScholarDigital Library
Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR, Vol. abs/1409.1556 (2014).Google Scholar
Jingkuan Song, Yi Yang, Zi Huang, Heng Tao Shen, and Jiebo Luo. 2013. Effective multiple feature hashing for large-scale near-duplicate video retrieval. TMM, Vol. 15, 8 (2013), 1997--2008. Google ScholarDigital Library
Jingkuan Song, Yang Yang, Yi Yang, Zi Huang, and Heng Tao Shen. 2013. Inter-media hashing for large-scale retrieval from heterogeneous data sources. In SIGMOD. 785--796. Google ScholarDigital Library
Di Wang, Xinbo Gao, Xiumei Wang, and Lihuo He. 2018. Label Consistent Matrix Factorization Hashing for Large-Scale Cross-Modal Similarity Search. TPAMI (2018).Google Scholar
Jingdong Wang, Ting Zhang, Jingkuan Song, Nicu Sebe, and Heng Tao Shen. 2018. A Survey on Learning to Hash. TPAMI, Vol. 40, 4 (2018), 769--790.Google ScholarCross Ref
Liang Xie, Jialie Shen, Jungong Han, Lei Zhu, and Ling Shao. 2017. Dynamic Multi-View Hashing for Online Image Retrieval. In IJCAI. 3133--3139. Google ScholarDigital Library
Liang Xie, Jialie Shen, and Lei Zhu. 2016. Online Cross-Modal Hashing for Web Image Retrieval. In AAAI. 294--300. Google ScholarDigital Library
Erkun Yang, Cheng Deng, Wei Liu, Xianglong Liu, Dacheng Tao, and Xinbo Gao. 2017. Pairwise Relationship Guided Deep Hashing for Cross-Modal Retrieval. In AAAI. 1618--1625. Google ScholarDigital Library
Rui Yang, Yuliang Shi, and Xin-Shun Xu. 2017. Discrete Multi-view Hashing for Effective Image Retrieval. In ICMR. 175--783. Google ScholarDigital Library
Dongqing Zhang and Wu-Jun Li. 2014. Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization. In AAAI. 2177--2183. Google ScholarDigital Library
Dan Zhang, Fei Wang, and Luo Si. 2011. Composite hashing with multiple information sources. In SIGIR. 225--234. Google ScholarDigital Library
Hanwang Zhang, Fumin Shen, Wei Liu, Xiangnan He, Huanbo Luan, and Tat-Seng Chua. 2016. Discrete Collaborative Filtering. In SIGIR. 325--334. Google ScholarDigital Library
Hanwang Zhang, Meng Wang, Richang Hong, and Tat-Seng Chua. 2016. Play and Rewind: Optimizing Binary Representations of Videos by Self-Supervised Temporal Hashing. In MM. 781--790. Google ScholarDigital Library
Hanwang Zhang, Na Zhao, Xindi Shang, Huan-Bo Luan, and Tat-Seng Chua. 2016. Discrete Image Hashing Using Large Weakly Annotated Photo Collections. In AAAI. 3669--3675. Google ScholarDigital Library
Peichao Zhang, Wei Zhang, Wu-Jun Li, and Minyi Guo. 2014. Supervised hashing with latent factor models. In SIGIR. 173--182. Google ScholarDigital Library
Xi Zhang, Siyu Zhou, Jiashi Feng, Hanjiang Lai, Bo Li, Yan Pan, Jian Yin, and Shuicheng Yan. 2017. HashGAN: Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval. CoRR, Vol. abs/1711.09347 (2017).Google Scholar
Han Zhu, Mingsheng Long, Jianmin Wang, and Yue Cao. 2016. Deep Hashing Network for Efficient Similarity Retrieval. In AAAI. 2415--2421. Google ScholarDigital Library
Lei Zhu, Zi Huang, Xiaojun Chang, Jingkuan Song, and Heng Tao Shen. 2017. Exploring Consistent Preferences: Discrete Hashing with Pair-Exemplar for Scalable Landmark Search. In MM. 726--734. Google ScholarDigital Library
Lei Zhu, Zi Huang, Zhihui Li, Liang Xie, and Heng Tao Shen. 2018. Exploring Auxiliary Context: Discrete Semantic Transfer Hashing for Scalable Image Retrieval. TNNLS, Vol. 29, 11 (2018), 5264--5276.Google Scholar
Lei Zhu, Jialie Shen, Xiaobai Liu, Liang Xie, and Liqiang Nie. 2016. Learning Compact Visual Representation with Canonical Views for Robust Mobile Landmark Search. In IJCAI. 3959--3967. Google ScholarDigital Library
Lei Zhu, Jialie Shen, Liang Xie, and Zhiyong Cheng. 2017. Unsupervised visual hashing with semantic assistant for content-based image retrieval. TKDE, Vol. 29, 2 (2017), 472--486. Google ScholarDigital Library

Index Terms

Online Multi-modal Hashing with Dynamic Query-adaption
1. Information systems
  1. Information systems applications
    1. Data mining
      1. Nearest-neighbor search
    2. Multimedia information systems

Recommendations

Flexible Multi-modal Hashing for Scalable Multimedia Retrieval
Survey Paper and Regular Paper

Multi-modal hashing methods could support efficient multimedia retrieval by combining multi-modal features for binary hash learning at the both offline training and online query stages. However, existing multi-modal methods cannot binarize the queries, ...
Read More
Flexible Online Multi-modal Hashing for Large-scale Multimedia Retrieval
MM '19: Proceedings of the 27th ACM International Conference on Multimedia

Multi-modal hashing fuses multi-modal features at both offline training and online query stage for compact binary hash learning. It has aroused extensive attention in research filed of efficient large-scale multimedia retrieval. However, existing ...
Read More
Efficient Multi-modal Hashing with Online Query Adaption for Multimedia Retrieval
Multi-modal hashing supports efficient multimedia retrieval well. However, existing methods still suffer from two problems: (1) Fixed multi-modal fusion. They collaborate the multi-modal features with fixed weights for hash learning, which cannot ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2019
1512 pages
ISBN:9781450361729
DOI:10.1145/3331184
General Chairs:
Benjamin Piwowarski
CNRS - Sorbonne Universite, France
,
Max Chevalier
Universite de Toulouse, CNRS, France
,
Eric Gaussier
Universite Grenoble Alpes, CNRS, France
,
Program Chairs:
Yoelle Maarek
Amazon Research, Israel
,
Jian-Yun Nie
University of Montreal, Canada
,
Falk Scholer
RMIT University, Australia
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 July 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
dynamic query-adaption
efficient discrete optimization
online multi-modal hashing
self-weighted
Qualifiers
- research-article
Conference

Acceptance Rates
SIGIR'19 Paper Acceptance Rate84of426submissions,20%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 96
  Total Citations
  View Citations
- 848
  Total Downloads
- Downloads (Last 12 months)102
- Downloads (Last 6 weeks)11
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Online Multi-modal Hashing with Dynamic Query-adaption

SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Flexible Multi-modal Hashing for Scalable Multimedia Retrieval

Flexible Online Multi-modal Hashing for Large-scale Multimedia Retrieval

Efficient Multi-modal Hashing with Online Query Adaption for Multimedia Retrieval