ABSTRACT
To tackle the scalability issues for cross-view retrieval on large-scale databases, in this paper we propose a supervised cross-view hashing framework termed SimH that can well preserve semantic similarities of objects in Hamming space. The proposed SimH generates one unified hash code for all views of an object. For off-line training, SimH firstly exploits the similarity matrix of training objects to learn their corresponding similarity preserving hash codes and then learns hash functions for each view to map features into hash codes, which can be open for any predictive model. Afterwards, the hash codes learnt during training are discarded. For online hash encoding, given an unseen object, learnt hash functions in each of its observed views will firstly predict view-specific hashing results and then a novel expected value based combining strategy is utilized to merge them and determine the unified hash code. Experiments on benchmark datasets show that SimH outperforms several state-of-the-art cross-view hashing methods.
- D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. JMLR, 3:993--1022, 2003. Google ScholarDigital Library
- M. Bronstein, A. Bronstein, F. Michel, and N. Paragios. Data fusion through cross-modality metric learning using similarity-sensitive hashing. In CVPR, 2010.Google ScholarCross Ref
- T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. Nus-wide: A real-world web image database from national university of singapore. In CIVR, 2009. Google ScholarDigital Library
- J. Costa Pereira, E. Coviello, G. Doyle, N. Rasiwasia, G. Lanckriet, R. Levy, and N. Vasconcelos. On the role of correlation and abstraction in cross-modal multimedia retrieval. TPAMI, 36(3):521--535, 2014. Google ScholarDigital Library
- G. Ding, Y. Guo, and J. Zhou. Collective matrix factorization hashing for multimodal data. In IEEE Conference on Computer Vision and Pattern Recognition, 2014. Google ScholarDigital Library
- G. H. Golub and H. A. van der Vorst. Eigenvalue computation in the 20th century. Journal of Computational and Applied Mathematics, 123(1-2):35--65, 2000. Google ScholarDigital Library
- Y. Guo, G. Ding, Y. Gao, and J. Wang. Semi-supervised active learning with cross-class sample transfer. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pages 1526--1532, 2016.Google ScholarDigital Library
- Y. Guo, G. Ding, J. Han, and X. Jin. Robust iterative quantization for efficient ℓp-norm similarity search. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pages 3382--3388, 2016.Google ScholarDigital Library
- Y. Guo, G. Ding, X. Jin, and J. Wang. Transductive zero-shot recognition via shared model space learning. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pages 3434--3500, 2016. Google ScholarDigital Library
- Y. Guo, G. Ding, and J. Zhou. Robust nonnegative matrix factorization with discriminability for image representation. In 2015 IEEE International Conference on Multimedia and Expo, pages 1--6, 2015.Google ScholarCross Ref
- Y. Guo, G. Ding, J. Zhou, and Q. Liu. Robust and discriminative concept factorization for image representation. In International Conference on Multimedia Retrieval, pages 115--122, 2015. Google ScholarDigital Library
- M. Hu, Y. Chen, and J.-Y. Kwok. Building sparse multiple-kernel svm classifiers. IEEE Transactions on Neural Networks, 20(5):827--839, 2009. Google ScholarDigital Library
- M. J. Huiskes and M. S. Lew. The mir flickr retrieval evaluation. In MIR, 2008. Google ScholarDigital Library
- I. Jolliffe. Principal Component Analysis. Springer Verlag, 1986.Google ScholarCross Ref
- S. Kumar and R. Udupa. Learning hash functions for cross-view similarity search. In IJCAI, 2011. Google ScholarDigital Library
- W. Liu, J. Wang, R. Ji, Y.-G. Jiang, and S.-F. Chang. Supervised hashing with kernels. In CVPR, 2012.Google Scholar
- J. Song, Y. Yang, Y. Yang, Z. Huang, and H. T. Shen. Inter-media hashing for large-scale retrieval from heterogeneous data sources. In SIGMOD, 2013. Google ScholarDigital Library
- F. Wu, Z. Yu, Y. Yang, S. Tang, Y. Zhang, and Y. Zhuang. Sparse multi-modal hashing. TMM, 16(2):427--139, 2014. Google ScholarDigital Library
- Z. Yu, F. Wu, Y. Yang, Q. Tian, J. Luo, and Y. Zhuang. Discriminative coupled dictionary hashing for fast cross-media retrieval. In SIGIR, 2014. Google ScholarDigital Library
- Z. Yu, Y. Zhang, S. Tang, Y. Yang, Q. Tian, and J. Luo. Cross-media hashing with kernel regression. In IEEE International Conference on Multimedia and Expo, 2014. Google ScholarDigital Library
- D. Zhai, H. Chang, Y. Zhen, X. Liu, X. Chen, and W. Gao. Parametric local multimodal hashing for cross-view similarity search. In IJCAI, 2013. Google ScholarDigital Library
- D. Zhang and W.-J. Li. Large-scale supervised multimodal hashing with semantic correlation maximization. In AAAI Conference on Artificial Intelligence, 2014. Google ScholarDigital Library
- D. Zhang, J. Wang, D. Cai, and J. Lu. Self-taught hashing for fast similarity search. In SIGIR, 2010. Google ScholarDigital Library
- Y. Zhen and D.-Y. Yeung. Co-regularized hashing for multimodal data. In NIPS, 2012. Google ScholarDigital Library
- Y. Zhen and D.-Y. Yeung. A probabilistic model for multimodal hash function learning. In SIGKDD, 2012. Google ScholarDigital Library
- J. Zhou, G. Ding, and Y. Guo. Latent semantic sparse hashing for cross-modal similarity search. In SIGIR, 2014. Google ScholarDigital Library
- J. Zhou, G. Ding, Y. Guo, Q. Liu, and X. Dong. Kernel-based supervised hashing for cross-view similarity search. In IEEE International Conference on Multimedia and Expo, 2014.Google ScholarCross Ref
- SimH: A Supervised Cross-View Hashing Framework Preserving Semantic Similarities in Hamming Space
Recommendations
A simple multiple-fold correlation-based multi-view multi-label learning
AbstractCorrelations among different features and labels are ubiquitous in the present multi-view multi-label data sets and they are always described with within-view, cross-view, and consensus-view representations. While how to discover and measure these ...
MMA: a multi-view and multi-modality benchmark dataset for human action recognition
Human action recognition is an active research topic in both computer vision and machine learning communities, which has broad applications including surveillance, biometrics and human computer interaction. In the past decades, although some famous ...
DIVOTrack: A Novel Dataset and Baseline Method for Cross-View Multi-Object Tracking in DIVerse Open Scenes
AbstractCross-view multi-object tracking aims to link objects between frames and camera views with substantial overlaps. Although cross-view multi-object tracking has received increased attention in recent years, existing datasets still have several ...
Comments