ABSTRACT
Due to its low storage cost and fast query speed, hashing has been widely adopted for approximate nearest neighbor search in large-scale datasets. Traditional hashing methods try to learn the hash codes in an unsupervised way where the metric (Euclidean) structure of the training data is preserved. Very recently, supervised hashing methods, which try to preserve the semantic structure constructed from the semantic labels of the training points, have exhibited higher accuracy than unsupervised methods. In this paper, we propose a novel supervised hashing method, called latent factor hashing(LFH), to learn similarity-preserving binary codes based on latent factor models. An algorithm with convergence guarantee is proposed to learn the parameters of LFH. Furthermore, a linear-time variant with stochastic learning is proposed for training LFH on large-scale datasets. Experimental results on two large datasets with semantic labels show that LFH can achieve superior accuracy than state-of-the-art methods with comparable training time.
- A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In Proceedings of the Annual Symposium on Foundations of Computer Science, pages 459--468, 2006. Google ScholarDigital Library
- S. Arya, D. M. Mount, N. S. Netanyahu, R. Silverman, and A. Y. Wu. An optimal algorithm for approximate nearest neighbor searching fixed dimensions. Journal of the ACM, 45(6):891--923, 1998. Google ScholarDigital Library
- T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. Nus-wide: A real-world web image database from national university of singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval, 2009. Google ScholarDigital Library
- M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the Annual Symposium on Computational Geometry, pages 253--262, 2004. Google ScholarDigital Library
- A. Gionis, P. Indyk, and R. Motwani. Similarity search in high dimensions via hashing. In Proceedings of the International Conference on Very Large Data Bases, pages 518--529, 1999. Google ScholarDigital Library
- Y. Gong and S. Lazebnik. Iterative quantization: A procrustean approach to learning binary codes. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 817--824, 2011. Google ScholarDigital Library
- P. Indyk and R. Motwani. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proceedings of the Annual ACM Symposium on Theory of Computing, pages 604--613, 1998. Google ScholarDigital Library
- W. Kong and W.-J. Li. Double-bit quantization for hashing. In Proceedings of the AAAI Conference on Artificial Intelligence, 2012.Google Scholar
- W. Kong and W.-J. Li. Isotropic hashing. In Proceedings of the Annual Conference on Neural Information Processing Systems, pages 1655--1663, 2012.Google Scholar
- W. Kong, W.-J. Li, and M. Guo. Manhattan hashing for large-scale image retrieval. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 45--54, 2012. Google ScholarDigital Library
- A. Krizhevsky. Learning multiple layers of features from tiny images. Master's thesis, University of Toronto, 2009.Google Scholar
- B. Kulis and T. Darrell. Learning to hash with binary reconstructive embeddings. In Proceedings of the Annual Conference on Neural Information Processing Systems, pages 1042--1050, 2009.Google Scholar
- B. Kulis and K. Grauman. Kernelized locality-sensitive hashing for scalable image search. In Proceedings of the IEEE International Conference on Computer Vision, pages 2130--2137, 2009.Google ScholarCross Ref
- B. Kulis, P. Jain, and K. Grauman. Fast similarity search for learned metrics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(12):2143--2157, 2009. Google ScholarDigital Library
- K. Lange, D. R. Hunter, and I. Yang. Optimization transfer using surrogate objective functions. Journal of Computational and Graphical Statistics, 9(1):1--20, 2000.Google Scholar
- X. Li, G. Lin, C. Shen, A. van den Hengel, and A. R. Dick. Learning hash functions using column generation. In Proceedings of the International Conference on Machine Learning, pages 142--150, 2013.Google Scholar
- W. Liu, J. Wang, R. Ji, Y.-G. Jiang, and S.-F. Chang. Supervised hashing with kernels. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 2074--2081, 2012. Google ScholarDigital Library
- W. Liu, J. Wang, S. Kumar, and S.-F. Chang. Hashing with graphs. In Proceedings of the International Conference on Machine Learning, 2011.Google Scholar
- M. Norouzi and D. J. Fleet. Minimal loss hashing for compact binary codes. In Proceedings of the International Conference on Machine Learning, pages 353--360, 2011.Google Scholar
- M. Norouzi, D. J. Fleet, and R. Salakhutdinov. Hamming distance metric learning. In Proceedings of the Annual Conference on Neural Information Processing Systems, pages 1070--1078, 2012.Google Scholar
- M. Ou, P. Cui, F. Wang, J. Wang, W. Zhu, and S. Yang. Comparing apples to oranges: A scalable solution with heterogeneous hashing. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 230--238, 2013. Google ScholarDigital Library
- M. Raginsky and S. Lazebnik. Locality-sensitive binary codes from shift-invariant kernels. In Proceedings of the Annual Conference on Neural Information Processing Systems, pages 1509--1517, 2009.Google Scholar
- M. Rastegari, J. Choi, S. Fakhraei, D. Hal, and L. S. Davis. Predictable dual-view hashing. In Proceedings of the International Conference on Machine Learning, pages 1328--1336, 2013.Google Scholar
- R. Salakhutdinov and G. E. Hinton. Semantic hashing. International Journal of Approximate Reasoning, 50(7):969--978, 2009. Google ScholarDigital Library
- A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain. Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12):1349--1380, 2000. Google ScholarDigital Library
- J. Song, Y. Yang, Y. Yang, Z. Huang, and H. T. Shen. Inter-media hashing for large-scale retrieval from heterogeneous data sources. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 785--796, 2013. Google ScholarDigital Library
- B. Stein. Principles of hash-based text retrieval. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 527--534, 2007. Google ScholarDigital Library
- C. Strecha, A. A. Bronstein, M. M. Bronstein, and P. Fua. Ldahash: Improved matching with smaller descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(1):66--78, 2012. Google ScholarDigital Library
- A. Torralba, R. Fergus, and W. T. Freeman. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11):1958--1970, 2008. Google ScholarDigital Library
- A. Torralba, R. Fergus, and Y. Weiss. Small codes and large image databases for recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2008.Google ScholarCross Ref
- F. Ture, T. Elsayed, and J. J. Lin. No free lunch: Brute force vs. locality-sensitive hashing for cross-lingual pairwise similarity. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 943--952, 2011. Google ScholarDigital Library
- J. Wang, O. Kumar, and S.-F. Chang. Semi-supervised hashing for scalable image retrieval. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 3424--3431, 2010.Google ScholarCross Ref
- J. Wang, S. Kumar, and S.-F. Chang. Sequential projection learning for hashing with compact codes. In Proceedings of the International Conference on Machine Learning, pages 1127--1134, 2010.Google Scholar
- Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. In Proceedings of the Annual Conference on Neural Information Processing Systems, pages 1753--1760, 2008.Google Scholar
- F. Wu, Z. Yu, Y. Yang, S. Tang, Y. Zhang, and Y. Zhuang. Sparse multi-modal hashing. IEEE Transactions on Multimedia, 16(2):427--439, 2014.Google ScholarDigital Library
- B. Xu, J. Bu, Y. Lin, C. Chen, X. He, and D. Cai. Harmonious hashing. In Proceedings of the International Joint Conference on Artificial Intelligence, 2013. Google ScholarDigital Library
- D. Zhai, H. Chang, Y. Zhen, X. Liu, X. Chen, and W. Gao. Parametric local multimodal hashing for cross-view similarity search. In Proceedings of the International Joint Conference on Artificial Intelligence, 2013. Google ScholarDigital Library
- D. Zhang and W.-J. Li. Large-scale supervised multimodal hashing with semantic correlation maximization. In Proceedings of the AAAI Conference on Artificial Intelligence, 2014.Google Scholar
- D. Zhang, F. Wang, and L. Si. Composite hashing with multiple information sources. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 225--234, 2011. Google ScholarDigital Library
- D. Zhang, J. Wang, D. Cai, and J. Lu. Self-taught hashing for fast similarity search. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 18--25, 2010. Google ScholarDigital Library
- Q. Zhang, Y. Wu, Z. Ding, and X. Huang. Learning hash codes for efficient content reuse detection. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 405--414, 2012. Google ScholarDigital Library
- Y. Zhen and D.-Y. Yeung. A probabilistic model for multimodal hash function learning. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 940--948, 2012. Google ScholarDigital Library
Index Terms
- Supervised hashing with latent factor models
Recommendations
Semi-supervised manifold-embedded hashing with joint feature representation and classifier learning
We propose a semi-supervised hashing method which uses very limited labeled data.We integrate manifold embedding, feature representation and classifier learning into a joint optimization framework.We adopt the l2,1-norm in our formulation to obtain a ...
Semi-Supervised Hashing for Large-Scale Search
Hashing-based approximate nearest neighbor (ANN) search in huge databases has become popular due to its computational and memory efficiency. The popular hashing methods, e.g., Locality Sensitive Hashing and Spectral Hashing, construct hash functions ...
Semi-supervised Hashing with Semantic Confidence for Large Scale Visual Search
SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information RetrievalSimilarity search is one of the fundamental problems for large scale multimedia applications. Hashing techniques, as one popular strategy, have been intensively investigated owing to the speed and memory efficiency. Recent research has shown that ...
Comments