Abstract
This article defines a new hashing task motivated by real-world applications in content-based image retrieval, that is, effective data indexing and retrieval given mixed query (query image together with user-provided keywords). Our work is distinguished from state-of-the-art hashing research by two unique features: (1) Unlike conventional image retrieval systems, the input query is a combination of an exemplar image and several descriptive keywords, and (2) the input image data are often associated with multiple labels. It is an assumption that is more consistent with the realistic scenarios. The mixed image-keyword query significantly extends traditional image-based query and better explicates the user intention. Meanwhile it complicates semantics-based indexing on the multilabel data. Though several existing hashing methods can be adapted to solve the indexing task, unfortunately they all prove to suffer from low effectiveness. To enhance the hashing efficiency, we propose a novel scheme “boosted shared hashing”. Unlike prior works that learn the hashing functions on either all image labels or a single label, we observe that the hashing function can be more effective if it is designed to index over an optimal label subset. In other words, the association between labels and hash bits are moderately sparse. The sparsity of the bit-label association indicates greatly reduced computation and storage complexities for indexing a new sample, since only limited number of hashing functions will become active for the specific sample. We develop a Boosting style algorithm for simultaneously optimizing both the optimal label subsets and hashing functions in a unified formulation, and further propose a query-adaptive retrieval mechanism based on hash bit selection for mixed queries, no matter whether or not the query words exist in the training data. Moreover, we show that the proposed method can be easily extended to the case where the data similarity is gauged by nonlinear kernel functions. Extensive experiments are conducted on standard image benchmarks like CIFAR-10, NUS-WIDE and a-TRECVID. The results validate both the sparsity of the bit-label association and the convergence of the proposed algorithm, and demonstrate that the proposed hashing scheme achieves substantially superior performances over state-of-the-art methods under the same hash bit budget.
- J. L. Bentley. 1975. Multidimensional binary search trees used for associative searching. Commun. ACM 18, 509--517. Google ScholarDigital Library
- A. Broder, M. Charikar, A. Frieze, and M. Mitzenmacher. 1998. Min-wise independent permutations. In Proceedings of the Symposium on Theory of Computing. Google ScholarDigital Library
- R. Caruana. 1997. Multitask learning. Mach. Learn. 28, 41--75. Google ScholarDigital Library
- K. Chatfield, V. Lempitsky, A. Vedaldi, and A. Zisserman. 2011. The devil is in the details: An evaluation of recent feature encoding methods. In Proceedings of the British Machine Vision Conference.Google Scholar
- T. -S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. 2009. Nus-wide: A real-world web image database from National University of Singapore. In Proceedings of the ACM Conference on Image and Video Retrieval. Google ScholarDigital Library
- M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni. 2004. Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the Symposium on Computational Geometry. Google ScholarDigital Library
- J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.Google Scholar
- E. Elhamifar and R. Vidal. 2009. Sparse subspace clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2790--2797.Google Scholar
- Y. Freund and R. E. Schapire. 1995. A decision-theoretic generalization of on-line learning and an application to boosting. In Proceedings of the 2nd European Conference on Computational Learning Theory (EuroCOLT'95). Springer, 23--37. Google ScholarDigital Library
- J. Friedman, T. Hastie, and R. Tibshirani. 1998. Additive logistic regression: a statistical view of boosting. In Annals of Statistics.Google Scholar
- Y. Gong and S. Lazebnik. 2011. Iterative quantization: A Procrustean approach to learning binary codes. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 817--824. Google ScholarDigital Library
- J. He, S. Kumar, and S.-F. Chang. 2012. On the difficulty of nearest neighbor search. In Proceedings of the International Conference on Machine Learning.Google Scholar
- J. He, W. Liu, and S.-F. Chang. 2010. Scalable similarity search with optimized kernel hashing. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Google ScholarDigital Library
- P. Indyk. 2004. Nearest neighbours in high-dimensional spaces. In Handbook of Discrete and Computational Geometry 2nd Ed., E. Goodman and J. O'Rourke, Eds. Chapter 39, CRC Press.Google Scholar
- P. Indyk and R. Motwani. 1998. Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the ACM Symposium on Theory of Computing. Google ScholarDigital Library
- S. Ji, L. Tang, S. Yu, and J. Ye. 2010. A shared-subspace learning framework for multi-label classification. ACM Trans. Knowl. Discov. Data 4, 8:1--8:29. Google ScholarDigital Library
- A. Krizhevsky. 2009. Learning multiple layers of features from tiny images. Tech. rep.Google Scholar
- B. Kulis and T. Darrell. 2009. Learning to hash with binary reconstructive embeddings. In Advances in Neural Information Processing Systems.Google Scholar
- B. Kulis and K. Grauman. 2009. Kernelized locality-sensitive hashing for scalable image search. In Proceedings of the IEEE International Conference on Computer Vision.Google Scholar
- S. Lazebnik, C. Schmid, and J. Ponce. 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR'06). IEEE, 2169--2178. Google ScholarDigital Library
- H. Lee, A. Battle, R. Raina, and A. Y. Ng. 2007. Efficient sparse coding algorithms. In Advances in Neural Information Processing Systems, B. Schölkopf, J. Platt, and T. Hoffman, Eds., 801--808.Google Scholar
- P. Li, M. Wang, J. Cheng, C. Xu, and H. Lu. 2013. Spectral hashing with semantically consistent graph for image indexing. IEEE Trans. Multimedia 15, 1, 141--152. Google ScholarDigital Library
- W. Liu, J. Wang, R. Ji, Y.-G. Jiang, and S.-F. Chang. 2012a. Supervised hashing with kernels. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. Google ScholarDigital Library
- W. Liu, J. Wang, S. Kumar, and S.-F. Chang. 2011. Hashing with graphs. In Proceedings of the International Conference on Machine Learning.Google Scholar
- X. Liu, J. He, D. Liu, and B. Lang. 2012b. Compact kernel hashing with multiple features. In Proceedings of the ACM International Conference on Multimedia. Google ScholarDigital Library
- X. Liu, Y. Mu, B. Lang, and S.-F. Chang. 2012c. Compact hashing for mixed image-keyword query over multi-label images. In Proceedings of the ACM International Conference on Multimedia Retrieval. Google ScholarDigital Library
- Y. Mu, X. Chen, T.-S. Chua, and S. Yan. 2011. Learning reconfigurable hashing for diverse semantics. In Proceedings of the ACM International Conference on Multimedia Retrieval. Google ScholarDigital Library
- Y. Mu, X. Chen, X. Liu, T.-S. Chua, and S. Yan. 2012. Multimedia semantics-aware query-adaptive hashing with bits reconfigurability. Int. J. Multimedia Information Retrieval, 1--12.Google Scholar
- Y. Mu, J. Shen, and S. Yan. 2010. Weakly-supervised hashing in kernel space. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.Google Scholar
- A. Oliva and A. Torralba. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vision 42, 145--175. Google ScholarDigital Library
- Y. Pati, R. Rezaiifar, and P. S. Krishnaprasad. 1993. Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In Conference Record of the 27th Asilomar Conference on Signals, Systems and Computers. Vol. 1, 40--44.Google Scholar
- J. Song, Y. Yang, Z. Huang, H. T. Shen, and R. Hong. 2011. Multiple feature hashing for real-time large scale near-duplicate video retrieval. In Proceedings of the ACM International Conference on Multimedia. Google ScholarDigital Library
- A. Torralba, K. Murphy, and W. Freeman. 2004. Sharing features: efficient boosting procedures for multiclass object detection. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. Google ScholarDigital Library
- A. Vedaldi and B. Fulkerson. 2008. VLFeat: An open and portable library of computer vision algorithms. http://www.vlfeat.org/.Google Scholar
- J. Wang, S. Kumar, and S.-F. Chang. 2010a. Semi-supervised hashing for scalable image retrieval. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- J. Wang, S. Kumar, and S.-F. Chang. 2010b. Sequential projection learning for hashing with compact codes. In Proceedings of the InternationalConference on Machine Learning.Google Scholar
- J. Wang, S. Kumar, and S.-F. Chang. 2012. Semi-supervised hashing for large scale search. IEEE Trans. Pattern Anal. Mach. Intell. Google ScholarDigital Library
- K. Q. Weinberger and L. K. Saul. 2009. Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207--244. Google ScholarDigital Library
- Y. Weiss, A. Torralba, and R. Fergus. 2008. Spectral hashing. In Advances in Neural Information Processing Systems.Google Scholar
- R. Yan, J. Tesic, and J. R. Smith. 2007a. Model-shared subspace boosting for multi-label classification. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Google ScholarDigital Library
- S. Yan, D. Xu, B. Zhang, H. Zhang, Q. Yang, and S. Lin. 2007b. Graph embedding and extensions: A general framework for dimensionality reduction. IEEE Trans. Pattern Anal. Mach. Intell. 29, 1, 40--51. Google ScholarDigital Library
- F. Yu, R. Ji, M.-H. Tsai, G. Ye, and S.-F. Chang. 2012. Weak attributes for large-scale image retrieval. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. Google ScholarDigital Library
- A. L. Yuille. 2002. The concave-convex procedure (cccp). In Advances in Neural Information Processing Systems. MIT Press.Google Scholar
- D. Zhang, F. Wang, and L. Si. 2011. Composite hashing with multiple information sources. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. Google ScholarDigital Library
Index Terms
- Mixed image-keyword query adaptive hashing over multilabel images
Recommendations
Compact hashing for mixed image-keyword query over multi-label images
ICMR '12: Proceedings of the 2nd ACM International Conference on Multimedia RetrievalRecently locality-sensitive hashing (LSH) algorithms have attracted much attention owing to its empirical success and theoretic guarantee in large-scale visual search. In this paper we address the new topic of hashing with multi-label data, in which ...
Baggingboosting-based semi-supervised multi-hashing with query-adaptive re-ranking
Propose a semi-supervised multi-hashing using bagging to relieve the disadvantage of boosting-based multi-hashing methods.Then, use boosting to train individual hash function in each hash table.This hybrid method takes advantages of both bagging and ...
Image retrieval with query-adaptive hashing
Hashing-based approximate nearest-neighbor search may well realize scalable content-based image retrieval. The existing semantic-preserving hashing methods leverage the labeled data to learn a fixed set of semantic-aware hash functions. However, a fixed ...
Comments