ABSTRACT
One of the fundamental problems in image search is to rank image documents according to a given textual query. Existing search engines highly depend on surrounding texts for ranking images, or leverage the query-image pairs annotated by human labelers to train a series of ranking functions. However, there are two major limitations: 1) the surrounding texts are often noisy or too few to accurately describe the image content, and 2) the human annotations are resourcefully expensive and thus cannot be scaled up. We demonstrate in this paper that the above two fundamental challenges can be mitigated by jointly exploring the cross-view learning and the use of click-through data. The former aims to create a latent subspace with the ability in comparing information from the original incomparable views (i.e., textual and visual views), while the latter explores the largely available and freely accessible click-through data (i.e., ``crowdsourced" human intelligence) for understanding query. Specifically, we propose a novel cross-view learning method for image search, named Click-through-based Cross-view Learning (CCL), by jointly minimizing the distance between the mappings of query and image in the latent subspace and preserving the inherent structure in each original space. On a large-scale click-based image dataset, CCL achieves the improvement over Support Vector Machine-based method by 4.0\% in terms of relevance, while reducing the feature dimension by several orders of magnitude (e.g., from thousands to tens). Moreover, the experiments also demonstrate the superior performance of CCL to several state-of-the-art subspace learning techniques.
- R. A. Baeza-Yates and A. Tiberi. Extracting semantic relations from query logs. In Proceedings of ACM Conference on Knowledge Discovery and Data Mining, 2007. Google ScholarDigital Library
- B. Bai, J. Weston, D. Grangier, R. Collobert, K. Sadamasa, Y. Qi, C. Cortes, and M. Mohri. Polynomial semantic indexing. In Proceedings of Advances in Neural Information Processing Systems, 2009.Google Scholar
- D. Beeferman and A. L. Berger. Agglomerative clustering of a search engine query log. In Proceedings of ACM Conference on Knowledge Discovery and Data Mining, 2000. Google ScholarDigital Library
- C. F. Cadieu, H. Hong, D. Yamins, N. Pinto, N. J. Majaj, and J. J. DiCarlo. The neural representation benchmark and its evaluation on brain and machine. In Proceedings of International Conference on Learning Representations, 2013.Google Scholar
- C. Cortes, M. Mohri, and A. Rostamizadeh. Two-stage learning kernel algorithms. In Proceedings of International Conference on Machine Learning, 2010.Google Scholar
- N. Craswell and M. Szummer. Random walks on the click graph. In Proceedings of ACM Conference on Research and Development in Information Retrieval, 2007. Google ScholarDigital Library
- Z. Fang and Z. Zhang. Discriminative feature selection for multi-view cross-domain learning. In Proceedings of ACM Conference of Information and Knowledge Management, 2013. Google ScholarDigital Library
- Y. Gong, Q. Ke, M. Isard, and S. Lazebnik. A multi-view embedding space for modeling internet images, tags, and their semantics. International Journal of Computer Vision, (106):210--233, 2014. Google ScholarDigital Library
- D. Hardoon, S. Szedmak, and J. Shawe-Taylor. Canonical correlation analysis: An overview with application to learning methods. Neural Computation, 16(12):2639--2664, 2004. Google ScholarDigital Library
- X.-S. Hua, L. Yang, J. Wang, J. Wang, M. Ye, K. Wang, Y. Rui, and J. Li. Clickage: Towards bridging semantic and intent gaps via mining click logs of search engines. Proceedings of ACM International Conference on Multimedia, 2013. Google ScholarDigital Library
- V. Jain and M. Varma. Learning to re-rank: Query-dependent image re-ranking using click data. In Proceedings of International World Wide Web Conference, 2011. Google ScholarDigital Library
- T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of ACM Conference on Knowledge Discovery and Data Mining, 2002. Google ScholarDigital Library
- T. Joachims, L. Granka, B. Pan, H. Hembrooke, F. Radlinski, and G. Gay. Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Trans. on Information Systems, 25(2), 2007. Google ScholarDigital Library
- M. Kloft, U. Brefeld, S. Sonnenburg, P. Laskov, K.-R. Muller, and A. Zien. Evaluating search engines by modeling the relationship between relevance and clicks. In Efficient and accurate $l_p$-norm multiple kernel learning, 2009.Google Scholar
- A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Proceedings of Advances in Neural Information Processing Systems, 2012.Google ScholarDigital Library
- A. Kumar, P. Rai, and H. Daume. Co-regularized multi-view spectral clustering. In Proceedings of Advances in Neural Information Processing Systems, 2011.Google Scholar
- G. R. G. Lanckriet, N. Cristianini, P. L. Bartlett, L. E. Ghaoui, and M. I. Jordan. Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research, 5:27--72, 2004. Google ScholarDigital Library
- X. Li, Y.-Y. Wang, and A. Acero. Learning query intent from regularized click graphs. In Proceedings of ACM Conference on Research and Development in Information Retrieval, 2008. Google ScholarDigital Library
- Q. Mei, D. Zhou, and K. W. Church. Query suggestion using hitting time. In Proceedings of ACM Conference of Information and Knowledge Management, 2008. Google ScholarDigital Library
- T. Mei, Y. Rui, S. Li, and Q. Tian. Multimedia Search Reranking: A Literature Survey. ACM Computing Surveys, 46(3), Sept. 2014. Google ScholarDigital Library
- S. Melacci and M. Belkin. Laplacian support vector machines trained in the primal. Journal of Machine Learning Research, 12:1149--1184, 2011. Google ScholarDigital Library
- I. Muslea, S. Minton, and C. Knoblock. Active learning with multiple views. Journal of Artificial Intelligence Research, 27(1):203--233, 2006. Google ScholarCross Ref
- Y. Pan, T. Yao, K. Yang, H. Li, C.-W. Ngo, J. Wang, and T. Mei. Image search by graph-based label propagation with image representation from dnn. Proceedings of ACM International Conference on Multimedia, 2013. Google ScholarDigital Library
- B. Poblete and R. A. Baeza-Yates. Query-sets: using implicit feedback and query patterns to organize web documents. In Proceedings of International World Wide Web Conference, 2008. Google ScholarDigital Library
- R. Rosipal and N. Kr\"amer. Overview and recent advances in partial least squares. Subspace, Latent Structure and Feature Selection, pages 34--51, 2006. Google ScholarDigital Library
- W. Sun and Y.-X. Yuan. Optimization theory and methods: nonlinear programming, volume 98. springer, 2006.Google Scholar
- M. Trevisiol, L. Chiarandini, L. M. Aiello, and A. Jaimes. Image ranking based on user browsing behavior. In Proceedings of ACM Conference on Research and Development in Information Retrieval, 2012. Google ScholarDigital Library
- J.-R. Wen, J.-Y. Nie, and H. Zhang. Clustering user queries of a search engine. In Proceedings of International World Wide Web Conference, 2001. Google ScholarDigital Library
- Z. Wen and W. Yin. A feasible method for optimization with orthogonality constrains. Mathematical Programming, 142:397--434, 2013.Google ScholarDigital Library
- W. Wu, H. Li, and J. Xu. Learning query and document similarities from click-through bipartite graph with metadata. Proceedings of ACM Conference on Web Search and Data Mining, 2013. Google ScholarDigital Library
- C. Xu, D. Tao, and C. Xu. A survey on multi-view learning. CoRR abs/1304.5634, 2013.Google Scholar
- T. Yao, T. Mei, C.-W. Ngo, and S. Li. Annotation for free: Video tagging by mining user search behavior. Proceedings of ACM International Conference on Multimedia, 2013. Google ScholarDigital Library
- S. Yu, B. Krishnapuram, R. Rosales, and R. Rao. Bayesian co-training. Journal of Machine Learning Research, pages 2649--2680, 2011. Google ScholarDigital Library
Index Terms
- Click-through-based cross-view learning for image search
Recommendations
Click-through-based Subspace Learning for Image Search
MM '14: Proceedings of the 22nd ACM international conference on MultimediaOne of the fundamental problems in image search is to rank image documents according to a given textual query. We address two limitations of the existing image search engines in this paper. First, there is no straightforward way of comparing textual ...
Rescue Tail Queries: Learning to Image Search Re-rank via Click-wise Multimodal Fusion
MM '14: Proceedings of the 22nd ACM international conference on MultimediaImage search engines have achieved good performance for head (popular) queries by leveraging text information and user click data. However, there still remain a large number of tail (rare) queries with relatively unsatisfying search results, which are ...
Click-boosting multi-modality graph-based reranking for image search
Image reranking is an effective way for improving the retrieval performance of keyword-based image search engines. A fundamental issue underlying the success of existing image reranking approaches is the ability in identifying potentially useful ...
Comments