research-article

Click-through-based cross-view learning for image search

Authors:
Yingwei Pan

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

,
Ting Yao

City University of Hong Kong, Hong Kong, Hong Kong

City University of Hong Kong, Hong Kong, Hong Kong
View Profile

,
Tao Mei

Microsoft Research, Beijing, China

Microsoft Research, Beijing, China
View Profile

,
Houqiang Li

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

,
Chong-Wah Ngo

City University of Hong Kong, Hong Kong, Hong Kong

City University of Hong Kong, Hong Kong, Hong Kong
View Profile

,
Yong Rui

Microsoft Research, Beijing, China

Microsoft Research, Beijing, China
View Profile

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrievalJuly 2014Pages 717–726https://doi.org/10.1145/2600428.2609568

Published:03 July 2014Publication History

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

Pages 717–726

ABSTRACT

One of the fundamental problems in image search is to rank image documents according to a given textual query. Existing search engines highly depend on surrounding texts for ranking images, or leverage the query-image pairs annotated by human labelers to train a series of ranking functions. However, there are two major limitations: 1) the surrounding texts are often noisy or too few to accurately describe the image content, and 2) the human annotations are resourcefully expensive and thus cannot be scaled up. We demonstrate in this paper that the above two fundamental challenges can be mitigated by jointly exploring the cross-view learning and the use of click-through data. The former aims to create a latent subspace with the ability in comparing information from the original incomparable views (i.e., textual and visual views), while the latter explores the largely available and freely accessible click-through data (i.e., ``crowdsourced" human intelligence) for understanding query. Specifically, we propose a novel cross-view learning method for image search, named Click-through-based Cross-view Learning (CCL), by jointly minimizing the distance between the mappings of query and image in the latent subspace and preserving the inherent structure in each original space. On a large-scale click-based image dataset, CCL achieves the improvement over Support Vector Machine-based method by 4.0\% in terms of relevance, while reducing the feature dimension by several orders of magnitude (e.g., from thousands to tens). Moreover, the experiments also demonstrate the superior performance of CCL to several state-of-the-art subspace learning techniques.

References

R. A. Baeza-Yates and A. Tiberi. Extracting semantic relations from query logs. In Proceedings of ACM Conference on Knowledge Discovery and Data Mining, 2007. Google ScholarDigital Library
B. Bai, J. Weston, D. Grangier, R. Collobert, K. Sadamasa, Y. Qi, C. Cortes, and M. Mohri. Polynomial semantic indexing. In Proceedings of Advances in Neural Information Processing Systems, 2009.Google Scholar
D. Beeferman and A. L. Berger. Agglomerative clustering of a search engine query log. In Proceedings of ACM Conference on Knowledge Discovery and Data Mining, 2000. Google ScholarDigital Library
C. F. Cadieu, H. Hong, D. Yamins, N. Pinto, N. J. Majaj, and J. J. DiCarlo. The neural representation benchmark and its evaluation on brain and machine. In Proceedings of International Conference on Learning Representations, 2013.Google Scholar
C. Cortes, M. Mohri, and A. Rostamizadeh. Two-stage learning kernel algorithms. In Proceedings of International Conference on Machine Learning, 2010.Google Scholar
N. Craswell and M. Szummer. Random walks on the click graph. In Proceedings of ACM Conference on Research and Development in Information Retrieval, 2007. Google ScholarDigital Library
Z. Fang and Z. Zhang. Discriminative feature selection for multi-view cross-domain learning. In Proceedings of ACM Conference of Information and Knowledge Management, 2013. Google ScholarDigital Library
Y. Gong, Q. Ke, M. Isard, and S. Lazebnik. A multi-view embedding space for modeling internet images, tags, and their semantics. International Journal of Computer Vision, (106):210--233, 2014. Google ScholarDigital Library
D. Hardoon, S. Szedmak, and J. Shawe-Taylor. Canonical correlation analysis: An overview with application to learning methods. Neural Computation, 16(12):2639--2664, 2004. Google ScholarDigital Library
X.-S. Hua, L. Yang, J. Wang, J. Wang, M. Ye, K. Wang, Y. Rui, and J. Li. Clickage: Towards bridging semantic and intent gaps via mining click logs of search engines. Proceedings of ACM International Conference on Multimedia, 2013. Google ScholarDigital Library
V. Jain and M. Varma. Learning to re-rank: Query-dependent image re-ranking using click data. In Proceedings of International World Wide Web Conference, 2011. Google ScholarDigital Library
T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of ACM Conference on Knowledge Discovery and Data Mining, 2002. Google ScholarDigital Library
T. Joachims, L. Granka, B. Pan, H. Hembrooke, F. Radlinski, and G. Gay. Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Trans. on Information Systems, 25(2), 2007. Google ScholarDigital Library
M. Kloft, U. Brefeld, S. Sonnenburg, P. Laskov, K.-R. Muller, and A. Zien. Evaluating search engines by modeling the relationship between relevance and clicks. In Efficient and accurate $l_p$-norm multiple kernel learning, 2009.Google Scholar
A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Proceedings of Advances in Neural Information Processing Systems, 2012.Google ScholarDigital Library
A. Kumar, P. Rai, and H. Daume. Co-regularized multi-view spectral clustering. In Proceedings of Advances in Neural Information Processing Systems, 2011.Google Scholar
G. R. G. Lanckriet, N. Cristianini, P. L. Bartlett, L. E. Ghaoui, and M. I. Jordan. Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research, 5:27--72, 2004. Google ScholarDigital Library
X. Li, Y.-Y. Wang, and A. Acero. Learning query intent from regularized click graphs. In Proceedings of ACM Conference on Research and Development in Information Retrieval, 2008. Google ScholarDigital Library
Q. Mei, D. Zhou, and K. W. Church. Query suggestion using hitting time. In Proceedings of ACM Conference of Information and Knowledge Management, 2008. Google ScholarDigital Library
T. Mei, Y. Rui, S. Li, and Q. Tian. Multimedia Search Reranking: A Literature Survey. ACM Computing Surveys, 46(3), Sept. 2014. Google ScholarDigital Library
S. Melacci and M. Belkin. Laplacian support vector machines trained in the primal. Journal of Machine Learning Research, 12:1149--1184, 2011. Google ScholarDigital Library
I. Muslea, S. Minton, and C. Knoblock. Active learning with multiple views. Journal of Artificial Intelligence Research, 27(1):203--233, 2006. Google ScholarCross Ref
Y. Pan, T. Yao, K. Yang, H. Li, C.-W. Ngo, J. Wang, and T. Mei. Image search by graph-based label propagation with image representation from dnn. Proceedings of ACM International Conference on Multimedia, 2013. Google ScholarDigital Library
B. Poblete and R. A. Baeza-Yates. Query-sets: using implicit feedback and query patterns to organize web documents. In Proceedings of International World Wide Web Conference, 2008. Google ScholarDigital Library
R. Rosipal and N. Kr\"amer. Overview and recent advances in partial least squares. Subspace, Latent Structure and Feature Selection, pages 34--51, 2006. Google ScholarDigital Library
W. Sun and Y.-X. Yuan. Optimization theory and methods: nonlinear programming, volume 98. springer, 2006.Google Scholar
M. Trevisiol, L. Chiarandini, L. M. Aiello, and A. Jaimes. Image ranking based on user browsing behavior. In Proceedings of ACM Conference on Research and Development in Information Retrieval, 2012. Google ScholarDigital Library
J.-R. Wen, J.-Y. Nie, and H. Zhang. Clustering user queries of a search engine. In Proceedings of International World Wide Web Conference, 2001. Google ScholarDigital Library
Z. Wen and W. Yin. A feasible method for optimization with orthogonality constrains. Mathematical Programming, 142:397--434, 2013.Google ScholarDigital Library
W. Wu, H. Li, and J. Xu. Learning query and document similarities from click-through bipartite graph with metadata. Proceedings of ACM Conference on Web Search and Data Mining, 2013. Google ScholarDigital Library
C. Xu, D. Tao, and C. Xu. A survey on multi-view learning. CoRR abs/1304.5634, 2013.Google Scholar
T. Yao, T. Mei, C.-W. Ngo, and S. Li. Annotation for free: Video tagging by mining user search behavior. Proceedings of ACM International Conference on Multimedia, 2013. Google ScholarDigital Library
S. Yu, B. Krishnapuram, R. Rosales, and R. Rao. Bayesian co-training. Journal of Machine Learning Research, pages 2649--2680, 2011. Google ScholarDigital Library

Index Terms

Click-through-based cross-view learning for image search
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Click-through-based Subspace Learning for Image Search
MM '14: Proceedings of the 22nd ACM international conference on Multimedia

One of the fundamental problems in image search is to rank image documents according to a given textual query. We address two limitations of the existing image search engines in this paper. First, there is no straightforward way of comparing textual ...
Read More
Rescue Tail Queries: Learning to Image Search Re-rank via Click-wise Multimodal Fusion
MM '14: Proceedings of the 22nd ACM international conference on Multimedia

Image search engines have achieved good performance for head (popular) queries by leveraging text information and user click data. However, there still remain a large number of tail (rare) queries with relatively unsatisfying search results, which are ...
Read More
Click-boosting multi-modality graph-based reranking for image search

Image reranking is an effective way for improving the retrieval performance of keyword-based image search engines. A fundamental issue underlying the success of existing image reranking approaches is the ability in identifying potentially useful ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval
July 2014
1330 pages
ISBN:9781450322577
DOI:10.1145/2600428
General Chairs:
Shlomo Geva
Queensland University of Technology
,
Andrew Trotman
University of Dunedin
,
Program Chairs:
Peter Bruza
Queensland University of Technology
,
Charles L.A. Clarke
University of Waterloo
,
Kal Järvelin
University of Tampere
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 July 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
click-through data
cross-view learning
dnn image representation
image search
multi-view embedding
subspace learning
Qualifiers
- research-article
Conference

Acceptance Rates
SIGIR '14 Paper Acceptance Rate82of387submissions,21%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 70
  Total Citations
  View Citations
- 512
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Click-through-based cross-view learning for image search

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Click-through-based Subspace Learning for Image Search

Rescue Tail Queries: Learning to Image Search Re-rank via Click-wise Multimodal Fusion

Click-boosting multi-modality graph-based reranking for image search