ABSTRACT
In this paper we introduce a sparse kernel learning framework for the Continuous Relevance Model (CRM). State-of-the-art image annotation models linearly combine evidence from several different feature types to improve image annotation accuracy. While previous authors have focused on learning the linear combination weights for these features, there has been no work examining the optimal combination of kernels. We address this gap by formulating a sparse kernel learning framework for the CRM, dubbed the SKL-CRM, that greedily selects an optimal combination of kernels. Our kernel learning framework rapidly converges to an annotation accuracy that substantially outperforms a host of state-of-the-art annotation models. We make two surprising conclusions: firstly, if the kernels are chosen correctly, only a very small number of features are required so to achieve superior performance over models that utilise a full suite of feature types; and secondly, the standard default selection of kernels commonly used in the literature is sub-optimal, and it is much better to adapt the kernel choice based on the feature type and image dataset.
- K. Barnard, P. Duygulu, N. de Freitas, D. Forsyth, D. Blei, and M. Jordan. Matching words and pictures. In JMLR'03. Google ScholarDigital Library
- G. Carneiro, A. B. Chan, P. J. Moreno, and N. Vasconcelos. Supervised learning of semantic classes for image annotation and retrieval. In PAMI'07.Google Scholar
- W. S. Cooper. Some inconsistencies and misidentified modeling assumptions in probabilistic information retrieval. In TOIS'95.Google Scholar
- P. Duygulu, K. Barnard, N. de Freitas, and D. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In ECCV '02. Google ScholarDigital Library
- S. Feng, R. Manmatha, and V. Lavrenko. Multiple bernoulli relevance models for image and video annotation. In CVPR'04. Google ScholarDigital Library
- H. Fu, Q. Zhang, and G. Qiu. Random forest for image annotation. In ECCV'12. Google ScholarDigital Library
- D. Grangier and S. Bengio. A discriminative kernel-based approach to rank images from text queries. In PAMI'08.Google Scholar
- M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid. Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation. In ICCV'09.Google Scholar
- C. Hentschel, S. Stober, A. Nürnberger, and M. Detyniecki. Automatic image annotation using a visual dictionary based on reliable image segmentation. In AMR'08.Google Scholar
- V. Lavrenko, S. Feng, and R. Manmatha. Statistical models for automatic video annotation and retrieval. In ICASSP'04.Google Scholar
- V. Lavrenko, R. Manmatha, and J. Jeon. A model for learning the semantics of pictures. In NIPS'03.Google Scholar
- J. Liu, M. Li, Q. Liu, H. Lu, and S. Ma. Image annotation via graph learning. In JPR'09.Google Scholar
- A. Makadia, V. Pavlovic, and S. Kumar. A new baseline for image annotation. In ECCV '08. Google ScholarDigital Library
- M. Markkula and E. Sormunen. End-user searching challenges indexing practices in the digital newspaper photo archive. In IR'00.Google Scholar
- D. Metzler and R. Manmatha. An inference network approach to image retrieval. In CIVR'04.Google Scholar
- S. Moran and V. Lavrenko. Optimal tag sets for automatic image annotation. In BMVC'11.Google Scholar
- H. Nakayama. Linear distance metric Learning for large-scale generic image recognition. PhD thesis, The University of Tokyo, Japan, 2011.Google Scholar
- P. Richtárik and M. Takác. Distributed coordinate descent method for learning with big data. In CoRR'13.Google Scholar
- Y. Verma and C. V. Jawahar. Exploring svm for image annotation in presence of confusing labels. In BMVC'13.Google Scholar
- Y. Verma and C. V. Jawahar. Image annotation using metric learning in semantic neighbourhoods. In ECCV'12. Google ScholarDigital Library
- K. Q. Weinberger and L. K. Saul. Distance metric learning for large margin nearest neighbor classification. In JMLR'09. Google ScholarDigital Library
- Y. Xiang, X. Zhou, T.-S. Chua, and C.-W. Ngo. A revisit of generative model for automatic image annotation using markov random fields. In CVPR'09.Google Scholar
- O. Yakhnenko and V. Honavar. Annotating images and image objects using a hierarchical dirichlet process model. In MDM '08. Google ScholarDigital Library
- A. Yavlinsky, E. Schofield, and S. Rüger. Automated image annotation using global features and robust nonparametric density estimation. In CIVR'05. Google ScholarDigital Library
- S. Zhang, J. Huang, Y. Huang, Y. Yu, H. Li, and D. N. Metaxas. Automatic image annotation using group sparsity. In CVPR'10.Google Scholar
Index Terms
- Sparse Kernel Learning for Image Annotation
Recommendations
Image annotation by composite kernel learning with group structure
MM '11: Proceedings of the 19th ACM international conference on MultimediaWe can obtain more and more kinds of heterogeneous features (such as color, shape and texture) in images which can be extracted to describe various aspects of visual characteristics. Those high-dimensional heterogeneous visual features are intrinsically ...
Multiple kernel learning with NOn-conVex group spArsity
Enforce grouping sparsity penalty to select out discriminative visual features.Propose non-convex penalty to guarantee a consistent selection for features.Introduce sparse canonical correlation analysis to boost image annotation. As the high-dimensional ...
Effective automatic image annotation via a coherent language model and active learning
MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on MultimediaImage annotations allow users to access a large image database with textual queries. There have been several studies on automatic image annotation utilizing machine learning techniques, which automatically learn statistical models from annotated images ...
Comments