A two-stage hybrid probabilistic topic model for refining image annotation

Tian, Dongping; Shi, Zhongzhi

doi:10.1007/s13042-019-00983-w

A two-stage hybrid probabilistic topic model for refining image annotation

Original Article
Published: 20 July 2019

Volume 11, pages 417–431, (2020)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Dongping Tian¹ &
Zhongzhi Shi²

271 Accesses
13 Citations
Explore all metrics

Abstract

Refining image annotation has become one of the core research topics in computer vision and pattern recognition due to its great potentials in image retrieval. However, it is still in its infancy and is not sophisticated enough to extract perfect semantic concepts just according to the image low-level features. In this paper, we propose a two-stage hybrid probabilistic topic model to improve the quality of automatic image annotation. To start with, a probabilistic latent semantic analysis model with asymmetric modalities is learned to estimate the posterior probabilities of each annotation keyword, during which the image-to-word relation can be well established. Next, a label similarity graph is constructed by a weighted linear combination of label similarity and visual similarity of images associated with the corresponding labels. By this way, the information from image low-level visual features and high-level semantic concepts can be seamlessly integrated by fully taking into account the word-to-word and image-to-image relations. Finally, the rank-two relaxation heuristics is exploited to further mine the correlation of the candidate annotations so as to capture the refining results, which plays a critical role in semantic based image retrieval. Extensive experiments show that the proposed model achieves not only superior annotation accuracy but also better retrieval performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Refining Image Annotation by Integrating PLSA with Random Walk Model

A New Method for Image Understanding and Retrieval Using Text-Mined Knowledge

Image Annotation and Refinement with Markov Chain Model of Visual Keywords and the Semantics

Notes

Here label means the initial annotation generated by the PLSA.
http://vision.sista.arizona.edu/kobus/research/data/eccv_2002/index.html
http://appsrv.cse.cuhk.edu.hk/~jkzhu/felib.html
Downloaded from http://press.liacs.nl/mirflickr/dlform.php

References

Bhagat P, Choudhary P (2018) Image annotation: then and now. Image Vis Comput 80:1–23
Article Google Scholar
Binder A, Samek W, Müller K et al (2013) Enhanced representation and multi-task learning for image annotation. Comput Vis Image Underst 117(5):466–478
Article Google Scholar
Blei D, Lafferty J (2007) Correlated topic models. Ann Appl Stat 1(1):17–35
Article MathSciNet Google Scholar
Blei D (2012) Probabilistic topic models. Commun ACM 55(4):77–84
Article Google Scholar
Bosch A, Zisserman A, Munoz X (2006) Scene classification via PLSA. Proc 9th Eur Conf Comput Vis (ECCV’06) 3954:517–530
Google Scholar
Burer S, Monteiro R, Zhang Y (2002) Rank-two relaxation heuristics for max-cut and other binary quadratic programs. SIAM J Optim 12(2):503–521
Article MathSciNet Google Scholar
Carneiro G, Chan A, Moreno P et al (2007) Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans Pattern Anal Mach Intell 29(3):394–410
Article Google Scholar
Chen Z, Fu H, Chi Z et al (2012) An adaptive recognition model for image annotation. IEEE Trans Syst Man Cybern Part C 42(6):1120–1127
Article Google Scholar
Cheng G, Guo L, Zhao T et al (2013) Automatic landslide detection from remote-sensing imagery using a scene classification method based on BoVW and PLSA. Int J Remote Sens 34(1):45–59
Article Google Scholar
Cilibrasi R, Paul M (2007) The google similarity distance. IEEE Trans Knowl Data Eng 19(3):370–383
Article Google Scholar
Duygulu P, Barnard K, Freitas N et al (2002) Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. Proc 7th Eur Conf Comput Vis (ECCV’02) 2353:97–112
MATH Google Scholar
Ergul E, Arica N (2010) Scene classification using spatial pyramid of latent topics. In: Proceedings of the 20th international conference on pattern recognition (ICPR’10), pp 3603–3606
Farahat A, Chen F (2006) Improving probabilistic latent semantic analysis with principal component analysis. In: Proceedings of the 11th conference of the european chapter of the association for computational linguistics (EACL’06), pp 105–112
Fathian M, Tab F, Moradi K et al (2018) A learning automata framework based on relevance feedback for content-based image retrieval. Int J Mach Learn Cybern 9(9):1457–1472
Article Google Scholar
Fellbaum C (2010) WordNet. Theory Appl Ontol Comput Appl 2010:231–243
Google Scholar
Feng Z, Jin R, Jain A (2013) Large-scale image annotation by efficient and robust kernel metric learning. In: Proceedings of the 16th international conference on computer vision (ICCV’13), pp 1609–1616
Feng S, Manmatha R, Lavrenko V (2004) Multiple Bernoulli relevance models for image and video annotation. In: Proceedings of the computer vision and pattern recognition (CVPR’04), pp 1002–1009
Foumani S, Nickabadi A (2019) A probabilistic topic model using deep visual word representation for simultaneous image classification and annotation. J Vis Commun Image Represent 59:195–203
Article Google Scholar
Guillaumin M, Mensink T, Verbeek J et al (2009) Tagprop: discriminative metric learning in nearest neighbor models for image auto-annotation. In: Proceedings of the 12th international conference on computer vision (ICCV’09), pp 309–316
Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1–2):177–196
Article Google Scholar
Hou Y (2015) Image annotation incorporating low-rankness, tag and visual correlation and inhomogeneous errors. In: Proceedings of the 11th international symposium on visual computing (ISVC’15), pp 71–81
Chapter Google Scholar
Huiskes M, Lew M (2008) The MIR flickr retrieval evaluation. In: Proceedings of the 1st international conference on multimedia information retrieval (MIR’08), pp 39–43
Jeon L, Lavrenko V, Manmantha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th international ACM SIGIR conference on research and development in information retrieval (SIGIR’03), pp 119–126
Jin Y, Jin K, Khan L et al (2008) The randomized approximating graph algorithm for image annotation refinement problem. In: Proceedings of the computer vision and pattern recognition workshop (CVPRW’08), pp 1–8
Jin Y, Khan L, Prabhakaran B (2010) Knowledge based image annotation refinement. J Signal Process Syst 58(3):387–406
Article Google Scholar
Jin Y, Khan L, Wang L et al (2005) Image annotations by combining multiple evidence and wordnet. In: Proceedings of the 13th international conference on multimedia (MM’05), pp 706–715
Lavrenko V, Manmatha R, Jeon J (2003) A model for learning the semantics of pictures. In: Advances in Neural Information Processing Systems 16 (NIPS’03), pp 553–560
Lee S, Neve W, Plataniotis K et al (2010) MAP-based image tag recommendation using a visual folksonomy. Pattern Recognit Lett 31(9):976–982
Article Google Scholar
Lee S, Neve W, Yong M (2010) Tag refinement in an image folksonomy using visual similarity and tag co-occurrence statistics. Signal Process Image Commun 25(10):761–773
Article Google Scholar
Li P, Cheng J, Li Z et al (2011) Correlated PLSA for image clustering. In: Proceedings of the 17th international conference on multimedia modeling (MMM’11), pp 307–316
Google Scholar
Li N, Luo W, Yang K et al (2018) Self-organizing weighted incremental probabilistic latent semantic analysis. Int J Mach Learn Cybern 9(12):1987–1998
Article Google Scholar
Li Z, Shi Z, Liu X et al (2010) Fusing semantic aspects for image annotation and retrieval. J Vis Commun Image Represent 21(8):798–805
Article Google Scholar
Li Z, Shi Z, Liu X et al (2011) Modeling continuous visual features for semantic image annotation and retrieval. Pattern Recognit Lett 32:516–523
Article Google Scholar
Li X, Snoek C, Worring M (2009) Learning social tag relevance by neighbor voting. IEEE Trans Multimed 11(7):1310–1322
Article Google Scholar
Liu D, Hua X, Yang L et al (2009) Tag ranking. In: Proceedings of the 18th international conference on world wide web (WWW’09), pp 351–360
Liu J, Li M, Liu Q et al (2009) Image annotation via graph learning. Pattern Recognit 42(2):218–228
Article MathSciNet Google Scholar
Liu Z, Ma J (2011) Refining image annotation by graph partition and image search engine. J Comput Res Development 48(7):1246–1254
Google Scholar
Liu J, Wang B, Li M et al (2007) Dual cross-media relevance model for image annotation. In: Proceedings of the 15th international conference on multimedia (MM’07), pp 605–614
Liu Y, Xu D, Feng S et al (2010) A novel visual words definition algorithm of image patch based on contextual semantic information. Acta Electron Sin 38(5):1156–1161
Google Scholar
Liu Z, Zhang C, Chen C (2018) MMDF-LDA: an improved multi-modal latent dirichlet allocation model for social image annotation. Expert Syst Appl 104:168–184
Article Google Scholar
Lu Z, Peng Y, Horace H (2010) Image categorization via robust PLSA. Pattern Recognit Lett 31(1):36–43
Article Google Scholar
Makadia A, Pavlovic V, Kumar S (2008) A new baseline for image annotation. In: Proceedings of the European Conference on Computer Vision (ECCV’08), pp 316–329
Google Scholar
Monay F, Gatica-Perez D (2003) On image auto-annotation with latent space models. In: Proceedings of the 11th international conference on multimedia (MM’03), pp 275–278
Monay F, Gatica-Perez D (2004) PLSA-based image auto-annotation: constraining the latent space. In: Proceedings of the 12th international conference on multimedia (MM’04), pp 348–351
Monay F, Gatica-Perez D (2007) Modeling semantic aspects for cross-media image indexing. IEEE Trans Pattern Anal Mach Intell 29(10):1802–1817
Article Google Scholar
Nikolopoulos S, Zafeiriou S, Patras I et al (2013) High order PLSA for indexing tagged images. Signal Process 93(8):2212–2228
Article Google Scholar
Romberg S, Lienhart R, Horster E (2012) Multimodal image retrieval: fusing modalities with multilayer multimodal PLSA. Int J Multimed Inf Retrieval 1(1):31–44
Article Google Scholar
Rui X, Li M, Li Z et al (2007) Bipartite graph reinforcement model for web image annotation. In: Proceedings of the 15th international conference on multimedia (MM’07), pp 585–594
Sun L, Ge H, Yoshida S et al (2014) Support vector description of clusters for content-based image annotation. Pattern Recognit 47(3):1361–1374
Article Google Scholar
Tian D, Zhao X, Shi Z (2014) An efficient refining image annotation technique by combining probabilistic latent semantic analysis and random walk model. Intell Autom Soft Comput 20(3):335–345
Article Google Scholar
Tian D (2015) Exploiting PLSA model and conditional random field for refining image annotation. High Technol Lett 21(1):78–84
Google Scholar
Tian D, Zhang W, Zhao X et al (2013) Employing PLSA model and max-bisection for refining image annotation. In: Proceedings of the 20th international conference on image processing (ICIP’13), pp 3996–4000
Tian D (2018) Research on PLSA model based semantic image analysis: a systematic review. J Inf Hiding Multimed Signal Process 9(5):1099–1113
Google Scholar
Wang C, Jing F, Zhang L et al (2006) Image annotation refinement using random walk with restarts. In: Proceedings of the 14th international conference on multimedia (MM’06), pp 647–650
Wang C, Jing F, Zhang L et al (2007) Content-based image annotation refinement. In: Proceedings of the computer vision and pattern recognition (CVPR’07), pp 1–8
Wang Z, Yi H, Wang J et al (2009) Hierarchical Gaussian mixture model for image annotation via PLSA. In: Proceedings of the 5th international conference on image and graphics (ICIG’09), pp 384–389
Wang J, Zhou J, Xu H et al (2014) Image tag refinement by regularized latent Dirichlet allocation. Comput Vis Image Underst 124(7):61–70
Article Google Scholar
Xu H, Wang J, Hua X et al (2009) Tag refinement by regularized LDA. In: Proceedings of the 17th international conference on multimedia (MM’09), pp 573–576
Zheng Y, Takiguchi T, Ariki Y (2011) Image annotation with concept level feature using PLSA + CCA. In: Proceedings of the 17th international conference on multimedia modeling (MMM’11), pp 454–464
Google Scholar
Zhou N, Cheung W, Qiu G et al (2011) A hybrid probabilistic model for unified collaborative and content based image tagging. IEEE Trans Pattern Anal Mach Intell 33(7):1281–1294
Article Google Scholar
Zhu J, Hoi S, Lyu M et al (2008) Near-duplicate keyframe retrieval by nonrigid image matching. In: Proceedings of the 16th international conference on multimedia (MM’08), pp 41–50
Zhu G, Yan S, Ma Y (2010) Image tag refinement towards low-rank, content-tag prior and error sparsity. In: Proceedings of the 18th international conference on multimedia (MM’10), pp 461–470

Download references

Acknowledgements

The authors would like to sincerely thank the editor and anonymous reviewers for their valuable comments and insightful suggestions that have helped us to improve the paper. Also, the authors thank Prof. Xiaofei Zhao for stimulating discussions and helpful hints. In addition, this work is fully supported by the National Program on Key Basic Research Project (973 Program) (No. 2013CB329502), National Natural Science Foundation of China (No. 61035003, No. 61202212), Tianchenghuizhi Fund for Innovation and Promotion of Education (No. 2018A03036) and Key R&D Program of the Shaanxi Province of China (No. 2018GY-037).

Author information

Authors and Affiliations

Institute of Computer Software, Baoji University of Arts and Sciences, Baoji, 721007, Shaanxi, People’s Republic of China
Dongping Tian
Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, 100190, Beijing, People’s Republic of China
Zhongzhi Shi

Authors

Dongping Tian
View author publications
You can also search for this author in PubMed Google Scholar
Zhongzhi Shi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dongping Tian.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tian, D., Shi, Z. A two-stage hybrid probabilistic topic model for refining image annotation. Int. J. Mach. Learn. & Cyber. 11, 417–431 (2020). https://doi.org/10.1007/s13042-019-00983-w

Download citation

Received: 06 October 2018
Accepted: 10 July 2019
Published: 20 July 2019
Issue Date: February 2020
DOI: https://doi.org/10.1007/s13042-019-00983-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A two-stage hybrid probabilistic topic model for refining image annotation

Abstract

Access this article

Similar content being viewed by others

Refining Image Annotation by Integrating PLSA with Random Walk Model

A New Method for Image Understanding and Retrieval Using Text-Mined Knowledge

Image Annotation and Refinement with Markov Chain Model of Visual Keywords and the Semantics

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A two-stage hybrid probabilistic topic model for refining image annotation

Abstract

Access this article

Similar content being viewed by others

Refining Image Annotation by Integrating PLSA with Random Walk Model

A New Method for Image Understanding and Retrieval Using Text-Mined Knowledge

Image Annotation and Refinement with Markov Chain Model of Visual Keywords and the Semantics

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation