Entropy-Based Variational Inference for Semi-Bounded Data Clustering in Medical Applications

Manouchehri, Narges; Rahmanpour, Maryam; Bouguila, Nizar

doi:10.1007/978-3-030-45240-7_9

Narges Manouchehri⁴,
Maryam Rahmanpour⁴ &
Nizar Bouguila⁴

1009 Accesses
1 Citations

Abstract

Over the past decades, the unprecedented availability of various types of data and simultaneous development of technology established extensive interest in applying numerous machine learning approaches to extract the implicit patterns, acquire information and retrieve latent meaningful knowledge. Such powerful statistical tools have been applied in various fields of science.

One of the vital domains where these techniques could be potentially deployed is healthcare. The main intention in this field is that medical diagnosis procedures and healthcare examinations generate a huge amount of various data types such as text, image, video and signal. Dealing with such large complex data is beyond the scope of human competence. Consequently, machine learning tools are significantly valuable as they assist the clinicians in processing medical datasets, achieving broader insight, planning and managing diseases, providing better care which leads to having better outcomes including elimination of unnecessary costs and increasing patient satisfaction.

In this work, we focus on one of the main clustering methods of machine learning approaches, namely mixture models. These capable techniques have demonstrated high potential and flexibility to express data. Gaussian mixture models (GMM) have been widely applied in various fields of research to express symmetric data. However, for asymmetric and non-Gaussian data, other alternatives such as inverted Dirichlet mixture models could describe the data more accurately. To learn our model, we employ an entropy-based variational approach and then evaluate it on four medical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Christopher M Bishop. Pattern recognition and machine learning. springer, 2006.
Google Scholar
Geoffrey McLachlan and David Peel. Finite mixture models. John Wiley & Sons, 2004.
MATH Google Scholar
Trevor Hastie and Robert Tibshirani. Discriminant analysis by Gaussian mixtures. Journal of the Royal Statistical Society: Series B (Methodological), 58(1):155–176, 1996.
MathSciNet MATH Google Scholar
Wentao Fan, Nizar Bouguila, and Djemel Ziou. Variational learning for finite Dirichlet mixture models and applications. IEEE transactions on neural networks and learning systems, 23(5):762–774, 2012.
Article Google Scholar
Parisa Tirdad, Nizar Bouguila, and Djemel Ziou. Variational learning of finite inverted Dirichlet mixture models and applications. In Artificial Intelligence Applications in Information and Communication Technologies, pages 119–145. Springer, 2015.
Google Scholar
Wentao Fan and Nizar Bouguila. An accelerated variational framework for face expression recognition. In 2018 IEEE International Black Sea Conference on Communications and Networking (BlackSeaCom), pages 1–5. IEEE, 2018.
Google Scholar
Wentao Fan, Can Hu, Jixiang Du, and Nizar Bouguila. A novel model-based approach for medical image segmentation using spatially constrained inverted Dirichlet mixture models. Neural Processing Letters, 47(2):619–639, 2018.
Google Scholar
Nizar Bouguila and Djemel Ziou. A hybrid sem algorithm for high-dimensional unsupervised learning using a finite generalized Dirichlet mixture. IEEE Transactions on Image Processing, 15(9):2657–2668, 2006.
Article Google Scholar
Nizar Bouguila. Hybrid generative/discriminative approaches for proportional data modeling and classification. IEEE Transactions on Knowledge and Data Engineering, 24(12):2184–2202, 2011.
Article Google Scholar
Can Hu, Wentao Fan, Ji-Xiang Du, and Nizar Bouguila. A novel statistical approach for clustering positive data based on finite inverted Beta-Liouville mixture models. Neurocomputing, 333:110–123, 2019.
Article Google Scholar
Geoffrey J McLachlan. Mixture models in statistics. 2015.
Google Scholar
Tarek Elguebaly and Nizar Bouguila. A hierarchical nonparametric Bayesian approach for medical images and gene expressions classification. Soft Computing, 19(1):189–204, 2015.
Article Google Scholar
Dimitris Karlis and Evdokia Xekalaki. Choosing initial values for the em algorithm for finite mixtures. Computational Statistics & Data Analysis, 41(3–4):577–590, 2003.
Article MathSciNet Google Scholar
Kenji Fukumizu and Shun-ichi Amari. Local minima and plateaus in hierarchical structures of multilayer perceptrons. Neural networks, 13(3):317–327, 2000.
Article Google Scholar
Michael Evans, Tim Swartz, et al. Methods for approximating integrals in statistics with special emphasis on Bayesian integration problems. Statistical science, 10(3):254–272, 1995.
Google Scholar
Christian Robert and George Casella. Monte Carlo statistical methods. Springer Science & Business Media, 2013.
Google Scholar
Constantinos Constantinopoulos and Aristidis Likas. Unsupervised learning of Gaussian mixtures based on variational component splitting. IEEE Transactions on Neural Networks, 18(3):745–755, 2007.
Article Google Scholar
Hagai Attias. A variational Bayesian framework for graphical models. In Advances in neural information processing systems, pages 209–215, 2000.
Google Scholar
Adrian Corduneanu and Christopher M Bishop. Variational Bayesian model selection for mixture distributions. In Artificial intelligence and Statistics, volume 2001, pages 27–34. Morgan Kaufmann Waltham, MA, 2001.
Google Scholar
Mark William Woolrich and Timothy E Behrens. Variational Bayes inference of spatial mixture models for segmentation. IEEE Transactions on Medical Imaging, 25(10):1380–1391, 2006.
Google Scholar
Michael I Jordan, Zoubin Ghahramani, Tommi S Jaakkola, and Lawrence K Saul. An introduction to variational methods for graphical models. Machine learning, 37(2):183–233, 1999.
Google Scholar
Jeffrey Regier, Andrew Miller, Jon McAuliffe, Ryan Adams, Matt Hoffman, Dustin Lang, David Schlegel, and Mr Prabhat. Celeste: Variational inference for a generative model of astronomical images. In International Conference on Machine Learning, pages 2095–2103, 2015.
Google Scholar
David M Blei, Alp Kucukelbir, and Jon D McAuliffe. Variational inference: A review for statisticians. Journal of the American Statistical Association, 112(518):859–877, 2017.
Google Scholar
Christopher M Bishop. Variational learning in graphical models and neural networks. In International Conference on Artificial Neural Networks, pages 13–22. Springer, 1998.
Google Scholar
JM Bernardo, MJ Bayarri, JO Berger, AP Dawid, D Heckerman, AFM Smith, M West, et al. The variational Bayesian em algorithm for incomplete data: with application to scoring graphical model structures. Bayesian statistics, 7:453–464, 2003.
MathSciNet Google Scholar
Nizar Bouguila and Djemel Ziou. High-dimensional unsupervised selection and estimation of a finite generalized Dirichlet mixture model based on minimum message length. IEEE transactions on pattern analysis and machine intelligence, 29(10):1716–1731, 2007.
Article Google Scholar
Antonio Penalver and Francisco Escolano. Entropy-based incremental variational Bayes learning of Gaussian mixtures. IEEE transactions on neural networks and learning systems, 23(3):534–540, 2012.
Article Google Scholar
Wentao Fan, Faisal R Al-Osaimi, Nizar Bouguila, and Jixiang Du. Proportional data modeling via entropy-based variational Bayes learning of mixture models. Applied Intelligence, 47(2):473–487, 2017.
Google Scholar
Wentao Fan, Nizar Bouguila, Sami Bourouis, and Yacine Laalaoui. Entropy-based variational Bayes learning framework for data clustering. IET Image Processing, 12(10):1762–1772, 2018.
Article Google Scholar
W Raghupathi and S Kudyba. Healthcare informatics: improving efficiency and productivity. In Data Mining in Health Care, pages 211–223. 2010.
Google Scholar
George G Tiao and Irwin Cuttman. The inverted Dirichlet distribution with applications. Journal of the American Statistical Association, 60(311):793–805, 1965.
Google Scholar
D Chandler. Oxford university press; new york: 1987. Introduction to Modern Statistical Mechanics, pages 234–270.
Google Scholar
Josef Kittler, Mohamad Hatef, Robert PW Duin, and Jiri Matas. On combining classifiers. IEEE transactions on pattern analysis and machine intelligence, 20(3):226–239, 1998.
Google Scholar
Lev Faivishevsky and Jacob Goldberger. Ica based on a smooth estimation of the differential entropy. In Advances in neural information processing systems, pages 433–440, 2009.
Google Scholar
Nikolai Leonenko, Luc Pronzato, Vippal Savani, et al. A class of rényi information estimators for multidimensional densities. The Annals of Statistics, 36(5):2153–2182, 2008.
Google Scholar
WHO. Cardiovascular Diseases report of WHO. https://www.who.int/health-topics/cardiovascular-diseases/.
Chayakrit Krittanawong, HongJu Zhang, Zhen Wang, Mehmet Aydar, and Takeshi Kitai. Artificial intelligence in precision cardiovascular medicine. Journal of the American College of Cardiology, 69(21):2657–2664, 2017.
Article Google Scholar
UCI repository. Heart disease. https://archive.ics.uci.edu/ml/datasets/Heart+Disease/.
WHO. Diabetes disease fact sheet. https://www.who.int/news-room/fact-sheets/detail/diabetes/.
Ioannis Kavakiotis, Olga Tsave, Athanasios Salifoglou, Nicos Maglaveras, Ioannis Vlahavas, and Ioanna Chouvarda. Machine learning and data mining methods in diabetes research. Computational and structural biotechnology journal, 15:104–116, 2017.
Article Google Scholar
Kaggle. Diabetes disease dataset. https://www.kaggle.com/uciml/pima-indians-diabetes-database/.
WHO. WHO cancer statistics. https://www.who.int/news-room/fact-sheets/detail/cancer/.
UCI. Lung cancer. https://archive.ics.uci.edu/ml/datasets/Lung+Cancer//.
WHO report on breast cancer. Breast cancer dataset. https://www.who.int/cancer/prevention/diagnosis-screening/breast-cancer/en//.
breast cancer. Cytological breast tissue dataset. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)/.

Download references

Acknowledgements

We express our thanks to the Natural Sciences and Engineering Research Council of Canada (NSERC) as this research was completed by their support.

Author information

Authors and Affiliations

Concordia Institute for Information Systems Engineering (CIISE), Concordia University, Montreal, QC, Canada
Narges Manouchehri, Maryam Rahmanpour & Nizar Bouguila

Authors

Narges Manouchehri
View author publications
You can also search for this author in PubMed Google Scholar
Maryam Rahmanpour
View author publications
You can also search for this author in PubMed Google Scholar
Nizar Bouguila
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Narges Manouchehri .

Editor information

Editors and Affiliations

LASPI, G037, IUT de Roanne, Roanne, France
Malek Masmoudi
Department of Business, Higher Colleges of Technology, Abu Dhabi, United Arab Emirates
Bassem Jarboui
Laboratoire LiSSi (EA 3956), Université Paris-Est Créteil Val-de-Marne, Créteil, France
Patrick Siarry

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Manouchehri, N., Rahmanpour, M., Bouguila, N. (2021). Entropy-Based Variational Inference for Semi-Bounded Data Clustering in Medical Applications. In: Masmoudi, M., Jarboui, B., Siarry, P. (eds) Artificial Intelligence and Data Mining in Healthcare. Springer, Cham. https://doi.org/10.1007/978-3-030-45240-7_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-45240-7_9
Published: 25 July 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-45239-1
Online ISBN: 978-3-030-45240-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics