Abstract
In a short period of time, many areas of science have made a sharp transition towards data-dependent methods. In some cases, this process has been enabled by simultaneous advances in data acquisition and the development of networked system technologies. This new situation is particularly clear in the life sciences, where data overabundance has sparked a flurry of new methodologies for data management and analysis. This can be seen as a perfect scenario for the use of machine learning and computational intelligence techniques to address problems in which more traditional data analysis approaches might struggle. But, this scenario also poses some serious challenges. One of them is model interpretability and explainability, especially for complex nonlinear models. In some areas such as medicine and health care, not addressing such challenge might seriously limit the chances of adoption, in real practice, of computer-based systems that rely on machine learning and computational intelligence methods for data analysis. In this paper, we reflect on recent investigations about the interpretability and explainability of machine learning methods and discuss their impact on medicine and health care. We pay specific attention to one of the ways in which interpretability and explainability in this context can be addressed, which is through data and model visualization. We argue that, beyond improving model interpretability as a goal in itself, we need to integrate the medical experts in the design of data analysis interpretation strategies. Otherwise, machine learning is unlikely to become a part of routine clinical and health care practice.
Similar content being viewed by others
References
Wu Q, Zhu Y, Wang X, Li M, Hou J, Masoumi A (2017) Exploring high efficiency hardware accelerator for the key algorithm of Square Kilometer Array telescope data processing. In: Proceedings of the IEEE \(25{\rm th}\) annual international symposium on field-programmable custom computing machines (FCCM), pp 195–195
Britton D, Lloyd SL (2014) How to deal with petabytes of data: the LHC Grid project. Rep Prog Phys 77(6):065902
Adam-Bourdarios C, Cowan G, Germain-Renaud C, Guyon I, Kégl B, Rousseau D (2015) The Higgs machine learning challenge. J Phys Conf 664(7):072015
Leonelli S (2016) Data-centric biology: a philosophical study. University of Chicago Press, Chicago
Kashyap H, Ahmed HA, Hoque N, Roy S, Bhattacharyya DK (2015) Big data analytics in bioinformatics: a machine learning perspective. arXiv preprint arXiv:1506.05101
Marx V (2013) Biology: the big challenges of big data. Nature 498(7453):255–260
Stein LD (2010) The case for cloud computing in genome informatics. Genome Biol 11(5):207
Stephens ZD, Lee SY, Faghri F, Campbell RH, Zhai C, Efron MJ, Iyer R, Schatz MC, Sinha S, Robinson GE (2015) Big data: Astronomical or genomical? PLoS Biol 13(7):e1002195
Vellido A, Martín-Guerrero JD, Lisboa PJG (2012) Making machine learning models interpretable. In: Proceedings of the \(20{\rm th}\) European symposium on artificial neural networks, computational intelligence and machine learning (ESANN), Bruges, Belgium, pp 163–172
Dong Y, Su H, Zhu J, Bao F (2017) Towards interpretable deep neural networks by leveraging adversarial examples. arXiv preprint arXiv:1708.05493
Schwartz-Ziv R, Tishby N (2017) Opening the black box of deep neural networks via information. arXiv preprint arXiv:1703.00810v3
Biran O, Cotton C (2017) Explanation and justification in machine learning: a survey. In: IJCAI-17 workshop on explainable AI (XAI), p 8
Pereira-Fariña M, Reed C (2017) Preface to proceedings of the 1st workshop on explainable computational intelligence (XCI 2017)
Adadi A, Berrada M (2018) Peeking inside the black-box: a survey on explainable Artificial Intelligence (XAI). IEEE Access 6:52138–52160
Doshi-Velez F, Kim B (2017) Towards a rigorous science of interpretable machine learning. arXiv preprint. arXiv:1702.08608
Doshi-Velez F, Kortz M, Budish R, Bavitz C, Gershman S, O’Brien D, Schieber S, Waldo J, Weinberger D, Wood A (2017) Accountability of AI under the law: the role of explanation. arXiv preprint arXiv:1711.01134
Vignard K (2014) The weaponization of increasingly autonomous technologies: considering how meaningful human control might move discussion forward. UNIDIR Resour 2:1
Davison N (2018) A legal perspective: autonomous weapon systems under international humanitarian law. United Nations Office of Disarmament Affairs (UNODA) Occasional Papers, pp 5–18
Press M (2016) Of robots and rules: autonomous weapon systems in the law of armed conflict. Geo J Int Law (Georgetown J of Int Law) 48:1337
Kroll JA (2018) The fallacy of inscrutability. Philos Trans R Soc A 376(2133):20180084
Goodman B, Flaxman S (2017) European Union regulations on algorithmic decision making and a “right to explanation. AI Magz 38(3):76
Rossi F (2016) Artificial intelligence: potential benefits and ethical considerations. Eur Parliam Policy Dep C Citiz Rights Const Affairs Brief PE 571:380
Wachter S, Mittelstadt B, Floridi L (2017) Why a right to explanation of automated decision-making does not exist in the general data protection regulation. Int Data Priv Law 7(2):76–99
Miller T, Howe P, Sonenberg L (2017) Explainable AI: beware of inmates running the asylum. In: IJCAI-17 workshop on explainable AI (XAI), p 36
Cath C (2018) Governing artificial intelligence: ethical, legal and technical opportunities and challenges. Philos Trans R Soc A 376(2133):20180080
Keim DA, Mansmann F, Schneidewind J, Thomas J, Ziegler H (2008) Visual analytics: scope and challenges. In: Visual data mining, LNCS, vol 4404. Springer, pp 76–90
Liu S, Wang X, Liu M, Zhu J (2017) Towards better analysis of machine learning models: a visual analytics perspective. Vis Inf 1(1):48–56
Vellido A, Martín JD, Rossi F, Lisboa PJ (2011) Seeing is believing: the importance of visualization in real-world machine learning applications. In: Proceedings of the \(19{\rm th}\) European symposium on artificial neural networks, computational intelligence and machine learning (ESANN), Bruges, Belgium, pp 219–226
Liu M, Shi J, Li Z, Li C, Zhu J, Liu S (2017) Towards better analysis of deep convolutional neural networks. IEEE Trans Vis Comput Gr 23(1):91–100
Selvaraju RR, Das A, Vedantam R, Cogswell M, Parikh D, Batra D (2017) Grad-CAM: Why did you say that? visual explanations from deep networks via gradient-based localization. In: Proceedings of the international conference on computer vision (ICCV 2017), pp. 618–626
Simonyan K, Vedaldi A, Zisserman A (2014) Deep inside convolutional networks: visualising image classification models and saliency maps. In: Workshop proceedings of the international conference on learning representations (ICLR)
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: Proceedings of the European conference on computer vision (ECCV), pp 818–833
Sacha D, Sedlmair M, Zhang L, Lee JA, Peltonen J, Weiskopf D, North SC, Keim DA (2017) What you see is what you can change: human-centred machine learning by interactive visualization. Neurocomputing 268:164–175
Reza SM (2016) Transforming big data into computational models for personalized medicine and health care. Dialog Clin Neurosci 18(3):339–343
Jensen PB, Jensen LJ, Brunak S (2012) Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet 13(6):395–405
Hoff T (2011) Deskilling and adaptation among primary care physicians using two work innovations. Health Care Manage R 36(4):338–348
Cabitza F, Rasoini R, Gensini GF (2017) Unintended consequences of machine learning in medicine. JAMA 318(6):517–518
Safdar S, Zafar S, Zafar N, Khan NF (2017) Machine learning based decision support systems (DSS) for heart disease diagnosis: a review. Artif Intell Rev 50(4):597–623
Pombo N, Araújo P, Viana J (2014) Knowledge discovery in clinical decision support systems for pain management: a systematic review. Artif Intell Med 60(1):1–11
Vellido A, Ribas V, Morales C, Ruiz-Sanmartín A, Ruiz-Rodríguez JC (2018) Machine learning for critical care: state-of-the-art and a sepsis case study. BioMed Eng OnLine 17(S1):135
Dreiseitl S, Binder M (2005) Do physicians value decision support? A look at the effect of decision support systems on physician opinion. Artif Intell Med 33(1):25–30
Tu JV (1996) Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. J Clin Epidemiol 49(11):1225–1231
Angermueller C, Pärnamaa T, Parts L, Stegle O (2016) Deep learning for computational biology. Mol Syst Biol 12(7):878
Mamoshina P, Vieira A, Putin E, Zhavoronkov A (2016) Applications of deep learning in biomedicine. Mol Pharm 13(5):1445–1454
Miotto R, Li L, Kidd BA, Dudley JT (2016) Deep patient: an unsupervised representation to predict the future of patients from the electronic health records. Sci Rep 6:26094
Jackups R (2017) Deep learning makes its way to the clinical laboratory. Clin Chem 63(12):1790–1791
Ravì D, Wong C, Deligianni F, Berthelot M, Andreu-Pérez J, Lo B, Yang GZ (2017) Deep learning for health informatics. IEEE J Biomed Health 21(1):4–21
Ching T, Himmelstein DS, Beaulieu-Jones BK, Kalinin AA, Do BT, Way GP, Ferrero E, Agapow PM, Zietz M, Hoffman MM, Xie W (2018) Opportunities and obstacles for deep learning in biology and medicine. J R Soc Interface 15(141):20170387
Bacciu D, Lisboa PJ, Martín JD, Stoean R, Vellido A (2018) Bioinformatics and medicine in the era of deep learning. In: Proceedings of the \(26{\rm th}\) European symposium on artificial neural networks, computational intelligence and machine learning (ESANN 2018), Bruges, Belgium, pp 345–354
Che Z, Purushotham S, Khemani R, Liu Y (2015) Distilling knowledge from deep networks with applications to healthcare domain. arXiv preprint arXiv:1512.03542
Wu M, Hughes M, Parbhoo S, Doshi-Velez F (2017) Beyond sparsity: tree-based regularization of deep models for interpretability. In: Neural information processing systems (NIPS) conference. Transparent and interpretable machine learning in safety critical environments (TIML) workshop, Long Beach (CA), USA
Che Z, Purushotham S, Khemani R, Liu Y (2016) Interpretable deep models for ICU outcome prediction. In: AMIA annual symposium proceedings, vol 2016. American Medical Informatics Association, p 371
Choi E, Bahadori MT, Sun J, Kulas J, Schuetz A, Stewart W (2016) Retain: an interpretable predictive model for healthcare using reverse time attention mechanism. In: Advances in neural information processing systems (NIPS), pp 3504–3512
Ma F, Chitta R, Zhou J, You Q, Sun T, Gao J (2017) Dipole: Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks. In: Proceedings of the \(23{\rm rd}\) ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 1903–1911
Sha Y, Wang MD (2017) Interpretable predictions of clinical outcomes with an attention-based recurrent neural network. In: Proceedings of the \(8{\rm th}\) ACM international conference on bioinformatics, computational biology, and health informatics (ACM-BCB), pp 233–240
Zhang Z, Xie Y, Xing F, McGough M, Yang L (2017) MDNet: a semantically and visually interpretable medical image diagnosis network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 6428–6436
Nguyen P, Tran T, Wickramasinghe N, Venkatesh S (2017) Deepr: a convolutional net for medical records. IEEE J Biomed Health Inform 21:2230
Hicks SA, Eskeland S, Lux M, de Lange T, Randel KR, Jeppsson M, Pogorelov K, Halvorsen P, Riegler M (2018) Mimir: an automatic reporting and reasoning system for deep learning based analysis in the medical domain. In: Proceedings of the \(9{\rm th}\) ACM multimedia systems conference (MMSys), pp 369–374
Rögnvaldsson T, Etchells TA, You L, Garwicz D, Jarman I, Lisboa PJ (2009) How to find simple and accurate rules for viral protease cleavage specificities. BMC Bioinform 10(1):149
Van Belle V, Van Calster B, Van Huffel S, Suykens JAK, Lisboa P (2016) Explaining support vector machines: a color based nomogram. PLoS ONE 11(10):e0164568
Vellido A, Romero E, Julià-Sapé M, Majós C, Moreno-Torres À, Arús C (2012) Robust discrimination of glioblastomas from metastatic brain tumors on the basis of single-voxel proton MRS. NMR Biomed 25(6):819828
Ash JS, Berg M, Coiera E (2004) Some unintended consequences of information technology in health care: the nature of patient care information system-related errors. JAMA 11(2):104–112
Reid MJ (2017) Black-box machine learning: implications for healthcare. Polygeia, London
Berner ES, Graber ML (2008) Overconfidence as a cause of diagnostic error in medicine. Am J Med 121(5):S2–S23
Bhanot G, Biehl M, Villmann T, Zühlke D (2017) Biomedical data analysis in translational research: Integration of expert knowledge and interpretable models. In: Proceedings of the \(25{\rm th}\) European symposium on artificial neural networks, computational intelligence and machine learning (ESANN), pp 177–186
Holzinger A (2016) Interactive machine learning for health informatics: When do we need the human-in-the-loop? Brain Inform 3(2):119–131
Julià-Sapé M, Acosta D, Mier M, Arús C, Watson D, The INTERPRET Consortium (2006) A multi-centre, web-accessible and quality control-checked database of in vivo MR spectra of brain tumour patients. Magn Reson Mater Phys 19(1):22–33
Julià-Sapé M, Lurgi M, Mier M, Estanyol F, Rafael X, Candiota AP, Barceló A, García A, Martínez-Bisbal MC, Ferrer-Luna R, Moreno-Torres À (2012) Strategies for annotation and curation of translational databases: the eTUMOUR project. Database 2012:bas035
Vellido A, Romero E, González-Navarro FF, Belanche-Muñoz LA, Julià-Sapé M, Arús C (2009) Outlier exploration and diagnostic classification of a multi-centre 1H-MRS brain tumour database. Neurocomputing 72(13–15):3085–3097
Vellido A, Romero E, Julià-Sapé M, Majós C, Moreno-Torres À, Pujol J, Arús C (2012) Robust discrimination of glioblastomas from metastatic brain tumors on the basis of single-voxel \(^{1}\text{ H } \text{ MRS }\). NMR Biomed 25(6):819–828
Mocioiu V, Kyathanahally SP, Arús C, Vellido A, Julià-Sapé M (2016) Automated quality control for proton magnetic resonance spectroscopy data using convex non-negative matrix factorization, In: Proceedings of the \(4{\rm th}\) international conference on bioinformatics and biomedical engineering (IWBBIO), LNCS/LNBI, Vol 9656, pp 719–727
Rajkomar A et al (2018) Scalable and accurate deep learning for electronic health records. NPJ Digit Med 1(1):18
Shah H (2017) The DeepMind debacle demands dialogue on data. Nature 547:259
Miotto R, Wang F, Wang S, Jiang X, Dudley JT (2017) Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform 19(6):1236–1246
Litjens G et al (2017) A survey on deep learning in medical image analysis. Med Image Anal 42:60–88
Chartrand G et al (2017) Deep learning: a primer for radiologists. Radiographics 37(7):2113–2131
Shickel B, Tighe PJ, Bihorac A, Rashidi P (2018) Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE J Biomed Health Inform 22(5):1589–1604
Zaharchuk G, Gong E, Wintermark M, Rubin D, Langlotz CP (2018) Deep learning in neuroradiology. AJNR Am J Neuroradiol 39(10):1776–1784
Chen H, Engkvist O, Wang Y, Olivecrona M, Blaschke T (2018) The rise of deep learning in drug discovery. Drug Discov Today 23(6):1241–1250
Kwon BC, Choi MJ, Kim JT, Choi E, Kim YB, Kwon S, Sun J, Choo J (2019) RetainVis: visual analytics with interpretable and interactive recurrent neural networks on electronic medical records. IEEE Trans Vis Comput Graph 25(1):299–309
Wu J, Peck D, Hsieh S, Dialani V, Lehman CD, Zhou B, Syrgkanis V, Mackey L, Patterson G (2018) Expert identification of visual primitives used by CNNs during mammogram classification. In: SPIE medical imaging 2018: computer-aided diagnosis, p 10575:105752T
Acknowledgements
This work was funded by the MINECO Spanish TIN2016-79576-R Project.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author declares that he has no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
This appendix includes a self-contained summary of research publications in the form of two tables. Table 1 covers a selection of general references on DL methods applied to biomedicine, while Table 2 focuses only on studies that deal with the problem of interpretability of DL methods applied in the field.
Rights and permissions
About this article
Cite this article
Vellido, A. The importance of interpretability and visualization in machine learning for applications in medicine and health care. Neural Comput & Applic 32, 18069–18083 (2020). https://doi.org/10.1007/s00521-019-04051-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-019-04051-w