Skip to main content

Advertisement

Log in

The importance of interpretability and visualization in machine learning for applications in medicine and health care

  • WSOM 2017
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In a short period of time, many areas of science have made a sharp transition towards data-dependent methods. In some cases, this process has been enabled by simultaneous advances in data acquisition and the development of networked system technologies. This new situation is particularly clear in the life sciences, where data overabundance has sparked a flurry of new methodologies for data management and analysis. This can be seen as a perfect scenario for the use of machine learning and computational intelligence techniques to address problems in which more traditional data analysis approaches might struggle. But, this scenario also poses some serious challenges. One of them is model interpretability and explainability, especially for complex nonlinear models. In some areas such as medicine and health care, not addressing such challenge might seriously limit the chances of adoption, in real practice, of computer-based systems that rely on machine learning and computational intelligence methods for data analysis. In this paper, we reflect on recent investigations about the interpretability and explainability of machine learning methods and discuss their impact on medicine and health care. We pay specific attention to one of the ways in which interpretability and explainability in this context can be addressed, which is through data and model visualization. We argue that, beyond improving model interpretability as a goal in itself, we need to integrate the medical experts in the design of data analysis interpretation strategies. Otherwise, machine learning is unlikely to become a part of routine clinical and health care practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Adapted from [9]

Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Wu Q, Zhu Y, Wang X, Li M, Hou J, Masoumi A (2017) Exploring high efficiency hardware accelerator for the key algorithm of Square Kilometer Array telescope data processing. In: Proceedings of the IEEE \(25{\rm th}\) annual international symposium on field-programmable custom computing machines (FCCM), pp 195–195

  2. Britton D, Lloyd SL (2014) How to deal with petabytes of data: the LHC Grid project. Rep Prog Phys 77(6):065902

    Google Scholar 

  3. Adam-Bourdarios C, Cowan G, Germain-Renaud C, Guyon I, Kégl B, Rousseau D (2015) The Higgs machine learning challenge. J Phys Conf 664(7):072015

    Google Scholar 

  4. Leonelli S (2016) Data-centric biology: a philosophical study. University of Chicago Press, Chicago

    Google Scholar 

  5. Kashyap H, Ahmed HA, Hoque N, Roy S, Bhattacharyya DK (2015) Big data analytics in bioinformatics: a machine learning perspective. arXiv preprint arXiv:1506.05101

  6. Marx V (2013) Biology: the big challenges of big data. Nature 498(7453):255–260

    Google Scholar 

  7. Stein LD (2010) The case for cloud computing in genome informatics. Genome Biol 11(5):207

    Google Scholar 

  8. Stephens ZD, Lee SY, Faghri F, Campbell RH, Zhai C, Efron MJ, Iyer R, Schatz MC, Sinha S, Robinson GE (2015) Big data: Astronomical or genomical? PLoS Biol 13(7):e1002195

    Google Scholar 

  9. Vellido A, Martín-Guerrero JD, Lisboa PJG (2012) Making machine learning models interpretable. In: Proceedings of the \(20{\rm th}\) European symposium on artificial neural networks, computational intelligence and machine learning (ESANN), Bruges, Belgium, pp 163–172

  10. Dong Y, Su H, Zhu J, Bao F (2017) Towards interpretable deep neural networks by leveraging adversarial examples. arXiv preprint arXiv:1708.05493

  11. Schwartz-Ziv R, Tishby N (2017) Opening the black box of deep neural networks via information. arXiv preprint arXiv:1703.00810v3

  12. Biran O, Cotton C (2017) Explanation and justification in machine learning: a survey. In: IJCAI-17 workshop on explainable AI (XAI), p 8

  13. Pereira-Fariña M, Reed C (2017) Preface to proceedings of the 1st workshop on explainable computational intelligence (XCI 2017)

  14. Adadi A, Berrada M (2018) Peeking inside the black-box: a survey on explainable Artificial Intelligence (XAI). IEEE Access 6:52138–52160

    Google Scholar 

  15. Doshi-Velez F, Kim B (2017) Towards a rigorous science of interpretable machine learning. arXiv preprint. arXiv:1702.08608

  16. Doshi-Velez F, Kortz M, Budish R, Bavitz C, Gershman S, O’Brien D, Schieber S, Waldo J, Weinberger D, Wood A (2017) Accountability of AI under the law: the role of explanation. arXiv preprint arXiv:1711.01134

  17. Vignard K (2014) The weaponization of increasingly autonomous technologies: considering how meaningful human control might move discussion forward. UNIDIR Resour 2:1

    Google Scholar 

  18. Davison N (2018) A legal perspective: autonomous weapon systems under international humanitarian law. United Nations Office of Disarmament Affairs (UNODA) Occasional Papers, pp 5–18

  19. Press M (2016) Of robots and rules: autonomous weapon systems in the law of armed conflict. Geo J Int Law (Georgetown J of Int Law) 48:1337

    Google Scholar 

  20. Kroll JA (2018) The fallacy of inscrutability. Philos Trans R Soc A 376(2133):20180084

    Google Scholar 

  21. Goodman B, Flaxman S (2017) European Union regulations on algorithmic decision making and a “right to explanation. AI Magz 38(3):76

    Google Scholar 

  22. Rossi F (2016) Artificial intelligence: potential benefits and ethical considerations. Eur Parliam Policy Dep C Citiz Rights Const Affairs Brief PE 571:380

    Google Scholar 

  23. Wachter S, Mittelstadt B, Floridi L (2017) Why a right to explanation of automated decision-making does not exist in the general data protection regulation. Int Data Priv Law 7(2):76–99

    Google Scholar 

  24. Miller T, Howe P, Sonenberg L (2017) Explainable AI: beware of inmates running the asylum. In: IJCAI-17 workshop on explainable AI (XAI), p 36

  25. Cath C (2018) Governing artificial intelligence: ethical, legal and technical opportunities and challenges. Philos Trans R Soc A 376(2133):20180080

    Google Scholar 

  26. Keim DA, Mansmann F, Schneidewind J, Thomas J, Ziegler H (2008) Visual analytics: scope and challenges. In: Visual data mining, LNCS, vol 4404. Springer, pp 76–90

  27. Liu S, Wang X, Liu M, Zhu J (2017) Towards better analysis of machine learning models: a visual analytics perspective. Vis Inf 1(1):48–56

    Google Scholar 

  28. Vellido A, Martín JD, Rossi F, Lisboa PJ (2011) Seeing is believing: the importance of visualization in real-world machine learning applications. In: Proceedings of the \(19{\rm th}\) European symposium on artificial neural networks, computational intelligence and machine learning (ESANN), Bruges, Belgium, pp 219–226

  29. Liu M, Shi J, Li Z, Li C, Zhu J, Liu S (2017) Towards better analysis of deep convolutional neural networks. IEEE Trans Vis Comput Gr 23(1):91–100

    Google Scholar 

  30. Selvaraju RR, Das A, Vedantam R, Cogswell M, Parikh D, Batra D (2017) Grad-CAM: Why did you say that? visual explanations from deep networks via gradient-based localization. In: Proceedings of the international conference on computer vision (ICCV 2017), pp. 618–626

  31. Simonyan K, Vedaldi A, Zisserman A (2014) Deep inside convolutional networks: visualising image classification models and saliency maps. In: Workshop proceedings of the international conference on learning representations (ICLR)

  32. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: Proceedings of the European conference on computer vision (ECCV), pp 818–833

  33. Sacha D, Sedlmair M, Zhang L, Lee JA, Peltonen J, Weiskopf D, North SC, Keim DA (2017) What you see is what you can change: human-centred machine learning by interactive visualization. Neurocomputing 268:164–175

    Google Scholar 

  34. Reza SM (2016) Transforming big data into computational models for personalized medicine and health care. Dialog Clin Neurosci 18(3):339–343

    Google Scholar 

  35. Jensen PB, Jensen LJ, Brunak S (2012) Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet 13(6):395–405

    Google Scholar 

  36. Hoff T (2011) Deskilling and adaptation among primary care physicians using two work innovations. Health Care Manage R 36(4):338–348

    Google Scholar 

  37. Cabitza F, Rasoini R, Gensini GF (2017) Unintended consequences of machine learning in medicine. JAMA 318(6):517–518

    Google Scholar 

  38. Safdar S, Zafar S, Zafar N, Khan NF (2017) Machine learning based decision support systems (DSS) for heart disease diagnosis: a review. Artif Intell Rev 50(4):597–623

    Google Scholar 

  39. Pombo N, Araújo P, Viana J (2014) Knowledge discovery in clinical decision support systems for pain management: a systematic review. Artif Intell Med 60(1):1–11

    Google Scholar 

  40. Vellido A, Ribas V, Morales C, Ruiz-Sanmartín A, Ruiz-Rodríguez JC (2018) Machine learning for critical care: state-of-the-art and a sepsis case study. BioMed Eng OnLine 17(S1):135

    Google Scholar 

  41. Dreiseitl S, Binder M (2005) Do physicians value decision support? A look at the effect of decision support systems on physician opinion. Artif Intell Med 33(1):25–30

    Google Scholar 

  42. Tu JV (1996) Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. J Clin Epidemiol 49(11):1225–1231

    Google Scholar 

  43. Angermueller C, Pärnamaa T, Parts L, Stegle O (2016) Deep learning for computational biology. Mol Syst Biol 12(7):878

    Google Scholar 

  44. Mamoshina P, Vieira A, Putin E, Zhavoronkov A (2016) Applications of deep learning in biomedicine. Mol Pharm 13(5):1445–1454

    Google Scholar 

  45. Miotto R, Li L, Kidd BA, Dudley JT (2016) Deep patient: an unsupervised representation to predict the future of patients from the electronic health records. Sci Rep 6:26094

    Google Scholar 

  46. Jackups R (2017) Deep learning makes its way to the clinical laboratory. Clin Chem 63(12):1790–1791

    Google Scholar 

  47. Ravì D, Wong C, Deligianni F, Berthelot M, Andreu-Pérez J, Lo B, Yang GZ (2017) Deep learning for health informatics. IEEE J Biomed Health 21(1):4–21

    Google Scholar 

  48. Ching T, Himmelstein DS, Beaulieu-Jones BK, Kalinin AA, Do BT, Way GP, Ferrero E, Agapow PM, Zietz M, Hoffman MM, Xie W (2018) Opportunities and obstacles for deep learning in biology and medicine. J R Soc Interface 15(141):20170387

    Google Scholar 

  49. Bacciu D, Lisboa PJ, Martín JD, Stoean R, Vellido A (2018) Bioinformatics and medicine in the era of deep learning. In: Proceedings of the \(26{\rm th}\) European symposium on artificial neural networks, computational intelligence and machine learning (ESANN 2018), Bruges, Belgium, pp 345–354

  50. Che Z, Purushotham S, Khemani R, Liu Y (2015) Distilling knowledge from deep networks with applications to healthcare domain. arXiv preprint arXiv:1512.03542

  51. Wu M, Hughes M, Parbhoo S, Doshi-Velez F (2017) Beyond sparsity: tree-based regularization of deep models for interpretability. In: Neural information processing systems (NIPS) conference. Transparent and interpretable machine learning in safety critical environments (TIML) workshop, Long Beach (CA), USA

  52. Che Z, Purushotham S, Khemani R, Liu Y (2016) Interpretable deep models for ICU outcome prediction. In: AMIA annual symposium proceedings, vol 2016. American Medical Informatics Association, p 371

  53. Choi E, Bahadori MT, Sun J, Kulas J, Schuetz A, Stewart W (2016) Retain: an interpretable predictive model for healthcare using reverse time attention mechanism. In: Advances in neural information processing systems (NIPS), pp 3504–3512

  54. Ma F, Chitta R, Zhou J, You Q, Sun T, Gao J (2017) Dipole: Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks. In: Proceedings of the \(23{\rm rd}\) ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 1903–1911

  55. Sha Y, Wang MD (2017) Interpretable predictions of clinical outcomes with an attention-based recurrent neural network. In: Proceedings of the \(8{\rm th}\) ACM international conference on bioinformatics, computational biology, and health informatics (ACM-BCB), pp 233–240

  56. Zhang Z, Xie Y, Xing F, McGough M, Yang L (2017) MDNet: a semantically and visually interpretable medical image diagnosis network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 6428–6436

  57. Nguyen P, Tran T, Wickramasinghe N, Venkatesh S (2017) Deepr: a convolutional net for medical records. IEEE J Biomed Health Inform 21:2230

    Google Scholar 

  58. Hicks SA, Eskeland S, Lux M, de Lange T, Randel KR, Jeppsson M, Pogorelov K, Halvorsen P, Riegler M (2018) Mimir: an automatic reporting and reasoning system for deep learning based analysis in the medical domain. In: Proceedings of the \(9{\rm th}\) ACM multimedia systems conference (MMSys), pp 369–374

  59. Rögnvaldsson T, Etchells TA, You L, Garwicz D, Jarman I, Lisboa PJ (2009) How to find simple and accurate rules for viral protease cleavage specificities. BMC Bioinform 10(1):149

    Google Scholar 

  60. Van Belle V, Van Calster B, Van Huffel S, Suykens JAK, Lisboa P (2016) Explaining support vector machines: a color based nomogram. PLoS ONE 11(10):e0164568

    Google Scholar 

  61. Vellido A, Romero E, Julià-Sapé M, Majós C, Moreno-Torres À, Arús C (2012) Robust discrimination of glioblastomas from metastatic brain tumors on the basis of single-voxel proton MRS. NMR Biomed 25(6):819828

    Google Scholar 

  62. Ash JS, Berg M, Coiera E (2004) Some unintended consequences of information technology in health care: the nature of patient care information system-related errors. JAMA 11(2):104–112

    Google Scholar 

  63. Reid MJ (2017) Black-box machine learning: implications for healthcare. Polygeia, London

    Google Scholar 

  64. Berner ES, Graber ML (2008) Overconfidence as a cause of diagnostic error in medicine. Am J Med 121(5):S2–S23

    Google Scholar 

  65. Bhanot G, Biehl M, Villmann T, Zühlke D (2017) Biomedical data analysis in translational research: Integration of expert knowledge and interpretable models. In: Proceedings of the \(25{\rm th}\) European symposium on artificial neural networks, computational intelligence and machine learning (ESANN), pp 177–186

  66. Holzinger A (2016) Interactive machine learning for health informatics: When do we need the human-in-the-loop? Brain Inform 3(2):119–131

    Google Scholar 

  67. Julià-Sapé M, Acosta D, Mier M, Arús C, Watson D, The INTERPRET Consortium (2006) A multi-centre, web-accessible and quality control-checked database of in vivo MR spectra of brain tumour patients. Magn Reson Mater Phys 19(1):22–33

    Google Scholar 

  68. Julià-Sapé M, Lurgi M, Mier M, Estanyol F, Rafael X, Candiota AP, Barceló A, García A, Martínez-Bisbal MC, Ferrer-Luna R, Moreno-Torres À (2012) Strategies for annotation and curation of translational databases: the eTUMOUR project. Database 2012:bas035

    Google Scholar 

  69. Vellido A, Romero E, González-Navarro FF, Belanche-Muñoz LA, Julià-Sapé M, Arús C (2009) Outlier exploration and diagnostic classification of a multi-centre 1H-MRS brain tumour database. Neurocomputing 72(13–15):3085–3097

    Google Scholar 

  70. Vellido A, Romero E, Julià-Sapé M, Majós C, Moreno-Torres À, Pujol J, Arús C (2012) Robust discrimination of glioblastomas from metastatic brain tumors on the basis of single-voxel \(^{1}\text{ H } \text{ MRS }\). NMR Biomed 25(6):819–828

    Google Scholar 

  71. Mocioiu V, Kyathanahally SP, Arús C, Vellido A, Julià-Sapé M (2016) Automated quality control for proton magnetic resonance spectroscopy data using convex non-negative matrix factorization, In: Proceedings of the \(4{\rm th}\) international conference on bioinformatics and biomedical engineering (IWBBIO), LNCS/LNBI, Vol 9656, pp 719–727

  72. Rajkomar A et al (2018) Scalable and accurate deep learning for electronic health records. NPJ Digit Med 1(1):18

    Google Scholar 

  73. Shah H (2017) The DeepMind debacle demands dialogue on data. Nature 547:259

    Google Scholar 

  74. Miotto R, Wang F, Wang S, Jiang X, Dudley JT (2017) Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform 19(6):1236–1246

    Google Scholar 

  75. Litjens G et al (2017) A survey on deep learning in medical image analysis. Med Image Anal 42:60–88

    Google Scholar 

  76. Chartrand G et al (2017) Deep learning: a primer for radiologists. Radiographics 37(7):2113–2131

    Google Scholar 

  77. Shickel B, Tighe PJ, Bihorac A, Rashidi P (2018) Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE J Biomed Health Inform 22(5):1589–1604

    Google Scholar 

  78. Zaharchuk G, Gong E, Wintermark M, Rubin D, Langlotz CP (2018) Deep learning in neuroradiology. AJNR Am J Neuroradiol 39(10):1776–1784

    Google Scholar 

  79. Chen H, Engkvist O, Wang Y, Olivecrona M, Blaschke T (2018) The rise of deep learning in drug discovery. Drug Discov Today 23(6):1241–1250

    Google Scholar 

  80. Kwon BC, Choi MJ, Kim JT, Choi E, Kim YB, Kwon S, Sun J, Choo J (2019) RetainVis: visual analytics with interpretable and interactive recurrent neural networks on electronic medical records. IEEE Trans Vis Comput Graph 25(1):299–309

    Google Scholar 

  81. Wu J, Peck D, Hsieh S, Dialani V, Lehman CD, Zhou B, Syrgkanis V, Mackey L, Patterson G (2018) Expert identification of visual primitives used by CNNs during mammogram classification. In: SPIE medical imaging 2018: computer-aided diagnosis, p 10575:105752T

Download references

Acknowledgements

This work was funded by the MINECO Spanish TIN2016-79576-R Project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alfredo Vellido.

Ethics declarations

Conflict of interest

The author declares that he has no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

This appendix includes a self-contained summary of research publications in the form of two tables. Table 1 covers a selection of general references on DL methods applied to biomedicine, while Table 2 focuses only on studies that deal with the problem of interpretability of DL methods applied in the field.

Table 1 Summary of key bibliographic references concerning DL in the (bio-)medical and health care application areas and, particularly, concerning the problem of interpretability
Table 2 Summary of bibliographic references concerning DL and addressing the problems of model interpretability and explainability in the (bio-)medical and health care domain

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vellido, A. The importance of interpretability and visualization in machine learning for applications in medicine and health care. Neural Comput & Applic 32, 18069–18083 (2020). https://doi.org/10.1007/s00521-019-04051-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-019-04051-w

Keywords

Navigation