skip to main content
10.1145/3107411.3107445acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article

Interpretable Predictions of Clinical Outcomes with An Attention-based Recurrent Neural Network

Authors Info & Claims
Published:20 August 2017Publication History

ABSTRACT

The increasing accumulation of healthcare data provides researchers with ample opportunities to build machine learning approaches for clinical decision support and to improve the quality of health care. Several studies have developed conventional machine learning approaches that rely heavily on manual feature engineering and result in task-specific models for health care. In contrast, healthcare researchers have begun to use deep learning, which has emerged as a revolutionary machine learning technique that obviates manual feature engineering but still achieves impressive results in research fields such as image classification. However, few of them have addressed the lack of the interpretability of deep learning models although interpretability is essential for the successful adoption of machine learning approaches by healthcare communities. In addition, the unique characteristics of healthcare data such as high dimensionality and temporal dependencies pose challenges for building models on healthcare data. To address these challenges, we develop a gated recurrent unit-based recurrent neural network with hierarchical attention for mortality prediction, and then, using the diagnostic codes from the Medical Information Mart for Intensive Care, we evaluate the model. We find that the prediction accuracy of the model outperforms baseline models and demonstrate the interpretability of the model in visualizations.

References

  1. Bahdanau, D., Cho, K., and Bengio, Y., 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.Google ScholarGoogle Scholar
  2. Bengio, Y., Simard, P., and Frasconi, P., 1994. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on neural networks 5, 2, 157--166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Turian, J., Warde-Farley, D., and Bengio, Y., 2010. Theano: A CPU and GPU math compiler in Python. In Proc. 9th Python in Science Conf, 1--7.Google ScholarGoogle Scholar
  4. Brown, P.F., Desouza, P.V., Mercer, R.L., Pietra, V.J.D., and Lai, J.C., 1992. Class-based n-gram models of natural language. Computational linguistics 18, 4, 467--479. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Cho, K., Courville, A., and Bengio, Y., 2015. Describing multimedia content using attention-based encoder-decoder networks. IEEE Transactions on Multimedia 17, 11, 1875--1886.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y., 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.Google ScholarGoogle Scholar
  7. Choi, E., Bahadori, M.T., Schuetz, A., Stewart, W.F., and Sun, J., 2016. RETAIN: Interpretable Predictive Model in Healthcare using Reverse Time Attention Mechanism. arXiv preprint arXiv:1608.05745.Google ScholarGoogle Scholar
  8. Chollet, F., 2015. Keras.Google ScholarGoogle Scholar
  9. Chung, J., Gulcehre, C., Cho, K., and Bengio, Y., 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.Google ScholarGoogle Scholar
  10. Free, C., Phillips, G., Watson, L., Galli, L., Felix, L., Edwards, P., Patel, V., and Haines, A., 2013. The effectiveness of mobile-health technologies to improve health care service delivery processes: a systematic review and meta-analysis. PLoS Med 10, 1, e1001363.Google ScholarGoogle ScholarCross RefCross Ref
  11. Frisse, M.E. and Holmes, R.L., 2007. Estimated financial savings associated with health information exchange and ambulatory care referral. Journal of biomedical informatics 40, 6, S27-S32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. He, D., Mathews, S.C., Kalloo, A.N., and Hutfless, S., 2014. Mining high-dimensional administrative claims data to predict early hospital readmissions. Journal of the American Medical Informatics Association 21, 2, 272--279.Google ScholarGoogle ScholarCross RefCross Ref
  13. Hochreiter, S., 1998. The vanishing gradient problem during learning recurrent neural nets and problem solutions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 6, 02, 107--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hochreiter, S. and Schmidhuber, J., 1997. Long short-term memory. Neural computation 9, 8, 1735--1780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jensen, P.B., Jensen, L.J., and Brunak, S., 2012. Mining electronic health records: towards better research applications and clinical care. Nature Reviews Genetics 13, 6, 395--405.Google ScholarGoogle ScholarCross RefCross Ref
  16. Johnson, A.E., Pollard, T.J., Shen, L., Lehman, L.-w.H., Feng, M., Ghassemi, M., Moody, B., Szolovits, P., Celi, L.A., and Mark, R.G., 2016. MIMIC-III, a freely accessible critical care database. Scientific data 3.Google ScholarGoogle Scholar
  17. Jones, S.S., Rudin, R.S., Perry, T., and Shekelle, P.G., 2014. Health information technology: an updated systematic review with a focus on meaningful use. Annals of internal medicine 160, 1, 48--54.Google ScholarGoogle ScholarCross RefCross Ref
  18. Karpathy, A. and Fei-Fei, L., 2015. Deep visual-semantic alignments for generating image descriptions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3128--3137.Google ScholarGoogle Scholar
  19. Kešelj, V., Peng, F., Cercone, N., and Thomas, C., 2003. N-gram-based author profiles for authorship attribution. In Proceedings of the conference pacific association for computational linguistics, PACLING, 255--264.Google ScholarGoogle Scholar
  20. Krizhevsky, A., Sutskever, I., and Hinton, G.E., 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, 1097--1105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Lasko, T.A., Denny, J.C., and Levy, M.A., 2013. Computational phenotype discovery using unsupervised feature learning over noisy, sparse, and irregular clinical data. PloS one 8, 6, e66341.Google ScholarGoogle ScholarCross RefCross Ref
  22. LeCun, Y., Bengio, Y., and Hinton, G., 2015. Deep learning. Nature 521, 7553, 436--444.Google ScholarGoogle Scholar
  23. Marafino, B.J., Davies, J.M., Bardach, N.S., Dean, M.L., Dudley, R.A., and Boscardin, J., 2014. N-gram support vector machines for scalable procedure and diagnosis classification, with applications to clinical free text data from the intensive care unit. Journal of the American Medical Informatics Association 21, 5, 871--875.Google ScholarGoogle ScholarCross RefCross Ref
  24. Matthews, B.W., 1975. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein Structure 405, 2, 442--451.Google ScholarGoogle ScholarCross RefCross Ref
  25. Mikolov, T., Chen, K., Corrado, G., and Dean, J., 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.Google ScholarGoogle Scholar
  26. Miotto, R., Li, L., Kidd, B.A., and Dudley, J.T., 2016. Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records. Scientific Reports 6.Google ScholarGoogle Scholar
  27. Nguyen, P., Tran, T., Wickramasinghe, N., and Venkatesh, S., 2016. Deepr: A Convolutional Net for Medical Records. arXiv preprint arXiv:1607.07519.Google ScholarGoogle Scholar
  28. Pak, A. and Paroubek, P., 2010. Twitter as a Corpus for Sentiment Analysis and Opinion Mining. In LREc.Google ScholarGoogle Scholar
  29. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., and Dubourg, V., 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, Oct, 2825--2830. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Pham, T., Tran, T., Phung, D., and Venkatesh, S., 2016. DeepCare: A Deep Dynamic Memory Model for Predictive Medicine. In Pacific-Asia Conference on Knowledge Discovery and Data Mining Springer, 30--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Rios, A. and Kavuluru, R., 2013. Supervised extraction of diagnosis codes from EMRs: role of feature selection, data selection, and probabilistic thresholding. In Healthcare Informatics (ICHI), 2013 IEEE International Conference on IEEE, 66--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Rocktäschel, T., Grefenstette, E., Hermann, K.M., Kočiský, T., and Blunsom, P., 2015. Reasoning about entailment with neural attention. arXiv preprint arXiv:1509.06664.Google ScholarGoogle Scholar
  33. Sainath, T.N., Mohamed, A.-r., Kingsbury, B., and Ramabhadran, B., 2013. Deep convolutional neural networks for LVCSR. In Acoustics, speech and signal processing (ICASSP), 2013 IEEE international conference on IEEE, 8614--8618.Google ScholarGoogle Scholar
  34. Steiger, J.H., 1980. Tests for comparing elements of a correlation matrix. Psychological bulletin 87, 2, 245--251.Google ScholarGoogle Scholar
  35. Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., and Hovy, E., 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.Google ScholarGoogle Scholar

Index Terms

  1. Interpretable Predictions of Clinical Outcomes with An Attention-based Recurrent Neural Network

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ACM-BCB '17: Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics
        August 2017
        800 pages
        ISBN:9781450347228
        DOI:10.1145/3107411

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 20 August 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        ACM-BCB '17 Paper Acceptance Rate42of132submissions,32%Overall Acceptance Rate254of885submissions,29%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader