skip to main content
10.1145/3292500.3330908acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Interpretable and Steerable Sequence Learning via Prototypes

Authors Info & Claims
Published:25 July 2019Publication History

ABSTRACT

One of the major challenges in machine learning nowadays is to provide predictions with not only high accuracy but also user-friendly explanations. Although in recent years we have witnessed increasingly popular use of deep neural networks for sequence modeling, it is still challenging to explain the rationales behind the model outputs, which is essential for building trust and supporting the domain experts to validate, critique and refine the model.

We propose ProSeNet, an interpretable and steerable deep sequence model with natural explanations derived from case-based reasoning. The prediction is obtained by comparing the inputs to a few prototypes, which are exemplar cases in the problem domain. For better interpretability, we define several criteria for constructing the prototypes, including simplicity, diversity, and sparsity and propose the learning objective and the optimization procedure. ProSeNet also provides a user-friendly approach to model steering: domain experts without any knowledge on the underlying model or parameters can easily incorporate their intuition and experience by manually refining the prototypes.

We conduct experiments on a wide range of real-world applications, including predictive diagnostics for automobiles, ECG, and protein sequence classification and sentiment analysis on texts. The result shows that ProSeNet can achieve accuracy on par with state-of-the-art deep learning models. We also evaluate the interpretability of the results with concrete case studies. Finally, through user study on Amazon Mechanical Turk (MTurk), we demonstrate that the model selects high-quality prototypes which align well with human knowledge and can be interactively refined for better interpretability without loss of performance.

References

  1. Dimitrios Alikaniotis, Helen Yannakoudakis, and Marek Rei. 2016. Automatic Text Scoring using Neural Networks. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (2016), 715--725.Google ScholarGoogle ScholarCross RefCross Ref
  2. ANSI/AAMI. 2008. Testing and reporting performance results of cardiac rhythm and ST segment measurement algorithms . Standard. American National Standards Institute, Inc. (ANSI), Association for the Advancement of Medical Instrumentation (AAMI).Google ScholarGoogle Scholar
  3. Roel Bertens, Jilles Vreeken, and Arno Siebes. 2016. Keeping it short and simple: Summarising complex event sequences with multivariate patterns. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 735--744. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. 2015. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1721--1730. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Zhengping Che, Sanjay Purushotham, Robinder Khemani, and Yan Liu. 2016. Interpretable deep models for icu outcome prediction. In AMIA Annual Symposium Proceedings, Vol. 2016. American Medical Informatics Association, 371.Google ScholarGoogle Scholar
  6. Chaofan Chen, Oscar Li, Alina Barnett, Jonathan Su, and Cynthia Rudin. 2018a. This looks like that: deep learning for interpretable image recognition. arXiv preprint arXiv:1806.10574 (2018).Google ScholarGoogle Scholar
  7. Yuanzhe Chen, Panpan Xu, and Liu Ren. 2018b. Sequence Synopsis: Optimize Visual Summary of Temporal Event Data. IEEE Transactions on Visualization and Computer Graphics , Vol. 24, 1 (Jan 2018), 45--55.Google ScholarGoogle ScholarCross RefCross Ref
  8. Edward Choi, Mohammad Taha Bahadori, Andy Schuetz, Walter F Stewart, and Jimeng Sun. 2016a. Doctor ai: Predicting clinical events via recurrent neural networks. In Machine Learning for Healthcare Conference. 301--318.Google ScholarGoogle Scholar
  9. Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. 2016b. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In Advances in Neural Information Processing Systems. 3504--3512. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Edward Choi, Andy Schuetz, Walter F Stewart, and Jimeng Sun. 2017. Using recurrent neural network models for early detection of heart failure onset. Journal of the American Medical Informatics Association , Vol. 24, 2 (2017), 361--370.Google ScholarGoogle ScholarCross RefCross Ref
  11. Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017).Google ScholarGoogle Scholar
  12. Alex Graves, Abdel-rahman Mohamed, and Geoffrey E. Hinton. 2013. Speech recognition with deep recurrent neural networks. In Int. Conf. Acoustics, Speech and Signal Processing. IEEE, 6645--6649.Google ScholarGoogle Scholar
  13. Hrayr Harutyunyan, Hrant Khachatrian, David C Kale, and Aram Galstyan. 2017. Multitask learning and benchmarking with clinical time series data. arXiv preprint arXiv:1703.07771 (2017).Google ScholarGoogle Scholar
  14. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  15. Alistair EW Johnson, Tom J Pollard, Lu Shen, H Lehman Li-wei, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific data , Vol. 3 (2016), 160035.Google ScholarGoogle Scholar
  16. Mohammad Kachuee, Shayan Fazeli, and Majid Sarrafzadeh. 2018. ECG Heartbeat Classification: A Deep Transferable Representation. arXiv preprint arXiv:1805.00794 (2018).Google ScholarGoogle Scholar
  17. Andrej Karpathy, Justin Johnson, and Li Fei-Fei. 2015. Visualizing and understanding recurrent networks. arXiv preprint arXiv:1506.02078 (2015).Google ScholarGoogle Scholar
  18. Janet L Kolodner. 1992. An Introduction to Case-based Reasoning. Artificial Intelligence Review , Vol. 6, 1 (1992), 3--34.Google ScholarGoogle ScholarCross RefCross Ref
  19. Oscar Li, Hao Liu, Chaofan Chen, and Cynthia Rudin. 2018. Deep Learning for Case-based Reasoning through Prototypes: A Neural Network that Explains its Predictions. In AAAI Conference on Artificial Intelligence .Google ScholarGoogle Scholar
  20. Zachary C. Lipton. 2018. The Mythos of Model Interpretability. Commun. ACM , Vol. 61, 10 (Sept. 2018), 36--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Zachary C Lipton, John Berkowitz, and Charles Elkan. 2015. A critical review of recurrent neural networks for sequence learning. arXiv preprint arXiv:1506.00019 (2015).Google ScholarGoogle Scholar
  22. W. James Murdoch, Peter J. Liu, and Bin Yu. 2018. Beyond Word Importance: Contextual Decomposition to Extract Interactions from LS™s. In International Conference on Learning Representations .Google ScholarGoogle Scholar
  23. W James Murdoch and Arthur Szlam. 2017. Automatic Rule Extraction from Long Short Term Memory Networks. (2017).Google ScholarGoogle Scholar
  24. Parliament and Council of the European Union. 2016. The General Data Protection Regulation. (2016).Google ScholarGoogle Scholar
  25. Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. 2013. On the Difficulty of Training Recurrent Neural Networks. In Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28 (ICML'13). 1310--1318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why Should I Trust You?: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, 1135--1144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Anchors: High-precision model-agnostic explanations. In AAAI Conference on Artificial Intelligence .Google ScholarGoogle ScholarCross RefCross Ref
  28. Elaine Rich and Kevin Knight. 1991. Artificial intelligence .Tata McGraw-Hill. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Cynthia Rudin. 2018. Please Stop Explaining Black Box Models for High Stakes Decisions. NeurIPS 2018 Workshop on Critiquing and Correcting Trends in Machine Learning (2018).Google ScholarGoogle Scholar
  30. Rainer Schmidt, Stefania Montani, Riccardo Bellazzi, Luigi Portinale, and Lothar Gierl. 2001. Cased-based reasoning for medical knowledge-based systems. International Journal of Medical Informatics , Vol. 64, 2--3 (2001), 355--367.Google ScholarGoogle ScholarCross RefCross Ref
  31. Hendrik Strobelt, Sebastian Gehrmann, Hanspeter Pfister, and Alexander M Rush. 2018. Lstmvis: A tool for visual analysis of hidden state dynamics in recurrent neural networks. IEEE Transactions on Visualization and Computer Graphics , Vol. 24, 1 (2018), 667--676.Google ScholarGoogle ScholarCross RefCross Ref
  32. Jimeng Sun, Fei Wang, Jianying Hu, and Shahram Edabollahi. 2012. Supervised patient similarity measure of heterogeneous patient records. ACM SIGKDD Explorations Newsletter , Vol. 14, 1 (2012), 16--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. In Advances in Neural Information Processing Systems (NIPS'14). 3104--3112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Duyu Tang, Bing Qin, and Ting Liu. 2015. Document Modeling with Gated Recurrent Neural Network for Sentiment Classification. In Proc. Conf. Empirical Methods in Natural Language Processing . 1422--1432.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Interpretable and Steerable Sequence Learning via Prototypes

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Conferences
                KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
                July 2019
                3305 pages
                ISBN:9781450362016
                DOI:10.1145/3292500

                Copyright © 2019 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 25 July 2019

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article

                Acceptance Rates

                KDD '19 Paper Acceptance Rate110of1,200submissions,9%Overall Acceptance Rate1,133of8,635submissions,13%

                Upcoming Conference

                KDD '24

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader