ABSTRACT
One of the major challenges in machine learning nowadays is to provide predictions with not only high accuracy but also user-friendly explanations. Although in recent years we have witnessed increasingly popular use of deep neural networks for sequence modeling, it is still challenging to explain the rationales behind the model outputs, which is essential for building trust and supporting the domain experts to validate, critique and refine the model.
We propose ProSeNet, an interpretable and steerable deep sequence model with natural explanations derived from case-based reasoning. The prediction is obtained by comparing the inputs to a few prototypes, which are exemplar cases in the problem domain. For better interpretability, we define several criteria for constructing the prototypes, including simplicity, diversity, and sparsity and propose the learning objective and the optimization procedure. ProSeNet also provides a user-friendly approach to model steering: domain experts without any knowledge on the underlying model or parameters can easily incorporate their intuition and experience by manually refining the prototypes.
We conduct experiments on a wide range of real-world applications, including predictive diagnostics for automobiles, ECG, and protein sequence classification and sentiment analysis on texts. The result shows that ProSeNet can achieve accuracy on par with state-of-the-art deep learning models. We also evaluate the interpretability of the results with concrete case studies. Finally, through user study on Amazon Mechanical Turk (MTurk), we demonstrate that the model selects high-quality prototypes which align well with human knowledge and can be interactively refined for better interpretability without loss of performance.
- Dimitrios Alikaniotis, Helen Yannakoudakis, and Marek Rei. 2016. Automatic Text Scoring using Neural Networks. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (2016), 715--725.Google ScholarCross Ref
- ANSI/AAMI. 2008. Testing and reporting performance results of cardiac rhythm and ST segment measurement algorithms . Standard. American National Standards Institute, Inc. (ANSI), Association for the Advancement of Medical Instrumentation (AAMI).Google Scholar
- Roel Bertens, Jilles Vreeken, and Arno Siebes. 2016. Keeping it short and simple: Summarising complex event sequences with multivariate patterns. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 735--744. Google ScholarDigital Library
- Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. 2015. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1721--1730. Google ScholarDigital Library
- Zhengping Che, Sanjay Purushotham, Robinder Khemani, and Yan Liu. 2016. Interpretable deep models for icu outcome prediction. In AMIA Annual Symposium Proceedings, Vol. 2016. American Medical Informatics Association, 371.Google Scholar
- Chaofan Chen, Oscar Li, Alina Barnett, Jonathan Su, and Cynthia Rudin. 2018a. This looks like that: deep learning for interpretable image recognition. arXiv preprint arXiv:1806.10574 (2018).Google Scholar
- Yuanzhe Chen, Panpan Xu, and Liu Ren. 2018b. Sequence Synopsis: Optimize Visual Summary of Temporal Event Data. IEEE Transactions on Visualization and Computer Graphics , Vol. 24, 1 (Jan 2018), 45--55.Google ScholarCross Ref
- Edward Choi, Mohammad Taha Bahadori, Andy Schuetz, Walter F Stewart, and Jimeng Sun. 2016a. Doctor ai: Predicting clinical events via recurrent neural networks. In Machine Learning for Healthcare Conference. 301--318.Google Scholar
- Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. 2016b. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In Advances in Neural Information Processing Systems. 3504--3512. Google ScholarDigital Library
- Edward Choi, Andy Schuetz, Walter F Stewart, and Jimeng Sun. 2017. Using recurrent neural network models for early detection of heart failure onset. Journal of the American Medical Informatics Association , Vol. 24, 2 (2017), 361--370.Google ScholarCross Ref
- Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017).Google Scholar
- Alex Graves, Abdel-rahman Mohamed, and Geoffrey E. Hinton. 2013. Speech recognition with deep recurrent neural networks. In Int. Conf. Acoustics, Speech and Signal Processing. IEEE, 6645--6649.Google Scholar
- Hrayr Harutyunyan, Hrant Khachatrian, David C Kale, and Aram Galstyan. 2017. Multitask learning and benchmarking with clinical time series data. arXiv preprint arXiv:1703.07771 (2017).Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
- Alistair EW Johnson, Tom J Pollard, Lu Shen, H Lehman Li-wei, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific data , Vol. 3 (2016), 160035.Google Scholar
- Mohammad Kachuee, Shayan Fazeli, and Majid Sarrafzadeh. 2018. ECG Heartbeat Classification: A Deep Transferable Representation. arXiv preprint arXiv:1805.00794 (2018).Google Scholar
- Andrej Karpathy, Justin Johnson, and Li Fei-Fei. 2015. Visualizing and understanding recurrent networks. arXiv preprint arXiv:1506.02078 (2015).Google Scholar
- Janet L Kolodner. 1992. An Introduction to Case-based Reasoning. Artificial Intelligence Review , Vol. 6, 1 (1992), 3--34.Google ScholarCross Ref
- Oscar Li, Hao Liu, Chaofan Chen, and Cynthia Rudin. 2018. Deep Learning for Case-based Reasoning through Prototypes: A Neural Network that Explains its Predictions. In AAAI Conference on Artificial Intelligence .Google Scholar
- Zachary C. Lipton. 2018. The Mythos of Model Interpretability. Commun. ACM , Vol. 61, 10 (Sept. 2018), 36--43. Google ScholarDigital Library
- Zachary C Lipton, John Berkowitz, and Charles Elkan. 2015. A critical review of recurrent neural networks for sequence learning. arXiv preprint arXiv:1506.00019 (2015).Google Scholar
- W. James Murdoch, Peter J. Liu, and Bin Yu. 2018. Beyond Word Importance: Contextual Decomposition to Extract Interactions from LS™s. In International Conference on Learning Representations .Google Scholar
- W James Murdoch and Arthur Szlam. 2017. Automatic Rule Extraction from Long Short Term Memory Networks. (2017).Google Scholar
- Parliament and Council of the European Union. 2016. The General Data Protection Regulation. (2016).Google Scholar
- Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. 2013. On the Difficulty of Training Recurrent Neural Networks. In Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28 (ICML'13). 1310--1318. Google ScholarDigital Library
- Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why Should I Trust You?: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, 1135--1144. Google ScholarDigital Library
- Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Anchors: High-precision model-agnostic explanations. In AAAI Conference on Artificial Intelligence .Google ScholarCross Ref
- Elaine Rich and Kevin Knight. 1991. Artificial intelligence .Tata McGraw-Hill. Google ScholarDigital Library
- Cynthia Rudin. 2018. Please Stop Explaining Black Box Models for High Stakes Decisions. NeurIPS 2018 Workshop on Critiquing and Correcting Trends in Machine Learning (2018).Google Scholar
- Rainer Schmidt, Stefania Montani, Riccardo Bellazzi, Luigi Portinale, and Lothar Gierl. 2001. Cased-based reasoning for medical knowledge-based systems. International Journal of Medical Informatics , Vol. 64, 2--3 (2001), 355--367.Google ScholarCross Ref
- Hendrik Strobelt, Sebastian Gehrmann, Hanspeter Pfister, and Alexander M Rush. 2018. Lstmvis: A tool for visual analysis of hidden state dynamics in recurrent neural networks. IEEE Transactions on Visualization and Computer Graphics , Vol. 24, 1 (2018), 667--676.Google ScholarCross Ref
- Jimeng Sun, Fei Wang, Jianying Hu, and Shahram Edabollahi. 2012. Supervised patient similarity measure of heterogeneous patient records. ACM SIGKDD Explorations Newsletter , Vol. 14, 1 (2012), 16--24. Google ScholarDigital Library
- Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. In Advances in Neural Information Processing Systems (NIPS'14). 3104--3112. Google ScholarDigital Library
- Duyu Tang, Bing Qin, and Ting Liu. 2015. Document Modeling with Gated Recurrent Neural Network for Sentiment Classification. In Proc. Conf. Empirical Methods in Natural Language Processing . 1422--1432.Google ScholarCross Ref
Index Terms
- Interpretable and Steerable Sequence Learning via Prototypes
Recommendations
Learning Locally Interpretable Rule Ensemble
Machine Learning and Knowledge Discovery in Databases: Research TrackAbstractThis paper proposes a new framework for learning a rule ensemble model that is both accurate and interpretable. A rule ensemble is an interpretable model based on the linear combination of weighted rules. In practice, we often face the trade-off ...
Seq-HyGAN: Sequence Classification via Hypergraph Attention Network
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge ManagementExtracting meaningful features from sequences and devising effective similarity measures are vital for sequence data mining tasks, particularly sequence classification. While neural network models are commonly used to automatically learn sequence ...
Predicting Paper Acceptance via Interpretable Decision Sets
WWW '21: Companion Proceedings of the Web Conference 2021Measuring the quality of research work is an essential component of the scientific process. With the ever-growing rates of articles being submitted to top-tier conferences, and the potential consistency and bias issues in the peer review process ...
Comments