research-article

Interpretable and Steerable Sequence Learning via Prototypes

Authors:
Yao Ming

Hong Kong University of Science and Technology, Hong Kong, China

Hong Kong University of Science and Technology, Hong Kong, China
View Profile

,
Panpan Xu

Bosch Research North America, Sunnyvale, CA, USA

Bosch Research North America, Sunnyvale, CA, USA
View Profile

,
Huamin Qu

Hong Kong University of Science and Technology, Hong Kong, China

Hong Kong University of Science and Technology, Hong Kong, China
View Profile

,
Liu Ren

Bosch Research North America, Sunnyvale, CA, USA

Bosch Research North America, Sunnyvale, CA, USA
View Profile

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningJuly 2019Pages 903–913https://doi.org/10.1145/3292500.3330908

Published:25 July 2019Publication History

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 903–913

ABSTRACT

One of the major challenges in machine learning nowadays is to provide predictions with not only high accuracy but also user-friendly explanations. Although in recent years we have witnessed increasingly popular use of deep neural networks for sequence modeling, it is still challenging to explain the rationales behind the model outputs, which is essential for building trust and supporting the domain experts to validate, critique and refine the model.

We propose ProSeNet, an interpretable and steerable deep sequence model with natural explanations derived from case-based reasoning. The prediction is obtained by comparing the inputs to a few prototypes, which are exemplar cases in the problem domain. For better interpretability, we define several criteria for constructing the prototypes, including simplicity, diversity, and sparsity and propose the learning objective and the optimization procedure. ProSeNet also provides a user-friendly approach to model steering: domain experts without any knowledge on the underlying model or parameters can easily incorporate their intuition and experience by manually refining the prototypes.

We conduct experiments on a wide range of real-world applications, including predictive diagnostics for automobiles, ECG, and protein sequence classification and sentiment analysis on texts. The result shows that ProSeNet can achieve accuracy on par with state-of-the-art deep learning models. We also evaluate the interpretability of the results with concrete case studies. Finally, through user study on Amazon Mechanical Turk (MTurk), we demonstrate that the model selects high-quality prototypes which align well with human knowledge and can be interactively refined for better interpretability without loss of performance.

References

Dimitrios Alikaniotis, Helen Yannakoudakis, and Marek Rei. 2016. Automatic Text Scoring using Neural Networks. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (2016), 715--725.Google ScholarCross Ref
ANSI/AAMI. 2008. Testing and reporting performance results of cardiac rhythm and ST segment measurement algorithms . Standard. American National Standards Institute, Inc. (ANSI), Association for the Advancement of Medical Instrumentation (AAMI).Google Scholar
Roel Bertens, Jilles Vreeken, and Arno Siebes. 2016. Keeping it short and simple: Summarising complex event sequences with multivariate patterns. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 735--744. Google ScholarDigital Library
Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. 2015. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1721--1730. Google ScholarDigital Library
Zhengping Che, Sanjay Purushotham, Robinder Khemani, and Yan Liu. 2016. Interpretable deep models for icu outcome prediction. In AMIA Annual Symposium Proceedings, Vol. 2016. American Medical Informatics Association, 371.Google Scholar
Chaofan Chen, Oscar Li, Alina Barnett, Jonathan Su, and Cynthia Rudin. 2018a. This looks like that: deep learning for interpretable image recognition. arXiv preprint arXiv:1806.10574 (2018).Google Scholar
Yuanzhe Chen, Panpan Xu, and Liu Ren. 2018b. Sequence Synopsis: Optimize Visual Summary of Temporal Event Data. IEEE Transactions on Visualization and Computer Graphics , Vol. 24, 1 (Jan 2018), 45--55.Google ScholarCross Ref
Edward Choi, Mohammad Taha Bahadori, Andy Schuetz, Walter F Stewart, and Jimeng Sun. 2016a. Doctor ai: Predicting clinical events via recurrent neural networks. In Machine Learning for Healthcare Conference. 301--318.Google Scholar
Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. 2016b. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In Advances in Neural Information Processing Systems. 3504--3512. Google ScholarDigital Library
Edward Choi, Andy Schuetz, Walter F Stewart, and Jimeng Sun. 2017. Using recurrent neural network models for early detection of heart failure onset. Journal of the American Medical Informatics Association , Vol. 24, 2 (2017), 361--370.Google ScholarCross Ref
Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017).Google Scholar
Alex Graves, Abdel-rahman Mohamed, and Geoffrey E. Hinton. 2013. Speech recognition with deep recurrent neural networks. In Int. Conf. Acoustics, Speech and Signal Processing. IEEE, 6645--6649.Google Scholar
Hrayr Harutyunyan, Hrant Khachatrian, David C Kale, and Aram Galstyan. 2017. Multitask learning and benchmarking with clinical time series data. arXiv preprint arXiv:1703.07771 (2017).Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
Alistair EW Johnson, Tom J Pollard, Lu Shen, H Lehman Li-wei, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific data , Vol. 3 (2016), 160035.Google Scholar
Mohammad Kachuee, Shayan Fazeli, and Majid Sarrafzadeh. 2018. ECG Heartbeat Classification: A Deep Transferable Representation. arXiv preprint arXiv:1805.00794 (2018).Google Scholar
Andrej Karpathy, Justin Johnson, and Li Fei-Fei. 2015. Visualizing and understanding recurrent networks. arXiv preprint arXiv:1506.02078 (2015).Google Scholar
Janet L Kolodner. 1992. An Introduction to Case-based Reasoning. Artificial Intelligence Review , Vol. 6, 1 (1992), 3--34.Google ScholarCross Ref
Oscar Li, Hao Liu, Chaofan Chen, and Cynthia Rudin. 2018. Deep Learning for Case-based Reasoning through Prototypes: A Neural Network that Explains its Predictions. In AAAI Conference on Artificial Intelligence .Google Scholar
Zachary C. Lipton. 2018. The Mythos of Model Interpretability. Commun. ACM , Vol. 61, 10 (Sept. 2018), 36--43. Google ScholarDigital Library
Zachary C Lipton, John Berkowitz, and Charles Elkan. 2015. A critical review of recurrent neural networks for sequence learning. arXiv preprint arXiv:1506.00019 (2015).Google Scholar
W. James Murdoch, Peter J. Liu, and Bin Yu. 2018. Beyond Word Importance: Contextual Decomposition to Extract Interactions from LS™s. In International Conference on Learning Representations .Google Scholar
W James Murdoch and Arthur Szlam. 2017. Automatic Rule Extraction from Long Short Term Memory Networks. (2017).Google Scholar
Parliament and Council of the European Union. 2016. The General Data Protection Regulation. (2016).Google Scholar
Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. 2013. On the Difficulty of Training Recurrent Neural Networks. In Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28 (ICML'13). 1310--1318. Google ScholarDigital Library
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why Should I Trust You?: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, 1135--1144. Google ScholarDigital Library
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Anchors: High-precision model-agnostic explanations. In AAAI Conference on Artificial Intelligence .Google ScholarCross Ref
Elaine Rich and Kevin Knight. 1991. Artificial intelligence .Tata McGraw-Hill. Google ScholarDigital Library
Cynthia Rudin. 2018. Please Stop Explaining Black Box Models for High Stakes Decisions. NeurIPS 2018 Workshop on Critiquing and Correcting Trends in Machine Learning (2018).Google Scholar
Rainer Schmidt, Stefania Montani, Riccardo Bellazzi, Luigi Portinale, and Lothar Gierl. 2001. Cased-based reasoning for medical knowledge-based systems. International Journal of Medical Informatics , Vol. 64, 2--3 (2001), 355--367.Google ScholarCross Ref
Hendrik Strobelt, Sebastian Gehrmann, Hanspeter Pfister, and Alexander M Rush. 2018. Lstmvis: A tool for visual analysis of hidden state dynamics in recurrent neural networks. IEEE Transactions on Visualization and Computer Graphics , Vol. 24, 1 (2018), 667--676.Google ScholarCross Ref
Jimeng Sun, Fei Wang, Jianying Hu, and Shahram Edabollahi. 2012. Supervised patient similarity measure of heterogeneous patient records. ACM SIGKDD Explorations Newsletter , Vol. 14, 1 (2012), 16--24. Google ScholarDigital Library
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. In Advances in Neural Information Processing Systems (NIPS'14). 3104--3112. Google ScholarDigital Library
Duyu Tang, Bing Qin, and Ting Liu. 2015. Document Modeling with Gated Recurrent Neural Network for Sentiment Classification. In Proc. Conf. Empirical Methods in Natural Language Processing . 1422--1432.Google ScholarCross Ref

Index Terms

Interpretable and Steerable Sequence Learning via Prototypes
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
    2. Natural language processing
  2. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches
      1. Instance-based learning
      2. Neural networks
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Learning Locally Interpretable Rule Ensemble
Machine Learning and Knowledge Discovery in Databases: Research Track
Abstract
This paper proposes a new framework for learning a rule ensemble model that is both accurate and interpretable. A rule ensemble is an interpretable model based on the linear combination of weighted rules. In practice, we often face the trade-off ...
Read More
Seq-HyGAN: Sequence Classification via Hypergraph Attention Network
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

Extracting meaningful features from sequences and devising effective similarity measures are vital for sequence data mining tasks, particularly sequence classification. While neural network models are commonly used to automatically learn sequence ...
Read More
Predicting Paper Acceptance via Interpretable Decision Sets
WWW '21: Companion Proceedings of the Web Conference 2021

Measuring the quality of research work is an essential component of the scientific process. With the ever-growing rates of articles being submitted to top-tier conferences, and the potential consistency and bias issues in the peer review process ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
July 2019
3305 pages
ISBN:9781450362016
DOI:10.1145/3292500
General Chairs:
Ankur Teredesai
KenSci
,
Vipin Kumar
University of Minnesota
,
Program Chairs:
Ying Li
EV Analysis Corporation
,
Rómer Rosales
LinkedIn
,
Evimaria Terzi
Boston University
,
George Karypis
University of Minnesota
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 July 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deep neural network
interpretability
sequence learning
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '19 Paper Acceptance Rate110of1,200submissions,9%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 61
  Total Citations
  View Citations
- 1,640
  Total Downloads
- Downloads (Last 12 months)171
- Downloads (Last 6 weeks)16
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Interpretable and Steerable Sequence Learning via Prototypes

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Learning Locally Interpretable Rule Ensemble

Seq-HyGAN: Sequence Classification via Hypergraph Attention Network

Predicting Paper Acceptance via Interpretable Decision Sets

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Interpretable and Steerable Sequence Learning via Prototypes

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Learning Locally Interpretable Rule Ensemble

Seq-HyGAN: Sequence Classification via Hypergraph Attention Network

Predicting Paper Acceptance via Interpretable Decision Sets

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media