research-article

Open Access

Explainable machine learning in deployment

Authors:
Umang Bhatt

Carnegie Mellon University and Partnership on AI and University of Cambridge and Leverhulme CFI

Carnegie Mellon University and Partnership on AI and University of Cambridge and Leverhulme CFI
View Profile

,
Alice Xiang

Partnership on AI

Partnership on AI
View Profile

,
Shubham Sharma

University of Texas at Austin

University of Texas at Austin
View Profile

,
Adrian Weller

University of Cambridge and Leverhulme CFI and The Alan Turing Institute

University of Cambridge and Leverhulme CFI and The Alan Turing Institute
View Profile

,
Ankur Taly

Fiddler Labs

Fiddler Labs
View Profile

,
Yunhan Jia

Baidu

Baidu
View Profile

,
Joydeep Ghosh

University of Texas at Austin and CognitiveScale

University of Texas at Austin and CognitiveScale
View Profile

,
Ruchir Puri

IBM Research

IBM Research
View Profile

,
José M. F. Moura

Carnegie Mellon University

Carnegie Mellon University
View Profile

,
Peter Eckersley

Partnership on AI

Partnership on AI
View Profile

FAT* '20: Proceedings of the 2020 Conference on Fairness, Accountability, and TransparencyJanuary 2020Pages 648–657https://doi.org/10.1145/3351095.3375624

Published:27 January 2020Publication History

FAT* '20: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency

Pages 648–657

ABSTRACT

Explainable machine learning offers the potential to provide stakeholders with insights into model behavior by using various methods such as feature importance scores, counterfactual explanations, or influential training data. Yet there is little understanding of how organizations use these methods in practice. This study explores how organizations view and use explainability for stakeholder consumption. We find that, currently, the majority of deployments are not for end users affected by the model but rather for machine learning engineers, who use explainability to debug the model itself. There is thus a gap between explainability in practice and the goal of transparency, since explanations primarily serve internal stakeholders rather than external ones. Our study synthesizes the limitations of current explainability techniques that hamper their use for end users. To facilitate end user interaction, we develop a framework for establishing clear goals for explainability. We end by discussing concerns raised regarding explainability.

References

2019. IBM'S Principles for Data Trust and Transparency. https://www.ibm.com/blogs/policy/trust-principles/Google Scholar
2019. Our approach: Microsoft AI principles. https://www.microsoft.com/en-us/ai/our-approach-to-aiGoogle Scholar
Tameem Adel, Zoubin Ghahramani, and Adrian Weller. 2018. Discovering interpretable representations for both deep generative and discriminative models. In International Conference on Machine Learning. 50--59.Google Scholar
Sarah Adel Bargal, Andrea Zunino, Donghyun Kim, Jianming Zhang, Vittorio Murino, and Stan Sclaroff. 2018. Excitation backprop for RNNs. In 'Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition'. 1440--1449.Google Scholar
Oscar Alvarado and Annika Waern. 2018. Towards algorithmic experience: Initial efforts for social media contexts. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 286.Google ScholarDigital Library
Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. 2016. Concrete problems in AI safety. arXiv preprint arXiv:1606.06565 (2016).Google Scholar
Marco Ancona, Enea Ceolini, Cengiz Oztireli, and Markus Gross. 2018. Towards better understanding of gradient-based attribution methods for Deep Neural Networks. In 6th International Conference on Learning Representations (ICLR 2018).Google Scholar
Marco Ancona, Cengiz Oztireli, and Markus Gross. 2019. Explaining Deep Neural Networks with a Polynomial Time Algorithm for Shapley Value Approximation. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.), Vol. 97. PMLR, Long Beach, California, USA, 272--281.Google Scholar
David Baehrens, Timon Schroeter, Stefan Harmeling, Motoaki Kawanabe, Katja Hansen, and Klaus-Robert MÃžller. 2010. How to explain individual classification decisions. Journal of Machine Learning Research 11, Jun (2010), 1803--1831.Google ScholarDigital Library
Rajiv Khanna Been Kim and Sanmi Koyejo. 2016. Examples are not Enough, Learn to Criticize! Criticism for Interpretability. In Advances in Neural Information Processing Systems.Google Scholar
Umang Bhatt, Pradeep Ravikumar, and José M. F. Moura. 2019. Towards Aggregating Weighted Feature Attributions. abs/1901.10040 (2019).Google Scholar
Miles Brundage, Shahar Avin, Jack Clark, Helen Toner, Peter Eckersley, Ben Garfinkel, Allan Dafoe, Paul Scharre, Thomas Zeitzoff, Bobby Filar, et al. 2018. The malicious use of artificial intelligence: Forecasting, prevention, and mitigation. arXiv preprint arXiv:1802.07228 (2018).Google Scholar
Aditya Chattopadhyay, Piyushi Manupriya, Anirban Sarkar, and Vineeth N Balasubramanian. 2019. Neural Network Attributions: A Causal Perspective. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.), Vol. 97. PMLR, Long Beach, California, USA, 981--990.Google Scholar
Jianbo Chen, Le Song, Martin J Wainwright, and Michael I Jordan. [n. d.]. L-shapley and c-shapley: Efficient model interpretation for structured data. 7th International Conference on Learning Representations (ICLR 2019) ([n. d.]).Google Scholar
R Dennis Cook. 1977. Detection of influential observation in linear regression. Technometrics 19, 1 (1977), 15--18.Google ScholarDigital Library
Jeffrey De Fauw, Joseph R Ledsam, Bernardino Romera-Paredes, Stanislav Nikolov, Nenad Tomasev, Sam Blackwell, Harry Askham, Xavier Glorot, Brendan O'Donoghue, Daniel Visentin, et al. 2018. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nature medicine 24, 9 (2018), 1342.Google Scholar
Amit Dhurandhar, Karthikeyan Shanmugam, Ronny Luss, and Peder A Olsen. 2018. Improving simple models with confidence profiles. In Advances in Neural Information Processing Systems. 10296--10306.Google Scholar
Ann-Kathrin Dombrowski, Maximilian Alber, Christopher J Anders, Marcel Ackermann, Klaus-Robert Müller, and Pan Kessel. 2019. Explanations can be manipulated and geometry is to blame. arXiv preprint arXiv:1906.07983 (2019).Google Scholar
Finale Doshi-Velez and Been Kim. 2017. Towards A Rigorous Science of Interpretable Machine Learning. (2017).Google Scholar
William DuMouchel. 2002. Data squashing: constructing summary data sets. In Handbook of Massive Data Sets. Springer, 579--591.Google Scholar
Christian Etmann, Sebastian Lunz, Peter Maass, and Carola Schoenlieb. 2019. On the Connection Between Adversarial Robustness and Saliency Map Interpretability. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.), Vol. 97. PMLR, Long Beach, California, USA, 1823--1832.Google Scholar
Ruth Fong and Andrea Vedaldi. 2017. Interpretable Explanations of Black Boxes by Meaningful Perturbation. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV). (2017). arXiv:arXiv:1704.03296 Google ScholarCross Ref
Amirata Ghorbani, Abubakar Abid, and James Zou. 2019. Interpretation of neural networks is fragile. AAAI (2019).Google Scholar
Leilani H Gilpin, David Bau, Ben Z Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal. 2018. Explaining explanations: An overview of interpretability of machine learning. In 2018 IEEE 5th International Conference on data science and advanced analytics (DSAA). IEEE, 80--89.Google ScholarCross Ref
Frederik Harder, Matthias Bauer, and Mijung Park. 2019. Interpretable and Differentially Private Predictions. arXiv preprint arXiv:1906.02004 (2019).Google Scholar
JB Heaton, Nicholas G Polson, and Jan Hendrik Witte. 2016. Deep learning in finance. arXiv preprint arXiv:1602.06561 (2016).Google Scholar
Paul W. Holland. 1986. Statistics and Causal Inference. J. Amer. Statist. Assoc. 81, 396 (1986), 945--960.Google ScholarCross Ref
Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daumé III, Miro Dudik, and Hanna Wallach. 2019. Improving fairness in machine learning systems: What do industry practitioners need?. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 600.Google ScholarDigital Library
Giles Hooker and Lucas Mentch. 2019. Please Stop Permuting Features: An Explanation and Alternatives. arXiv preprint arXiv:1905.03151 (2019).Google Scholar
Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Logan Engstrom, Brandon Tran, and Aleksander Madry. 2019. Adversarial Examples Are Not Bugs, They Are Features. http://arxiv.org/abs/1905.02175 cite arxiv:1905.02175.Google Scholar
Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, and Rory Sayres. 2017. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). arXiv preprint arXiv:1711.11279 (2017).Google Scholar
Pang Wei Koh and Percy Liang. 2017. Understanding black-box predictions via influence functions. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 (ICML 2017). Journal of Machine Learning Research, 1885--1894.Google Scholar
Bruno Lepri, Nuria Oliver, Emmanuel Letouzé, Alex Pentland, and Patrick Vinck. 2018. Fair, transparent, and accountable algorithmic decision-making processes. Philosophy & Technology 31, 4 (2018), 611--627.Google ScholarCross Ref
Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30 (NeurIPS 2017), I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 4765--4774.Google Scholar
Scott M Lundberg, Bala Nair, Monica S Vavilala, Mayumi Horibe, Michael J Eisses, Trevor Adams, David E Liston, Daniel King-Wai Low, Shu-Fang Newman, Jerry Kim, et al. 2018. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nature biomedical engineering 2, 10 (2018), 749.Google Scholar
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2017. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017).Google Scholar
Tim Miller. 2018. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence (2018).Google Scholar
Smitha Milli, Ludwig Schmidt, Anca Dragan, and Moritz Hardt. 2019. Model Reconstruction from Model Explanations. In Proceedings of ACM FAT* 2019 (2019).Google ScholarDigital Library
Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model cards for model reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency. ACM, 220--229.Google ScholarDigital Library
Brent Mittelstadt, Chris Russell, and Sandra Wachter. 2019. Explaining explanations in AI. In Proceedings of the conference on fairness, accountability, and transparency. ACM, 279--288.Google ScholarDigital Library
Grégoire Montavon, Sebastian Lapuschkin, Alexander Binder, Wojciech Samek, and Klaus-Robert Müller. 2017. Explaining nonlinear classification decisions with deep taylor decomposition. Pattern Recognition 65 (2017), 211--222.Google ScholarDigital Library
Yilin Niu, Chao Qiao, Hang Li, and Minlie Huang. 2018. Word Embedding based Edit Distance. arXiv preprint arXiv:1810.10752 (2018).Google Scholar
Board of Governors of the Federal Reserve System. 2011. Supervisory Guidance on Model Risk Management. https://www.federalreserve.gov/supervisionreg/srletters/sr1107a1.pdf (2011).Google Scholar
Onora O'Neill. 2018. Linking trust to trustworthiness. International Journal of Philosophical Studies 26, 2 (2018), 293--300.Google ScholarCross Ref
European Parliament and Council of European Union. 2018. European Union General Data Protection Regulation, Articles 13-15. http://www.privacy-regulation.eu/en/13.htm (2018).Google Scholar
Judea Pearl. 2000. Causality: models, reasoning and inference. Vol. 29. Springer.Google Scholar
Fábio Pinto, Marco OP Sampaio, and Pedro Bizarro. 2019. Automatic Model Monitoring for Data Streams. arXiv preprint arXiv:1908.04240 (2019).Google Scholar
Forough Poursabzi-Sangdeh, Daniel G Goldstein, Jake M Hofman, Jennifer Wortman Vaughan, and Hanna Wallach. 2018. Manipulating and measuring model interpretability. arXiv preprint arXiv:1802.07810 (2018).Google Scholar
Alun Preece, Dan Harborne, Dave Braines, Richard Tomsett, and Supriyo Chakraborty. 2018. Stakeholders in explainable AI. arXiv preprint arXiv:1810.00184 (2018).Google Scholar
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1135--1144.Google ScholarDigital Library
Andrew Slavin Ross, Michael C Hughes, and Finale Doshi-Velez. 2017. Right for the right reasons: training differentiable models by constraining their explanations. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. AAAI Press, 2662--2670.Google ScholarCross Ref
Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1, 5 (2019), 206.Google ScholarCross Ref
Andrew D Selbst and Solon Barocas. 2018. The intuitive appeal of explainable machines. Fordham L. Rev. 87 (2018), 1085.Google Scholar
Lloyd S Shapley. 1953. A Value for n-Person Games. In Contributions to the Theory of Games II. 307--317.Google Scholar
Shubham Sharma, Jette Henderson, and Joydeep Ghosh. 2019. CERTIFAI: Counterfactual Explanations for Robustness, Transparency, Interpretability, and Fairness of Artificial Intelligence models. arXiv preprint arXiv:1905.07857 (2019).Google Scholar
Reza Shokri, Martin Strobel, and Yair Zick. 2019. Privacy Risks of Explaining Machine Learning Models. arXiv preprint arXiv:1907.00164 (2019).Google Scholar
Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. 2017. Learning important features through propagating activation differences. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 (ICML 2017). Journal of Machine Learning Research, 3145--3153.Google ScholarDigital Library
Avanti Shrikumar, Eva Prakash, and Anshul Kundaje. 2018. Gkmexplain: Fast and Accurate Interpretation of Nonlinear Gapped k-mer Support Vector Machines Using Integrated Gradients. BioRxiv (2018), 457606.Google Scholar
Sahil Singla, Eric Wallace, Shi Feng, and Soheil Feizi. 2019. Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.), Vol. 97. PMLR, Long Beach, California, USA, 5848--5856.Google Scholar
Daniel Smilkov, Nikhil Thorat, Been Kim, Fernanda Viégas, and Martin Wattenberg. 2017. Smoothgrad: removing noise by adding noise. arXiv preprint arXiv:1706.03825 (2017).Google Scholar
Erik Štrumbelj and Igor Kononenko. 2014. Explaining prediction models and individual predictions with feature contributions. Knowledge and Information Systems 41, 3 (2014), 647--665.Google ScholarDigital Library
Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 (ICML 2017). Journal of Machine Learning Research, 3319--3328.Google ScholarDigital Library
Florian Tramèr, Fan Zhang, Ari Juels, Michael K Reiter, and Thomas Ristenpart. 2016. Stealing machine learning models via prediction apis. In 25th {USENIX} Security Symposium ({USENIX} Security 16). 601--618.Google Scholar
Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, and Aleksander Madry. 2019. Robustness May Be at Odds with Accuracy. In International Conference on Learning Representations. https://openreview.net/forum?id=SyxAb30cY7Google Scholar
Berk Ustun, Alexander Spangher, and Yang Liu. 2019. Actionable recourse in linear classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency. ACM, 10--19.Google ScholarDigital Library
Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2017. Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GPDR. Harv. JL & Tech. 31 (2017), 841.Google Scholar
Adrian Weller. 2019. Transparency: motivations and challenges. In Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Springer, 23--40.Google Scholar
James Wexler, Mahima Pushkarna, Tolga Bolukbasi, Martin Wattenberg, Fernanda Viegas, and Jimbo Wilson. 2019. The What-If Tool: Interactive Probing of Machine Learning Models. arXiv preprint arXiv:1907.04135 (2019).Google Scholar
Chih-Kuan Yeh, Cheng-Yu Hsieh, Arun Sai Suggala, David Inouye, and Pradeep Ravikumar. 2019. How Sensitive are Sensitivity-Based Explanations? arXiv preprint arXiv:1901.09392 (2019).Google Scholar
Jianming Zhang, Sarah Adel Bargal, Zhe Lin, Jonathan Brandt, Xiaohui Shen, and Stan Sclaroff. 2018. Top-down neural attention by excitation backprop. International Journal of Computer Vision 126, 10 (2018), 1084--1102.Google ScholarDigital Library
Yujia Zhang, Kuangyan Song, Yiming Sun, Sarah Tan, and Madeleine Udell. 2019. "Why Should You Trust My Explanation?" Understanding Uncertainty in LIME Explanations. arXiv:arXiv:1904.12991Google Scholar

Index Terms

Explainable machine learning in deployment
1. Computing methodologies
  1. Artificial intelligence
    1. Philosophical/theoretical foundations of artificial intelligence
  2. Machine learning
2. Human-centered computing

Recommendations

Machine Learning Explainability and Robustness: Connected at the Hip
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

This tutorial examines the synergistic relationship between explainability methods for machine learning and a significant problem related to model quality: robustness against adversarial perturbations. We begin with a broad overview of approaches to ...
Read More
Explainable Argumentation for Wellness Consultation
Explainable, Transparent Autonomous Agents and Multi-Agent Systems
Abstract
There has been a recent resurgence in the area of explainable artificial intelligence as researchers and practitioners seek to provide more transparency to their algorithms. Much of this research is focused on explicitly explaining decisions or ...
Read More
State of the art of Fairness, Interpretability and Explainability in Machine Learning: Case of PRIM
SITA'20: Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications

The adoption of complex machine learning (ML) models in recent years has brought along a new challenge related to how to interpret, understand, and explain the reasoning behind these complex models' predictions. Treating complex ML systems as ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
FAT* '20: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency
January 2020
895 pages
ISBN:9781450369367
DOI:10.1145/3351095
General Chairs:
Mireille Hildebrandt,
Carlos Castillo,
Program Chairs:
Elisa Celis,
Salvatore Ruggieri,
Linnet Taylor,
Gabriela Zanfir-Fortuna
Copyright © 2020 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 January 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deployed systems
explainability
machine learning
qualitative study
transparency
Qualifiers
- research-article
Conference

Upcoming Conference

FAccT '24

The 2024 ACM Conference on Fairness, Accountability, and Transparency

June 3 - 6, 2024

Rio de Janeiro , Brazil
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 81
  Total Citations
  View Citations
- 13,106
  Total Downloads
- Downloads (Last 12 months)2,814
- Downloads (Last 6 weeks)311
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Explainable machine learning in deployment

FAT* '20: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency

ABSTRACT

References

Cited By

Index Terms

Recommendations

Machine Learning Explainability and Robustness: Connected at the Hip

Explainable Argumentation for Wellness Consultation

State of the art of Fairness, Interpretability and Explainability in Machine Learning: Case of PRIM

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Explainable machine learning in deployment

FAT* '20: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency

ABSTRACT

References

Cited By

Index Terms

Recommendations

Machine Learning Explainability and Robustness: Connected at the Hip

Explainable Argumentation for Wellness Consultation

State of the art of Fairness, Interpretability and Explainability in Machine Learning: Case of PRIM

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media