skip to main content
10.1145/3447548.3467333acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

Model-Based Counterfactual Synthesizer for Interpretation

Authors Info & Claims
Published:14 August 2021Publication History

ABSTRACT

Counterfactuals, serving as one of the emerging type of model interpretations, have recently received attention from both researchers and practitioners. Counterfactual explanations formalize the exploration of "what-if'' scenarios, and are an instance of example-based reasoning using a set of hypothetical data samples. Counterfactuals essentially show how the model decision alters with input perturbations. Existing methods for generating counterfactuals are mainly algorithm-based, which are time-inefficient and assume the same counterfactual universe for different queries. To address these limitations, we propose a Model-based Counterfactual Synthesizer (MCS) framework for interpreting machine learning models. We first analyze the model-based counterfactual process and construct a base synthesizer using a conditional generative adversarial net (CGAN). To better approximate the counterfactual universe for those rare queries, we novelly employ the umbrella sampling technique to conduct the MCS framework training. Besides, we also enhance the MCS framework by incorporating the causal dependence among attributes with model inductive bias, and validate its design correctness from the causality identification perspective. Experimental results on several datasets demonstrate the effectiveness as well as efficiency of our proposed MCS framework, and verify the advantages compared with other alternatives.

References

  1. Peter W Battaglia, Jessica B Hamrick, Victor Bapst, et al. 2018. Relational inductive biases, deep learning, and graph networks. arXiv:1806.01261 (2018).Google ScholarGoogle Scholar
  2. Christopher M Bishop. 2006. Pattern recognition and machine learning. springer.Google ScholarGoogle Scholar
  3. Tim Brennan and William L Oliver. 2013. Emergence of machine learning techniques in criminology: implications of complexity in our data and in research questions. Criminology & Pub. Pol'y, Vol. 12 (2013), 551.Google ScholarGoogle ScholarCross RefCross Ref
  4. Chaofan Chen, Oscar Li, Daniel Tao, Alina Barnett, Cynthia Rudin, and Jonathan K Su. 2019. This looks like that: deep learning for interpretable image recognition. In Advances in Neural Information Processing Systems (NeurIPS). 8928--8939.Google ScholarGoogle Scholar
  5. Amit Dhurandhar, Pin-Yu Chen, et al. 2018. Explanations based on the missing: Towards contrastive explanations with pertinent negatives. In NeurIPS. 592--603.Google ScholarGoogle Scholar
  6. Mengnan Du, Ninghao Liu, and Xia Hu. 2019. Techniques for interpretable machine learning. Commun. ACM, Vol. 63, 1 (2019), 68--77.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. José AR Fonollosa. 2019. Conditional distribution variability measures for causality detection. In Cause Effect Pairs in Machine Learning. Springer, 339--347.Google ScholarGoogle Scholar
  8. Amirata Ghorbani, James Wexler, James Zou, and Been Kim. 2019. Towards automatic concept-based explanations. arXiv preprint arXiv:1902.03129 (2019).Google ScholarGoogle Scholar
  9. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, et al. 2014. Generative adversarial nets. In NeurIPS. 2672--2680.Google ScholarGoogle Scholar
  10. Yash Goyal, Amir Feder, Uri Shalit, and Been Kim. 2019 a. Explaining classifiers with causal concept effect (cace). arXiv:1907.07165 (2019).Google ScholarGoogle Scholar
  11. Yash Goyal, Ziyan Wu, Jan Ernst, et al. 2019 b. Counterfactual Visual Explanations. In International Conference on Machine Learning (ICML). 2376--2384.Google ScholarGoogle Scholar
  12. Thomas L Griffiths, Nick Chater, Charles Kemp, Amy Perfors, and Joshua B Tenenbaum. 2010. Probabilistic models of cognition: Exploring representations and inductive biases. Trends in cognitive sciences, Vol. 14, 8 (2010), 357--364.Google ScholarGoogle Scholar
  13. Ned Hall. 2007. Structural equations and causation. Philosophical Studies, Vol. 132, 1 (2007), 109--136.Google ScholarGoogle ScholarCross RefCross Ref
  14. Patrik Hoyer, Dominik Janzing, Joris M Mooij, Jonas Peters, et al. 2008. Nonlinear causal discovery with additive noise models. NeurIPS, Vol. 21 (2008), 689--696.Google ScholarGoogle Scholar
  15. Eric Jang, Shixiang Gu, and Ben Poole. 2016. Categorical reparameterization with gumbel-softmax. arXiv:1611.01144 (2016).Google ScholarGoogle Scholar
  16. Shalmali Joshi, Oluwasanmi Koyejo, Been Kim, et al. 2018. xGEMs: Generating examplars to explain black-box models. arXiv:1806.08867 (2018).Google ScholarGoogle Scholar
  17. Diviyan Kalainathan, Olivier Goudet, and Ritik Dutta. 2020. Causal Discovery Toolbox: Uncovering causal relationships in Python. JMLR, Vol. 21, 37 (2020), 1--5.Google ScholarGoogle Scholar
  18. Amir-Hossein Karimi, Gilles Barthe, Bernhard Schölkopf, and Isabel Valera. 2020. A survey of algorithmic recourse: definitions, formulations, solutions, and prospects. arXiv:2010.04050 (2020).Google ScholarGoogle Scholar
  19. Johannes Kästner. 2011. Umbrella sampling. Wiley Interdisciplinary Reviews: Computational Molecular Science, Vol. 1, 6 (2011), 932--942.Google ScholarGoogle ScholarCross RefCross Ref
  20. Been Kim, Rajiv Khanna, and Oluwasanmi O Koyejo. 2016. Examples are not enough, learn to criticize! criticism for interpretability. In NeurIPS. 2280--2288.Google ScholarGoogle Scholar
  21. Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, et al. 2018. Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV). In ICML. 2668--2677.Google ScholarGoogle Scholar
  22. Pang Wei Koh and Percy Liang. 2017. Understanding black-box predictions via influence functions. In ICML. 1885--1894.Google ScholarGoogle Scholar
  23. Geert Litjens, Thijs Kooi, Babak Ehteshami Bejnordi, Arnaud Arindra Adiyoso Setio, Francesco Ciompi, et al. 2017. A survey on deep learning in medical image analysis. Medical image analysis, Vol. 42 (2017), 60--88.Google ScholarGoogle Scholar
  24. Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. In NeurIPS. 4765--4774.Google ScholarGoogle Scholar
  25. Divyat Mahajan, Chenhao Tan, and Amit Sharma. 2019. Preserving causal constraints in counterfactual explanations for machine learning classifiers. arXiv:1912.03277 (2019).Google ScholarGoogle Scholar
  26. JL McClelland. 1992. The interaction of nature and nurture in development: A parallel distributed processing perspective (Parallel Distributed Processing and Cognitive Neuroscience PDP. CNS. 92.6).Google ScholarGoogle Scholar
  27. Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv:1411.1784 (2014).Google ScholarGoogle Scholar
  28. Tom M Mitchell. 1980. The need for biases in learning generalizations. Department of Computer Science, Laboratory for Computer Science Research.Google ScholarGoogle Scholar
  29. Jonathan Moore, Nils Hammerla, and Chris Watkins. 2019. Explaining deep learning models with constrained adversarial examples. In Pacific Rim International Conference on Artificial Intelligence. Springer, 43--56.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Ramaravind K Mothilal, Amit Sharma, and Chenhao Tan. 2020. Explaining machine learning classifiers through diverse counterfactual explanations. In Proceedings on Fairness, Accountability, and Transparency (FAccT). 607--617.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Noseong Park, Mahmoud Mohammadi, Kshitij Gorde, Sushil Jajodia, Hongkyu Park, and Youngmin Kim. 2018. Data synthesis based on generative adversarial networks. arXiv:1806.03384 (2018).Google ScholarGoogle Scholar
  32. Martin Pawelczyk, Klaus Broelemann, and Gjergji Kasneci. 2020. Learning Model-Agnostic Counterfactual Explanations for Tabular Data. In Proceedings of The Web Conference 2020. 3126--3132.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why should i trust you?" Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD conference on knowledge discovery and data mining. 1135--1144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Anchors: High-precision model-agnostic explanations. In 32nd AAAI on Artificial Intelligence.Google ScholarGoogle Scholar
  35. Edwina L Rissland. 1991. Example-based reasoning. Informal reasoning in education (1991), 187--208.Google ScholarGoogle Scholar
  36. Ramprasaath R Selvaraju, Michael Cogswell, et al. 2017. Grad-Cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision (ICCV). 618--626.Google ScholarGoogle ScholarCross RefCross Ref
  37. Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 3319--3328.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Tom Vermeire and David Martens. 2020. Explainable Image Classification with Evidence Counterfactual. arXiv:2004.07511 (2020).Google ScholarGoogle Scholar
  39. Sandra Wachter et al. 2017. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. JL & Tech., Vol. 31 (2017), 841.Google ScholarGoogle Scholar
  40. Adam White and Artur d'Avila Garcez. 2019. Measurable counterfactual local explanations for any classifier. arXiv preprint arXiv:1908.03020 (2019).Google ScholarGoogle Scholar
  41. Lei Xu, Maria Skoularidou, Alfredo Cuesta-Infante, and Kalyan Veeramachaneni. 2019. Modeling tabular data using conditional gan. In NeurIPS. 7335--7345.Google ScholarGoogle Scholar
  42. Fan Yang, Ninghao Liu, Mengnan Du, et al. 2021. Generative Counterfactuals for Neural Networks via Attribute-Informed Perturbation. arXiv:2101.06930 (2021).Google ScholarGoogle Scholar
  43. Fan Yang, Ninghao Liu, Mengnan Du, Kaixiong Zhou, Shuiwang Ji, and Xia Hu. 2020. Deep Neural Networks with Knowledge Instillation. In SDM. 370--378.Google ScholarGoogle Scholar
  44. Chih-Kuan Yeh, Been Kim, et al. 2020. On Completeness-aware Concept-Based Explanations in Deep Neural Networks. NeurIPS (2020).Google ScholarGoogle Scholar
  45. Bolei Zhou, Yiyou Sun, David Bau, and Antonio Torralba. 2018. Interpretable basis decomposition for visual explanation. In ECCV. 119--134.Google ScholarGoogle Scholar

Index Terms

  1. Model-Based Counterfactual Synthesizer for Interpretation

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
            August 2021
            4259 pages
            ISBN:9781450383325
            DOI:10.1145/3447548

            Copyright © 2021 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 14 August 2021

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate1,133of8,635submissions,13%

            Upcoming Conference

            KDD '24

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader