Auditing black-box models for indirect influence

Adler, Philip; Falk, Casey; Friedler, Sorelle A.; Nix, Tionney; Rybeck, Gabriel; Scheidegger, Carlos; Smith, Brandon; Venkatasubramanian, Suresh

doi:10.1007/s10115-017-1116-3

Auditing black-box models for indirect influence

Regular Paper
Published: 25 October 2017

Volume 54, pages 95–122, (2018)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Philip Adler¹,
Casey Falk¹,
Sorelle A. Friedler¹,
Tionney Nix¹,
Gabriel Rybeck¹,
Carlos Scheidegger²,
Brandon Smith¹ &
…
Suresh Venkatasubramanian³

3790 Accesses
105 Citations
9 Altmetric
1 Mention
Explore all metrics

Abstract

Data-trained predictive models see widespread use, but for the most part they are used as black boxes which output a prediction or score. It is therefore hard to acquire a deeper understanding of model behavior and in particular how different features influence the model prediction. This is important when interpreting the behavior of complex models or asserting that certain problematic attributes (such as race or gender) are not unduly influencing decisions. In this paper, we present a technique for auditing black-box models, which lets us study the extent to which existing models take advantage of particular features in the data set, without knowing how the models work. Our work focuses on the problem of indirect influence: how some features might indirectly influence outcomes via other, related features. As a result, we can find attribute influences even in cases where, upon further direct examination of the model, the attribute is not referred to by the model at all. Our approach does not require the black-box model to be retrained. This is important if, for example, the model is only accessible via an API, and contrasts our work with other methods that investigate feature influence such as feature selection. We present experimental evidence for the effectiveness of our procedure using a variety of publicly available data sets and models. We also validate our procedure using techniques from interpretable learning and feature selection, as well as against other black-box auditing procedures. To further demonstrate the effectiveness of this technique, we use it to audit a black-box recidivism prediction algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Article Open access 30 January 2023

Literature reviews as independent studies: guidelines for academic practice

Article Open access 14 October 2022

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Article Open access 22 August 2014

Notes

This is a straightforward application of the standard min-cost flow problem.
This follows from the fact that the earthmover distance between two distributions on the line is the \(\ell _1\) difference between their cumulative density functions. In this case, it means that the earthmover distance is precisely the distance between the means.
https://github.com/algofairness/BlackBoxAuditing.
https://archive.ics.uci.edu/ml/datasets.html.
Implemented using Weka’s version 3.6.13 SMO: http://weka.sourceforge.net/doc.dev/weka/classifiers/functions/SMO.html.
Implemented using TensorFlow version 0.6.0: https://www.tensorflow.org/.
Implemented using Weka’s version 3.6.13 J48: http://weka.sourceforge.net/doc.dev/weka/classifiers/trees/J48.html.
Available at: https://bitbucket.org/aheneliu/goldeneye/.
Weka’s REPTree, J48 and M5P models were used for this analysis with the default model-building parameters. J48 was used to predict categorical features and M5P was used for numerical features. REPTree can handle both categorical and numerical features.
Feature selection was implemented in Weka version 3.6.13 using WrapperSubsetEval and Greedy StepWise on J48 and SMO models. Default options were used, save for the generation of a complete ranking for all features.
The ProPublica methodology can be found here: https://github.com/propublica/compas-analysis.
http://orange.biolab.si/toolbox/.

References

Adler P, Falk C, Friedler SA, Rybeck G, Scheidegger C, Smith B, Venkatasubramanian S (2016) Auditing black-box models for indirect influence, In: Proceedings of the IEEE international conference on data mining (ICDM)
Agrawal R, Srikant R (2000) Privacy-preserving data mining, In: ACM Sigmod Record, vol 29. ACM, pp. 439–450
Angwin J, Larson J, Mattu S, Kirchner L (2016) Machine bias, ProPublica
Barakat N, Diederich J (2004) Learning-based rule-extraction from support vector machines. In: Proceedings of the 14th international conference on computer theory and applications
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article MATH Google Scholar
Bucilua C, Caruana R, Niculescu-Mizil A (2006) Model compression, In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 535–541
Casella G, Berger RL (2001) Statistical inference, 2nd edn. Cengage Learning, Boston
MATH Google Scholar
Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40:16–28
Article Google Scholar
Chouldechova A (2016) Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. In: Presented at the workshop on fairness, accountability, and transparency in machine learning (FATML)
Clark P, Niblett T (1989) The cn2 induction algorithm. Mach Learn 3(4):261–283
Google Scholar
Datta A, Sen S, Zick Y (2016) Algorithmic transparency via quantitative input influence: theory and experiments with learning systems. In: Proceedings of 37th IEEE symposium on security and privacy
Duivesteijn W, Thaele J (2014) Understanding where your classifier does (not) work—the SCaPE model class for EMM, In: International conference on data mining (ICDM), pp 809–814
Feldman M, Friedler SA, Moeller J, Scheidegger C, Venkatasubramanian S (2015) Certifying and removing disparate impact. In: Proceedings of the 21st ACM KDD, pp 259–268
Freedman D, Diaconis P (1981) On the histogram as a density estimator: L 2 theory. Probab Theory Relat Fields 57(4):453–476
MathSciNet MATH Google Scholar
Hastie T, Tibshirani R (1998) Classification by pairwise coupling. In: Jordan MI, Kearns MJ, Solla SA (eds) Advances in neural information processing systems, vol 10. MIT Press, Cambridge
Google Scholar
Henelius A, Puolamäki K, Boström H, Asker L, Papapetrou P (2014) A peek into the black box: exploring classifiers by randomization. Data Min Knowl Disc 28:1503–1529
Article MathSciNet Google Scholar
Kabra M, Robie A, Branson K (2015) Understanding classifier errors by examining influential neighbors. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3917–3925
Kaufman S, Rosset S, Perlich C, Stitelman O (2012) Leakage in data mining: Formulation, detection, and avoidance. ACM Trans Knowl Discov Data (TKDD) 6(4):15
Google Scholar
Kleinberg J, Mullainathan S, Raghavan M (2017) Inherent trade-offs in the fair determination of risk scores, In: Proceedings of innovations in theoretical computer science (ITCS)
Le QV, Ranzato M, Monga R, Devin M, Chen K, Corrado GS, Dean J, Ng AY (2011) Building high-level features using large scale unsupervised learning. In: Proceedings of the ICML
Massey DS, Denton N (1993) American apartheid: segregation and the making of the underclass. Harvard University Press, Cambridge
Google Scholar
Motwani R, Raghavan P (1995) Randomized Algorithms. Cambridge University Press, Cambridge
Book MATH Google Scholar
Quinlan R (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers, San Mateo
Google Scholar
Raccuglia P, Elbert KC, Adler PDF, Falk C, Wenny MB, Mollo A, Zeller M, Friedler SA, Schrier J, Norquist AJ (2016) Machine-learning-assisted materials discovery using failed experiments. Nature 533:73–76
Article Google Scholar
Ribeiro MT, Singh S, Guestrin C (2016) Why Should I Trust You?: Explaining the Predictions of Any Classifier. In: Proceedings of the ACM KDD
Romei A, Ruggieri S (2014) A multidisciplinary survey on discrimination analysis. Knowl Eng Rev 29:582–638
Article Google Scholar
Rubner Y, Tomasi C, Guibas LJ (1998) A metric for distributions with applications to image databases. In: 6th International conference on computer vision 1998. IEEE, pp 59–66
Strobl C, Boulesteix A-L, Kneib T, Augustin T, Zeileis A (2008) Conditional variable importance for random forests. BMC Bioinf 9(1):1
Article Google Scholar
Strobl C, Boulesteix A-L, Zeileis A, Hothorn T (2007) Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinf 8(1):1
Article Google Scholar
Ustun B, Traca S, Rudin C (2014) Supersparse linear integer models for interpretable classification. Technical report 1306.6677, arXiv
Zacarias OP, Bostrom H (2013) Comparing support vector regression and random forests for predicting malaria incidence in Mozambique. In: International conference on advances in ICT for emerging regions (ICTer), 2013. IEEE, pp 217–221
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: Computer vision—ECCV 2014. Springer, pp 818–833

Download references

Author information

Authors and Affiliations

Department of Computer Science, Haverford College, Haverford, PA, 19041, USA
Philip Adler, Casey Falk, Sorelle A. Friedler, Tionney Nix, Gabriel Rybeck & Brandon Smith
Department of Computer Science, University of Arizona, Tucson, AZ, USA
Carlos Scheidegger
Department of Computer Science, University of Utah, Salt Lake City, UT, USA
Suresh Venkatasubramanian

Authors

Philip Adler
View author publications
You can also search for this author in PubMed Google Scholar
Casey Falk
View author publications
You can also search for this author in PubMed Google Scholar
Sorelle A. Friedler
View author publications
You can also search for this author in PubMed Google Scholar
Tionney Nix
View author publications
You can also search for this author in PubMed Google Scholar
Gabriel Rybeck
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Scheidegger
View author publications
You can also search for this author in PubMed Google Scholar
Brandon Smith
View author publications
You can also search for this author in PubMed Google Scholar
Suresh Venkatasubramanian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sorelle A. Friedler.

Additional information

A preliminary version of this work with authors Philip Adler, Casey Falk, Sorelle A. Friedler, Gabriel Rybeck, Carlos Scheidegger, Brandon Smith, and Suresh Venkatasubramanian was titled Auditing Black-box Models for Indirect Influence and appeared in the Proceedings of the IEEE International Conference on Data Mining (ICDM) in 2016. This research was funded in part by the NSF under Grants IIS-1251049, CNS-1302688, IIS-1513651, DMR-1307801, IIS-1633724, and IIS-1633387.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Adler, P., Falk, C., Friedler, S.A. et al. Auditing black-box models for indirect influence. Knowl Inf Syst 54, 95–122 (2018). https://doi.org/10.1007/s10115-017-1116-3

Download citation

Received: 01 April 2017
Revised: 22 August 2017
Accepted: 10 October 2017
Published: 25 October 2017
Issue Date: January 2018
DOI: https://doi.org/10.1007/s10115-017-1116-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Auditing black-box models for indirect influence

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Literature reviews as independent studies: guidelines for academic practice

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Auditing black-box models for indirect influence

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Literature reviews as independent studies: guidelines for academic practice

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation