Abstract
In the past few decades, artificial intelligence (AI) technology has experienced swift developments, changing everyone’s daily life and profoundly altering the course of human society. The intention behind developing AI was and is to benefit humans by reducing labor, increasing everyday conveniences, and promoting social good. However, recent research and AI applications indicate that AI can cause unintentional harm to humans by, for example, making unreliable decisions in safety-critical scenarios or undermining fairness by inadvertently discriminating against a group or groups. Consequently, trustworthy AI has recently garnered increased attention regarding the need to avoid the adverse effects that AI could bring to people, so people can fully trust and live in harmony with AI technologies.
A tremendous amount of research on trustworthy AI has been conducted and witnessed in recent years. In this survey, we present a comprehensive appraisal of trustworthy AI from a computational perspective to help readers understand the latest technologies for achieving trustworthy AI. Trustworthy AI is a large and complex subject, involving various dimensions. In this work, we focus on six of the most crucial dimensions in achieving trustworthy AI: (i) Safety & Robustness, (ii) Nondiscrimination & Fairness, (iii) Explainability, (iv) Privacy, (v) Accountability & Auditability, and (vi) Environmental Well-being. For each dimension, we review the recent related technologies according to a taxonomy and summarize their applications in real-world systems. We also discuss the accordant and conflicting interactions among different dimensions and discuss potential aspects for trustworthy AI to investigate in the future.
- [1] 2008. IEEE Standard for Software Reviews and Audits. IEEE Std 1028-2008 (2008), 1–53. Google ScholarCross Ref
- [2] 2017. The Montreal Declaration of Responsible AI. https://www.montrealdeclaration-responsibleai.com/the-declaration. Accessed March 18, 2021.Google Scholar
- [3] 2018. Federated learning of predictive models from federated electronic health records. International Journal of Medical Informatics 112 (2018), 59–67. Google ScholarCross Ref
- [4] 2019. Governance Principles for the New Generation Artificial Intelligence–Developing Responsible Artificial Intelligence. https://www.chinadaily.com.cn/a/201906/17/WS5d07486ba3103dbf14328ab7.html. Accessed March 18, 2021.Google Scholar
- [5] 2021. Federated AI Technology Enabler. https://fate.fedai.org/.Google Scholar
- [6] 2021. LEAF: A Benchmark for Federated Settings. https://leaf.cmu.edu/.Google Scholar
- [7] 2021. A List of Homomorphic Encryption Libraries, Software or Resources. https://github.com/jonaschn/awesome-he.Google Scholar
- [8] 2021. A List of MPC Software or Resources. https://github.com/rdragos/awesome-mpc.Google Scholar
- [9] 2021. OenDP: Open Source Tools for Differential Privacy. https://opendp.org/.Google Scholar
- [10] 2021. Opacus: Train PyTorch Models with Differential Privacy. https://opacus.ai/.Google Scholar
- [11] 2021. Paddle Federated Learning. https://github.com/PaddlePaddle/PaddleFL.Google Scholar
- [12] 2021. A Technical Analysis of Confidential Computing. https://confidentialcomputing.io/wp-content/uploads/sites/85/2021/03/CCC-Tech-Analysis-Confidential-Computing-V1.pdf. Accessed Jan, 2021.Google Scholar
- [13] 2021. TensorFlow Federated. https://github.com/tensorflow/federated.Google Scholar
- [14] 2021. TensorFlow Privacy. https://github.com/tensorflow/privacy.Google Scholar
- [15] . 2016. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. 308–318.Google ScholarDigital Library
- [16] . 2021. A review of interpretable ML in healthcare: Taxonomy, applications, challenges, and future directions. Symmetry 13, 12 (2021), 2439.Google ScholarCross Ref
- [17] . 2018. A survey on homomorphic encryption schemes: Theory and implementation. ACM Computing Surveys (CSUR) 51, 4 (2018), 1–35.Google ScholarDigital Library
- [18] . 2018. Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access 6 (2018), 52138–52160.Google ScholarCross Ref
- [19] . 2019. One-network adversarial fairness. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 2412–2420.Google ScholarDigital Library
- [20] . 2018. Auditing black-box models for indirect influence. Knowledge and Information Systems 54, 1 (2018), 95–122.Google ScholarDigital Library
- [21] . 2019. Fair regression: Quantitative definitions and reduction-based algorithms. In International Conference on Machine Learning. PMLR, 120–129.Google Scholar
- [22] . 2019. Learning optimal and fair decision trees for non-discriminative decision-making. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 1418–1426.Google ScholarDigital Library
- [23] . 2019. QUOTIENT: Two-party secure neural network training and prediction. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 1231–1247.Google ScholarDigital Library
- [24] . 2008. A lambic: A privacy-preserving recommender system for electronic commerce. International Journal of Information Security 7, 5 (2008), 307–334.Google ScholarDigital Library
- [25] . 2018. Threat of adversarial attacks on deep learning in computer vision: A survey. IEEE Access 6 (2018), 14410–14430.Google ScholarCross Ref
- [26] . 2019. Privacy-preserving machine learning: Threats and solutions. IEEE Security & Privacy 17, 2 (2019), 49–58.Google ScholarCross Ref
- [27] . 2018. Towards better understanding of gradient-based attribution methods for deep neural networks. In International Conference on Learning Representations.Google Scholar
- [28] . 2021. Large-Scale Differentially Private BERT.
arxiv:2108.01624 [cs.LG]Google Scholar - [29] . 2020. Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion 58 (2020), 82–115.Google ScholarDigital Library
- [30] . 2020. AI explainability 360: An extensible toolkit for understanding data and machine learning models. Journal of Machine Learning Research 21, 130 (2020), 1–6.Google Scholar
- [31] . 2019. Scalable fair clustering. In International Conference on Machine Learning. PMLR, 405–413.Google Scholar
- [32] . 2010. The security of machine learning. Machine Learning 81, 2 (2010), 121–148.Google ScholarDigital Library
- [33] . 2014. Private empirical risk minimization: Efficient algorithms and tight error bounds. In 2014 IEEE 55th Annual Symposium on Foundations of Computer Science. IEEE, 464–473.Google ScholarDigital Library
- [34] . 2020. Towards mitigating gender bias in a decoder-based neural machine translation model by adding contextual information. In Proceedings of the The Fourth Widening Natural Language Processing Workshop. Association for Computational Linguistics, 99–102.Google ScholarCross Ref
- [35] . 2019. Probabilistic verification of fairness properties via concentration. Proceedings of the ACM on Programming Languages 3, OOPSLA (2019), 1–27.Google ScholarDigital Library
- [36] . 2018. AI fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias. arXiv preprint arXiv:1810.01943 (2018).Google Scholar
- [37] . 2020. Principles and practice of explainable machine learning. arXiv preprint arXiv:2009.11698 (2020).Google Scholar
- [38] . 2017. A convex framework for fair regression. arXiv preprint arXiv:1706.02409 (2017).Google Scholar
- [39] . 2021. Fairness in criminal justice risk assessments: The state of the art. Sociological Methods & Research 50, 1 (2021), 3–44.Google ScholarCross Ref
- [40] . 2013. Evasion attacks against machine learning at test time. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 387–402.Google ScholarDigital Library
- [41] . 2012. Poisoning attacks against support vector machines. arXiv preprint arXiv:1206.6389 (2012).Google Scholar
- [42] . 2006. Pattern Recognition and Machine Learning. Springer.Google ScholarDigital Library
- [43] . 2020. Language (technology) is Power: A critical survey of “Bias” in NLP. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 5454–5476.Google ScholarCross Ref
- [44] . 2018. Adversarial attacks on node embeddings. arXiv preprint arXiv:1809.01093 (2018).Google Scholar
- [45] . 2019. Adversarial attacks on node embeddings via graph poisoning.
arxiv:1809.01093 [cs.LG]Google Scholar - [46] . 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In Advances in Neural Information Processing Systems. 4349–4357.Google Scholar
- [47] . 2019. Identifying and reducing gender bias in word-level language models. arXiv preprint arXiv:1904.03035 (2019).Google Scholar
- [48] . 2019. Nuanced metrics for measuring unintended bias with real data for text classification. In Companion Proceedings of the 2019 World Wide Web Conference. 491–500.Google ScholarDigital Library
- [49] . 2019. Compositional fairness constraints for graph embeddings. arXiv preprint arXiv:1905.10674 (2019).Google Scholar
- [50] . 2007. Analysing and assessing accountability: A conceptual framework 1. European Law Journal 13, 4 (2007), 447–468.Google ScholarCross Ref
- [51] . 2013. Privacy-preserving biometric identification using secure multiparty computation: An overview and recent trends. IEEE Signal Processing Magazine 30, 2 (2013), 42–52.Google ScholarCross Ref
- [52] . 2021. When is Memorization of Irrelevant Training Data Necessary for High-Accuracy Learning?Association for Computing Machinery, New York, NY, USA, 123–132. Google ScholarDigital Library
- [53] . 2020. Toward trustworthy AI development: Mechanisms for supporting verifiable claims. arXiv preprint arXiv:2004.07213 (2020).Google Scholar
- [54] . 2019. Understanding the origins of bias in word embeddings. In International Conference on Machine Learning. PMLR, 803–811.Google Scholar
- [55] . 2005. A (very) brief history of artificial intelligence. AI Magazine 26, 4 (2005), 53–53.Google Scholar
- [56] . 2018. Notes from the AI frontier: Modeling the impact of AI on the world economy. McKinsey Global Institute (2018).Google Scholar
- [57] . 2018. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on Fairness, Accountability and Transparency. 77–91.Google Scholar
- [58] . 2016. How the machine ‘thinks’: Understanding opacity in machine learning algorithms. Big Data & Society 3, 1 (2016), 2053951715622512.Google ScholarCross Ref
- [59] . 2017. Neuralpower: Predict and deploy energy-efficient convolutional neural networks. In Asian Conference on Machine Learning. PMLR, 622–637.Google Scholar
- [60] . 2009. Building classifiers with independency constraints. In 2009 IEEE International Conference on Data Mining Workshops. IEEE, 13–18.Google ScholarDigital Library
- [61] . 2010. Three Naive Bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery 21, 2 (2010), 277–292.Google ScholarDigital Library
- [62] . 2013. Why unbiased computational processes can lead to discriminative decision procedures. In Discrimination and Privacy in the Information Society. Springer, 43–57.Google ScholarCross Ref
- [63] . 2019. Efficient and effective sparse LSTM on FPGA with bank-balanced sparsity. In Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 63–72.Google ScholarDigital Library
- [64] . 2019. The secret sharer: Evaluating and testing unintended memorization in neural networks. In 28th \( \lbrace \)USENIX\( \rbrace \) Security Symposium (\( \lbrace \)USENIX\( \rbrace \) Security 19). 267–284.Google Scholar
- [65] . 2021. Extracting training data from large language models. In 30th USENIX Security Symposium (USENIX Security 21). 2633–2650.Google Scholar
- [66] . 2017. Adversarial examples are not easily detected: Bypassing ten detection methods. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. 3–14.Google ScholarDigital Library
- [67] . 2017. Towards Evaluating the Robustness of Neural Networks.
arxiv:1608.04644 [cs.CR]Google Scholar - [68] . 2018. Audio adversarial examples: Targeted attacks on speech-to-text. In 2018 IEEE Security and Privacy Workshops (SPW). IEEE, 1–7.Google Scholar
- [69] . 2019. Unlabeled Data Improves Adversarial Robustness.
arxiv:1905.13736 [stat.ML]Google Scholar - [70] . 2011. Many cars tone deaf to women’s voices. AOL Autos (2011).Google Scholar
- [71] . 2020. Fairness in machine learning: A survey. arXiv preprint arXiv:2010.04053 (2020).Google Scholar
- [72] . 2016. How to be fair and diverse?ArXiv abs/1610.07183 (2016).Google Scholar
- [73] . 2019. Improved adversarial learning for fair classification. arXiv preprint arXiv:1901.10443 (2019).Google Scholar
- [74] . 2018. Adversarial attacks and defences: A survey. arXiv preprint arXiv:1810.00069 (2018).Google Scholar
- [75] . 2018. Regan: A pipelined reram-based accelerator for generative adversarial networks. In 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 178–183.Google ScholarDigital Library
- [76] . 2019. Robust decision trees against adversarial examples. In International Conference on Machine Learning. PMLR, 1122–1131.Google Scholar
- [77] . 2018. Why is my classifier discriminatory?arXiv preprint arXiv:1805.12002 (2018).Google Scholar
- [78] . 2020. Bias and debias in recommender system: A survey and future directions. arXiv preprint arXiv:2010.03240 (2020).Google Scholar
- [79] . 2018. Fast Gradient Attack on Network Embedding.
arxiv:1809.02797 [physics.soc-ph]Google Scholar - [80] . 2020. Stateful detection of black-box adversarial attacks. In Proceedings of the 1st ACM Workshop on Security and Privacy on Artificial Intelligence. 30–39.Google ScholarDigital Library
- [81] . 2019. Proportionally fair clustering. In International Conference on Machine Learning. PMLR, 1032–1041.Google Scholar
- [82] . 2017. Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning.
arxiv:1712.05526 [cs.CR]Google Scholar - [83] . 2020. A survey of accelerator architectures for deep neural networks. Engineering 6, 3 (2020), 264–274.Google ScholarCross Ref
- [84] . 2018. Seq2sick: Evaluating the robustness of sequence-to-sequence models with adversarial examples. arXiv preprint arXiv:1803.01128 (2018).Google Scholar
- [85] . 2017. A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282 (2017).Google Scholar
- [86] . 2019. Transformers. Zip: Compressing Transformers with Pruning and Quantization.
Technical Report . Technical report, Stanford University, Stanford, California.Google Scholar - [87] . 2019. Path-specific counterfactual fairness. In AAAI.Google Scholar
- [88] . 2016. Faster fully homomorphic encryption: Bootstrapping in less than 0.1 seconds. In International Conference on the Theory and Application of Cryptology and Information Security. Springer, 3–33.Google ScholarCross Ref
- [89] . 2019. On measuring gender bias in translation of gender-neutral pronouns. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing. 173–181.Google ScholarCross Ref
- [90] . 2016. Towards the limit of network quantization. arXiv preprint arXiv:1612.01543 (2016).Google Scholar
- [91] . 2017. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data 5 2 (2017), 153–163.Google ScholarCross Ref
- [92] . 2017. Fairer and more accurate, but for whom?arXiv preprint arXiv:1707.00046 (2017).Google Scholar
- [93] . 2019. Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning. PMLR, 1310–1320.Google Scholar
- [94] . 2013. Classifying political orientation on Twitter: It’s not easy!. In Seventh International AAAI Conference on Weblogs and Social Media.Google Scholar
- [95] . 2016. Group equivariant convolutional networks. In International Conference on Machine Learning. PMLR, 2990–2999.Google ScholarDigital Library
- [96] . 2019. Independent high-level expert group on artificial intelligence (2019). Ethics Guidelines for Trustworthy AI (2019).Google Scholar
- [97] . 2018. The measure and mismeasure of fairness: A critical review of fair machine learning. arXiv preprint arXiv:1808.00023 (2018).Google Scholar
- [98] . 2017. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 797–806.Google ScholarDigital Library
- [99] . 2017. Algorithmic bias: A counterfactual perspective. NSF Trustworthy Algorithms (2017).Google Scholar
- [100] . 2021. RobustBench: A Standardized Adversarial Robustness Benchmark.
arxiv:2010.09670 [cs.LG]Google Scholar - [101] . 2020. Minimally Distorted Adversarial Examples with a Fast Adaptive Boundary Attack.
arxiv:1907.02044 [cs.LG]Google Scholar - [102] . 2020. Reliable Evaluation of Adversarial Robustness with an Ensemble of Diverse Parameter-free Attacks.
arxiv:2003.01690 [cs.LG]Google Scholar - [103] . 2019. On the compatibility of privacy and fairness. In Adjunct Publication of the 27th Conference on User Modeling, Adaptation and Personalization. 309–315.Google ScholarDigital Library
- [104] . 2020. Conversational assistants and gender stereotypes: Public perceptions and desiderata for voice personas. In Proceedings of the Second Workshop on Gender Bias in Natural Language Processing. 72–78.Google Scholar
- [105] . 2021. Say no to the discrimination: Learning fair graph neural networks with limited sensitive attribute information. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. 680–688.Google ScholarDigital Library
- [106] . 2018. Adversarial Attack on Graph Structured Data.
arxiv:1806.02371 [cs.LG]Google Scholar - [107] . 2004. Adversarial classification. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 99–108.Google ScholarDigital Library
- [108] . 2020. A survey of the state of explainable AI for natural language processing. arXiv preprint arXiv:2010.00711 (2020).Google Scholar
- [109] . 2013. Practicing differential privacy in health care: A review.Trans. Data Priv. 6, 1 (2013), 35–67.Google ScholarDigital Library
- [110] . 2020. An overview of privacy in machine learning. arXiv preprint arXiv:2005.08679 (2020).Google Scholar
- [111] . 2018. Universal transformers. arXiv preprint arXiv:1807.03819 (2018).Google Scholar
- [112] . 2018. Deep Learning in Natural Language Processing. Springer.Google ScholarDigital Library
- [113] . 2016. Assessing the consequences of text preprocessing decisions. Available at SSRN (2016).Google Scholar
- [114] . 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171–4186.Google Scholar
- [115] . 2020. Queens are powerful too: Mitigating gender bias in dialogue generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 8173–8188.Google ScholarCross Ref
- [116] . 2019. AdverTorch v0.1: An adversarial robustness toolbox based on Pytorch. arXiv preprint arXiv:1902.07623 (2019).Google Scholar
- [117] . 2018. Measuring and mitigating unintended bias in text classification. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society. 67–73.Google ScholarDigital Library
- [118] . 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017).Google Scholar
- [119] . 2019. Techniques for interpretable machine learning. Commun. ACM 63, 1 (2019), 68–77.Google ScholarDigital Library
- [120] . 2008. Differential privacy: A survey of results. In International Conference on Theory and Applications of Models of Computation. Springer, 1–19.Google ScholarDigital Library
- [121] . 2012. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference. 214–226.Google ScholarDigital Library
- [122] . 2006. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography Conference. Springer, 265–284.Google ScholarDigital Library
- [123] . 2014. The algorithmic foundations of differential privacy.Foundations and Trends in Theoretical Computer Science 9, 3-4 (2014), 211–407.Google ScholarDigital Library
- [124] . 2018. Decision making with limited feedback: Error bounds for predictive policing and recidivism prediction. In Proceedings of Algorithmic Learning Theory, Vol. 83.Google Scholar
- [125] . 2012. Neural acceleration for general-purpose approximate programs. In 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 449–460.Google ScholarDigital Library
- [126] . 2019. On the connection between adversarial robustness and saliency map interpretability. arXiv preprint arXiv:1905.04172 (2019).Google Scholar
- [127] . 2018.
DOI: Google ScholarDigital Library - [128] . 2017. Robust physical-world attacks on deep learning models. arXiv preprint arXiv:1707.08945 (2017).Google Scholar
- [129] . 2019. Deep adversarial social recommendation. In 28th International Joint Conference on Artificial Intelligence (IJCAI-19). 1351–1357.Google ScholarDigital Library
- [130] . 2021. Attacking black-box recommendations via copying cross-domain user profiles. In 2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, 1583–1594.Google ScholarCross Ref
- [131] . 2021. Jointly attacking graph neural network and its explanations. arXiv preprint arXiv:2108.03388 (2021).Google Scholar
- [132] . 2019. Graph neural networks for social recommendation. In The World Wide Web Conference. 417–426.Google ScholarDigital Library
- [133] . 2020. A graph neural network framework for social recommendations. IEEE Transactions on Knowledge and Data Engineering (2020).Google Scholar
- [134] . 2019. Deep social collaborative filtering. In Proceedings of the 13th ACM Conference on Recommender Systems. 305–313.Google ScholarDigital Library
- [135] . 2018. Poisoning attacks to graph-based recommender systems. In Proceedings of the 34th Annual Computer Security Applications Conference. 381–392.Google ScholarDigital Library
- [136] . 2015. Certifying and removing disparate impact. Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2015).Google ScholarDigital Library
- [137] . 2015. Certifying and removing disparate impact. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 259–268.Google ScholarDigital Library
- [138] . 2020. Does Learning Require Memorization? A Short Tale about a Long Tail. Association for Computing Machinery, New York, NY, USA, 954–959. Google ScholarDigital Library
- [139] . 2019. Learning fair representations via an adversarial framework. arXiv preprint arXiv:1904.13341 (2019).Google Scholar
- [140] . 2019. Adversarial attacks on medical machine learning. Science 363, 6433 (2019), 1287–1289.Google ScholarCross Ref
- [141] . 2018. AI4People–An ethical framework for a good AI society: Opportunities, risks, principles, and recommendations. Minds and Machines 28, 4 (2018), 689–707.Google ScholarDigital Library
- [142] . 2020. The Future of Jobs Report 2020. World Economic Forum, Geneva, Switzerland.Google Scholar
- [143] . 2015. Model inversion attacks that exploit confidence information and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. 1322–1333.Google ScholarDigital Library
- [144] . 2014. Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing. In 23rd \( \lbrace \)USENIX\( \rbrace \) Security Symposium (\( \lbrace \)USENIX\( \rbrace \) Security 14). 17–32.Google Scholar
- [145] . 2019. Estimation of energy consumption in machine learning. J. Parallel and Distrib. Comput. 134 (2019), 75–88.Google ScholarDigital Library
- [146] . 2020. RealToxicityPrompts: Evaluating neural toxic degeneration in language models. In Findings of the Association for Computational Linguistics: EMNLP 2020. 3356–3369.Google Scholar
- [147] . 2020. Shortcut learning in deep neural networks. Nature Machine Intelligence 2, 11 (2020), 665–673.Google ScholarCross Ref
- [148] . 2009. Fully homomorphic encryption using ideal lattices. In Proceedings of the Forty-first Annual ACM Symposium on Theory of Computing. 169–178.Google ScholarDigital Library
- [149] . 2011. Implementing gentry’s fully-homomorphic encryption scheme. In Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, 129–148.Google ScholarCross Ref
- [150] . 2019. Interpretation of neural networks is fragile. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 3681–3688.Google ScholarDigital Library
- [151] . 2018. Explaining explanations: An overview of interpretability of machine learning. In 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 80–89.Google ScholarCross Ref
- [152] . 2018. Non-discriminatory machine learning through convex fairness criteria. Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society (2018).Google ScholarDigital Library
- [153] . 2018. Non-discriminatory machine learning through convex fairness criteria. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.Google ScholarDigital Library
- [154] . 2019. Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them. arXiv preprint arXiv:1903.03862 (2019).Google Scholar
- [155] . 2020. Automatically identifying gender issues in machine translation using perturbations. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings. 1991–1995.Google ScholarCross Ref
- [156] . 2017. Adversarial and clean data are not twins. arXiv preprint arXiv:1704.04960 (2017).Google Scholar
- [157] . 2015. Race and Social Equity: A Nervous Area of Government. Taylor & Francis. https://books.google.com/books?id=y2dsBgAAQBAJ.Google ScholarCross Ref
- [158] . 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).Google Scholar
- [159] . 2019. Disparate interactions: An algorithm-in-the-loop analysis of fairness in risk assessments. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 90–99.Google ScholarDigital Library
- [160] . 2017. On the (statistical) detection of adversarial examples. arXiv preprint arXiv:1702.06280 (2017).Google Scholar
- [161] . 2018. A survey of methods for explaining black box models. ACM Computing Surveys (CSUR) 51, 5 (2018), 1–42.Google ScholarDigital Library
- [162] . 2012. A methodology for direct and indirect discrimination prevention in data mining. IEEE Transactions on Knowledge and Data Engineering 25, 7 (2012), 1445–1459.Google ScholarDigital Library
- [163] . 2017. ESE: Efficient speech recognition engine with sparse LSTM on FPGA. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 75–84.Google ScholarDigital Library
- [164] . 2016. EIE: Efficient inference engine on compressed deep neural network. ACM SIGARCH Computer Architecture News 44, 3 (2016), 243–254.Google ScholarDigital Library
- [165] . 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).Google Scholar
- [166] . 2014. Deep speech: Scaling up end-to-end speech recognition. arXiv preprint arXiv:1412.5567 (2014).Google Scholar
- [167] . 2016. Equality of opportunity in supervised learning. In NIPS.Google Scholar
- [168] . 2020. FedML: A research library and benchmark for federated machine learning. arXiv preprint arXiv:2007.13518 (2020).Google Scholar
- [169] . 2017. Calibration for the (computationally-identifiable) masses. ArXiv abs/1711.08513 (2017).Google Scholar
- [170] . 2018. Ethical challenges in data-driven dialogue systems. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society. 123–129.Google ScholarDigital Library
- [171] . 2020. What shapes feature representations? Exploring datasets, architectures, and training. arXiv preprint arXiv:2006.12433 (2020).Google Scholar
- [172] . 2019. Privacy as protection of the incomputable self: From agnostic to agonistic machine learning. Theoretical Inquiries in Law 20, 1 (2019), 83–121.Google ScholarCross Ref
- [173] . 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).Google Scholar
- [174] . 2018. Loss-aware weight quantization of deep networks. arXiv preprint arXiv:1802.08635 (2018).Google Scholar
- [175] . 2018. The ugly truth about ourselves and our robot creations: The problem of bias and social inequity. Science and Engineering Ethics 24, 5 (2018), 1521–1536.Google ScholarCross Ref
- [176] . 2021. Membership Inference Attacks on Machine Learning: A Survey.
arxiv:2103.07853 [cs.LG]Google Scholar - [177] . 2020. Unlearnable examples: Making personal data unexploitable. In International Conference on Learning Representations.Google Scholar
- [178] . 2021. Unlearnable examples: Making personal data unexploitable. arXiv preprint arXiv:2101.04898 (2021).Google Scholar
- [179] . 2020. Multilingual Twitter corpus and baselines for evaluating demographic bias in hate speech recognition. arXiv preprint arXiv:2002.10361 (2020).Google Scholar
- [180] . 2019. 50 years of test (un) fairness: Lessons for machine learning. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 49–58.Google ScholarDigital Library
- [181] . 2019. FAE: A fairness-aware ensemble framework. 2019 IEEE International Conference on Big Data (Big Data) (2019), 1375–1380.Google ScholarCross Ref
- [182] . 2014. Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866 (2014).Google Scholar
- [183] . 2020. Auditing differentially private machine learning: How private is private SGD? In Advances in Neural Information Processing Systems, , , , , and (Eds.), Vol. 33. Curran Associates, Inc., 22205–22216. https://proceedings.neurips.cc/paper/2020/file/fc4ddc15f9f4b4b06ef7844d6bb53abf-Paper.pdf.Google Scholar
- [184] . 2014. Differential privacy and machine learning: A survey and review. arXiv preprint arXiv:1412.7584 (2014).Google Scholar
- [185] . 2020. Identifying and correcting label bias in machine learning. In International Conference on Artificial Intelligence and Statistics. PMLR, 702–712.Google Scholar
- [186] . 2020. Drug discovery with explainable artificial intelligence. Nature Machine Intelligence 2, 10 (2020), 573–584.Google ScholarCross Ref
- [187] . 2020. Adversarial attacks and defenses on graphs: A review and empirical study. arXiv preprint arXiv:2003.00653 (2020).Google Scholar
- [188] . 2020. Graph Structure Learning for Robust Graph Neural Networks.
arxiv:2005.10203 [cs.LG]Google Scholar - [189] . 2017. Constance: Modeling annotation contexts to improve stance classification. arXiv preprint arXiv:1708.06309 (2017).Google Scholar
- [190] . 2016. Fair algorithms for infinite and contextual bandits. arXiv: Learning (2016).Google Scholar
- [191] . 2021. (Nearly) dimension independent private ERM with AdaGrad rates via publicly estimated subspaces. In Proceedings of Thirty Fourth Conference on Learning Theory(
Proceedings of Machine Learning Research , Vol. 134), and (Eds.). PMLR, 2717–2746. https://proceedings.mlr.press/v134/kairouz21a.html.Google Scholar - [192] . 2019. Censored and fair universal representations using generative adversarial models. arXiv preprint arXiv:1910.00411 (2019).Google Scholar
- [193] . 2020. Secure, privacy-preserving and federated machine learning in medical imaging. Nature Machine Intelligence 2, 6 (2020), 305–311.Google ScholarCross Ref
- [194] . 2009. Classifying without discriminating. In 2009 2nd International Conference on Computer, Control and Communication. IEEE, 1–6.Google ScholarCross Ref
- [195] . 2011. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems 33 (2011), 1–33.Google ScholarDigital Library
- [196] . 2012. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems 33, 1 (2012), 1–33.Google ScholarDigital Library
- [197] . 2013. Explainable and non-explainable discrimination in classification. In Discrimination and Privacy in the Information Society.Google Scholar
- [198] . 2012. Fairness-aware classifier with prejudice remover regularizer. In ECML/PKDD.Google Scholar
- [199] . 2013. On differentially private low rank approximation. In Proceedings of the Twenty-fourth Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM, 1395–1414.Google ScholarCross Ref
- [200] . 2020. SCAFFOLD: Stochastic controlled averaging for federated learning. In International Conference on Machine Learning. PMLR, 5132–5143.Google Scholar
- [201] . 2020. Tighter theory for local SGD on identical and heterogeneous data. In International Conference on Artificial Intelligence and Statistics. PMLR, 4519–4529.Google Scholar
- [202] . 2017. Avoiding discrimination through causal reasoning. In NIPS.Google Scholar
- [203] . 2018. Fairness through computationally-bounded awareness. In NeurIPS.Google Scholar
- [204] . 2016. Sequence-level knowledge distillation. arXiv preprint arXiv:1606.07947 (2016).Google Scholar
- [205] . 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).Google Scholar
- [206] . 2018. Examining gender and race bias in two hundred sentiment analysis systems. arXiv preprint arXiv:1805.04508 (2018).Google Scholar
- [207] . 2009. Artificial intelligence: Definition, trends, techniques, and cases. Artificial Intelligence 1 (2009), 270–299.Google Scholar
- [208] . 2018. Adaptive sensitive reweighting to mitigate bias in fairness-aware classification. In Proceedings of the 2018 World Wide Web Conference. 853–862.Google ScholarDigital Library
- [209] . 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097–1105.Google ScholarDigital Library
- [210] . 2017. Counterfactual fairness. In NIPS.Google Scholar
- [211] . 2019. Quantifying the carbon emissions of machine learning. arXiv preprint arXiv:1910.09700 (2019).Google Scholar
- [212] . 2019. Algorithmic bias? An empirical study of apparent gender-based discrimination in the display of STEM career ads. Management Science 65, 7 (2019), 2966–2981.Google ScholarDigital Library
- [213] . 2019. ALBERT: A lite BERT for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019).Google Scholar
- [214] . 1995. Convolutional networks for images, speech, and time series. The Handbook of Brain Theory and Neural Networks 3361, 10 (1995), 1995.Google ScholarDigital Library
- [215] . 2020. Keystone: An open framework for architecting trusted execution environments. In Proceedings of the Fifteenth European Conference on Computer Systems (EuroSys’20).Google ScholarDigital Library
- [216] . 2004. Trust in automation: Designing for appropriate reliance. Human Factors 46, 1 (2004), 50–80.Google ScholarCross Ref
- [217] . 2018. Discrete attacks and submodular optimization with applications to text classification. CoRR abs/1812.00151 (2018).
arxiv:1812.00151 http://arxiv.org/abs/1812.00151.Google Scholar - [218] . 2011. Towards fully autonomous driving: Systems and algorithms. In 2011 IEEE Intelligent Vehicles Symposium (IV). IEEE, 163–168.Google ScholarCross Ref
- [219] . 2016. Evaluating the energy efficiency of deep convolutional neural networks on CPUs and GPUs. In 2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom)(BDCloud-SocialCom-SustainCom). IEEE, 477–484.Google ScholarCross Ref
- [220] . 2020. Federated optimization in heterogeneous networks. In Proceedings of Machine Learning and Systems 2020, MLSys 2020, Austin, TX, USA, March 2-4, 2020, , , and (Eds.). mlsys.org. https://proceedings.mlsys.org/book/316.pdf.Google Scholar
- [221] . 2020. On the convergence of FedAvg on non-IID data. In International Conference on Learning Representations. https://openreview.net/forum?id=HJxNAnVtDS.Google Scholar
- [222] . 2021. Large Language Models Can Be Strong Differentially Private Learners.
arxiv:2110.05679 [cs.LG]Google Scholar - [223] . 2020. DeepRobust: A PyTorch Library for Adversarial Attacks and Defenses.
arxiv:2005.06149 [cs.LG]Google Scholar - [224] . 2021. Explainable AI: A review of machine learning interpretability methods. Entropy 23, 1 (2021), 18.Google ScholarCross Ref
- [225] . 2019. Designing monitoring systems for continuous certification of cloud services: Deriving meta-requirements and design guidelines. Communications of the Association for Information Systems 44, 1 (2019), 25.Google Scholar
- [226] . 2020. Does gender matter? Towards fairness in dialogue systems. In Proceedings of the 28th International Conference on Computational Linguistics. 4403–4416.Google ScholarCross Ref
- [227] . 2019. Say what I want: Towards the dark side of neural dialogue models. arXiv preprint arXiv:1909.06044 (2019).Google Scholar
- [228] . 2021. The authors matter: Understanding and mitigating implicit bias in deep text classification. arXiv preprint arXiv:2105.02778 (2021).Google Scholar
- [229] . 2020. Mitigating gender bias for neural dialogue generation with adversarial learning. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 893–903.Google ScholarCross Ref
- [230] . 2021. DIG: A turnkey library for diving into graph deep learning research. arXiv preprint arXiv:2103.12608 (2021).Google Scholar
- [231] . 2021. Linear convergent decentralized optimization with compression. In International Conference on Learning Representations. https://openreview.net/forum?id=84gjULz1t5.Google Scholar
- [232] . 2015. RENO: A high-efficient reconfigurable neuromorphic computing accelerator design. In Proceedings of the 52nd Annual Design Automation Conference. 1–6.Google ScholarDigital Library
- [233] . 2017. Calibrated fairness in bandits. arXiv preprint arXiv:1707.01875 (2017).Google Scholar
- [234] . 2016. Learning to pivot with adversarial networks. arXiv preprint arXiv:1611.01046 (2016).Google Scholar
- [235] . 2020. Gender bias in neural natural language processing. In Logic, Language, and Security. Springer, 189–202.Google Scholar
- [236] . 2017. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems 30 (2017), 4765–4774.Google ScholarDigital Library
- [237] . 2020. Parameterized explainer for graph neural network. arXiv preprint arXiv:2011.04573 (2020).Google Scholar
- [238] . 2019. Jointly learning explainable rules for recommendation with knowledge graph. In The World Wide Web Conference. 1210–1221.Google ScholarDigital Library
- [239] . 2019. Attacking Graph Convolutional Networks via Rewiring.
arxiv:1906.03750 [cs.LG]Google Scholar - [240] . 2017. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017).Google Scholar
- [241] . 2012. Collaborative filtering and the missing at random assumption. arXiv preprint arXiv:1206.5267 (2012).Google Scholar
- [242] . 2019. Ethical implications and accountability of algorithms. Journal of Business Ethics 160, 4 (2019), 835–850.Google ScholarCross Ref
- [243] . 2019. On measuring social biases in sentence encoders. arXiv preprint arXiv:1903.10561 (2019).Google Scholar
- [244] . 1995. An integrative model of organizational trust. Academy of Management Review 20, 3 (1995), 709–734.Google ScholarCross Ref
- [245] . 2006. A proposal for the Dartmouth summer research project on artificial intelligence, August 31, 1955. AI Magazine 27, 4 (2006), 12–12.Google ScholarDigital Library
- [246] . 2021. Advances and open problems in federated learning. Foundations and Trends in Machine Learning 14, 1 (2021).Google Scholar
- [247] . 2009. Differentially private recommender systems: Building privacy into the Netflix prize contenders. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 627–636.Google ScholarDigital Library
- [248] . 2007. Mechanism design via differential privacy. In 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS’07). IEEE, 94–103.Google ScholarDigital Library
- [249] . 2019. A survey on bias and fairness in machine learning. arXiv preprint arXiv:1908.09635 (2019).Google Scholar
- [250] . 2017. The cost of fairness in classification. ArXiv abs/1705.09055 (2017).Google Scholar
- [251] . 2018. The cost of fairness in binary classification. In Conference on Fairness, Accountability and Transparency. PMLR, 107–118.Google Scholar
- [252] . 2019. Are sixteen heads really better than one?arXiv preprint arXiv:1905.10650 (2019).Google Scholar
- [253] . 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence 267 (2019), 1–38.Google ScholarCross Ref
- [254] . 2018. Deep learning for healthcare: Review, opportunities and challenges. Briefings in Bioinformatics 19, 6 (2018), 1236–1246.Google ScholarCross Ref
- [255] . 2014. A survey of methods for analyzing and improving GPU energy efficiency. ACM Computing Surveys (CSUR) 47, 2 (2014), 1–23.Google ScholarDigital Library
- [256] . 2019. Explaining explanations in AI. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 279–288.Google ScholarDigital Library
- [257] . 2017. SecureML: A system for scalable privacy-preserving machine learning. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 19–38.Google ScholarCross Ref
- [258] . 2020. Interpretable Machine Learning. Lulu.com.Google Scholar
- [259] . 2017. Universal adversarial perturbations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1765–1773.Google ScholarCross Ref
- [260] . 2019. How police technology aggravates racial inequity: A taxonomy of problems and a path forward. Available at SSRN 3340898 (2019).Google Scholar
- [261] . 2018. Explainable prediction of medical codes from clinical text. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 1101–1111.Google ScholarCross Ref
- [262] . 2018. Fair inference on outcomes. Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence(2018), 1931–1940.Google Scholar
- [263] . 2013. Privacy-preserving matrix factorization. In Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security. 801–812.Google ScholarDigital Library
- [264] . 2018. Adversarial over-sensitivity and over-stability strategies for dialogue models. arXiv preprint arXiv:1809.02079 (2018).Google Scholar
- [265] . 2021. An empirical study on the relation between network interpretability and adversarial robustness. SN Computer Science 2, 1 (2021), 1–13.Google ScholarDigital Library
- [266] . 2019. InterpretML: A unified framework for machine learning interpretability. arXiv preprint arXiv:1909.09223 (2019).Google Scholar
- [267] . 2017. Asilomar AI Principles. https://futureoflife.org/ai-principles/ Accessed March 18, 2021.Google Scholar
- [268] . 2019. Social data: Biases, methodological pitfalls, and ethical boundaries. Frontiers in Big Data 2 (2019), 13.Google ScholarCross Ref
- [269] . 2015. Google photos identified two black people as’ Gorillas’. Mashable, July 1 (2015).Google Scholar
- [270] . 2020. Bias in word embeddings. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 446–457.Google ScholarDigital Library
- [271] . 2016. cleverhans v1.0.0: An adversarial machine learning library. arXiv preprint arXiv:1610.00768 (2016).Google Scholar
- [272] . 2019. Timeloop: A systematic approach to DNN accelerator evaluation. In 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). IEEE, 304–315.Google ScholarCross Ref
- [273] . 2018. Reducing gender bias in abusive language detection. arXiv preprint arXiv:1808.07231 (2018).Google Scholar
- [274] . 2014. Learning part-of-speech taggers with inter-annotator agreement loss. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. 742–751.Google ScholarCross Ref
- [275] . 2017. On Fairness and Calibration. In NIPS.Google Scholar
- [276] . 2020. Large image datasets: A pyrrhic win for computer vision?arXiv preprint arXiv:2006.16923 (2020).Google Scholar
- [277] . 2019. Assessing gender bias in machine translation: A case study with Google translate. Neural Computing and Applications (2019), 1–19.Google Scholar
- [278] . 2019. Interpretable deep learning in drug discovery. In Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Springer, 331–345.Google ScholarDigital Library
- [279] . 2019. Toward a better trade-off between performance and fairness with kernel-based distribution matching. arXiv preprint arXiv:1910.11779 (2019).Google Scholar
- [280] . 1986. Induction of decision trees. Machine Learning 1, 1 (1986), 81–106.Google ScholarCross Ref
- [281] . 2021. Data poisoning won’t save you from facial recognition. In ICML 2021 Workshop on Adversarial Machine Learning.Google Scholar
- [282] . 2018. Certified defenses against adversarial examples. arXiv preprint arXiv:1801.09344 (2018).Google Scholar
- [283] . 2020. Closing the AI accountability gap: Defining an end-to-end framework for internal algorithmic auditing. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 33–44.Google ScholarDigital Library
- [284] . 2017. Deep convolutional neural networks for image classification: A comprehensive review. Neural Computation 29, 9 (2017), 2352–2449.Google ScholarDigital Library
- [285] . 2016. “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1135–1144.Google ScholarDigital Library
- [286] . 2020. Overfitting in adversarially robust deep learning. In International Conference on Machine Learning. PMLR, 8093–8104.Google Scholar
- [287] . 2020. The future of digital health with federated learning. NPJ Digital Medicine 3, 1 (2020), 1–7.Google ScholarCross Ref
- [288] . 2020. A survey of privacy attacks in machine learning. arXiv preprint arXiv:2007.07646 (2020).Google Scholar
- [289] . 2013. Learning separable filters. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2754–2761.Google ScholarDigital Library
- [290] . 2004. A field study of the impact of gender and user’s technical experience on the performance of voice-activated medical tracking application. International Journal of Human-Computer Studies 60, 5-6 (2004), 529–544.Google ScholarCross Ref
- [291] . 2018. SyNERGY: An energy measurement and prediction framework for convolutional neural networks on Jetson TX1. In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA). The Steering Committee of The World Congress in Computer Science, Computer, 375–382.Google Scholar
- [292] . 2014. Fitnets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550 (2014).Google Scholar
- [293] . 2010. Are face-detection cameras racist. Time Business 1 (2010).Google Scholar
- [294] . 2018. DeepSecure: Scalable provably-secure deep learning. In Proceedings of the 55th Annual Design Automation Conference. 1–6.Google ScholarDigital Library
- [295] . 2017. diffpriv: An R package for easy differential privacy. (2017). https://github.com/brubinstein/diffpriv.Google Scholar
- [296] . 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1, 5 (2019), 206–215.Google ScholarCross Ref
- [297] . 2002. Artificial intelligence: A modern approach. (2002).Google Scholar
- [298] . 2017. Improving smiling detection with race and gender diversity. 1, 2 (2017), 7. arXiv preprint arXiv:1712.00193.Google Scholar
- [299] . 2020. Adversarial attacks on copyright detection systems. In International Conference on Machine Learning. PMLR, 8307–8315.Google Scholar
- [300] . 2015. Trusted execution environment: What it is, and what it is not. In 2015 IEEE Trustcom/BigDataSE/ISPA, Vol. 1. IEEE, 57–64.Google ScholarDigital Library
- [301] . 2009. Efficient privacy-preserving face recognition. In International Conference on Information Security and Cryptology. Springer, 229–244.Google Scholar
- [302] . 2006. Machine learning for detection and diagnosis of disease. Annu. Rev. Biomed. Eng. 8 (2006), 537–565.Google ScholarCross Ref
- [303] . 2018. Aequitas: A bias and fairness audit toolkit. arXiv preprint arXiv:1811.05577 (2018).Google Scholar
- [304] . 2014. Auditing algorithms: Research methods for detecting discrimination on internet platforms. Data and Discrimination: Converting Critical Concerns into Productive Inquiry 22 (2014), 4349–4357.Google Scholar
- [305] . 2019. The risk of racial bias in hate speech detection. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 1668–1678.Google ScholarCross Ref
- [306] . 2019. How do fairness definitions fare? Examining public attitudes towards algorithmic definitions of fairness. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 99–106.Google ScholarDigital Library
- [307] . 2008. The graph neural network model. IEEE Transactions on Neural Networks 20, 1 (2008), 61–80.Google ScholarDigital Library
- [308] . 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision. 618–626.Google ScholarCross Ref
- [309] . 2018. Poison frogs! Targeted clean-label poisoning attacks on neural networks. In Advances in Neural Information Processing Systems. 6103–6113.Google Scholar
- [310] . 2019. Adversarial training for free!arXiv preprint arXiv:1904.12843 (2019).Google Scholar
- [311] . 2019. Predictive biases in natural language processing models: A conceptual framework and overview. arXiv preprint arXiv:1912.11078 (2019).Google Scholar
- [312] . 2020. Predictive biases in natural language processing models: A conceptual framework and overview. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 5248–5264.Google ScholarCross Ref
- [313] . 2020. Fawkes: Protecting privacy against unauthorized deep learning models. In 29th \( \lbrace \)USENIX\( \rbrace \) Security Symposium (\( \lbrace \)USENIX\( \rbrace \) Security 20). 1589–1604.Google Scholar
- [314] . 2016. Understanding and improving convolutional neural networks via concatenated rectified linear units. In International Conference on Machine Learning. PMLR, 2217–2225.Google Scholar
- [315] . 2020. Federated learning in medicine: Facilitating multi-institutional collaborations without sharing patient data. Scientific Reports 10, 1 (2020), 1–12.Google ScholarCross Ref
- [316] . 2019. The woman worked as a babysitter: On biases in language generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 3407–3412.Google ScholarCross Ref
- [317] . 2017. Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 3–18.Google ScholarCross Ref
- [318] . 2006. Do you trust your recommendations? An exploration of security and privacy issues in recommender systems. In International Conference on Emerging Trends in Information and Communication Security. Springer, 14–29.Google Scholar
- [319] . 2013. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013).Google Scholar
- [320] . 2018. Darts: Deceiving autonomous cars with toxic signs. arXiv preprint arXiv:1802.06430 (2018).Google Scholar
- [321] . 2019. The EU approach to ethics guidelines for trustworthy artificial intelligence. Computer Law Review International 20, 4 (2019), 97–106.Google ScholarCross Ref
- [322] . 2019. Privacy risks of securing machine learning models against adversarial examples. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 241–257.Google ScholarDigital Library
- [323] . 2013. Stochastic gradient descent with differentially private updates. In 2013 IEEE Global Conference on Signal and Information Processing. IEEE, 245–248.Google ScholarCross Ref
- [324] . 2018. Designing adaptive neural networks for energy-constrained image classification. In Proceedings of the International Conference on Computer-Aided Design. 1–8.Google ScholarDigital Library
- [325] . 2019. Evaluating gender bias in machine translation. arXiv preprint arXiv:1906.00591 (2019).Google Scholar
- [326] . 2019. Energy and policy considerations for deep learning in NLP. arXiv preprint arXiv:1906.02243 (2019).Google Scholar
- [327] . 2019. Patient knowledge distillation for BERT model compression. arXiv preprint arXiv:1908.09355 (2019).Google Scholar
- [328] . 2017. Efficient processing of deep neural networks: A tutorial and survey. Proc. IEEE 105, 12 (2017), 2295–2329.Google ScholarCross Ref
- [329] . 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).Google Scholar
- [330] . 2019. Distilling task-specific knowledge from BERT into simple neural networks. arXiv preprint arXiv:1903.12136 (2019).Google Scholar
- [331] . 2021. Better safe than sorry: Preventing delusive adversaries with adversarial training. Advances in Neural Information Processing Systems 34 (2021).Google Scholar
- [332] . 2016. Google’s speech recognition has a gender bias. Making Noise and Hearing Things 12 (2016).Google Scholar
- [333] . 2020. Trustworthy artificial intelligence. Electronic Markets (2020), 1–18.Google Scholar
- [334] . 2020. A survey on explainable artificial intelligence (XAI): Toward medical XAI. IEEE Transactions on Neural Networks and Learning Systems (2020).Google Scholar
- [335] . 2017. FairTest: Discovering unwarranted associations in data-driven applications. In 2017 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 401–416.Google ScholarCross Ref
- [336] . 2021. Differentially private learning needs better features (or much more data). In International Conference on Learning Representations. https://openreview.net/forum?id=YTWGvpFOQD-.Google Scholar
- [337] . 2016. Stealing machine learning models via prediction apis. In 25th \( \lbrace \)USENIX\( \rbrace \) Security Symposium (\( \lbrace \)USENIX\( \rbrace \) Security 16). 601–618.Google Scholar
- [338] . 2018. Robustness may be at odds with accuracy. arXiv preprint arXiv:1805.12152 (2018).Google Scholar
- [339] . 2014. Big questions for social media big data: Representativeness, validity and other methodological pitfalls. arXiv preprint arXiv:1403.7400 (2014).Google Scholar
- [340] . 2019. Applications of machine learning in drug discovery and development. Nature Reviews Drug Discovery 18, 6 (2019), 463–477.Google ScholarCross Ref
- [341] . 2019. Getting gender right in neural machine translation. arXiv preprint arXiv:1909.05088 (2019).Google Scholar
- [342] . 2017. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. JL & Tech. 31 (2017), 841.Google Scholar
- [343] . 2020. FinPrivacy: A privacy-preserving mechanism for fingerprint identification. ACM Transactions on Internet Technology (2020).Google Scholar
- [344] . 2020. Benchmarking the performance and energy efficiency of AI accelerators for AI training. In 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID). IEEE, 744–751.Google ScholarCross Ref
- [345] . 2020. Towards fairness in visual recognition: Effective strategies for bias mitigation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8919–8928.Google ScholarCross Ref
- [346] . 1965. Randomized response: A survey technique for eliminating evasive answer bias. J. Amer. Statist. Assoc. 60, 309 (1965), 63–69.Google ScholarCross Ref
- [347] . 2020. Federated learning with differential privacy: Algorithms and performance analysis. IEEE Transactions on Information Forensics and Security 15 (2020), 3454–3469.Google ScholarDigital Library
- [348] . 2019. The role and limits of principles in AI ethics: Towards a focus on tensions. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 195–200.Google ScholarDigital Library
- [349] . 2020. What to account for when accounting for algorithms: A systematic literature review on algorithmic accountability. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 1–18.Google ScholarDigital Library
- [350] . 2017. Why we should have seen that coming: Comments on Microsoft’s Tay “experiment,” and wider implications. The ORBIT Journal 1, 2 (2017), 1–12.Google ScholarCross Ref
- [351] . 1997. No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation 1, 1 (1997), 67–82.Google ScholarDigital Library
- [352] . 2018. Provable defenses against adversarial examples via the convex outer adversarial polytope. In International Conference on Machine Learning. PMLR, 5286–5295.Google Scholar
- [353] . 2020. Fast is Better Than Free: Revisiting Adversarial Training.
arxiv:2001.03994 [cs.LG]Google Scholar - [354] . 2019. Wasserstein adversarial examples via projected sinkhorn iterations. In International Conference on Machine Learning. PMLR, 6808–6817.Google Scholar
- [355] . 2020. Adversarial weight perturbation helps robust generalization. Advances in Neural Information Processing Systems 33 (2020).Google Scholar
- [356] . 2020. Stronger and faster Wasserstein adversarial attacks. In International Conference on Machine Learning. PMLR, 10377–10387.Google Scholar
- [357] . 2019. Accelergy: An architecture-level energy estimation methodology for accelerator designs. In 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE, 1–8.Google ScholarCross Ref
- [358] . 2018. FairGAN: Fairness-aware generative adversarial networks. 2018 IEEE International Conference on Big Data (Big Data) (2018), 570–575.Google ScholarCross Ref
- [359] . 2020. To be robust or to be fair: Towards fairness in adversarial training. arXiv preprint arXiv:2010.06121 (2020).Google Scholar
- [360] . 2020. Adversarial attacks and defenses in images, graphs and text: A review. International Journal of Automation and Computing 17, 2 (2020), 151–178.Google ScholarCross Ref
- [361] . 2021. Federated learning for healthcare informatics. Journal of Healthcare Informatics Research 5, 1 (2021), 1–19.Google ScholarCross Ref
- [362] . 2021. Bot-adversarial dialogue for safe conversational agents. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2950–2968.Google ScholarCross Ref
- [363] . 2021. DNN intellectual property protection: Taxonomy, attacks and evaluations. In Proceedings of the 2021 on Great Lakes Symposium on VLSI. 455–460.Google ScholarDigital Library
- [364] . 2019. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 2 (2019), 1–19.Google ScholarDigital Library
- [365] . 2017. Designing energy-efficient convolutional neural networks using energy-aware pruning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5687–5695.Google ScholarCross Ref
- [366] . 1982. Protocols for secure computations. In 23rd Annual Symposium on Foundations of Computer Science (SFCS 1982). IEEE, 160–164.Google ScholarCross Ref
- [367] . 2020. Defining and evaluating fair natural language generation. In Proceedings of the The Fourth Widening Natural Language Processing Workshop. 107–109.Google ScholarCross Ref
- [368] . 2019. GNNExplainer: Generating explanations for graph neural networks. Advances in Neural Information Processing Systems 32 (2019), 9240.Google Scholar
- [369] . 2021. Differentially Private Fine-tuning of Language Models.
arxiv:2110.06500 [cs.LG]Google Scholar - [370] . 2021. Do not let privacy overbill utility: Gradient embedding perturbation for private learning. In International Conference on Learning Representations. https://openreview.net/forum?id=7aogOj_VYO0.Google Scholar
- [371] . 2021. Large scale private learning via low-rank reparametrization. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event(
Proceedings of Machine Learning Research , Vol. 139), and (Eds.). PMLR, 12208–12218. http://proceedings.mlr.press/v139/yu21f.html.Google Scholar - [372] . 2021. Indiscriminate Poisoning Attacks Are Shortcuts.
arxiv:2111.00898 [cs.LG]Google Scholar - [373] . 2018. Building ethics into artificial intelligence. arXiv preprint arXiv:1812.02953 (2018).Google Scholar
- [374] . 2020. XGNN: Towards model-level explanations of graph neural networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 430–438.Google ScholarDigital Library
- [375] . 2020. Explainability in graph neural networks: A taxonomic survey. arXiv preprint arXiv:2012.15445 (2020).Google Scholar
- [376] . 2017. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web. 1171–1180.Google ScholarDigital Library
- [377] . 2014. Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 (2014).Google Scholar
- [378] . 2013. Learning fair representations. In ICML.Google Scholar
- [379] . 2020. Systematic review of privacy-preserving distributed machine learning from federated databases in health care. JCO Clinical Cancer Informatics 4 (2020), 184–200.Google ScholarCross Ref
- [380] . 2018. Mitigating unwanted biases with adversarial learning. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society. 335–340.Google ScholarDigital Library
- [381] . 2020. Demographics should not be the reason of toxicity: Mitigating discrimination in text classifications with instance weighting. arXiv preprint arXiv:2004.14088 (2020).Google Scholar
- [382] . 2019. Theoretically principled trade-off between robustness and accuracy. In International Conference on Machine Learning. PMLR, 7472–7482.Google Scholar
- [383] . 2017. A causal framework for discovering and removing direct and indirect discrimination. In IJCAI.Google Scholar
- [384] . 2021. Graph embedding for recommendation against attribute inference attacks. arXiv preprint arXiv:2101.12549 (2021).Google Scholar
- [385] . 2020. Adversarial attacks on deep-learning models in natural language processing: A survey. ACM Transactions on Intelligent Systems and Technology (TIST) 11, 3 (2020), 1–41.Google ScholarDigital Library
- [386] . 2020. Interpretable deep learning under fire. In 29th \( \lbrace \)USENIX\( \rbrace \) Security Symposium (\( \lbrace \)USENIX\( \rbrace \) Security 20).Google Scholar
- [387] . 2018. Explainable recommendation: A survey and new perspectives. arXiv preprint arXiv:1804.11192 (2018).Google Scholar
- [388] . 2020. The secret revealer: Generative model-inversion attacks against deep neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 253–261.Google ScholarCross Ref
- [389] . 2014. Explicit factor models for explainable recommendation based on phrase-level sentiment analysis. In Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval. 83–92.Google ScholarDigital Library
- [390] . 2016. Identifying significant predictive bias in classifiers. arXiv preprint arXiv:1611.08292 (2016).Google Scholar
- [391] . 2020. IDLG: Improved deep leakage from gradients. arXiv preprint arXiv:2001.02610 (2020).Google Scholar
- [392] . 2019. Gender bias in contextualized word embeddings. arXiv preprint arXiv:1904.03310 (2019).Google Scholar
- [393] . 2017. Men also like shopping: Reducing gender bias amplification using corpus-level constraints. arXiv preprint arXiv:1707.09457 (2017).Google Scholar
- [394] . 2018. Gender bias in coreference resolution: Evaluation and debiasing methods. arXiv preprint arXiv:1804.06876 (2018).Google Scholar
- [395] . 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2921–2929.Google ScholarCross Ref
- [396] . 2021. Bypassing the ambient dimension: Private {SGD} with gradient subspace identification. In International Conference on Learning Representations. https://openreview.net/forum?id=7dpmlkBuJFC.Google Scholar
- [397] . 2020. Deep leakage from gradients. In Federated Learning. Springer, 17–31.Google ScholarCross Ref
- [398] . 2015. A survey on measuring indirect discrimination in machine learning. arXiv preprint arXiv:1511.00148 (2015).Google Scholar
- [399] . 2018. AI Can Be Sexist and Racist–it’s Time to Make it Fair.Google Scholar
- [400] . 2018. Adversarial attacks on neural networks for graph data. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2847–2856.Google ScholarDigital Library
- [401] . 2019. Adversarial attacks on graph neural networks via meta learning. arXiv preprint arXiv:1902.08412 (2019).Google Scholar
- [402] . 2018. Adversarial attacks on neural networks for graph data. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (
July 2018).DOI: Google ScholarDigital Library
Index Terms
- Trustworthy AI: A Computational Perspective
Recommendations
Trustworthy Artificial Intelligence: A Review
Artificial intelligence (AI) and algorithmic decision making are having a profound impact on our daily lives. These systems are vastly used in different high-stakes applications like healthcare, business, government, education, and justice, moving us ...
Trustworthy AI: From Principles to Practices
The rapid development of Artificial Intelligence (AI) technology has enabled the deployment of various systems based on it. However, many current AI systems are found vulnerable to imperceptible attacks, biased against underrepresented groups, lacking in ...
Trustworthy AI'21: 1st International Workshop on Trustworthy AI for Multimedia Computing
MM '21: Proceedings of the 29th ACM International Conference on MultimediaIn this workshop, we are addressing the trustworthy AI issues for Multimedia Computing. We aim to bring together researchers in the trustworthy aspects of Multimedia Computing and facilitate discussions in injecting trusts into multimedia to develop ...
Comments