research-article

How do visual explanations foster end users' appropriate trust in machine learning?

Authors:
Fumeng Yang

Brown University

Brown University
View Profile

,
Zhuanyi Huang

Pacific Northwest National Laboratory

Pacific Northwest National Laboratory
View Profile

,
Jean Scholtz

Pacific Northwest National Laboratory

Pacific Northwest National Laboratory
View Profile

,
Dustin L. Arendt

Pacific Northwest National Laboratory

Pacific Northwest National Laboratory
View Profile

IUI '20: Proceedings of the 25th International Conference on Intelligent User InterfacesMarch 2020Pages 189–201https://doi.org/10.1145/3377325.3377480

Published:17 March 2020Publication History

IUI '20: Proceedings of the 25th International Conference on Intelligent User Interfaces

Pages 189–201

ABSTRACT

We investigated the effects of example-based explanations for a machine learning classifier on end users' appropriate trust. We explored the effects of spatial layout and visual representation in an in-person user study with 33 participants. We measured participants' appropriate trust in the classifier, quantified the effects of different spatial layouts and visual representations, and observed changes in users' trust over time. The results show that each explanation improved users' trust in the classifier, and the combination of explanation, human, and classification algorithm yielded much better decisions than the human and classification algorithm separately. Yet these visual explanations lead to different levels of trust and may cause inappropriate trust if an explanation is difficult to understand. Visual representation and performance feedback strongly affect users' trust, and spatial layout shows a moderate effect. Our results do not support that individual differences (e.g., propensity to trust) affect users' trust in the classifier. This work advances the state-of-the-art in trust-able machine learning and informs the design and appropriate use of automated systems.

References

Ashraf Abdul, Jo Vermeulen, Danding Wang, Brian Y Lim, and Mohan Kankanhalli. 2018. Trends and trajectories for explainable, accountable and intelligible systems: An HCI research agenda. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, Article Paper 582, 18 pages.Google ScholarDigital Library
Saleema Amershi, James Fogarty, Ashish Kapoor, and Desney Tan. 2010. Examining multiple potential models in end-user interactive concept learning. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1357--1360.Google ScholarDigital Library
Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador García, Sergio Gil-López, Daniel Molina, Richard Benjamins, Raja Chatila, and Francisco Herrera. 2019. Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI. arXiv:cs.AI/1910.10045Google ScholarDigital Library
Gagan Bansal, Besmira Nushi, Ece Kamar, Daniel S Weld, Walter S Lasecki, and Eric Horvitz. 2019. Updates in human-ai teams: Understanding and addressing the performance/compatibility tradeoff. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 2429--2437.Google ScholarDigital Library
Michelle A Borkin, Azalea A Vo, Zoya Bylinskii, Phillip Isola, Shashank Sunkavalli, Aude Oliva, and Hanspeter Pfister. 2013. What makes a visualization memorable? IEEE Transactions on Visualization and Computer Graphics 19, 12 (2013), 2306--2315.Google ScholarDigital Library
Eli T Brown, Jingjing Liu, Carla E Brodley, and Remco Chang. 2012. Dis-function: Learning distance functions interactively. In Visual Analytics Science and Technology (VAST), 2012 IEEE Conference on. IEEE, 83--92.Google ScholarDigital Library
Adrian Bussone, Simone Stumpf, and Dympna O'Sullivan. 2015. The role of explanations on trust and reliance in clinical decision support systems. In Healthcare Informatics (ICHI), 2015 International Conference on. IEEE, 160--169.Google ScholarDigital Library
Carrie J Cai, Jonas Jongejan, and Jess Holbrook. 2019. The effects of example-based explanations in a machine learning interface. In Proceedings of the 24th International Conference on Intelligent User Interfaces. ACM, 258--262.Google ScholarDigital Library
Stuart K Card and Jock Mackinlay. 1997. The structure of the information visualization design space. In Proceedings of VIZ'97: Visualization Conference, Information Visualization Symposium and Parallel Rendering Symposium. IEEE, 92--99.Google ScholarCross Ref
Jaegul Choo and Shixia Liu. 2018. Visual Analytics for Explainable Deep Learning. arXiv preprint arXiv:1804.02527 (2018).Google Scholar
Sven Coppers, Jan Van den Bergh, Kris Luyten, Karin Coninx, Iulianna van der Lek-Ciudin, Tom Vanallemeersch, and Vincent Vandeghinste. 2018. Intellingo: An Intelligible Translation Environment (CHI '18). ACM, New York, NY, USA, Article 524, 13 pages.Google Scholar
Geoff Cumming. 2013. Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis. Routledge.Google Scholar
Geoff Cumming. 2014. The new statistics: Why and how. Psych. Sci. 25, 1 (2014), 7--29.Google ScholarCross Ref
Aritra Dasgupta, Joon-Yong Lee, Ryan Wilson, Robert Lafrance, Nick Cramer, Kristin Cook, and Samuel Payne. 2017. Familiarity vs trust: A comparative study of domain scientists' trust in visual analytics and conventional analysis methods. IEEE Transactions on Visualization and Computer Graphics 23, 1 (2017), 271--280.Google ScholarDigital Library
Ewart de Visser, Marvin Cohen, Amos Freedy, and Raja Parasuraman. 2014. A design methodology for trust cue calibration in cognitive agents. In International Conference on Virtual, Augmented and Mixed Reality. Springer, 251--262.Google ScholarDigital Library
Ewart de Visser and Raja Parasuraman. 2011. Adaptive aiding of human-robot teaming: Effects of imperfect automation on performance, trust, and workload. Journal of Cognitive Engineering and Decision Making 5, 2 (2011), 209--231.Google ScholarCross Ref
Peter de Vries, Cees Midden, and Don Bouwhuis. 2003. The effects of errors on system trust, self-confidence, and the allocation of control in route planning. International Journal of Human-Computer Studies 58, 6 (2003), 719--735.Google ScholarDigital Library
Berkeley J Dietvorst, Joseph P Simmons, and Cade Massey. 2015. Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General 144, 1 (2015), 114.Google ScholarCross Ref
Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017).Google Scholar
Pierre Dragicevic. 2016. Fair statistical communication in HCI. Springer Int. Publishing, Cham, 291--330.Google Scholar
Mary T Dzindolet, Scott A Peterson, Regina A Pomranky, Linda G Pierce, and Hall P Beck. 2003. The role of trust in automation reliance. International Journal of Human-Computer Studies 58, 6 (2003), 697--718.Google ScholarDigital Library
Mary T Dzindolet, Linda G Pierce, Hall P Beck, and Lloyd A Dawe. 2002. The perceived utility of human and automated aids in a visual detection task. Human Factors 44, 1 (2002), 79--94.Google ScholarCross Ref
Malin Eiband, Charlotte Anlauff, Tim Ordenewitz, Martin Zürn, and Heinrich Hussmann. 2019. Understanding Algorithms Through Exploration: Supporting Knowledge Acquisition in Primary Tasks (MuC'19). ACM, New York, NY, USA, 127--136.Google Scholar
John Ellson, Emden Gansner, Lefteris Koutsofios, Stephen C North, and Gordon Woodhull. 2001. Graphviz---open source graph drawing tools. In International Symposium on Graph Drawing. Springer, 483--484.Google Scholar
Greg Elofson. 2001. Developing trust with intelligent agents: An exploratory study. In Trust and Deception in Virtual Societies. Springer, 125--138.Google Scholar
Dumitru Erhan, Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2009. Visualizing higher-layer features of a deep network. Technical Report. University of Montreal.Google Scholar
Marina G Falleti, Paul Maruff, Alexander Collie, and David G Darby. 2006. Practice effects associated with the repeated assessment of cognitive function using the CogState battery at 10-minute, one week and one month test-retest intervals. Journal of Clinical and Experimental Neuropsychology 28, 7 (2006), 1095--1112.Google ScholarCross Ref
Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9, Aug (2008), 1871--1874.Google ScholarDigital Library
Amos Freedy, Ewart DeVisser, Gershon Weltman, and Nicole Coeyman. 2007. Measurement of trust in human-robot collaboration. In Collaborative Technologies and Systems, International Symposium on. IEEE, 106--114.Google Scholar
Yashesh Gaur, Walter S Lasecki, Florian Metze, and Jeffrey P Bigham. 2016. The effects of automatic speech recognition quality on human transcription latency. In Proceedings of the 13th Web for All Conference. ACM, Article Article 23, 8 pages.Google ScholarDigital Library
David Gefen. 2000. E-commerce: the role of familiarity and trust. Omega 28, 6 (2000), 725--737.Google ScholarCross Ref
David Gefen, Elena Karahanna, and Detmar W Straub. 2003. Trust and TAM in online shopping: An integrated model. MIS quarterly 27, 1 (2003), 51--90.Google Scholar
Pierre Geurts, Damien Ernst, and Louis Wehenkel. 2006. Extremely randomized trees. Machine Learning 63, 1 (2006), 3--42.Google ScholarDigital Library
Alyssa Glass, Deborah L McGuinness, and Michael Wolverton. 2008. Toward establishing trust in adaptive agents. In Proceedings of the 13th International Conference on Intelligent User Interfaces. ACM, 227--236.Google ScholarDigital Library
Michael Gleicher. 2013. Explainers: Expert explorations with crafted projections. IEEE Transactions on Visualization and Computer Graphics 12 (2013), 2042--2051.Google ScholarDigital Library
Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).Google Scholar
Tyrone Grandison and Morris Sloman. 2000. A survey of trust in internet applications. IEEE Communications Surveys & Tutorials 3, 4 (2000), 2--16.Google ScholarDigital Library
Samuel Gratzl, Alexander Lex, Nils Gehlenborg, Hanspeter Pfister, and Marc Streit. 2013. Lineup: Visual analysis of multi-attribute rankings. IEEE Transactions on Visualization and Computer Graphics 19, 12 (2013), 2277--2286.Google ScholarDigital Library
David Gunning. 2017. Explainable artificial intelligence (XAI). Defense Advanced Research Projects Agency (DARPA), 2nd Web (2017).Google Scholar
Ivan Herman, Guy Melancon, and M Scott Marshall. 2000. Graph visualization and navigation in information visualization: A survey. IEEE Transactions on Visualization and Computer Graphics 6, 1 (2000), 24--43.Google ScholarDigital Library
Tom Hitron, Iddo Wald, Hadas Erel, and Oren Zuckerman. 2018. Introducing children to machine learning concepts through hands-on experience. In Proceedings of the 17th ACM Conference on Interaction Design and Children. ACM, 563--568.Google ScholarDigital Library
Robert R Hoffman and Gary Klein. 2017. Explaining explanation, part 1: theoretical foundations. IEEE Intelligent Systems 3 (2017), 68--73.Google ScholarDigital Library
Fred Matthew Hohman, Minsuk Kahng, Robert Pienta, and Duen Horng Chau. 2018. Visual Analytics in Deep Learning: An Interrogative Survey for the Next Frontiers. IEEE Transactions on Visualization and Computer Graphics (2018).Google ScholarDigital Library
Marc W Howard and Michael J Kahana. 1999. Contextual variability and serial position effects in free recall. Journal of Experimental Psychology: Learning, Memory, and Cognition 25, 4 (1999), 923.Google ScholarCross Ref
Yu-Chen Hsu. 2006. The effects of metaphors on novice and expert learnersâĂ&Yuml; performance and mental-model development. Interacting with Computers 18, 4 (2006), 770--792.Google ScholarDigital Library
T Inagaki, N Moray, and M Itoh. 1998. Trust, self-confidence and authority in human-machine systems. IFAC Proceedings Volumes 31, 26 (1998), 431--436.Google ScholarCross Ref
Jiun-Yin Jian, Ann M Bisantz, and Colin G Drury. 2000. Foundations for an empirically determined scale of trust in automated systems. International Journal of Cognitive Ergonomics 4, 1 (2000), 53--71.Google ScholarCross Ref
Devon Johnson and Kent Grayson. 2005. Cognitive and affective trust in service relationships. Journal of Business research 58, 4 (2005), 500--507.Google ScholarCross Ref
Lawrence K Jones and Mary F Chenery. 1980. Multiple subtypes among vocationally undecided college students: A model and assessment instrument. Journal of Counseling Psychology 27, 5 (1980), 469.Google ScholarCross Ref
Minsuk Kahng, Pierre Y Andrews, Aditya Kalro, and Duen Horng Polo Chau. 2018. ActiVis: Visual exploration of industry-scale deep neural network models. IEEE Transactions on Visualization and Computer Graphics 24, 1 (2018), 88--97.Google ScholarCross Ref
Mohammad T Khasawneh, Shannon R Bowling, Xiaochun Jiang, Anand K Gramopadhye, and Brian J Melloy. 2003. A model for predicting human trust in automated systems. Origins 5 (2003).Google Scholar
René F Kizilcec. 2016. How much information?: Effects of transparency on trust in an algorithmic interface. In Proceedings of the Conference on Human Factors in Computing Systems. ACM, 2390--2395.Google ScholarDigital Library
Pang Wei Koh and Percy Liang. 2017. Understanding black-box predictions via influence functions. arXiv preprint arXiv:1703.04730 (2017).Google ScholarDigital Library
Sherrie YX Komiak and Izak Benbasat. 2006. The effects of personalization and familiarity on trust and adoption of recommendation agents. MIS quarterly (2006), 941--960.Google Scholar
Josua Krause, Aritra Dasgupta, Jordan Swartz, Yindalon Aphinyanaphongs, and Enrico Bertini. 2017. A workflow for visual diagnostics of binary classifiers using instance-level explanations. arXiv preprint arXiv:1705.01968 (2017).Google Scholar
Josua Krause, Adam Perer, and Enrico Bertini. 2014. INFUSE: interactive feature selection for predictive modeling of high dimensional data. IEEE Transactions on Visualization and Computer Graphics 20, 12 (2014), 1614--1623.Google ScholarCross Ref
Josua Krause, Adam Perer, and Enrico Bertini. 2018. A user study on the effect of aggregating explanations for interpreting machine learning models. In KDD Workshop on Interactive Data Exploration and Analytics (IDEA).Google Scholar
Josua Krause, Adam Perer, and Kenney Ng. 2016. Interacting with predictions: Visual inspection of black-box machine learning models. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 5686--5697.Google ScholarDigital Library
Todd Kulesza, Margaret Burnett, Weng-Keen Wong, and Simone Stumpf. 2015. Principles of explanatory debugging to personalize interactive machine learning. In Proceedings of the 20th International Conference on Intelligent User Interfaces. ACM, 126--137.Google ScholarDigital Library
John D Lee and Neville Moray. 1992. Trust, control strategies and allocation of function in human-machine systems. Ergonomics 35, 10 (1992), 1243--1270.Google ScholarCross Ref
John D Lee and Neville Moray. 1994. Trust, self-confidence, and operators' adaptation to automation. International Journal of Human-computer Studies 40, 1 (1994), 153--184.Google ScholarDigital Library
John D Lee and Katrina A See. 2004. Trust in automation: Designing for appropriate reliance. Human Factors 46, 1 (2004), 50--80.Google ScholarCross Ref
Min Kyung Lee. 2018. Understanding perception of algorithmic decisions: Fairness, trust, and emotion in response to algorithmic management. Big Data & Society 5, 1 (2018), 2053951718756684.Google ScholarCross Ref
Matthew KO Lee and Efraim Turban. 2001. A trust model for consumer internet shopping. International Journal of electronic commerce 6, 1 (2001), 75--91.Google ScholarDigital Library
Scott LeeTiernan, Edward Cutrell, Mary Czerwinski, and Hunter G Hoffman. 2001. Effective notification systems depend on user trust. In INTERACT. 684--685.Google Scholar
Brian Y Lim. 2012. Improving understanding and trust with intelligibility in context-aware applications. Ph.D. Dissertation. Carnegie Mellon University.Google ScholarDigital Library
Brian Y Lim, Anind K Dey, and Daniel Avrahami. 2009. Why and why not explanations improve the intelligibility of context-aware intelligent systems. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2119--2128.Google ScholarDigital Library
Brian Y Lim, Qian Yang, Ashraf M Abdul, and Danding Wang. 2019. Why these Explanations? Selecting Intelligibility Types for Explanation Goals. In IUI Workshops.Google Scholar
Shixia Liu, Jiannan Xiao, Junlin Liu, Xiting Wang, Jing Wu, and Jun Zhu. 2018. Visual diagnosis of tree boosting methods. IEEE Transactions on Visualization and Computer Graphics 24 (2018), 163--173.Google ScholarCross Ref
Poornima Madhavan and Douglas A Wiegmann. 2007. Similarities and differences between human-human and human-automation trust: an integrative review. Theoretical Issues in Ergonomics Science 8, 4 (2007), 277--301.Google ScholarCross Ref
Azad Madni and Carla Madni. 2018. Architectural Framework for Exploring Adaptive Human-Machine Teaming Options in Simulated Dynamic Environments. Systems 6, 4 (2018), 44.Google ScholarCross Ref
Maria Madsen and Shirley Gregor. 2000. Measuring human-computer trust. In 11th Australasian Conference on Information Systems, Vol. 53. Citeseer, 6--8.Google Scholar
Aravindh Mahendran and Andrea Vedaldi. 2015. Understanding deep image representations by inverting them. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5188--5196.Google ScholarCross Ref
Stephen Marsh and Mark R Dibben. 2005. Trust, untrust, distrust and mistrust-an exploration of the dark (er) side. In International Conference on Trust Management. Springer, 17--33.Google ScholarDigital Library
Ronald Scott Marshall. 2003. Building trust early: The influence of first and second order expectations on trust in international channels of distribution. International Business Review 12, 4 (2003), 421--443.Google ScholarCross Ref
Reena Master, Xiaochun Jiang, Mohammad T Khasawneh, Shannon R Bowling, Larry Grimes, Anand K Gramopadhye, and Brian J Melloy. 2005. Measurement of trust over time in hybrid inspection systems. Human Factors and Ergonomics in Manufacturing & Service Industries 15, 2 (2005), 177--196.Google ScholarDigital Library
Roger C Mayer, James H Davis, and F David Schoorman. 1995. An integrative model of organizational trust. Academy of Management Review 20, 3 (1995), 709--734.Google ScholarCross Ref
Daniel J McAllister. 1995. Affect-and cognition-based trust as foundations for interpersonal cooperation in organizations. Academy of Management Journal 38, 1 (1995), 24--59.Google ScholarCross Ref
Maranda McBride and Shona Morgan. 2010. Trust calibration for automated decision aids. Institute for Homeland Security Solutions (2010).Google Scholar
John M McGuirl and Nadine B Sarter. 2006. Supporting trust calibration and the effective use of decision aids by presenting dynamic system confidence information. Human Factors 48, 4 (2006), 656--665.Google ScholarCross Ref
D Harrison McKnight and Norman L Chervany. 2001. Trust and distrust definitions: One bite at a time. In Trust in Cyber-societies. Springer, 27--54.Google ScholarDigital Library
D Harrison McKnight, Vivek Choudhury, and Charles Kacmar. 2002. Developing and validating trust measures for e-commerce: An integrative typology. Information systems research 13, 3 (2002), 334--359.Google Scholar
David L McLain and Katarina Hackman. 1999. Trust, risk, and decision-making in organizational change. Public Administration Quarterly (1999), 152--176.Google Scholar
Stephanie M Merritt. 2011. Affective processes in human-automation interactions. Human Factors 53, 4 (2011), 356--370.Google ScholarCross Ref
Stephanie M Merritt, Heather Heimbaugh, Jennifer LaChapell, and Deborah Lee. 2013. I trust it, but I don't know why: Effects of implicit attitudes toward automation on trust in an automated system. Human factors 55, 3 (2013), 520--534.Google Scholar
Stephanie M Merritt and Daniel R Ilgen. 2008. Not all trust is created equal: Dispositional and history-based trust in human-automation interactions. Human Factors 50, 2 (2008), 194--210.Google ScholarCross Ref
Stephanie M Merritt, Deborah Lee, Jennifer L Unnerstall, and Kelli Huber. 2015. Are well-calibrated users effective users? Associations between calibration of trust and performance on an automation-aided task. Human Factors 57, 1 (2015), 34--47.Google ScholarCross Ref
Malgorzata A. Migut, Jan C. van Gemert, and Marcel Worring. 2011. Interactive decision making using dissimilarity to visually represented prototypes. In 2011 IEEE Conference on Visual Analytics Science and Technology (VAST). IEEE, 141--149.Google ScholarCross Ref
Christopher A Miller. 2005. Trust in adaptive automation: the role of etiquette in tuning trust via analogic and affective methods. In Proceedings of the 1st International Conference on Augmented Cognition. Citeseer, 22--27.Google Scholar
Tim Miller. 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence 267 (2019), 1--38.Google ScholarCross Ref
Yao Ming. 2017. A survey on visualization for explainable classifiers. (2017). http://www.cse.ust.hk/~huamin/explainable_AI_yao.pdfGoogle Scholar
Bonnie M Muir. 1987. Trust between humans and machines, and the design of decision aids. International Journal of Man-Machine Studies 27, 5--6 (1987), 527--539.Google ScholarDigital Library
Bonnie M Muir. 1994. Trust in automation: Part I. Theoretical issues in the study of trust and human intervention in automated systems. Ergonomics 37, 11 (1994), 1905--1922.Google ScholarCross Ref
Bonnie M Muir and Neville Moray. 1996. Trust in automation. Part II. Experimental studies of trust and human intervention in a process control simulation. Ergonomics 39, 3 (1996), 429--460.Google ScholarCross Ref
Julian D Olden and Donald A Jackson. 2002. Illuminating the "black-box": a randomization approach for understanding variable contributions in artificial neural networks. Ecological Modelling 154, 1--2 (2002), 135--150.Google ScholarCross Ref
Anshul Vikram Pandey, Anjali Manivannan, Oded Nov, Margaret Satterthwaite, and Enrico Bertini. 2014. The persuasive power of data visualization. IEEE Transactions on Visualization and Computer Graphics 20, 12 (2014), 2211--2220.Google ScholarCross Ref
Raja Parasuraman and Victor Riley. 1997. Humans and automation: Use, misuse, disuse, abuse. Human factors 39, 2 (1997), 230--253.Google Scholar
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 10 (2011), 2825--2830.Google ScholarDigital Library
Pearl Pu and Li Chen. 2006. Trust building with explanation interfaces. In Proceedings of the 11th international conference on Intelligent user interfaces. ACM, 93--100.Google ScholarDigital Library
John K Rempel, John G Holmes, and Mark P Zanna. 1985. Trust in close relationships. Journal of Personality and Social Psychology 49, 1 (1985), 95--112.Google ScholarCross Ref
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should I trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1135--1144.Google ScholarDigital Library
Jader Sant'Ana, Emerson Franchini, Vinicius da Silva, and Fernando Diefenthaeler. 2017. Effect of fatigue on reaction time, response time, performance time, and kick impact in taekwondo roundhouse kick. Sports Biomechanics 16, 2 (2017), 201--209.Google ScholarCross Ref
Christin Seifert, Aisha Aamir, Aparna Balagopalan, Dhruv Jain, Abhinav Sharma, Sebastian Grottel, and Stefan Gumhold. 2017. Visualizations of deep neural networks in computer vision: A survey. In Transparent Data Mining for Big and Small Data. Springer, 123--144.Google Scholar
Younho Seong and Ann M Bisantz. 2008. The impact of cognitive feedback on judgment performance and trust with decision aids. International Journal of Industrial Ergonomics 38, 7--8 (2008), 608--625.Google ScholarCross Ref
Pedro FB Silva, Andre RS Marcal, and Rubim M Almeida da Silva. 2013. Evaluation of features for leaf discrimination. In International Conference Image Analysis and Recognition. Springer, 197--204.Google ScholarCross Ref
Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013).Google Scholar
Archana Singh, Avantika Yadav, and Ajay Rana. 2013. K-means with three different distance metrics. International Journal of Computer Applications 67, 10 (2013).Google ScholarCross Ref
Erik Štrumbelj, Igor Kononenko, and M Robnik Šikonja. 2009. Explaining instance classifications with interactions of subsets of feature values. Data & Knowledge Engineering 68, 10 (2009), 886--904.Google ScholarDigital Library
Simone Stumpf, Adrian Bussone, and Dympna O'Sullivan. 2016. Explanations considered harmful? user interactions with machine learning systems. In Human Centred Machine Learning at CHI 2016.Google Scholar
Simone Stumpf, Vidya Rajaram, Lida Li, Weng-Keen Wong, Margaret Burnett, Thomas Dietterich, Erin Sullivan, and Jonathan Herlocker. 2009. Interacting meaningfully with machine learning systems: Three experiments. International Journal of Human-Computer Studies 67, 8 (2009), 639--662.Google ScholarDigital Library
Simone Stumpf, Simonas Skrebe, Graeme Aymer, and Julie Hobson. 2018. Explaining smart heating systems to discourage fiddling with optimized behavior. In CEUR Workshop Proceedings.Google Scholar
Paolo Tamagnini, Josua Krause, Aritra Dasgupta, and Enrico Bertini. 2017. Interpreting black-box classifiers using instance-level visual explanations. In Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. ACM, 6.Google ScholarDigital Library
Sandra Wachter, Brent Mittelstadt, and Chris Russell. [n.d.]. Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GPDR. Harvard Journal of Law & Technology 31, 2 ([n. d.]).Google Scholar
Connie R Wanberg and Paul M Muchinsky. 1992. A typology of career decision status: Validity extension of the vocational decision status model. Journal of Counseling Psychology 39, 1 (1992), 71--80.Google ScholarCross Ref
Dayong Wang, Aditya Khosla, Rishab Gargeya, Humayun Irshad, and Andrew H Beck. 2016. Deep learning for identifying metastatic breast cancer. arXiv preprint arXiv:1606.05718 (2016).Google Scholar
Danding Wang, Qian Yang, Ashraf Abdul, and Brian Y Lim. 2019. Designing Theory-Driven User-Centric Explainable AI. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, Article Paper 601, 15 pages.Google ScholarDigital Library
Ning Wang, David V Pynadath, and Susan G Hill. 2016. Trust calibration within a human-robot team: Comparing automatically generated explanations. In The Eleventh ACM/IEEE International Conference on Human Robot Interaction. IEEE Press, 109--116.Google ScholarDigital Library
Weiquan Wang and Izak Benbasat. 2007. Recommendation agents for electronic commerce: Effects of explanation facilities on trusting beliefs. Journal of Management Information Systems 23, 4 (2007), 217--246.Google ScholarDigital Library
Marcia L Watson. 2005. Can there be just one trust? A cross-disciplinary identification of trust definitions and measurement. The Institute for Public Relations (2005), 1--25.Google Scholar
Peter H Westfall, S Stanley Young, et al. 1993. Resampling-based multiple testing: Examples and methods for p-value adjustment. Vol. 279. John Wiley & Sons.Google Scholar
Kanit Wongsuphasawat, Daniel Smilkov, James Wexler, Jimbo Wilson, Dandelion Mané, Doug Fritz, Dilip Krishnan, Fernanda B Viégas, and Martin Wattenberg. 2018. Visualizing dataflow graphs of deep learning models in TensorFlow. IEEE Transactions on Visualization and Computer Graphics 24, 1 (2018), 1--12.Google ScholarCross Ref
Scott Cheng-Hsin Yang and Patrick Shafto. 2017. Explainable artificial intelligence via bayesian teaching. In NIPS 2017 Workshop on Teaching Machines, Robots, and Humans.Google Scholar
Vahan Yoghourdjian, Daniel Archambault, Stephan Diehl, Tim Dwyer, Karsten Klein, Helen C Purchase, and Hsiang-Yun Wu. 2018. Exploring the limits of complexity: A survey of empirical studies on graph visualisation. Visual Informatics 2, 4 (2018), 264--282.Google ScholarCross Ref
Jason Yosinski, Jeff Clune, Anh Nguyen, Thomas Fuchs, and Hod Lipson. 2015. Understanding neural networks through deep visualization. arXiv preprint arXiv:1506.06579 (2015).Google Scholar
Jason Yosinski, Jeff Clune, Anh Mai Nguyen, Thomas J. Fuchs, and Hod Lipson. 2015. Understanding Neural Networks Through Deep Visualization. CoRR abs/1506.06579 (2015). arXiv:1506.06579Google Scholar
Beste F Yuksel, Penny Collisson, and Mary Czerwinski. 2017. Brains or beauty: How to engender trust in user-agent interactions. ACM Transactions on Internet Technology (TOIT) 17, 1, Article Article 2 (Jan. 2017), 20 pages.Google ScholarDigital Library
Hongjie Zhang, Yanyan Hou, Jianye Zhang, Xiangyang Qi, and Fujun Wang. 2015. A new method for nondestructive quality evaluation of the resistance spot welding based on the radar chart method and the decision tree classifier. The International Journal of Advanced Manufacturing Technology 78, 5--8 (2015), 841--851.Google ScholarCross Ref
Quan-Shi Zhang and Song-Chun Zhu. 2018. Visual interpretability for deep learning: a survey. Frontiers of Information Technology & Electronic Engineering 19, 1 (2018), 27--39.Google ScholarCross Ref
Jianlong Zhou and Fang Chen. 2018. 2D Transparency Space---Bring Domain Users and Machine Learning Experts Together. In Human and Machine Learning. Springer, 3--19.Google Scholar

Index Terms

How do visual explanations foster end users' appropriate trust in machine learning?
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Empirical studies in HCI
  2. Visualization

Recommendations

Effect of Wait Time on Trust and Reliance in Human-Robot Interaction
HRI '24: Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction

The design of trust is a critical issue in human-robot interaction (HRI). This study investigated how waiting time for a robot's action affects trust and reliance on a robot. In Experiment 1, an online study manipulated the wait time for the task partner'...
Read More
Would you Trust a Robot that Distrusts you?
HRI '24: Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction

Trust is an important antecedent to successful human-robot collaboration and is conceptualized as a reciprocal process. The current study evaluated whether a robot could influence participants' trust levels through the disclosure of its own trust. We ...
Read More
The influence of the propensity to trust on mobile users' attitudes toward in-app advertisements

In-app advertising is one of the fastest growing areas in social commerce. Building on previous studies of e-commerce and psychological theories, this paper examines a theoretical model that extends the theory of planned behavior to include the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
IUI '20: Proceedings of the 25th International Conference on Intelligent User Interfaces
March 2020
607 pages
ISBN:9781450371186
DOI:10.1145/3377325
General Chairs:
Fabio Paternò,
Nuria Oliver,
Program Chairs:
Cristina Conati,
Lucio Davide Spano,
Nava Tintarev
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 March 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Honorable Mention
Author Tags
classification
explainable artificial intelligence
human-machine collaboration
information visualization
supervised-learning
trust
trust calibration
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate746of2,811submissions,27%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 86
  Total Citations
  View Citations
- 2,270
  Total Downloads
- Downloads (Last 12 months)522
- Downloads (Last 6 weeks)68
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

How do visual explanations foster end users' appropriate trust in machine learning?

IUI '20: Proceedings of the 25th International Conference on Intelligent User Interfaces

ABSTRACT

References

Cited By

Index Terms

Recommendations

Effect of Wait Time on Trust and Reliance in Human-Robot Interaction

Would you Trust a Robot that Distrusts you?

The influence of the propensity to trust on mobile users' attitudes toward in-app advertisements