skip to main content
10.1145/3180155.3180220acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Public Access

DeepTest: automated testing of deep-neural-network-driven autonomous cars

Published:27 May 2018Publication History

ABSTRACT

Recent advances in Deep Neural Networks (DNNs) have led to the development of DNN-driven autonomous cars that, using sensors like camera, LiDAR, etc., can drive without any human intervention. Most major manufacturers including Tesla, GM, Ford, BMW, and Waymo/Google are working on building and testing different types of autonomous vehicles. The lawmakers of several US states including California, Texas, and New York have passed new legislation to fast-track the process of testing and deployment of autonomous vehicles on their roads.

However, despite their spectacular progress, DNNs, just like traditional software, often demonstrate incorrect or unexpected corner-case behaviors that can lead to potentially fatal collisions. Several such real-world accidents involving autonomous cars have already happened including one which resulted in a fatality. Most existing testing techniques for DNN-driven vehicles are heavily dependent on the manual collection of test data under different driving conditions which become prohibitively expensive as the number of test conditions increases.

In this paper, we design, implement, and evaluate DeepTest, a systematic testing tool for automatically detecting erroneous behaviors of DNN-driven vehicles that can potentially lead to fatal crashes. First, our tool is designed to automatically generated test cases leveraging real-world changes in driving conditions like rain, fog, lighting conditions, etc. DeepTest systematically explore different parts of the DNN logic by generating test inputs that maximize the numbers of activated neurons. DeepTest found thousands of erroneous behaviors under different realistic driving conditions (e.g., blurring, rain, fog, etc.) many of which lead to potentially fatal crashes in three top performing DNNs in the Udacity self-driving car challenge.

References

  1. 2013. Add Dramatic Rain to a Photo in Photoshop. https://design.tutsplus.com/tutorials/add-dramatic-rain-to-a-photo-in-photoshop-psd-29536. (2013).Google ScholarGoogle Scholar
  2. 2013. How to create mist: Photoshop effects for atmospheric landscapes. http://www.techradar.com/how-to/photography-video-capture/cameras/how-to-create-mist-photoshop-effects-for-atmospheric-landscapes-1320997. (2013).Google ScholarGoogle Scholar
  3. 2014. The OpenCV Reference Manual (2.4.9.0 ed.).Google ScholarGoogle Scholar
  4. 2014. This Is How Bad Self-Driving Cars Suck In The Rain. http://jalopnik.com/this-is-how-bad-self-driving-cars-suck-in-the-rain-1666268433. (2014).Google ScholarGoogle Scholar
  5. 2015. Affine Transformation. https://www.mathworks.com/discovery/affine-transformation.html. (2015).Google ScholarGoogle Scholar
  6. 2015. Affine Transformations. http://docs.opencv.org/3.1.0/d4/d61/tutorial_warp_affine.html. (2015).Google ScholarGoogle Scholar
  7. 2015. Open Source Computer Vision Library. https://github.com/itseez/opencv. (2015).Google ScholarGoogle Scholar
  8. 2016. Chauffeur model. https://github.com/udacity/self-driving-car/tree/master/steering-models/community-models/chauffeur. (2016).Google ScholarGoogle Scholar
  9. 2016. comma.ai's steering model. https://github.com/commaai/research/blob/master/train_steering_model.py. (2016).Google ScholarGoogle Scholar
  10. 2016. Epoch model. https://github.com/udacity/self-driving-car/tree/master/steering-models/community-models/cg23. (2016).Google ScholarGoogle Scholar
  11. 2016. Google Auto Waymo Disengagement Report for Autonomous Driving. https://www.dmv.ca.gov/portal/wcm/connect/946b3502-c959-4e3b-b119-91319c27788f/GoogleAutoWaymo_disengage_report_2016.pdf?MOD=AJPERES. (2016).Google ScholarGoogle Scholar
  12. 2016. Google's Self-Driving Car Caused Its First Crash. https://www.wired.com/2016/02/googles-self-driving-car-may-caused-first-crash/. (2016).Google ScholarGoogle Scholar
  13. 2016. Rambo model. https://github.com/udacity/self-driving-car/tree/master/steering-models/community-models/rambo. (2016).Google ScholarGoogle Scholar
  14. 2016. Tesla Autopilot. https://www.tesla.com/autopilot. (2016).Google ScholarGoogle Scholar
  15. 2016. Udacity self driving car challenge 2. https://github.com/udacity/self-driving-car/tree/master/challenges/challenge-2. (2016).Google ScholarGoogle Scholar
  16. 2016. Udacity self driving car challenge 2 dataset. https://github.com/udacity/self-driving-car/tree/master/datasets/CH2. (2016).Google ScholarGoogle Scholar
  17. 2016. Who's responsible when an autonomous car crashes? http://money.cnn.com/2016/07/07/technology/tesla-liability-risk/index.html. (2016).Google ScholarGoogle Scholar
  18. 2017. Autonomous Vehicles Enacted Legislation. http://www.ncsl.org/research/transportation/autonomous-vehicles-self-driving-vehicles-enacted-legislation.aspx. (2017).Google ScholarGoogle Scholar
  19. 2017. Baidu Apollo. https://github.com/ApolloAuto/apollo. (2017).Google ScholarGoogle Scholar
  20. 2017. Inside Waymo's Secret World for Training Self-Driving Cars. https://www.theatlantic.com/technology/archive/2017/08/inside-waymos-secret-testing-and-simulation-facilities/537648/. (2017).Google ScholarGoogle Scholar
  21. 2017. The Numbers Don't Lie: Self-Driving Cars Are Getting Good. https://www.wired.com/2017/02/california-dmv-autonomous-car-disengagement. (2017).Google ScholarGoogle Scholar
  22. 2017. Software 2.0. https://medium.com/@karpathy/software-2-0-a64152b37c35. (2017).Google ScholarGoogle Scholar
  23. 2017. Tesla's Self-Driving System Cleared in Deadly Crash. https://www.nytimes.com/2017/01/19/business/tesla-model-s-autopilot-fatal-crash.html. (2017).Google ScholarGoogle Scholar
  24. Martin Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016).Google ScholarGoogle Scholar
  25. Raja Ben Abdessalem, Shiva Nejati, Lionel C Briand, and Thomas Stifter. 2016. Testing advanced driver assistance systems using multi-objective search and neural networks. In Automated Software Engineering (ASE), 2016 31st IEEE/ACM International Conference on. IEEE, 63--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. an Goodfellow and Nicolas Papernot. 2017. The challenge of verification and testing of machine learning. http://www.cleverhans.io/security/privacy/ml/2017/06/14/verification.html. (2017).Google ScholarGoogle Scholar
  27. Saswat Anand, Edmund K Burke, Tsong Yueh Chen, John Clark, Myra B Cohen, Wolfgang Grieskamp, Mark Harman, Mary Jean Harrold, Phil Mcminn, Antonia Bertolino, et al. 2013. An orchestrated survey of methodologies for automated software test case generation. Journal of Systems and Software 86, 8 (2013), 1978--2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Hyrum Anderson. 2017. Evading Next-Gen AV using A.I. https://www.defcon.org/html/defcon-25/dc-25-index.html. (2017).Google ScholarGoogle Scholar
  29. Osbert Bastani, Yani Ioannou, Leonidas Lampropoulos, Dimitrios Vytiniotis, Aditya Nori, and Antonio Criminisi. 2016. Measuring neural net robustness with constraints. In Advances in Neural Information Processing Systems. 2613--2621. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Yoshua Bengio, Pascal Lamblin, Dan Popovici, and Hugo Larochelle. 2007. Greedy layer-wise training of deep networks. In Advances in neural information processing systems. 153--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, et al 2<sup>.</sup> 016. End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316 (2016).Google ScholarGoogle Scholar
  32. Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In Security and Privacy (SP), 2017 IEEE Symposium on. IEEE, 39--57.Google ScholarGoogle ScholarCross RefCross Ref
  33. Tsong Y Chen, Shing C Cheung,and Shiu Ming Yiu. 1998. Metamorphic testing: a new approach for generating next test cases. Technical Report. Technical Report HKUST-CS98-01, Department of Computer Science, Hong Kong University of Science and Technology, Hong Kong.Google ScholarGoogle Scholar
  34. François Chollet et al. 2015. Keras. https://github.com/fchollet/keras. (2015).Google ScholarGoogle Scholar
  35. Moustapha Cisse, Piotr Bojanowski, Edouard Grave, Yann Dauphin, and Nicolas Usunier. 2017. Parseval networks: Improving robustness to adversarial examples. In International Conference on Machine Learning. 854--863.Google ScholarGoogle Scholar
  36. California DMV. 2016. Autonomous Vehicle Disengagement Reports. https://www.dmv.ca.gov/portal/dmv/detail/vr/autonomous/disengagement_report_2016. (2016).Google ScholarGoogle Scholar
  37. Ivan Evtimov, Kevin Eykholt, Earlence Fernandes, Tadayoshi Kohno, Bo Li, Atul Prakash, Amir Rahmati, and Dawn Song. 2017. Robust Physical-World Attacks on Machine Learning Models. arXiv preprint arXiv:1707.08945 (2017).Google ScholarGoogle Scholar
  38. Reuben Feinman, Ryan R Curtin, Saurabh Shintre, and Andrew B Gardner. 2017. Detecting Adversarial Samples from Artifacts. arXiv preprint arXiv:1703.00410 (2017).Google ScholarGoogle Scholar
  39. Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. http://www.deeplearningbook.org Book in preparation for MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In International Conference on Learning Representations (ICLR).Google ScholarGoogle Scholar
  41. Kathrin Grosse, Praveen Manoharan, Nicolas Papernot, Michael Backes, and Patrick McDaniel. 2017. On the (statistical) detection of adversarial examples. arXiv preprint arXiv:1702.06280 (2017).Google ScholarGoogle Scholar
  42. Kathrin Grosse, Nicolas Papernot, Praveen Manoharan, Michael Backes, and Patrick D. McDaniel. 2017. Adversarial Perturbations Against Deep Neural Networks for Malware Classification. In Proceedings of the 2017 European Symposium on Research in Computer Security.Google ScholarGoogle Scholar
  43. Shixiang Gu and Luca Rigazio. 2015. Towards deep neural network architectures robust to adversarial examples. In International Conference on Learning Representations (ICLR).Google ScholarGoogle Scholar
  44. Jan Hauke and Tomasz Kossowski. 2011. Comparison of values of Pearson's and Spearman's correlation coefficients on the same sets of data. Quaestiones geographicae 30, 2 (2011), 87.Google ScholarGoogle Scholar
  45. Samer Hijazi, Rishi Kumar, and Chris Rowen. 2015. Using convolutional neural networks for image recognition. Technical Report. Tech. Rep., 2015. {Online}. Available: http://ip. cadence. com/uploads/901/cnn-wp-pdf.Google ScholarGoogle Scholar
  46. Sepp Hochreiter, Yoshua Bengio, Paolo Frasconi, Jürgen Schmidhuber, et al. 2001. Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. (2001).Google ScholarGoogle Scholar
  47. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Xiaowei Huang, Marta Kwiatkowska, Sen Wang, and Min Wu. 2017. Safety verification of deep neural networks. In International Conference on Computer Aided Verification. Springer, 3--29.Google ScholarGoogle ScholarCross RefCross Ref
  49. L. C. Jain and L. R. Medsker. 1999. Recurrent Neural Networks: Design and Applications (1st ed.). CRC Press, Inc., Boca Raton, FL, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Andrej Karpathy. {n. d.}. Convolutional neural networks. http://cs231n.github.io/convolutional-networks/. ({n. d.}).Google ScholarGoogle Scholar
  51. Guy Katz, Clark Barrett, David L. Dill, Kyle Julian, and Mykel J. Kochenderfer. 2017. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks. Springer International Publishing, Cham, 97--117.Google ScholarGoogle Scholar
  52. Jernej Kos, Ian Fischer, and Dawn Song. 2017. Adversarial examples for generative models. arXiv preprint arXiv:1702.06832 (2017).Google ScholarGoogle Scholar
  53. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Pavel Laskov et al. 2014. Practical evasion of a learning-based classifier: A case study. In Security and Privacy (SP), 2014 IEEE Symposium on. IEEE, 197--211. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Xiaodong Song. 2017. Delving into Transferable Adversarial Examples and Black-box Attacks. In International Conference on Learning Representations (ICLR).Google ScholarGoogle Scholar
  56. Phil McMinn. 2004. Search-based software test data generation: a survey. Software testing, Verification and reliability 14, 2 (2004), 105--156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Jan Hendrik Metzen, Tim Genewein, Volker Fischer, and Bastian Bischoff. 2017. On detecting adversarial perturbations. In International Conference on Learning Representations (ICLR).Google ScholarGoogle Scholar
  58. Thomas M. Mitchell. 1997. Machine Learning (1 ed.). McGraw-Hill, Inc., New York, NY, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Takeru Miyato, Andrew M Dai, and Ian Goodfellow. 2016. Adversarial Training Methods for Semi-Supervised Text Classification. In Proceedings of the International Conference on Learning Representations (ICLR).Google ScholarGoogle Scholar
  60. Christian Murphy, Gail E Kaiser, Lifeng Hu, and Leon Wu. 2008. Properties of Machine Learning Applications for Use in Metamorphic Testing.. In SEKE, Vol. 8. 867--872.Google ScholarGoogle Scholar
  61. Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10). 807--814. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Nina Narodytska and Shiva Prasad Kasiviswanathan. 2016. Simple black-box adversarial perturbations for deep networks. In Workshop on Adversarial Training, NIPS 2016.Google ScholarGoogle Scholar
  63. Anh Nguyen, Jason Yosinski, and Jeff Clune. 2015. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 427--436.Google ScholarGoogle ScholarCross RefCross Ref
  64. Nicolas Papernot and Patrick McDaniel. 2017. Extending Defensive Distillation. arXiv preprint arXiv:1705.05264 (2017).Google ScholarGoogle Scholar
  65. Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. 2017. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security. ACM, 506--519. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. 2016. The limitations of deep learning in adversarial settings. In 2016 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 372--387.Google ScholarGoogle ScholarCross RefCross Ref
  67. Nicolas Papernot, Patrick McDaniel, Ananthram Swami, and Richard Harang. 2016. Crafting adversarial input sequences for recurrent neural networks. In Military Communications Conference, MILCOM 2016-2016 IEEE. IEEE, 49--54.Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. 2016. Distillation as a defense to adversarial perturbations against deep neural networks. In Security and Privacy (SP), 2016 IEEE Symposium on. IEEE, 582--597.Google ScholarGoogle ScholarCross RefCross Ref
  69. Corina S Păsăreanu and Willem Visser. 2009. A survey of new trends in symbolic execution for software testing and analysis. International Journal on Software Tools for Technology Transfer (STTT) 11, 4 (2009), 339--353.Google ScholarGoogle ScholarCross RefCross Ref
  70. Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. DeepXplore: Automated Whitebox Testing of Deep Learning Systems. arXiv preprint arXiv:1705.06640 (2017). Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Luca Pulina and Armando Tacchella. 2010. An abstraction-refinement approach to verification of artificial neural networks. In Computer Aided Verification. Springer, 243--257. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams. 1988. Learning representations by back-propagating errors. Cognitive modeling 5, 3 (1988), 1.Google ScholarGoogle Scholar
  73. D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, and Michael Young. 2014. Machine Learning: The High Interest Credit Card of Technical Debt.Google ScholarGoogle Scholar
  74. Uri Shaham, Yutaro Yamada, and Sahand Negahban. 2015. Understanding adversarial training: Increasing local stability of neural nets through robust optimization. arXiv preprint arXiv:1511.05432 ( 2015).Google ScholarGoogle Scholar
  75. Mahmood Sharif, Sruti Bhagavatula, Lujo Bauer, and Michael K Reiter. 2016. Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 1528--1540. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Charles Spearman. 1904. The proof and measurement of association between two things. The American journal of psychology 15, 1 (1904), 72--101.Google ScholarGoogle Scholar
  77. Jacob Steinhardt, Pang Wei Koh, and Percy Liang. 2017. Certified Defenses for Data Poisoning Attacks. arXiv preprint arXiv:1706.03691 (2017).Google ScholarGoogle Scholar
  78. C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. 2014. Intriguing properties of neural networks. In International Conference on Learning Representations (ICLR).Google ScholarGoogle Scholar
  79. Theano Development Team. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688 (May 2016). http://arxiv.org/abs/1605.02688Google ScholarGoogle Scholar
  80. Gang Wang, Tianyi Wang, Haitao Zheng, and Ben Y Zhao. 2014. Man vs. Machine: Practical Adversarial Detection of Malicious Crowdsourcing Workers.. In USENIX Security Symposium. 239--254. Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. Michael J Wilber, Vitaly Shmatikov, and Serge Belongie. 2016. Can we still avoid automatic face detection?. In Applications of Computer Vision (WACV), 2016 IEEE Winter Conference on. IEEE, 1--9.Google ScholarGoogle ScholarCross RefCross Ref
  82. Ian H Witten, Eibe Frank, Mark A Hall, and Christopher J Pal. 2016. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Xiaoyuan Xie, Joshua Ho, Christian Murphy, Gail Kaiser, Baowen Xu, and Tsong Yueh Chen. 2009. Application of metamorphic testing to supervised classifiers. In Quality Software, 2009. QSIC'09. 9th International Conference on. IEEE, 135--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. Weilin Xu, David Evans, and Yanjun Qi. 2017. Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks. arXiv preprint arXiv:1704.01155 (2017).Google ScholarGoogle Scholar
  85. Weilin Xu, Yanjun Qi, and David Evans. 2016. Automatically evading classifiers. In Proceedings of the 2016 Network and Distributed Systems Symposium.Google ScholarGoogle ScholarCross RefCross Ref
  86. Stephan Zheng, Yang Song, Thomas Leung, and Ian Goodfellow. 2016. Improving the robustness of deep neural networks via stability training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4480--4488.Google ScholarGoogle ScholarCross RefCross Ref
  87. Zhi Quan Zhou, DH Huang, TH Tse, Zongyuan Yang, Haitao Huang, and TY Chen. 2004. Metamorphic testing and its applications. In Proceedings of the 8th International Symposium on Future Software Technology (ISFST 2004). 346--351.Google ScholarGoogle Scholar

Index Terms

  1. DeepTest: automated testing of deep-neural-network-driven autonomous cars

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ICSE '18: Proceedings of the 40th International Conference on Software Engineering
          May 2018
          1307 pages
          ISBN:9781450356381
          DOI:10.1145/3180155
          • Conference Chair:
          • Michel Chaudron,
          • General Chair:
          • Ivica Crnkovic,
          • Program Chairs:
          • Marsha Chechik,
          • Mark Harman

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 27 May 2018

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate276of1,856submissions,15%

          Upcoming Conference

          ICSE 2025

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader