ABSTRACT
Recent advances in Deep Neural Networks (DNNs) have led to the development of DNN-driven autonomous cars that, using sensors like camera, LiDAR, etc., can drive without any human intervention. Most major manufacturers including Tesla, GM, Ford, BMW, and Waymo/Google are working on building and testing different types of autonomous vehicles. The lawmakers of several US states including California, Texas, and New York have passed new legislation to fast-track the process of testing and deployment of autonomous vehicles on their roads.
However, despite their spectacular progress, DNNs, just like traditional software, often demonstrate incorrect or unexpected corner-case behaviors that can lead to potentially fatal collisions. Several such real-world accidents involving autonomous cars have already happened including one which resulted in a fatality. Most existing testing techniques for DNN-driven vehicles are heavily dependent on the manual collection of test data under different driving conditions which become prohibitively expensive as the number of test conditions increases.
In this paper, we design, implement, and evaluate DeepTest, a systematic testing tool for automatically detecting erroneous behaviors of DNN-driven vehicles that can potentially lead to fatal crashes. First, our tool is designed to automatically generated test cases leveraging real-world changes in driving conditions like rain, fog, lighting conditions, etc. DeepTest systematically explore different parts of the DNN logic by generating test inputs that maximize the numbers of activated neurons. DeepTest found thousands of erroneous behaviors under different realistic driving conditions (e.g., blurring, rain, fog, etc.) many of which lead to potentially fatal crashes in three top performing DNNs in the Udacity self-driving car challenge.
- 2013. Add Dramatic Rain to a Photo in Photoshop. https://design.tutsplus.com/tutorials/add-dramatic-rain-to-a-photo-in-photoshop-psd-29536. (2013).Google Scholar
- 2013. How to create mist: Photoshop effects for atmospheric landscapes. http://www.techradar.com/how-to/photography-video-capture/cameras/how-to-create-mist-photoshop-effects-for-atmospheric-landscapes-1320997. (2013).Google Scholar
- 2014. The OpenCV Reference Manual (2.4.9.0 ed.).Google Scholar
- 2014. This Is How Bad Self-Driving Cars Suck In The Rain. http://jalopnik.com/this-is-how-bad-self-driving-cars-suck-in-the-rain-1666268433. (2014).Google Scholar
- 2015. Affine Transformation. https://www.mathworks.com/discovery/affine-transformation.html. (2015).Google Scholar
- 2015. Affine Transformations. http://docs.opencv.org/3.1.0/d4/d61/tutorial_warp_affine.html. (2015).Google Scholar
- 2015. Open Source Computer Vision Library. https://github.com/itseez/opencv. (2015).Google Scholar
- 2016. Chauffeur model. https://github.com/udacity/self-driving-car/tree/master/steering-models/community-models/chauffeur. (2016).Google Scholar
- 2016. comma.ai's steering model. https://github.com/commaai/research/blob/master/train_steering_model.py. (2016).Google Scholar
- 2016. Epoch model. https://github.com/udacity/self-driving-car/tree/master/steering-models/community-models/cg23. (2016).Google Scholar
- 2016. Google Auto Waymo Disengagement Report for Autonomous Driving. https://www.dmv.ca.gov/portal/wcm/connect/946b3502-c959-4e3b-b119-91319c27788f/GoogleAutoWaymo_disengage_report_2016.pdf?MOD=AJPERES. (2016).Google Scholar
- 2016. Google's Self-Driving Car Caused Its First Crash. https://www.wired.com/2016/02/googles-self-driving-car-may-caused-first-crash/. (2016).Google Scholar
- 2016. Rambo model. https://github.com/udacity/self-driving-car/tree/master/steering-models/community-models/rambo. (2016).Google Scholar
- 2016. Tesla Autopilot. https://www.tesla.com/autopilot. (2016).Google Scholar
- 2016. Udacity self driving car challenge 2. https://github.com/udacity/self-driving-car/tree/master/challenges/challenge-2. (2016).Google Scholar
- 2016. Udacity self driving car challenge 2 dataset. https://github.com/udacity/self-driving-car/tree/master/datasets/CH2. (2016).Google Scholar
- 2016. Who's responsible when an autonomous car crashes? http://money.cnn.com/2016/07/07/technology/tesla-liability-risk/index.html. (2016).Google Scholar
- 2017. Autonomous Vehicles Enacted Legislation. http://www.ncsl.org/research/transportation/autonomous-vehicles-self-driving-vehicles-enacted-legislation.aspx. (2017).Google Scholar
- 2017. Baidu Apollo. https://github.com/ApolloAuto/apollo. (2017).Google Scholar
- 2017. Inside Waymo's Secret World for Training Self-Driving Cars. https://www.theatlantic.com/technology/archive/2017/08/inside-waymos-secret-testing-and-simulation-facilities/537648/. (2017).Google Scholar
- 2017. The Numbers Don't Lie: Self-Driving Cars Are Getting Good. https://www.wired.com/2017/02/california-dmv-autonomous-car-disengagement. (2017).Google Scholar
- 2017. Software 2.0. https://medium.com/@karpathy/software-2-0-a64152b37c35. (2017).Google Scholar
- 2017. Tesla's Self-Driving System Cleared in Deadly Crash. https://www.nytimes.com/2017/01/19/business/tesla-model-s-autopilot-fatal-crash.html. (2017).Google Scholar
- Martin Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016).Google Scholar
- Raja Ben Abdessalem, Shiva Nejati, Lionel C Briand, and Thomas Stifter. 2016. Testing advanced driver assistance systems using multi-objective search and neural networks. In Automated Software Engineering (ASE), 2016 31st IEEE/ACM International Conference on. IEEE, 63--74. Google ScholarDigital Library
- an Goodfellow and Nicolas Papernot. 2017. The challenge of verification and testing of machine learning. http://www.cleverhans.io/security/privacy/ml/2017/06/14/verification.html. (2017).Google Scholar
- Saswat Anand, Edmund K Burke, Tsong Yueh Chen, John Clark, Myra B Cohen, Wolfgang Grieskamp, Mark Harman, Mary Jean Harrold, Phil Mcminn, Antonia Bertolino, et al. 2013. An orchestrated survey of methodologies for automated software test case generation. Journal of Systems and Software 86, 8 (2013), 1978--2001. Google ScholarDigital Library
- Hyrum Anderson. 2017. Evading Next-Gen AV using A.I. https://www.defcon.org/html/defcon-25/dc-25-index.html. (2017).Google Scholar
- Osbert Bastani, Yani Ioannou, Leonidas Lampropoulos, Dimitrios Vytiniotis, Aditya Nori, and Antonio Criminisi. 2016. Measuring neural net robustness with constraints. In Advances in Neural Information Processing Systems. 2613--2621. Google ScholarDigital Library
- Yoshua Bengio, Pascal Lamblin, Dan Popovici, and Hugo Larochelle. 2007. Greedy layer-wise training of deep networks. In Advances in neural information processing systems. 153--160. Google ScholarDigital Library
- Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, et al 2<sup>.</sup> 016. End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316 (2016).Google Scholar
- Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In Security and Privacy (SP), 2017 IEEE Symposium on. IEEE, 39--57.Google ScholarCross Ref
- Tsong Y Chen, Shing C Cheung,and Shiu Ming Yiu. 1998. Metamorphic testing: a new approach for generating next test cases. Technical Report. Technical Report HKUST-CS98-01, Department of Computer Science, Hong Kong University of Science and Technology, Hong Kong.Google Scholar
- François Chollet et al. 2015. Keras. https://github.com/fchollet/keras. (2015).Google Scholar
- Moustapha Cisse, Piotr Bojanowski, Edouard Grave, Yann Dauphin, and Nicolas Usunier. 2017. Parseval networks: Improving robustness to adversarial examples. In International Conference on Machine Learning. 854--863.Google Scholar
- California DMV. 2016. Autonomous Vehicle Disengagement Reports. https://www.dmv.ca.gov/portal/dmv/detail/vr/autonomous/disengagement_report_2016. (2016).Google Scholar
- Ivan Evtimov, Kevin Eykholt, Earlence Fernandes, Tadayoshi Kohno, Bo Li, Atul Prakash, Amir Rahmati, and Dawn Song. 2017. Robust Physical-World Attacks on Machine Learning Models. arXiv preprint arXiv:1707.08945 (2017).Google Scholar
- Reuben Feinman, Ryan R Curtin, Saurabh Shintre, and Andrew B Gardner. 2017. Detecting Adversarial Samples from Artifacts. arXiv preprint arXiv:1703.00410 (2017).Google Scholar
- Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. http://www.deeplearningbook.org Book in preparation for MIT Press. Google ScholarDigital Library
- Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In International Conference on Learning Representations (ICLR).Google Scholar
- Kathrin Grosse, Praveen Manoharan, Nicolas Papernot, Michael Backes, and Patrick McDaniel. 2017. On the (statistical) detection of adversarial examples. arXiv preprint arXiv:1702.06280 (2017).Google Scholar
- Kathrin Grosse, Nicolas Papernot, Praveen Manoharan, Michael Backes, and Patrick D. McDaniel. 2017. Adversarial Perturbations Against Deep Neural Networks for Malware Classification. In Proceedings of the 2017 European Symposium on Research in Computer Security.Google Scholar
- Shixiang Gu and Luca Rigazio. 2015. Towards deep neural network architectures robust to adversarial examples. In International Conference on Learning Representations (ICLR).Google Scholar
- Jan Hauke and Tomasz Kossowski. 2011. Comparison of values of Pearson's and Spearman's correlation coefficients on the same sets of data. Quaestiones geographicae 30, 2 (2011), 87.Google Scholar
- Samer Hijazi, Rishi Kumar, and Chris Rowen. 2015. Using convolutional neural networks for image recognition. Technical Report. Tech. Rep., 2015. {Online}. Available: http://ip. cadence. com/uploads/901/cnn-wp-pdf.Google Scholar
- Sepp Hochreiter, Yoshua Bengio, Paolo Frasconi, Jürgen Schmidhuber, et al. 2001. Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. (2001).Google Scholar
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780. Google ScholarDigital Library
- Xiaowei Huang, Marta Kwiatkowska, Sen Wang, and Min Wu. 2017. Safety verification of deep neural networks. In International Conference on Computer Aided Verification. Springer, 3--29.Google ScholarCross Ref
- L. C. Jain and L. R. Medsker. 1999. Recurrent Neural Networks: Design and Applications (1st ed.). CRC Press, Inc., Boca Raton, FL, USA. Google ScholarDigital Library
- Andrej Karpathy. {n. d.}. Convolutional neural networks. http://cs231n.github.io/convolutional-networks/. ({n. d.}).Google Scholar
- Guy Katz, Clark Barrett, David L. Dill, Kyle Julian, and Mykel J. Kochenderfer. 2017. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks. Springer International Publishing, Cham, 97--117.Google Scholar
- Jernej Kos, Ian Fischer, and Dawn Song. 2017. Adversarial examples for generative models. arXiv preprint arXiv:1702.06832 (2017).Google Scholar
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105. Google ScholarDigital Library
- Pavel Laskov et al. 2014. Practical evasion of a learning-based classifier: A case study. In Security and Privacy (SP), 2014 IEEE Symposium on. IEEE, 197--211. Google ScholarDigital Library
- Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Xiaodong Song. 2017. Delving into Transferable Adversarial Examples and Black-box Attacks. In International Conference on Learning Representations (ICLR).Google Scholar
- Phil McMinn. 2004. Search-based software test data generation: a survey. Software testing, Verification and reliability 14, 2 (2004), 105--156. Google ScholarDigital Library
- Jan Hendrik Metzen, Tim Genewein, Volker Fischer, and Bastian Bischoff. 2017. On detecting adversarial perturbations. In International Conference on Learning Representations (ICLR).Google Scholar
- Thomas M. Mitchell. 1997. Machine Learning (1 ed.). McGraw-Hill, Inc., New York, NY, USA. Google ScholarDigital Library
- Takeru Miyato, Andrew M Dai, and Ian Goodfellow. 2016. Adversarial Training Methods for Semi-Supervised Text Classification. In Proceedings of the International Conference on Learning Representations (ICLR).Google Scholar
- Christian Murphy, Gail E Kaiser, Lifeng Hu, and Leon Wu. 2008. Properties of Machine Learning Applications for Use in Metamorphic Testing.. In SEKE, Vol. 8. 867--872.Google Scholar
- Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10). 807--814. Google ScholarDigital Library
- Nina Narodytska and Shiva Prasad Kasiviswanathan. 2016. Simple black-box adversarial perturbations for deep networks. In Workshop on Adversarial Training, NIPS 2016.Google Scholar
- Anh Nguyen, Jason Yosinski, and Jeff Clune. 2015. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 427--436.Google ScholarCross Ref
- Nicolas Papernot and Patrick McDaniel. 2017. Extending Defensive Distillation. arXiv preprint arXiv:1705.05264 (2017).Google Scholar
- Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. 2017. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security. ACM, 506--519. Google ScholarDigital Library
- Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. 2016. The limitations of deep learning in adversarial settings. In 2016 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 372--387.Google ScholarCross Ref
- Nicolas Papernot, Patrick McDaniel, Ananthram Swami, and Richard Harang. 2016. Crafting adversarial input sequences for recurrent neural networks. In Military Communications Conference, MILCOM 2016-2016 IEEE. IEEE, 49--54.Google ScholarDigital Library
- Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. 2016. Distillation as a defense to adversarial perturbations against deep neural networks. In Security and Privacy (SP), 2016 IEEE Symposium on. IEEE, 582--597.Google ScholarCross Ref
- Corina S Păsăreanu and Willem Visser. 2009. A survey of new trends in symbolic execution for software testing and analysis. International Journal on Software Tools for Technology Transfer (STTT) 11, 4 (2009), 339--353.Google ScholarCross Ref
- Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. DeepXplore: Automated Whitebox Testing of Deep Learning Systems. arXiv preprint arXiv:1705.06640 (2017). Google ScholarDigital Library
- Luca Pulina and Armando Tacchella. 2010. An abstraction-refinement approach to verification of artificial neural networks. In Computer Aided Verification. Springer, 243--257. Google ScholarDigital Library
- David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams. 1988. Learning representations by back-propagating errors. Cognitive modeling 5, 3 (1988), 1.Google Scholar
- D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, and Michael Young. 2014. Machine Learning: The High Interest Credit Card of Technical Debt.Google Scholar
- Uri Shaham, Yutaro Yamada, and Sahand Negahban. 2015. Understanding adversarial training: Increasing local stability of neural nets through robust optimization. arXiv preprint arXiv:1511.05432 ( 2015).Google Scholar
- Mahmood Sharif, Sruti Bhagavatula, Lujo Bauer, and Michael K Reiter. 2016. Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 1528--1540. Google ScholarDigital Library
- Charles Spearman. 1904. The proof and measurement of association between two things. The American journal of psychology 15, 1 (1904), 72--101.Google Scholar
- Jacob Steinhardt, Pang Wei Koh, and Percy Liang. 2017. Certified Defenses for Data Poisoning Attacks. arXiv preprint arXiv:1706.03691 (2017).Google Scholar
- C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. 2014. Intriguing properties of neural networks. In International Conference on Learning Representations (ICLR).Google Scholar
- Theano Development Team. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688 (May 2016). http://arxiv.org/abs/1605.02688Google Scholar
- Gang Wang, Tianyi Wang, Haitao Zheng, and Ben Y Zhao. 2014. Man vs. Machine: Practical Adversarial Detection of Malicious Crowdsourcing Workers.. In USENIX Security Symposium. 239--254. Google ScholarDigital Library
- Michael J Wilber, Vitaly Shmatikov, and Serge Belongie. 2016. Can we still avoid automatic face detection?. In Applications of Computer Vision (WACV), 2016 IEEE Winter Conference on. IEEE, 1--9.Google ScholarCross Ref
- Ian H Witten, Eibe Frank, Mark A Hall, and Christopher J Pal. 2016. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann. Google ScholarDigital Library
- Xiaoyuan Xie, Joshua Ho, Christian Murphy, Gail Kaiser, Baowen Xu, and Tsong Yueh Chen. 2009. Application of metamorphic testing to supervised classifiers. In Quality Software, 2009. QSIC'09. 9th International Conference on. IEEE, 135--144. Google ScholarDigital Library
- Weilin Xu, David Evans, and Yanjun Qi. 2017. Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks. arXiv preprint arXiv:1704.01155 (2017).Google Scholar
- Weilin Xu, Yanjun Qi, and David Evans. 2016. Automatically evading classifiers. In Proceedings of the 2016 Network and Distributed Systems Symposium.Google ScholarCross Ref
- Stephan Zheng, Yang Song, Thomas Leung, and Ian Goodfellow. 2016. Improving the robustness of deep neural networks via stability training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4480--4488.Google ScholarCross Ref
- Zhi Quan Zhou, DH Huang, TH Tse, Zongyuan Yang, Haitao Huang, and TY Chen. 2004. Metamorphic testing and its applications. In Proceedings of the 8th International Symposium on Future Software Technology (ISFST 2004). 346--351.Google Scholar
Index Terms
- DeepTest: automated testing of deep-neural-network-driven autonomous cars
Recommendations
Efficient parking control algorithms for self-driving cars
We explored the problems which will soon arise while parking in car parks. These include structure of parking lot suitable for autonomous cars, finding the closest parking slot available, and navigation to the location. In this paper, we explored the ...
SDLV: Verification of Steering Angle Safety for Self-Driving Cars
Special Issue on Formal Methods and AIAbstractSelf-driving cars over the last decade have achieved significant progress like driving millions of miles without any human intervention. However, behavioral safety in applying deep-neural-network-based (DNN based) systems for self-driving cars ...
Accelerating the Race to Autonomous Cars
KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data MiningEvery automaker is working on driver assistance systems and self-driving cars. Conventional computer vision used for ADAS is reaching its threshold because it is impossible to write code for every possible scenario as a vehicle navigates. In order to ...
Comments