research-article

Selective Symbolic Type-Guided Checkpointing and Restoration for Autonomous Vehicle Repair

Authors:
Yu Huang

University of Michigan

University of Michigan
View Profile

,
Kevin Angstadt

University of Michigan

University of Michigan
View Profile

,
Kevin Leach

University of Michigan

University of Michigan
View Profile

,
Westley Weimer

University of Michigan

University of Michigan
View Profile

ICSEW'20: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering WorkshopsJune 2020Pages 3–10https://doi.org/10.1145/3387940.3392201

Published:25 September 2020Publication History

ICSEW'20: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops

Pages 3–10

ABSTRACT

Fault tolerant design can help autonomous vehicle systems address defects, environmental changes and security attacks. Checkpoint and restoration fault tolerance techniques save a copy of an application's state before a problem occurs and restore that state afterwards. However, traditional Checkpoint/Restore techniques still admit high overhead, may carry along tainted data, and rarely operate in tandem with human-written or automated repairs that modify source code or alter data layout. Thus, it can be difficult to apply traditional Checkpoint/Restore techniques to solve the issues of non-environmental defects, security attacks or software bugs. To address such challenges, in this paper, we propose and evaluate a selective checkpoint and restore (SCR) technique that records only critical system state based on types and minimal symbolic annotations to deploy repaired programs. We found that using source-level symbolic information allows an application to be resumed even after its code is modified in our evaluation. We evaluate our approach using a commodity autonomous vehicle system and demonstrate that it admits manual and automated software repairs, does not carry tainted data, and has low overhead.

References

J. Ansel, K. Arya, and G. Cooperman. DMTCP: Transparent checkpointing for cluster computations and the desktop. In IEEE International Symposium on Parallel & Distributed Processing, pages 1--12, 2009.Google ScholarDigital Library
T. Ball and S. K. Rajamani. SLIC: A specification language for interface checking (of C). Technical Report MSR-TR-2001--21, Microsoft Research, 2001.Google Scholar
C. Bird, N. Nagappan, B. Murphy, H. Gall, and P. Devanbu. Don't touch my code!: Examining the effects of ownership on software quality. In Foundations of Software Engineering, pages 4--14, New York, NY, USA, 2011. ACM.Google Scholar
G. Bronevetsky, D. Marques, K. Pingali, and P. Stodghill. Automated application-level checkpointing of mpi programs. In ACM Sigplan Notices, volume 38, pages 84--94. ACM, 2003.Google ScholarDigital Library
V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection: A survey. ACM Computing Surveys (CSUR), 41(3):15, 2009.Google Scholar
Y. Chen, J. S. Plank, and K. Li. Clip: A checkpointing tool for message-passing parallel programs. In Supercomputing, pages 1--11, 1997.Google Scholar
F. DeMarco, J. Xuan, D. Le Berre, and M. Monperrus. Automatic repair of buggy if conditions and missing preconditions with smt. In International Workshop on Constraints in Software Testing, Verification, and Analysis, pages 30--39, 2014.Google ScholarDigital Library
C. Flanagan and K. R. M. Leino. Houdini, an annotation assistant for ESC/Java. In International Symposium of Formal Methods Europe, pages 500--517, 2001.Google ScholarDigital Library
L. Gazzola, D. Micucci, and L. Mariani. Automatic software repair: a survey. In International Conference on Software Engineering, pages 12--19, 2018.Google ScholarDigital Library
D. Greenfieldboyce and J. S. Foster. Type qualifier inference for Java. In ACM SIGPLAN Notices, volume 42, pages 321--336. ACM, 2007.Google ScholarDigital Library
E. Hendriks. Bproc: The beowulf distributed process space. In Supercomputing, pages 129--136, 2002.Google Scholar
M. Hicks and S. Nettles. Dynamic software updating. Trans. Programming Languages and Systems, 27(6):1049--1096, 2005.Google ScholarDigital Library
K. Highnam, K. Angstadt, K. Leach, W. Weimer, A. Paulos, and P. Hurley. An uncrewed aerial vehicle attack scenario and trustworthy repair architecture. In Dependable Systems and Networks, pages 222--225, 2016.Google ScholarCross Ref
R. Hund, C. Willems, and T. Holz. Practical timing side channel attacks against kernel space aslr. In Security and Privacy, pages 191--205, 2013.Google Scholar
D. Jackson. Alloy: a lightweight object modelling notation. Trans. Software Engineering and Methodology, 11(2):256--290, 2002.Google ScholarDigital Library
A. Y. Javaid. Cyber security threat analysis and attack simulation for unmanned aerial vehicle network. PhD thesis, University of Toledo, 2015.Google Scholar
A. Y. Javaid, W. Sun, V. K. Devabhaktuni, and M. Alam. Cyber security threat analysis and modeling of an unmanned aerial vehicle system. In Technologies for Homeland Security, pages 585--590, 2012.Google ScholarCross Ref
Z. T. Kalbarczyk, R. K. Iyer, S. Bagchi, and K. Whisnant. Chameleon: A software infrastructure for adaptive fault tolerance. Trans. Parallel and Distributed Systems, 10(6):560--579, 1999.Google ScholarDigital Library
G. S. Kc, A. D. Keromytis, and V. Prevelakis. Countering code-injection attacks with instruction-set randomization. In Computer and Communications Security, pages 272--280, 2003.Google ScholarDigital Library
A. J. Kennedy. Functional pearl pickler combinators. Journal of Functional Programming, 14(6):727--739, 2004.Google ScholarDigital Library
G. Kiczales, J. Lamping, A. Mendhekar, C. Maeda, C. Lopes, J.-M. Loingtier, and J. Irwin. Aspect-oriented programming. In European Conference on Object-Oriented Programming, pages 220--242. Springer, 1997.Google ScholarCross Ref
A. Kim, B. Wampler, J. Goppert, I. Hwang, and H. Aldridge. Cyber attack vulnerabilities analysis for unmanned aerial vehicles. In Infotech@ Aerospace, pages 1--30, 2012.Google Scholar
J. Kodumal and A. Aiken. Banshee: A scalable constraint-based analysis toolkit. In International Static Analysis Symposium, pages 218--234. Springer, 2005.Google ScholarDigital Library
O. Laadan and S. E. Hallyn. Linux-cr: Transparent application checkpoint-restart in linux. In Linux Symposium, volume 159, 2010.Google Scholar
C. Le Goues, M. Dewey-Vogt, S. Forrest, and W. Weimer. A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In International Conference on Software Engineering, pages 3--13, 2012.Google ScholarDigital Library
F. Long and M. Rinard. Automatic patch generation by learning correct code. SIGPLAN Notices, 51(1):298--312, 2016.Google ScholarDigital Library
D. E. Lowell, S. Chandra, and P. M. Chen. Exploring failure transparency and the limits of generic recovery. In Operating System Design and Implementation, page 20, 2000.Google Scholar
M. Martinez, T. Durieux, R. Sommerard, J. Xuan, and M. Monperrus. Automatic repair of real bugs in Java: A large-scale experiment on the defects4j dataset. Empirical Software Engineering, 22(4):1936--1964, 2017.Google ScholarDigital Library
J. A. Marty. Vulnerability analysis of the mavlink protocol for command and control of unmanned aircraft. Technical report, Air Force Institute of Technology, 2013.Google Scholar
S. Mechtaev, J. Yi, and A. Roychoudhury. Angelix: Scalable multiline program patch synthesis via symbolic analysis. In International Conference on Software Engineering, pages 691--701, 2016.Google Scholar
M. Monperrus. Automatic software repair: a bibliography. ACM Computing Surveys (CSUR), 51(1):17, 2018.Google Scholar
S. Narayanasamy, C. Pereira, H. Patil, R. Cohn, and B. Calder. Automatic logging of operating system effects to guide application-level architecture simulation. Performance Evaluation Review, 34(1):216--227, 2006.Google ScholarDigital Library
H. D. T. Nguyen, D. Qi, A. Roychoudhury, and S. Chandra. Semfix: Program repair via semantic analysis. In International Conference on Software Engineering, pages 772--781, 2013.Google Scholar
S. Osman, D. Subhraveti, G. Su, and J. Nieh. The design and implementation of zap: A system for migrating computing environments. ACM SIGOPS Operating Systems Review, 36(SI):361--376, 2002.Google Scholar
Pavel Emelyanov. Checkpoint/restore in userspace. In https://criu.org, 2012.Google Scholar
L. Perkov, N. Pavković, and J. Petrović. High-availability using open source software. In Information and Communication Technology, Electronics and Microelectronics, pages 167--170, 2011.Google Scholar
E. Pinheiro. Epckpt: Eduardo pinheiro checkpoint project, 2004.Google Scholar
J. S. Plank, M. Beck, G. Kingsley, and K. Li. Libckpt: Transparent Checkpointing under Unix. January 1995.Google Scholar
M. L. Powell and B. P. Miller. Process migration in demos/mp. In Operating Systems Review, volume 17, 1983.Google ScholarDigital Library
Y. Qi, X. Mao, and Y. Lei. Efficient automated program repair through fault-recorded testing prioritization. In International Conference on Software Maintenance, pages 180--189, 2013.Google ScholarDigital Library
Z. Qi, F. Long, S. Achour, and M. Rinard. An analysis of patch plausibility and correctness for generate-and-validate patch generation systems. In International Symposium on Software Testing and Analysis, pages 24--36, 2015.Google ScholarDigital Library
High-assurance cyber military systems (HACMS). https://www.darpa.mil/program/high-assurance-cyber-military-systems, 2015.Google Scholar
F. Rahman and P. Devanbu. Ownership, experience and defects: A fine-grained study of authorship. In International Conference on Software Engineering, pages 491--500, 2011.Google Scholar
R. Ramey. boost c++ libraries, 2004.Google Scholar
G. A. Reis, J. Chang, N. Vachharajani, R. Rangan, and D. I. August. SWIFT: Software implemented fault tolerance. In Code Generation and Optimization, pages 243--254, 2005.Google ScholarDigital Library
P. N. Robillard. The role of knowledge in software development. Communications of the ACM, 42(1):87--92, Jan 1999.Google ScholarDigital Library
M. Roesch et al. Snort: Lightweight intrusion detection for networks. In Lisa, volume 99, pages 229--238, 1999.Google ScholarDigital Library
P. M. Rondon, M. Kawaguchi, and R. Jhala. Low-level liquid types. In Sigplan Notices, volume 45, pages 131--144, 2010.Google ScholarDigital Library
J. C. Sancho, F. Petrini, K. Davis, R. Gioiosa, and S. Jiang. Current practice and a direction forward in checkpoint/restart implementations for fault tolerance. In Parallel and Distributed Processing Symposium, 2005.Google ScholarDigital Library
S. Sankaran, J. M. Squyres, B. Barrett, V. Sahay, A. Lumsdaine, J. Duell, P. Hargrove, and E. Roman. The lam/mpi checkpoint/restart framework: System-initiated checkpointing. J. High Performance Computing Applications, 19(4):479--493, 2005.Google ScholarDigital Library
F. B. Schneider. Byzantine generals in action: Implementing fail-stop processors. Trans. Computer Systems, 2(2):145--154, 1984.Google ScholarDigital Library
M. E. Segal and O. Frieder. On-the-fly program modification: Systems for dynamic updating. IEEE Software, 10(2):53--65, 1993.Google ScholarDigital Library
U. Shankar, K. Talwar, J. S. Foster, and D. Wagner. Detecting format string vulnerabilities with type qualifiers. In USENIX Security Symposium, pages 201--220, 2001.Google ScholarDigital Library
D. P. Shepard, J. A. Bhatti, T. E. Humphreys, and A. A. Fansler. Evaluation of smart grid and civilian uav vulnerability to gps spoofing attacks. In Radionavigation Laboratory Conference Proceedings, 2012.Google Scholar
J. M. Spivey and J. Abrial. The Z notation. Prentice Hall Hemel Hempstead, 1992.Google ScholarDigital Library
V. L. Thing and J. Wu. Autonomous vehicle security: A taxonomy of attacks and defences. In Cyber, Physical and Social Computing, pages 164--170, 2016.Google ScholarCross Ref
Y. Tian, K. Pei, S. Jana, and B. Ray. Deeptest: Automated testing of deep-neural-network-driven autonomous cars. In International Conference on Software Engineering, pages 303--314, 2018.Google ScholarDigital Library
C. Timperley and C. Le Goues. Darjeeling: a language-agnostic search-based program repair tool. In https://github.com/squaresLab/Darjeeling, 2020.Google Scholar
R. van Tonder and C. Le Goues. Static automated program repair for heap properties. In International Conference on Software Engineering, pages 151--162, 2018.Google ScholarDigital Library
Vmadump. https://bproc.sourceforge.net, 2002.Google Scholar
Y.-M. Wang, Y. Huang, K.-P. Vo, P.-Y. Chung, and C. Kintala. Checkpointing and its applications. In Fault-Tolerant Computing, pages 00--22, 1995.Google Scholar
W. Weimer, S. Forrest, M. Kim, C. Le Goues, and P. Hurley. Trusted software repair for system resiliency. In 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops, pages 238--241, 2016.Google ScholarCross Ref
W. Weimer, T. Nguyen, C. Le Goues, and S. Forrest. Automatically finding patches using genetic programming. In International Conference on Software Engineering, pages 364--374, 2009.Google ScholarDigital Library
A. M. Wyglinski, X. Huang, T. Padir, L. Lai, T. R. Eisenbarth, and K. Venkatasubramanian. Security of autonomous systems employing embedded computing and sensors. IEEE Micro, 33(1):80--86, 2013.Google ScholarDigital Library
H. Zhong and J. Nieh. Crak: Linux checkpoint/restart as a kernel module. Technical report, Technical Report CUCS-014--01, Department of Computer Science, Columbia University, 2001.Google Scholar

Index Terms

Selective Symbolic Type-Guided Checkpointing and Restoration for Autonomous Vehicle Repair

Index terms have been assigned to the content through auto-classification.

Recommendations

Multilevel Diskless Checkpointing

Extreme scale systems available before the end of this decade are expected to have 100 million to 1 billion CPU cores. The probability that a failure occurs during an application execution is expected to be much higher than today's systems. ...
Read More
N-Level Diskless Checkpointing
HPCC '09: Proceedings of the 2009 11th IEEE International Conference on High Performance Computing and Communications

Diskless checkpointing is an efficient technique to tolerate a small number of processor failures in large parallel and distributed systems. In literature, a simultaneous failure of no more than N processors is often tolerated by using a one-level Reed-...
Read More
Optimal checkpointing interval of a communication system with rollback recovery

This paper considers a communication system which consists of many processors and studies the problem for improving its reliability by adopting the recovery techniques of checkpoint and rollback. When either processor failure or communication error has ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICSEW'20: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops
June 2020
831 pages
ISBN:9781450379632
DOI:10.1145/3387940

Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 September 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
autonomous vehicle
checkpoint
maintenance
repair
restore
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 111
  Total Downloads
- Downloads (Last 12 months)15
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Selective Symbolic Type-Guided Checkpointing and Restoration for Autonomous Vehicle Repair

ICSEW'20: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops

ABSTRACT

References

Cited By

Index Terms

Recommendations

Multilevel Diskless Checkpointing

N-Level Diskless Checkpointing

Optimal checkpointing interval of a communication system with rollback recovery

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Selective Symbolic Type-Guided Checkpointing and Restoration for Autonomous Vehicle Repair

ICSEW'20: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops

ABSTRACT

References

Cited By

Index Terms

Recommendations

Multilevel Diskless Checkpointing

N-Level Diskless Checkpointing

Optimal checkpointing interval of a communication system with rollback recovery

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media