ABSTRACT
Assertion oracles are executable boolean expressions placed inside the program that should pass (return true) for all correct executions and fail (return false) for all incorrect executions. Because designing perfect assertion oracles is difficult, assertions often fail to distinguish between correct and incorrect executions. In other words, they are prone to false positives and false negatives. In this paper, we propose GAssert (Genetic ASSERTion improvement), the first technique to automatically improve assertion oracles. Given an assertion oracle and evidence of false positives and false negatives, GAssert implements a novel co-evolutionary algorithm that explores the space of possible assertions to identify one with fewer false positives and false negatives. Our empirical evaluation on 34 Java methods from 7 different Java code bases shows that GAssert effectively improves assertion oracles. GAssert outperforms two baselines (random and invariant-based oracle improvement), and is comparable with and in some cases even outperformed human-improved assertions.
Supplemental Material
- Shay Artzi, Michael D. Ernst, Adam Kieżun, Carlos Pacheco, and Jef H. Perkins. 2006. Finding the Needles in the Haystack: Generating Legal Test Inputs for Object-Oriented Programs. In Workshop on Model-Based Testing and ObjectOriented Systems (M-TOOS ' 06 ).Google Scholar
- Angello Astorga, P. Madhusudan, Shambwaditya Saha, Shiyu Wang, and Tao Xie. 2019. Learning Stateful Preconditions Modulo a Test Generator. In Proceedings of the Conference on Programming Language Design and Implementation (PLDI '19). ACM, 775-787.Google ScholarDigital Library
- Thomas Back. 1996. Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms. Oxford University Press.Google Scholar
- Earl T. Barr, Mark Harman, Phil McMinn, Muzammil Shahbaz, and Shin Yoo. 2015. The Oracle Problem in Software Testing: A Survey. IEEE Transactions on Software Engineering 41, 5 ( 2015 ), 507-525.Google ScholarDigital Library
- Markus F Brameier and Wolfgang Banzhaf. 2007. A Comparison with Tree-Based Genetic Programming. Linear Genetic Programming ( 2007 ), 173-192.Google Scholar
- Christoph Csallner, Nikolai Tillmann, and Yannis Smaragdakis. 2008. DySy: Dynamic Symbolic Execution for Invariant Inference. In Proceedings of the International Conference on Software Engineering (ICSE '08). ACM, 281-290.Google ScholarDigital Library
- Jason M. Daida, Adam M. Hilss, David J. Ward, and Stephen L. Long. 2003. Visualizing Tree Structures in Genetic Programming. In Proceedings of the conference on Genetic and Evolutionary Computation (GECCO '03). Springer, 1652-1664.Google Scholar
- Kalyanmoy Deb, Amrit Pratap, Sameer Agarwal, and TAMT Meyarivan. 2002. A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6, 2 ( 2002 ), 182-197.Google Scholar
- Michael D. Ernst, Jake Cockrell, William G. Griswold, and David Notkin. 1999. Dynamically Discovering Likely Program Invariants to Support Program Evolution. In Proceedings of the International Conference on Software Engineering (ICSE '99). ACM, 213-224.Google ScholarDigital Library
- Michael D Ernst, Jake Cockrell, William G Griswold, and David Notkin. 2001. Dynamically Discovering Likely Program Invariants to Support Program Evolution. IEEE Transactions on Software Engineering 27, 2 ( 2001 ), 99-123.Google ScholarDigital Library
- Gordon Fraser and Andrea Arcuri. 2011. Evolutionary Generation of Whole Test Suites. In Proceedings of the International Conference on Quality Software (QSIC '11). IEEE, 31-40.Google ScholarDigital Library
- Gordon Fraser and Andrea Arcuri. 2011. EvoSuite: Automatic Test Suite Generation for Object-Oriented Software. In Proceedings of the European Software Engineering Conference held jointly with the ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC/FSE '11). ACM, 416-419.Google ScholarDigital Library
- Gordon Fraser and Andrea Arcuri. 2013. Whole Test Suite Generation. IEEE Transactions on Software Engineering 39, 2 ( 2013 ), 276-291.Google ScholarDigital Library
- Gordon Fraser and Andreas Zeller. 2011. Generating Parameterized Unit Tests. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA '11). ACM, 364-374.Google ScholarDigital Library
- Juan Pablo Galeotti, Carlo A. Furia, Eva May, Gordon Fraser, and Andreas Zeller. 2014. DynaMate: Dynamically Inferring Loop Invariants for Automatic Full Functional Verification. In Proceedings of the Haifa Verification Conference (HVC '14). Springer, 48-53.Google ScholarCross Ref
- Juan P. Galeotti, Carlo A. Furia, Eva May, Gordon Fraser, and Andreas Zeller. 2015. Inferring Loop Invariants by Mutation, Dynamic Analysis, and Static Checking. IEEE Transactions on Software Engineering 41, 10 ( 2015 ), 1019-1037.Google ScholarCross Ref
- Ashutosh Gupta and Andrey Rybalchenko. 2009. Invgen: An Eficient Invariant Generator. In Proceedings of the International Conference on Computer Aided Verification (CAV '09). Springer, 634-640.Google ScholarDigital Library
- Mark Harman, William B. Langdon, Yue Jia, David Robert White, Andrea Arcuri, and John A. Clark. 2012. The GISMOE Challenge: Constructing the Pareto Program Surface using Genetic Programming to Find Better Programs (keynote paper). In Proceedings of the International Conference on Automated Software Engineering (ASE '14). ACM, 1-14.Google Scholar
- Chen Huo and James Clause. 2014. Improving Oracle Quality by Detecting Brittle Assertions and Unused Inputs in Tests. In Proceedings of the ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE '14). ACM, 621-631.Google ScholarDigital Library
- Gunel Jahangirova, David Clark, Mark Harman, and Paolo Tonella. 2016. Test Oracle Assessment and Improvement. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA '16). ACM, 247-258.Google ScholarDigital Library
- Gunel Jahangirova, David Clark, Mark Harman, and Paolo Tonella. 2018. OASIs: Oracle Assessment and Improvement Tool. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA '18). ACM, 368-371.Google ScholarDigital Library
- Gunel Jahangirova, David Clark, Mark Harman, and Paolo Tonella. 2019. An Empirical Validation of Oracle Improvement. IEEE Transactions on Software Engineering ( 2019 ).Google ScholarCross Ref
- René Just. 2014. The Major Mutation Framework: Eficient and Scalable Mutation Analysis for Java. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA '14). ACM, 433-436.Google ScholarDigital Library
- John R Koza and John R Koza. 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection. Vol. 1. MIT press.Google ScholarDigital Library
- William B. Langdon, Shin Yoo, and Mark Harman. 2017. Inferring Automatic Test Oracles. In Proceedings of the International Workshop on Search-Based Software Testing (SBST '17). IEEE, 5-6.Google ScholarDigital Library
- Y. Lavinas, C. Aranha, T. Sakurai, and M. Ladeira. 2018. Experimental Analysis of the Tournament Size on Genetic Algorithms. In International Conference on Systems, Man, and Cybernetics (SMC '18'). IEEE, 3647-3653.Google Scholar
- David Lo and Shahar Maoz. 2009. Mining Scenario-Based Specifications with Value-Based Invariants. In Proceedings of the International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA '09). ACM, 755-756.Google ScholarDigital Library
- Brad L. Miller, Brad L. Miller, David E. Goldberg, and David E. Goldberg. 1995. Genetic Algorithms, Tournament Selection, and the Efects of Noise. Complex Systems 9, 3 ( 1995 ), 193-212.Google Scholar
- Cu D. Nguyen, Alessandro Marchetto, and Paolo Tonella. 2013. Automated Oracles: An Empirical Study on Cost and Efectiveness. In Proceedings of the European Software Engineering Conference held jointly with the ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC/FSE '13). ACM, 136-146.Google ScholarDigital Library
- Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, and Thomas Ball. 2007. Feedback-Directed Random Test Generation. In Proceedings of the International Conference on Software Engineering (ICSE '07). ACM, 75-84.Google ScholarDigital Library
- Annibale Panichella, Rocco Oliveto, Massimiliano Di Penta, and Andrea De Lucia. 2015. Improving Multi-Objective Test Case Selection by Injecting Diversity in Genetic Algorithms. IEEE Transactions on Software Engineering 41, 4 ( 2015 ), 358-383.Google ScholarDigital Library
- Corina S. Pasareanu and Willem Visser. 2004. Verification of Java Programs Using Symbolic Execution and Invariant Generation. In Proceedings of the International SPIN Workshop on SPIN Model Checking and Software Verification (SPIN '04). Springer, 164-181.Google Scholar
- Long H. Pham, Jun Sun, Lyly Tran Thi, Jingyi Wang, and Xin Peng. 2017. Learning Likely Invariants to Explain Why a Program Fails. In Proocedings of the International Conference on Engineering of Complex Computer Systems (ICECCS '17). IEEE, 70-79.Google ScholarCross Ref
- Dipesh Pradhan, Shuai Wang, Shaukat Ali, Tao Yue, and Marius Liaaen. 2017. CBGA-ES: A Cluster-Based Genetic Algorithm with Elitist Selection for Supporting Multi-Objective Test Optimization. In Proceedings of the International Conference on Software Testing, Verification and Validation (ICST '17). IEEE, 367-378.Google ScholarCross Ref
- Sam Ratclif, David R. White, and John A. Clark. 2011. Searching for Invariants Using Genetic Programming and Mutation Testing. In Proceedings of the conference on Genetic and Evolutionary Computation (GECCO '11). ACM, 1907-1914.Google Scholar
- Henry G. Rice. 1953. Classes of Recursively Enumerable Sets and Their Decision Problems. Transactions of the American Mathematical Society 74, 2 ( 1953 ), 358-366.Google ScholarCross Ref
- Abhik Roychoudhury and I. V. Ramakrishnan. 2004. Inductively Verifying Invariant Properties of Parameterized Systems. Automated Software Engineering 11, 2 ( 2004 ), 101-139.Google Scholar
- D. Schuler and A. Zeller. 2011. Assessing Oracle Quality with Checked Coverage. In Proceedings of the International Conference on Software Testing, Verification and Validation (ICST '11). 90-99.Google Scholar
- Oren Shoval, Hila Sheftel, Guy Shinar, Yuval Hart, Omer Ramote, Avi Mayo, Erez Dekel, Kathryn Kavanagh, and Uri Alon. 2012. Evolutionary Trade-Ofs, Pareto Optimality, and the Geometry of Phenotype Space. Science 336, 6085 ( 2012 ), 1157-1160.Google Scholar
- Matt Staats, Shin Hong, Moonzoo Kim, and Gregg Rothermel. 2012. Understanding User Understanding: Determining Correctness of Generated Program Invariants. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA '12). ACM, 188-198.Google ScholarDigital Library
- Hisashi Tamaki, Hajime Kita, and Shigenobu Kobayashi. 1996. Multi-Objective Optimization by Genetic Algorithms: A Review. In Proceedings of IEEE International Conference on Evolutionary Computation. IEEE, 517-522.Google ScholarCross Ref
- Tao Xie, D. Notkin, and D. Marinov. 2004. Rostra: a Framework for Detecting Redundant Object-Oriented Unit Tests. In Proceedings of the International Conference on Automated Software Engineering (ASE 04). 196-205.Google Scholar
- Shuai Wang, Shaukat Ali, Tao Yue, and Marius Liaaen. 2018. Integrating Weight Assignment Strategies With NSGA-II for Supporting User Preference Multiobjective Optimization. IEEE Transation on Evolutionary Computation 22, 3 ( 2018 ), 378-393.Google ScholarCross Ref
- Julian West. 1995. Generating Trees and the Catalan and Schröder Numbers. Discrete Mathematics 146, 1-3 ( 1995 ), 247-262.Google ScholarDigital Library
- Darrell Whitley. 1994. A Genetic Algorithm Tutorial. Statistics and Computing 4, 2 ( 1994 ), 65-85.Google Scholar
- Lingming Zhang, Guowei Yang, Neha Rungta, Suzette Person, and Sarfraz Khurshid. 2014. Feedback-driven dynamic invariant discovery. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA '14). ACM, 362-372.Google ScholarDigital Library
Index Terms
- Evolutionary improvement of assertion oracles
Recommendations
Improving assertion oracles with evolutionary computation
GECCO '21: Proceedings of the Genetic and Evolutionary Computation Conference CompanionAssertion oracles are executable boolean expressions placed inside a software program that verify the correctness of test executions. A perfect assertion oracle passes (returns true) for all correct executions and fails (returns false) for all incorrect ...
OASIs: oracle assessment and improvement tool
ISSTA 2018: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and AnalysisThe oracle problem remains one of the key challenges in software testing, for which little automated support has been developed so far. We introduce OASIs, a search-based tool for Java that assists testers in oracle assessment and improvement. It does ...
GAssert: a fully automated tool to improve assertion oracles
ICSE '21: Proceedings of the 43rd International Conference on Software Engineering: Companion ProceedingsThis demo presents the implementation and usage details of GAssert, the first tool to automatically improve assertion oracles. Assertion oracles are executable boolean expressions placed inside the program that should pass (return true) for all correct ...
Comments