ABSTRACT
The usage of open source (OS) software is wide-spread across many industries. While the functional quality of OS projects is considered to be similar to closed-source software, much is unknown about the quality in terms of performance. One challenge for OS developers is that, unlike for functional testing, there is a lack of accepted best practices for performance testing. To reveal the state of practice of performance testing in OS projects, we conduct an exploratory study on 111 Java-based OS projects from GitHub. We study the performance tests of these projects from five perspectives: (1) developers, (2) size, (3) test organization, (4) types of performance tests and (5) used tooling. We show that writing performance tests is not a popular task in OS projects: performance tests form only a small portion of the test suite, are rarely updated, and are usually maintained by a small group of core project developers. Further, even though many projects are aware that they need performance tests, developers appear to struggle implementing them. We argue that future performance testing frameworks should provider better support for low-friction testing, for instance via non-parameterized methods or performance test generation, as well as focus on a tight integration with standard continuous integration tooling.
- M. Aberdour. Achieving quality in open-source software. IEEE Software, 24(1):58--64, Jan 2007. Google ScholarDigital Library
- J. P. S. Alcocer and A. Bergel. Tracking down performance variation against source code evolution. In Proceedings of the 11th Symposium on Dynamic Languages (DLS), pages 129--139. ACM, 2015. Google ScholarDigital Library
- A. Avritzer, J. Kondek, D. Liu, and E. J. Weyuker. Software performance testing based on workload characterization. In Proceedings of the 3rd International Workshop on Software and Performance (WOSP), pages 17--24. ACM, 2002. Google ScholarDigital Library
- S. Baltes, O. Moseler, F. Beck, and S. Diehl. Navigate, understand, communicate: How developers locate performance bugs. In 2015 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pages 1--10, Oct 2015. Google ScholarCross Ref
- M. Beller, G. Gousios, A. Panichella, and A. Zaidman. When, how, and why developers (do not) test in their IDEs. In Proceedings of the 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE), pages 179--190. ACM, 2015. Google ScholarDigital Library
- S. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In Proceedings of the 21st ACM SIGPLAN Conference on Object-oriented Programming Systems, Languages, and Applications (OOPSLA), pages 169--190. ACM, 2006. Google ScholarDigital Library
- J. Cito, P. Leitner, T. Fritz, and H. C. Gall. The making of cloud applications: An empirical study on software development for the cloud. In Proceedings of the 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE), pages 393--403. ACM, 2015. Google ScholarDigital Library
- A. Georges, D. Buytaert, and L. Eeckhout. Statistically rigorous java performance evaluation. In Proceedings of the 22nd ACM SIGPLAN Conference on Object-oriented Programming Systems and Applications (OOPSLA), pages 57--76. ACM, 2007. Google ScholarDigital Library
- X. Han and T. Yu. An empirical study on performance bugs for highly configurable software systems. In Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pages 23:1--23:10, 2016. Google ScholarDigital Library
- C. Heger, J. Happe, and R. Farahbod. Automated Root Cause Isolation of Performance Regressions During Software Development. In Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering (ICPE), pages 27--38, 2013. Google ScholarDigital Library
- I. Herraiz and A. E. Hassan. Beyond lines of code: Do we need more complexity metrics? Making software: what really works, and why we believe it, pages 125--141, 2010.Google Scholar
- V. Horký, P. Libič, L. Marek, A. Steinhauser, and P. Tůma. Utilizing performance unit tests to increase performance awareness. In Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering (ICPE), pages 289--300. ACM, 2015. Google ScholarDigital Library
- P. Huang, X. Ma, D. Shen, and Y. Zhou. Performance regression testing target prioritization via performance risk analysis. In Proceedings of the 36th International Conference on Software Engineering (ICSE), pages 60--71. ACM, 2014. Google ScholarDigital Library
- Z. M. Jiang and A. E. Hassan. A survey on load testing of large-scale software systems. IEEE Transactions on Software Engineering (TSE), 41(11):1091--1118, 2015. Google ScholarCross Ref
- G. Jin, L. Song, X. Shi, J. Scherpelz, and S. Lu. Understanding and detecting real-world performance bugs. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 77--88. ACM, 2012. Google ScholarDigital Library
- J. D. Long, D. Feng, and N. Cliff. Ordinal analysis of behavioral data. Handbook of psychology, 2003. Google ScholarCross Ref
- Q. Luo, D. Poshyvanyk, and M. Grechanik. Mining performance regression inducing code changes in evolving software. In Proceedings of the 13th International Conference on Mining Software Repositories (MSR), pages 25--36. ACM, 2016. Google ScholarDigital Library
- A. Mockus, R. T. Fielding, and J. Herbsleb. A case study of open source software development: The apache server. In Proceedings of the 22nd International Conference on Software Engineering (ICSE), pages 263--272. ACM, 2000. Google ScholarDigital Library
- R. Pooley. Software engineering and performance: A roadmap. In Proceedings of the Conference on The Future of Software Engineering (ICSE), pages 189--199. ACM, 2000. Google ScholarDigital Library
- J. Romano, J. D. Kromrey, J. Coraggio, J. Skowronek, and L. Devine. Exploring methods for evaluating group differences on the NSSE and other surveys: Are the t-test and Cohen's d indices the most appropriate choices. In Annual meeting of the Southern Association for Institutional Research, 2006.Google Scholar
- J. P. Sandoval Alcocer, A. Bergel, and M. T. Valente. Learning from source code history to identify performance failures. In Proceedings of the 7th ACM/SPEC on International Conference on Performance Engineering (ICPE), pages 37--48, 2016. Google ScholarDigital Library
- C. U. Smith and L. G. Williams. Software performance engineering: A case study including performance comparison with design alternatives. IEEE Transactions on Software Engineering (TSE), 19(7):720--741, July 1993. Google ScholarDigital Library
- M. D. Syer, Z. M. Jiang, M. Nagappan, A. E. Hassan, M. Nasser, and P. Flora. Continuous validation of load test suites. In Proceedings of the 5th ACM/SPEC International Conference on Performance Engineering (ICPE), pages 259--270. ACM, 2014. Google ScholarDigital Library
- M. L. Vasquez, C. Vendome, Q. Luo, and D. Poshyvanyk. How developers detect and fix performance bottlenecks in Android apps. In R. Koschke, J. Krinke, and M. P. Robillard, editors, Proceedings of the International Conference on Software Maintenance and Evolution (ICSME), pages 352--361. IEEE Computer Society, 2015.Google Scholar
- E. J. Weyuker and F. I. Vokolos. Experience with performance testing of software systems: Issues, an approach, and case study. IEEE Transactions on Software Engineering (TSE), 26(12):1147--1156, 2000. Google ScholarDigital Library
- S. Zaman, B. Adams, and A. E. Hassan. A Qualitative Study on Performance Bugs. In Proceedings of the 9th IEEE Working Conference on Mining Software Repositories (MSR), pages 199--208. IEEE Press, 2012. Google ScholarCross Ref
- F. Zhang, A. E. Hassan, S. McIntosh, and Y. Zou. The use of summation to aggregate software metrics hinders the performance of defect prediction models. IEEE Transactions on Software Engineering (TSE), (1):1--16, 2016. Google ScholarDigital Library
Index Terms
- An Exploratory Study of the State of Practice of Performance Testing in Java-Based Open Source Projects
Recommendations
Unit Testing Performance in Java Projects: Are We There Yet?
ICPE '17: Proceedings of the 8th ACM/SPEC on International Conference on Performance EngineeringAlthough methods and tools for unit testing of performance exist for over a decade, anecdotal evidence suggests unit testing of performance is not nearly as common as unit testing of functionality. We examine this situation in a study of GitHub projects ...
The Maintenance and Evolution of Field-Representative Performance Tests
ICSME '14: Proceedings of the 2014 IEEE International Conference on Software Maintenance and EvolutionThe rise of large-scale software systems poses new challenges for the software performance engineering field. Failures in these systems are typically associated with performance issues, rather than with feature bugs. Therefore, performance testing has ...
Search-based mutation testing to improve performance tests
GECCO '18: Proceedings of the Genetic and Evolutionary Computation Conference CompanionPerformance bugs are common and can cause a significant deterioration in the behaviour of a program, leading to costly issues. To detect them and reduce their impact, performance tests are typically applied. However, there is a lack of mechanisms to ...
Comments