Skip to main content

Using a Genetic Algorithm to Optimize Configurations in a Data-Driven Application

  • Conference paper
  • First Online:
Book cover Search-Based Software Engineering (SSBSE 2020)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 12420))

Included in the following conference series:

Abstract

Users of highly-configurable software systems often want to optimize a particular objective such as improving a functional outcome or increasing system performance. One approach is to use an evolutionary algorithm. However, many applications today are data-driven, meaning they depend on inputs or data which can be complex and varied. Hence, a search needs to be run (and re-run) for all inputs, making optimization a heavy-weight and potentially impractical process. In this paper, we explore this issue on a data-driven highly-configurable scientific application. We build an exhaustive database containing 3,000 configurations and 10,000 inputs, leading to almost 100 million records as our oracle, and then run a genetic algorithm individually on each of the 10,000 inputs. We ask if (1) a genetic algorithm can find configurations to improve functional objectives; (2) whether patterns of best configurations over all input data emerge; and (3) if we can we use sampling to approximate the results. We find that the original (default) configuration is best only 34% of the time, while clear patterns emerge of other best configurations. Out of 3,000 possible configurations, only 112 distinct configurations achieve the optimal result at least once across all 10,000 inputs, suggesting the potential for lighter weight optimization approaches. We show that sampling of the input data finds similar patterns at a lower cost.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Each run of the application can return multiple answers leading to many more records than 3,000 \(\times \) 10,000.

  2. 2.

    Supplementary data website: https://github.com/LavaOps/ssbse-2020-FrDdE.

References

  1. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215 (2018). https://blast.ncbi.nlm.nih.gov/

  2. Cashman, M., Cohen, M.B., Ranjan, P., Cottingham, R.W.: Navigating the maze: the impact of configurability in bioinformatics software. In: International Conference on Automated Software Engineering, pp. 757–767. ASE, September 2018

    Google Scholar 

  3. Garvin, B.J., Cohen, M.B., Dwyer, M.B.: Evaluating improvements to a meta-heuristic search for constrained interaction testing. Empir. Softw. Eng. (EMSE) 16, 61–102 (2010)

    Article  Google Scholar 

  4. Garvin, B.J., Cohen, M.B., Dwyer, M.B.: Failure avoidance in configurable systems through feature locality. In: Cámara, J., de Lemos, R., Ghezzi, C., Lopes, A. (eds.) Assurances for Self-Adaptive Systems. LNCS, vol. 7740, pp. 266–296. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36249-1_10

    Chapter  Google Scholar 

  5. Henard, C., Papadakis, M., Harman, M., Le Traon, Y.: Combining multi-objective search and constraint solving for configuring large software product lines. In: IEEE/ACM 37th IEEE International Conference on Software Engineering, vol. 1, pp. 517–528 (2015)

    Google Scholar 

  6. Henard, C., Papadakis, M., Perrouin, G., Klein, J., Heymans, P., Le Traon, Y.: Bypassing the combinatorial explosion: using similarity to generate and prioritize T-wise test configurations for software product lines. IEEE Trans. Softw. Eng. 40(7), 650–670 (2014)

    Article  Google Scholar 

  7. Jamshidi, P., Siegmund, N., Velez, M., Kästner, C., Patel, A., Agarwal, Y.: Transfer learning for performance modeling of configurable systems: an exploratory analysis. In: International Conference on Automated Software Engineering (ASE), pp. 497–508, November 2017

    Google Scholar 

  8. Jamshidi, P., Velez, M., Kästner, C., Siegmund, N.: Learning to sample: exploiting similarities across environments to learn performance models for configurable systems. In: Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 71–82. ESEC/FSE (2018)

    Google Scholar 

  9. Jia, Y., Cohen, M.B., Harman, M., Petke, J.: Learning combinatorial interaction test generation strategies using hyperheuristic search. In: IEEE/ACM 37th IEEE International Conference on Software Engineering, vol. 1, pp. 540–550 (2015)

    Google Scholar 

  10. Krishna, R., Menzies, T., Fu, W.: Too much automation? The bellwether effect and its implications for transfer learning. In: 31st IEEE/ACM International Conference on Automated Software Engineering (ASE). pp. 122–131 (2016)

    Google Scholar 

  11. Krishna, R., Menzies, T.: Bellwethers: a baseline method for transfer learning. IEEE Trans. Softw. Eng. 45(11), 1081–1105 (2019)

    Article  Google Scholar 

  12. Langdon, W.B.: Big data driven genetic improvement for maintenance of legacy software systems. SIGEVOlution Newsl. ACM Spec. Interes. Group Genet. Evol. Comput. 12(3), 6–9 (2019)

    Google Scholar 

  13. Langdon, W.B., Krauss, O.: Evolving sqrt into 1/x via software data maintenance. In: Coello, C.A.C. (ed.) GECCO 2020: Genetic and Evolutionary Computation Conference, Companion Volume, pp. 1928–1936. ACM, July 2020

    Google Scholar 

  14. Medeiros, F., Kästner, C., Ribeiro, M., Gheyi, R., Apel, S.: A comparison of 10 sampling algorithms for configurable systems. In: International Conference on Software Engineering (ICSE), pp. 643–654. ACM, May 2016

    Google Scholar 

  15. Meinicke, J., Wong, C.P., Kästner, C., Thüm, T., Saake, G.: On essential configuration complexity: measuring interactions in highly-configurable systems. In: International Conference on Automated Software Engineering (ASE), pp. 483–494. ACM, September 2016

    Google Scholar 

  16. Nair, V., et al.: Data-driven search-based software engineering. In: IEEE/ACM 15th International Conference on Mining Software Repositories (MSR), pp. 341–352 (2018)

    Google Scholar 

  17. Nair, V., Menzies, T., Siegmund, N., Apel, S.: Using bad learners to find good configurations. In: Joint Meeting on Foundations of Software Engineering, pp. 257–267. ESEC/FSE (2017)

    Google Scholar 

  18. Oh, J., Batory, D., Myers, M., Siegmund, N.: Finding near-optimal configurations in product lines by random sampling. In: Joint Meeting on Foundations of Software Engineering, p. 61–71. ESEC/FSE (2017)

    Google Scholar 

  19. Qu, X., Cohen, M.B., Rothermel, G.: Configuration-aware regression testing: an empirical study of sampling and prioritization. In: International Symposium on Software Testing and Analysis, pp. 75–86. ISSTA, ACM (2008)

    Google Scholar 

  20. Siegmund, N., Grebhahn, A., Kästner, C., Apel, S.: Performance-influence models for highly configurable systems. In: European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp. 284–294. ACM Press, August 2015

    Google Scholar 

  21. Xiang, Y., Zhou, Y., Zheng, Z., Li, M.: Configuring software product lines by combining many-objective optimization and sat solvers. ACM Trans. Softw. Eng. Methodol. 26(4), 1–46 (2018)

    Article  Google Scholar 

  22. Yilmaz, C., Dumlu, E., Cohen, M.B., Porter, A.: Reducing masking effects in combinatorial interaction testing: a feedback driven adaptive approach. IEEE Trans. Softw. Eng. 40(1), 43–66 (2014)

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported in part by NSF Grant CCF-1901543 and by The Center for Bioenergy Innovation (CBI) which is supported by the Office of Biological and Environmental Research in the DOE Office of Science.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Myra B. Cohen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sinha, U., Cashman, M., Cohen, M.B. (2020). Using a Genetic Algorithm to Optimize Configurations in a Data-Driven Application. In: Aleti, A., Panichella, A. (eds) Search-Based Software Engineering. SSBSE 2020. Lecture Notes in Computer Science(), vol 12420. Springer, Cham. https://doi.org/10.1007/978-3-030-59762-7_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59762-7_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59761-0

  • Online ISBN: 978-3-030-59762-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics