Skip to main content
Log in

High-level software requirements and iteration changes: a predictive model

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Knowing whether a software feature will be completed in its planned iteration can help with release planning decisions. However, existing research has focused on predictions of only low-level software tasks, like bug fixes. In this paper, we describe a mixed-method empirical study on three large IBM projects. We investigated the types of iteration changes that occur. We show that up to 54% of high-level requirements do not make their planned iteration. Requirements are most often pushed out to the next iteration, but high-level requirements are also commonly moved to the next minor or major release or returned to the product or release backlog. We developed and evaluated a model that uses machine learning to predict if a high-level requirement will be completed within its planned iteration. The model includes 29 features that were engineered based on prior work, interviews with IBM developers, and domain knowledge. Predictions were made at four different stages of the requirement lifetime. Our model is able to achieve up to 100% precision. We ranked the importance of our model features and found that some features are highly dependent on project and prediction stage. However, some features (e.g., the time remaining in the iteration and creator of the requirement) emerge as important across all projects and stages. We conclude with a discussion on future research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. Note that the term feature refers to a model feature, not to be confused with a software feature.

References

  • Abdelmoez W, Kholief M, Elsalmy FM (2012) Bug fix-time prediction model using naïve bayes classifier. In: 2012 22nd international conference on computer theory and applications (ICCTA). IEEE, pp 167–172

  • Al Alam SD, Karim MR, Pfahl D, Ruhe G (2016) Comparative analysis of predictive techniques for release readiness classification. In: 2016 IEEE/ACM 5th international workshop on realizing artificial intelligence synergies in software engineering (RAISE). IEEE, pp 15–21

  • Alam A, Didar S, Nayebi M, Pfahl D, Ruhe G (2017) A two-staged survey on release readiness. In: Proceedings of the 21st international conference on evaluation and assessment in software engineering. ACM, pp 374–383

  • Alam A, Didar S, Pfahl D, Ruhe G (2016) Release readiness classification: an explorative case study. In: Proceedings of the 10th ACM/IEEE international symposium on empirical software engineering and measurement. ACM, p 27

  • Assar S, Borg M, Pfahl D (2016) Using text clustering to predict defect resolution time: a conceptual replication and an evaluation of prediction accuracy. Empir Softw Eng 21(4):1437–1475

    Article  Google Scholar 

  • Asthana A, Olivieri J (2009) Quantifying software reliability and readiness. In: IEEE international workshop technical committee on communications quality and reliability, 2009. CQR 2009. IEEE, pp 1–6

  • Azhar D, Riddle P, Mendes E, Mittas N, Angelis L (2013) Using ensembles for web effort estimation. In: 2013 ACM/IEEE international symposium on empirical software engineering and measurement. IEEE, pp 173–182

  • Bhattacharya P, Neamtiu I (2011) Bug-fix time prediction models: can we do better?. In: Proceedings of the 8th working conference on mining software repositories. ACM, pp 207–210

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  • Brettschneider R (1989) Is your software ready for release? IEEE Softw 6(4):100

    Article  Google Scholar 

  • Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans Evol Comput 6(2):182–197

    Article  Google Scholar 

  • Dehghan A, Blincoe K, Damian D (2016) A hybrid model for task completion effort estimation. In: Proceedings of the 2nd international workshop on software analytics. ACM, pp 22–28

  • Dehghan A, Neal A, Blincoe K, Linaker J, Damian D (2017) Predicting likelihood of requirement implementation within the planned iteration: an empirical study at ibm. In: Proceedings of the 14th international conference on mining software repositories. IEEE Press, pp 124–134

  • Domingos P (2012) A few useful things to know about machine learning. Commun ACM 55(10):78–87

    Article  Google Scholar 

  • Duc AN, Cruzes DS, Ayala C, Conradi R (2011) Impact of stakeholder type and collaboration on issue resolution time in oss projects. In: IFIP international conference on open source systems. Springer, pp 1–16

  • Easterbrook S, Singer J, Storey MA, Damian D (2008) Selecting empirical methods for software engineering research. In: Guide to advanced empirical software eng. Springer, pp 285–311

  • Efron B (1983) Estimating the error rate of a prediction rule: improvement on cross-validation. J Am Stat Assoc 78(382):316–331

    Article  MathSciNet  MATH  Google Scholar 

  • Fernández-Delgado M, Cernadas E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real world classification problems. J Mach Learn Res 15 (1):3133–3181

    MathSciNet  MATH  Google Scholar 

  • Fortmann-Roe S (2012) Understanding the bias-variance tradeoff

  • Franch X, Ruhe G (2016) Software release planning. In: Proceedings of the 38th international conference on software engineering companion. ACM, pp 894–895

  • Giger E, Pinzger M, Gall H (2010) Predicting the fix time of bugs. In: Proceedings of the 2nd international workshop on recommendation systems for software engineering. ACM, pp 52–56

  • Gueorguiev S, Harman M, Antoniol G (2009) Software project planning for robustness and completion time in the presence of uncertainty using multi objective search based software engineering. In: Proceedings of the 11th annual conference on genetic and evolutionary computation. ACM, pp 1673–1680

  • Guo PJ, Zimmermann T, Nagappan N, Murphy B (2010) Characterizing and predicting which bugs get fixed: an empirical study of microsoft windows. In: 2010 ACM/IEEE 32nd international conference on software engineering, vol 1. IEEE, pp 495–504

  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newslett 11(1):10–18

    Article  Google Scholar 

  • He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284

    Article  Google Scholar 

  • Heikkilä VT, Damian D, Lassenius C, Paasivaara M (2015) A mapping study on requirements engineering in agile software development. In: 41St euromicro conference on software engineering and advanced applications (SEAA). IEEE, pp 199–207

  • Idri A, Hosni M, Abran A (2016) Systematic literature review of ensemble effort estimation. J Syst Softw 118:151–175

    Article  Google Scholar 

  • Jeong G, Kim S, Zimmermann T (2009) Improving bug triage with bug tossing graphs. In: Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering. ACM, pp 111–120

  • Jorgensen M (2014) What we do and don’t know about software development effort estimation. IEEE Softw 31(2):37–40

    Article  Google Scholar 

  • Kikas R, Dumas M, Pfahl D (2016) Using dynamic and contextual features to predict issue lifetime in github projects. In: Proceedings of the 13th international conference on mining software repositories. ACM, pp 291–302

  • Klinkenberg R (2013) Rapidminer: data mining use cases and business analytics applications. Chapman and Hall/CRC

  • Kocaguneli E, Menzies T (2013) Software effort models should be assessed via leave-one-out validation. J Syst Softw 86(7):1879–1890

    Article  Google Scholar 

  • Kocaguneli E, Menzies T, Keung JW (2012) On the value of ensemble effort estimation. IEEE Trans Softw Eng 38(6):1403–1416

    Article  Google Scholar 

  • Lindstrom L, Jeffries R (2004) Extreme programming and agile software development methodologies. Inf Syst Manag 21(3):41–52

    Article  Google Scholar 

  • Liu H, Setiono R (1995) Chi2: feature selection and discretization of numeric attributes. In: Proceedings of the seventh international conference on tools with artificial intelligence, 1995. IEEE, pp 388–391

  • Marks L, Zou Y, Hassan AE (2011) Studying the fix-time for bugs in large open source projects. In: Proceedings of the 7th international conference on predictive models in software engineering. ACM, p 11

  • McBride M (2014) Is your team ready to release?. In: Managing projects in the real world. Springer, pp 171–182

  • McConnell S (1997) Gauging software readiness with defect tracking. IEEE Softw 14(3):136

    Article  Google Scholar 

  • McConnell S (1998) Software project survival guide. Pearson Education

  • Menzies T, Dekhtyar A, Distefano J, Greenwald J (2007) Problems with precision: a response to comments on data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(9):637–640

    Article  Google Scholar 

  • Minku LL, Mendes E, Turhan B (2016) Data mining for software engineering and humans in the loop. Progress Artif Intell 5(4):307–314

    Article  Google Scholar 

  • Panjer LD (2007) Predicting eclipse bug lifetimes. In: Proceedings of the fourth international workshop on mining software repositories. IEEE Computer Society, p 29

  • Pearse T, Freeman T, Oman P (1999) Using metrics to manage the end-game of a software project. In: Proceedings of the sixth international software on metrics symposium, 1999. IEEE, pp 207–215

  • Petersen K, Wohlin C (2009) Context in industrial software engineering research. In: Proceedings of the 2009 3rd international symposium on empirical software engineering and measurement. IEEE Computer Society, pp 401–404

  • Pfahl D, Karus S, Stavnycha M (2016) Improving expert prediction of issue resolution time. In: Proceedings of the 20th international conference on evaluation and assessment in software engineering. ACM, p 42

  • Price K, Storn RM, Lampinen JA (2006) Differential evolution: a practical approach to global optimization. Springer Science & Business Media, Berlin

    MATH  Google Scholar 

  • Quah TS (2009) Estimating software readiness using predictive models. Inf Sci 179(4):430–445

    Article  Google Scholar 

  • Raja U (2013) All complaints are not created equal: text analysis of open source software defect reports. Empir Softw Eng 18(1):117–138

    Article  Google Scholar 

  • Ramarao P, Muthukumaran K, Dash S, Murthy NB (2016) Impact of bug reporter’s reputation on bug-fix times. In: 2016 international conference on information systems engineering (ICISE). IEEE, pp 57–61

  • Rothman J (2002) Release criteria: is this software done? STQE magazine

  • Rothman J (2014) Measurements to reduce risk in product ship decisions

  • Ruhe G (2005) Software release planning. In: Handbook of software engineering and knowledge engineering: Vol 3: Recent advances. World Scientific, pp 365–393

  • Satapathy PR (2013) Evaluation of software release readiness metric [0, 1] across the software development life cycle. Department of Computer Science & Engineering. University of California, Riverside

    Google Scholar 

  • Selya AS, Rose JS, Dierker LC, Hedeker D, Mermelstein RJ (2012) A practical guide to calculating cohen’s f2, a measure of local effect size, from proc mixed. Front Psychol 3:111

    Article  Google Scholar 

  • Seymour J (1988) Software delays: truth or consequences. PC Mag 7(12):77–78

    Google Scholar 

  • Ting KM (2002) An instance-weighting method to induce cost-sensitive trees. IEEE Trans Knowl Data Eng 14(3):659–665

    Article  MathSciNet  Google Scholar 

  • Ware M, Wilkie FG, Shapcott M (2008) The use of intra-release product measures in predicting release readiness. In: 2008 1st international conference on software testing, verification, and validation. IEEE, pp 230–237

  • Weiss C, Premraj R, Zimmermann T, Zeller A (2007) How long will it take to fix this bug?. In: Proceedings of the fourth international workshop on mining software repositories. IEEE Computer Society, p 1

  • Wild R, Brune P (2012) Determining software product release readiness by the change-error correlation function: on the importance of the change-error time lag. In: 2012 45th Hawaii international conference on system science (HICSS). IEEE, pp 5360–5367

  • Yang F, Wang HZ, Mi H, Cai WW, et al (2009) Using random forest for reliable classification and cost-sensitive learning for medical diagnosis. BMC Bioinformatics 10(1):S22

    Article  Google Scholar 

  • Zeng H, Rine D (2004) Estimation of software defects fix effort using neural networks. In: Proceedings of the 28th annual international computer software and applications conference, 2004. COMPSAC 2004. vol 2. IEEE, pp 20–21

  • Zhang H, Zhang X (2007) Comments on data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(9):635–637

    Article  Google Scholar 

  • Zhang Q, Li H (2007) Moea/d: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evol Comput 11(6):712–731

    Article  Google Scholar 

  • Zhou Y, Tong Y, Gu R, Gall H (2016) Combining text mining and data mining for bug report classification. Journal of Software:Evolution and Process

Download references

Acknowledgements

Special thanks to Fabio Calefato from University of Bari, Alan Yeung from Persistent Systems and the IBM team.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kelly Blincoe.

Additional information

Communicated by: Abram Hindle and Lin Tan

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Blincoe, K., Dehghan, A., Salaou, AD. et al. High-level software requirements and iteration changes: a predictive model. Empir Software Eng 24, 1610–1648 (2019). https://doi.org/10.1007/s10664-018-9656-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-018-9656-z

Keywords

Navigation