High-level software requirements and iteration changes: a predictive model

Blincoe, Kelly; Dehghan, Ali; Salaou, Abdoul-Djawadou; Neal, Adam; Linaker, Johan; Damian, Daniela

doi:10.1007/s10664-018-9656-z

High-level software requirements and iteration changes: a predictive model

Published: 15 October 2018

Volume 24, pages 1610–1648, (2019)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Kelly Blincoe ORCID: orcid.org/0000-0003-4092-9706¹,
Ali Dehghan²,
Abdoul-Djawadou Salaou²,
Adam Neal³,
Johan Linaker⁴ &
…
Daniela Damian²

893 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

Knowing whether a software feature will be completed in its planned iteration can help with release planning decisions. However, existing research has focused on predictions of only low-level software tasks, like bug fixes. In this paper, we describe a mixed-method empirical study on three large IBM projects. We investigated the types of iteration changes that occur. We show that up to 54% of high-level requirements do not make their planned iteration. Requirements are most often pushed out to the next iteration, but high-level requirements are also commonly moved to the next minor or major release or returned to the product or release backlog. We developed and evaluated a model that uses machine learning to predict if a high-level requirement will be completed within its planned iteration. The model includes 29 features that were engineered based on prior work, interviews with IBM developers, and domain knowledge. Predictions were made at four different stages of the requirement lifetime. Our model is able to achieve up to 100% precision. We ranked the importance of our model features and found that some features are highly dependent on project and prediction stage. However, some features (e.g., the time remaining in the iteration and creator of the requirement) emerge as important across all projects and stages. We conclude with a discussion on future research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Prediction of the Successful Completion of Requirements in Software Development—An Initial Study

Learning actionable analytics from multiple software projects

Article 24 July 2020

Revisiting process versus product metrics: a large scale analysis

Article 17 March 2022

Notes

Note that the term feature refers to a model feature, not to be confused with a software feature.

References

Abdelmoez W, Kholief M, Elsalmy FM (2012) Bug fix-time prediction model using naïve bayes classifier. In: 2012 22nd international conference on computer theory and applications (ICCTA). IEEE, pp 167–172
Al Alam SD, Karim MR, Pfahl D, Ruhe G (2016) Comparative analysis of predictive techniques for release readiness classification. In: 2016 IEEE/ACM 5th international workshop on realizing artificial intelligence synergies in software engineering (RAISE). IEEE, pp 15–21
Alam A, Didar S, Nayebi M, Pfahl D, Ruhe G (2017) A two-staged survey on release readiness. In: Proceedings of the 21st international conference on evaluation and assessment in software engineering. ACM, pp 374–383
Alam A, Didar S, Pfahl D, Ruhe G (2016) Release readiness classification: an explorative case study. In: Proceedings of the 10th ACM/IEEE international symposium on empirical software engineering and measurement. ACM, p 27
Assar S, Borg M, Pfahl D (2016) Using text clustering to predict defect resolution time: a conceptual replication and an evaluation of prediction accuracy. Empir Softw Eng 21(4):1437–1475
Article Google Scholar
Asthana A, Olivieri J (2009) Quantifying software reliability and readiness. In: IEEE international workshop technical committee on communications quality and reliability, 2009. CQR 2009. IEEE, pp 1–6
Azhar D, Riddle P, Mendes E, Mittas N, Angelis L (2013) Using ensembles for web effort estimation. In: 2013 ACM/IEEE international symposium on empirical software engineering and measurement. IEEE, pp 173–182
Bhattacharya P, Neamtiu I (2011) Bug-fix time prediction models: can we do better?. In: Proceedings of the 8th working conference on mining software repositories. ACM, pp 207–210
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article MATH Google Scholar
Brettschneider R (1989) Is your software ready for release? IEEE Softw 6(4):100
Article Google Scholar
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans Evol Comput 6(2):182–197
Article Google Scholar
Dehghan A, Blincoe K, Damian D (2016) A hybrid model for task completion effort estimation. In: Proceedings of the 2nd international workshop on software analytics. ACM, pp 22–28
Dehghan A, Neal A, Blincoe K, Linaker J, Damian D (2017) Predicting likelihood of requirement implementation within the planned iteration: an empirical study at ibm. In: Proceedings of the 14th international conference on mining software repositories. IEEE Press, pp 124–134
Domingos P (2012) A few useful things to know about machine learning. Commun ACM 55(10):78–87
Article Google Scholar
Duc AN, Cruzes DS, Ayala C, Conradi R (2011) Impact of stakeholder type and collaboration on issue resolution time in oss projects. In: IFIP international conference on open source systems. Springer, pp 1–16
Easterbrook S, Singer J, Storey MA, Damian D (2008) Selecting empirical methods for software engineering research. In: Guide to advanced empirical software eng. Springer, pp 285–311
Efron B (1983) Estimating the error rate of a prediction rule: improvement on cross-validation. J Am Stat Assoc 78(382):316–331
Article MathSciNet MATH Google Scholar
Fernández-Delgado M, Cernadas E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real world classification problems. J Mach Learn Res 15 (1):3133–3181
MathSciNet MATH Google Scholar
Fortmann-Roe S (2012) Understanding the bias-variance tradeoff
Franch X, Ruhe G (2016) Software release planning. In: Proceedings of the 38th international conference on software engineering companion. ACM, pp 894–895
Giger E, Pinzger M, Gall H (2010) Predicting the fix time of bugs. In: Proceedings of the 2nd international workshop on recommendation systems for software engineering. ACM, pp 52–56
Gueorguiev S, Harman M, Antoniol G (2009) Software project planning for robustness and completion time in the presence of uncertainty using multi objective search based software engineering. In: Proceedings of the 11th annual conference on genetic and evolutionary computation. ACM, pp 1673–1680
Guo PJ, Zimmermann T, Nagappan N, Murphy B (2010) Characterizing and predicting which bugs get fixed: an empirical study of microsoft windows. In: 2010 ACM/IEEE 32nd international conference on software engineering, vol 1. IEEE, pp 495–504
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newslett 11(1):10–18
Article Google Scholar
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
Article Google Scholar
Heikkilä VT, Damian D, Lassenius C, Paasivaara M (2015) A mapping study on requirements engineering in agile software development. In: 41St euromicro conference on software engineering and advanced applications (SEAA). IEEE, pp 199–207
Idri A, Hosni M, Abran A (2016) Systematic literature review of ensemble effort estimation. J Syst Softw 118:151–175
Article Google Scholar
Jeong G, Kim S, Zimmermann T (2009) Improving bug triage with bug tossing graphs. In: Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering. ACM, pp 111–120
Jorgensen M (2014) What we do and don’t know about software development effort estimation. IEEE Softw 31(2):37–40
Article Google Scholar
Kikas R, Dumas M, Pfahl D (2016) Using dynamic and contextual features to predict issue lifetime in github projects. In: Proceedings of the 13th international conference on mining software repositories. ACM, pp 291–302
Klinkenberg R (2013) Rapidminer: data mining use cases and business analytics applications. Chapman and Hall/CRC
Kocaguneli E, Menzies T (2013) Software effort models should be assessed via leave-one-out validation. J Syst Softw 86(7):1879–1890
Article Google Scholar
Kocaguneli E, Menzies T, Keung JW (2012) On the value of ensemble effort estimation. IEEE Trans Softw Eng 38(6):1403–1416
Article Google Scholar
Lindstrom L, Jeffries R (2004) Extreme programming and agile software development methodologies. Inf Syst Manag 21(3):41–52
Article Google Scholar
Liu H, Setiono R (1995) Chi2: feature selection and discretization of numeric attributes. In: Proceedings of the seventh international conference on tools with artificial intelligence, 1995. IEEE, pp 388–391
Marks L, Zou Y, Hassan AE (2011) Studying the fix-time for bugs in large open source projects. In: Proceedings of the 7th international conference on predictive models in software engineering. ACM, p 11
McBride M (2014) Is your team ready to release?. In: Managing projects in the real world. Springer, pp 171–182
McConnell S (1997) Gauging software readiness with defect tracking. IEEE Softw 14(3):136
Article Google Scholar
McConnell S (1998) Software project survival guide. Pearson Education
Menzies T, Dekhtyar A, Distefano J, Greenwald J (2007) Problems with precision: a response to comments on data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(9):637–640
Article Google Scholar
Minku LL, Mendes E, Turhan B (2016) Data mining for software engineering and humans in the loop. Progress Artif Intell 5(4):307–314
Article Google Scholar
Panjer LD (2007) Predicting eclipse bug lifetimes. In: Proceedings of the fourth international workshop on mining software repositories. IEEE Computer Society, p 29
Pearse T, Freeman T, Oman P (1999) Using metrics to manage the end-game of a software project. In: Proceedings of the sixth international software on metrics symposium, 1999. IEEE, pp 207–215
Petersen K, Wohlin C (2009) Context in industrial software engineering research. In: Proceedings of the 2009 3rd international symposium on empirical software engineering and measurement. IEEE Computer Society, pp 401–404
Pfahl D, Karus S, Stavnycha M (2016) Improving expert prediction of issue resolution time. In: Proceedings of the 20th international conference on evaluation and assessment in software engineering. ACM, p 42
Price K, Storn RM, Lampinen JA (2006) Differential evolution: a practical approach to global optimization. Springer Science & Business Media, Berlin
MATH Google Scholar
Quah TS (2009) Estimating software readiness using predictive models. Inf Sci 179(4):430–445
Article Google Scholar
Raja U (2013) All complaints are not created equal: text analysis of open source software defect reports. Empir Softw Eng 18(1):117–138
Article Google Scholar
Ramarao P, Muthukumaran K, Dash S, Murthy NB (2016) Impact of bug reporter’s reputation on bug-fix times. In: 2016 international conference on information systems engineering (ICISE). IEEE, pp 57–61
Rothman J (2002) Release criteria: is this software done? STQE magazine
Rothman J (2014) Measurements to reduce risk in product ship decisions
Ruhe G (2005) Software release planning. In: Handbook of software engineering and knowledge engineering: Vol 3: Recent advances. World Scientific, pp 365–393
Satapathy PR (2013) Evaluation of software release readiness metric [0, 1] across the software development life cycle. Department of Computer Science & Engineering. University of California, Riverside
Google Scholar
Selya AS, Rose JS, Dierker LC, Hedeker D, Mermelstein RJ (2012) A practical guide to calculating cohen’s f2, a measure of local effect size, from proc mixed. Front Psychol 3:111
Article Google Scholar
Seymour J (1988) Software delays: truth or consequences. PC Mag 7(12):77–78
Google Scholar
Ting KM (2002) An instance-weighting method to induce cost-sensitive trees. IEEE Trans Knowl Data Eng 14(3):659–665
Article MathSciNet Google Scholar
Ware M, Wilkie FG, Shapcott M (2008) The use of intra-release product measures in predicting release readiness. In: 2008 1st international conference on software testing, verification, and validation. IEEE, pp 230–237
Weiss C, Premraj R, Zimmermann T, Zeller A (2007) How long will it take to fix this bug?. In: Proceedings of the fourth international workshop on mining software repositories. IEEE Computer Society, p 1
Wild R, Brune P (2012) Determining software product release readiness by the change-error correlation function: on the importance of the change-error time lag. In: 2012 45th Hawaii international conference on system science (HICSS). IEEE, pp 5360–5367
Yang F, Wang HZ, Mi H, Cai WW, et al (2009) Using random forest for reliable classification and cost-sensitive learning for medical diagnosis. BMC Bioinformatics 10(1):S22
Article Google Scholar
Zeng H, Rine D (2004) Estimation of software defects fix effort using neural networks. In: Proceedings of the 28th annual international computer software and applications conference, 2004. COMPSAC 2004. vol 2. IEEE, pp 20–21
Zhang H, Zhang X (2007) Comments on data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(9):635–637
Article Google Scholar
Zhang Q, Li H (2007) Moea/d: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evol Comput 11(6):712–731
Article Google Scholar
Zhou Y, Tong Y, Gu R, Gall H (2016) Combining text mining and data mining for bug report classification. Journal of Software:Evolution and Process

Download references

Acknowledgements

Special thanks to Fabio Calefato from University of Bari, Alan Yeung from Persistent Systems and the IBM team.

Author information

Authors and Affiliations

University of Auckland, Auckland, New Zealand
Kelly Blincoe
University of Victoria, Victoria, BC, V8P 5C2, Canada
Ali Dehghan, Abdoul-Djawadou Salaou & Daniela Damian
Persistent Systems, Toronto, ON, Canada
Adam Neal
Lund University, Lund, Sweden
Johan Linaker

Authors

Kelly Blincoe
View author publications
You can also search for this author in PubMed Google Scholar
Ali Dehghan
View author publications
You can also search for this author in PubMed Google Scholar
Abdoul-Djawadou Salaou
View author publications
You can also search for this author in PubMed Google Scholar
Adam Neal
View author publications
You can also search for this author in PubMed Google Scholar
Johan Linaker
View author publications
You can also search for this author in PubMed Google Scholar
Daniela Damian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kelly Blincoe.

Additional information

Communicated by: Abram Hindle and Lin Tan

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Blincoe, K., Dehghan, A., Salaou, AD. et al. High-level software requirements and iteration changes: a predictive model. Empir Software Eng 24, 1610–1648 (2019). https://doi.org/10.1007/s10664-018-9656-z

Download citation

Published: 15 October 2018
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s10664-018-9656-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High-level software requirements and iteration changes: a predictive model

Abstract

Access this article

Similar content being viewed by others

Prediction of the Successful Completion of Requirements in Software Development—An Initial Study

Learning actionable analytics from multiple software projects

Revisiting process versus product metrics: a large scale analysis

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

High-level software requirements and iteration changes: a predictive model

Abstract

Access this article

Similar content being viewed by others

Prediction of the Successful Completion of Requirements in Software Development—An Initial Study

Learning actionable analytics from multiple software projects

Revisiting process versus product metrics: a large scale analysis

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation