research-article

A hybrid model for task completion effort estimation

Authors:
Ali Dehghan

University of Victoria, Canada

University of Victoria, Canada
View Profile

,
Kelly Blincoe

University of Auckland, New Zealand

University of Auckland, New Zealand
View Profile

,
Daniela Damian

University of Victoria, Canada

University of Victoria, Canada
View Profile

SWAN 2016: Proceedings of the 2nd International Workshop on Software AnalyticsNovember 2016Pages 22–28https://doi.org/10.1145/2989238.2989242

Published:13 November 2016Publication History

SWAN 2016: Proceedings of the 2nd International Workshop on Software Analytics

Pages 22–28

ABSTRACT

Predicting time and effort of software task completion has been an active area of research for a long time. Previous studies have proposed predictive models based on either text data or metadata of software tasks to estimate either completion time or completion effort of software tasks, but there is a lack of focus in the literature on integrating all sets of attributes together to achieve better performing models. We first apply the previously proposed models on the datasets of two IBM commercial projects called RQM and RTC to find the best performing model in predicting task completion effort on each set of attributes. Then we propose an approach to create a hybrid model based on selected individual predictors to achieve more accurate and stable results in early prediction of task completion effort and to make sure the model is not bounded to some attributes and consequently is adoptable to a larger number of tasks. Categorizing task completion effort values into Low and High labels based on their measured median value, we show that our hybrid model provides 3-8% more accuracy in early prediction of task completion effort compared to the best individual predictors.

References

W. Abdelmoez, M. Kholief, and F. M. Elsalmy. Bug fix-time prediction model using na¨ıve bayes classifier. In Comp. Theory and Applications (ICCTA), 2012 22nd Int’l Conf. on, pages 167–172. IEEE, 2012.Google Scholar
J. Anvik, L. Hiew, and G. C. Murphy. Who should fix this bug? In Proc. of the 28th international conference on Software engineering, pages 361–370. ACM, 2006. Google ScholarDigital Library
P. Bhattacharya and I. Neamtiu. Bug-fix time prediction models: can we do better? In Proc. of the 8th Working Conference on Mining Software Repositories, pages 207–210. ACM, 2011. Google ScholarDigital Library
K. O. Elish and M. O. Elish. Predicting defect-prone software modules using support vector machines. Journal of Systems and Software, 81(5):649–660, 2008. Google ScholarDigital Library
E. Giger, M. Pinzger, and H. Gall. Predicting the fix time of bugs. In Proc. of the 2nd International Workshop on Recommendation Systems for Software Engineering, pages 52–56. ACM, 2010. Google ScholarDigital Library
D. Gray, D. Bowes, N. Davey, Y. Sun, and B. Christianson. Using the support vector machine as a classification method for software defect prediction with static code metrics. In Int’l Conf. on Eng. Apps. of Neural Networks, pages 223–234. Springer, 2009.Google Scholar
P. J. Guo, T. Zimmermann, N. Nagappan, and B. Murphy. Characterizing and predicting which bugs get fixed: an empirical study of microsoft windows. In 2010 ACM/IEEE 32nd International Conf. on Software Eng., volume 1, pages 495–504. IEEE, 2010. Google ScholarDigital Library
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The weka data mining software: an update. ACM SIGKDD explorations newsletter, 11(1):10–18, 2009. Google ScholarDigital Library
R. Hewett and P. Kijsanayothin. On modeling software defect repair time. Empirical Software Engineering, 14(2):165–186, 2009. Google ScholarDigital Library
M. Hofmann and R. Klinkenberg. RapidMiner: Data mining use cases and business analytics applications. CRC Press, 2013. Google ScholarDigital Library
M. Jorgensen. What we do and don’t know about software development effort estimation. IEEE software, 31(2), 2014.Google Scholar
Y. Kamei, T. Fukushima, S. McIntosh, K. Yamashita, N. Ubayashi, and A. E. Hassan. Studying just-in-time defect prediction using cross-project models. Empirical Software Engineering, pages 1–35, 2015. Google ScholarDigital Library
J. Kittler, M. Hatef, R. P. Duin, and J. Matas. On combining classifiers. IEEE transactions on pattern analysis and machine intelligence, 20(3):226–239, 1998. Google ScholarDigital Library
L. Marks, Y. Zou, and A. E. Hassan. Studying the fix-time for bugs in large open source projects. In Proc. of the 7th International Conf. on Predictive Models in Software Eng., page 11. ACM, 2011. Google ScholarDigital Library
A. T. Mısırlı, A. B. Bener, and B. Turhan. An industrial case study of classifier ensembles for locating software defects. Software Quality Journal, 19(3):515–536, 2011. Google ScholarDigital Library
L. D. Panjer. Predicting eclipse bug lifetimes. In Proc. of the Fourth International Workshop on mining software repositories, page 29. IEEE Computer Society, 2007. Google ScholarDigital Library
D. Pfahl, S. Karus, and M. Stavnycha. Improving expert prediction of issue resolution time. In Proc. of the 20th Int’l Conf. on Evaluation and Assessment in Software Engineering, page 42. ACM, 2016. Google ScholarDigital Library
S. W. Thomas, M. Nagappan, D. Blostein, and A. E. Hassan. The impact of classifier configuration and classifier combination on bug localization. IEEE Transactions on Soft. Eng., 39(10):1427–1443, 2013. Google ScholarDigital Library
C. Weiss, R. Premraj, T. Zimmermann, and A. Zeller. How long will it take to fix this bug? In Proc. of the Fourth International Workshop on Mining Software Repositories, page 1. IEEE Computer Society, 2007. Google ScholarDigital Library
P. Willett. The porter stemming algorithm: then and now. Program, 40(3):219–223, 2006.Google ScholarCross Ref
D. H. Wolpert. Stacked generalization. Neural networks, 5(2):241–259, 1992. Google ScholarDigital Library

Index Terms

A hybrid model for task completion effort estimation
1. Software and its engineering
  1. Software creation and management
    1. Software development process management
    2. Software verification and validation
      1. Software defect analysis

Recommendations

Web effort estimation: the value of cross-company data set compared to single-company data set
PROMISE '12: Proceedings of the 8th International Conference on Predictive Models in Software Engineering

This study investigates to what extent Web effort estimation models built using cross-company data sets can provide suitable effort estimates for Web projects belonging to another company, when compared to Web effort estimates obtained using that ...
Read More
Effort estimation: how valuable is it for a web company to use a cross-company data set, compared to using its own single-company data set?
WWW '07: Proceedings of the 16th international conference on World Wide Web

Previous studies comparing the prediction accuracy of effort models built using Web cross- and single-company data sets have been inconclusive, and as such replicated studies are necessary to determine under what circumstances a company can place ...
Read More
A Replicated Comparison of Cross-Company and Within-Company Effort Estimation Models Using the ISBSG Database
METRICS '05: Proceedings of the 11th IEEE International Software Metrics Symposium

Four years ago was the last time the ISBSG database was used to compare the effort prediction accuracy between cross-company and within-company cost models. Since then more than 2,000 projects have been volunteered to this database, which may have ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SWAN 2016: Proceedings of the 2nd International Workshop on Software Analytics
November 2016
53 pages
ISBN:9781450343954
DOI:10.1145/2989238
General Chairs:
Olga Baysal,
Jacek Czerwonka,
Latifa Guerrouj,
David Lo,
Brendan Murphy
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 November 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Mining software repositories
effort estimation
ensemble learning
machine learning
task completion effort
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 168
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A hybrid model for task completion effort estimation

SWAN 2016: Proceedings of the 2nd International Workshop on Software Analytics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Web effort estimation: the value of cross-company data set compared to single-company data set

Effort estimation: how valuable is it for a web company to use a cross-company data set, compared to using its own single-company data set?

A Replicated Comparison of Cross-Company and Within-Company Effort Estimation Models Using the ISBSG Database

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A hybrid model for task completion effort estimation

SWAN 2016: Proceedings of the 2nd International Workshop on Software Analytics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Web effort estimation: the value of cross-company data set compared to single-company data set

Effort estimation: how valuable is it for a web company to use a cross-company data set, compared to using its own single-company data set?

A Replicated Comparison of Cross-Company and Within-Company Effort Estimation Models Using the ISBSG Database

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media