skip to main content
10.1145/2989238.2989242acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

A hybrid model for task completion effort estimation

Published:13 November 2016Publication History

ABSTRACT

Predicting time and effort of software task completion has been an active area of research for a long time. Previous studies have proposed predictive models based on either text data or metadata of software tasks to estimate either completion time or completion effort of software tasks, but there is a lack of focus in the literature on integrating all sets of attributes together to achieve better performing models. We first apply the previously proposed models on the datasets of two IBM commercial projects called RQM and RTC to find the best performing model in predicting task completion effort on each set of attributes. Then we propose an approach to create a hybrid model based on selected individual predictors to achieve more accurate and stable results in early prediction of task completion effort and to make sure the model is not bounded to some attributes and consequently is adoptable to a larger number of tasks. Categorizing task completion effort values into Low and High labels based on their measured median value, we show that our hybrid model provides 3-8% more accuracy in early prediction of task completion effort compared to the best individual predictors.

References

  1. W. Abdelmoez, M. Kholief, and F. M. Elsalmy. Bug fix-time prediction model using na¨ıve bayes classifier. In Comp. Theory and Applications (ICCTA), 2012 22nd Int’l Conf. on, pages 167–172. IEEE, 2012.Google ScholarGoogle Scholar
  2. J. Anvik, L. Hiew, and G. C. Murphy. Who should fix this bug? In Proc. of the 28th international conference on Software engineering, pages 361–370. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. P. Bhattacharya and I. Neamtiu. Bug-fix time prediction models: can we do better? In Proc. of the 8th Working Conference on Mining Software Repositories, pages 207–210. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. O. Elish and M. O. Elish. Predicting defect-prone software modules using support vector machines. Journal of Systems and Software, 81(5):649–660, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. E. Giger, M. Pinzger, and H. Gall. Predicting the fix time of bugs. In Proc. of the 2nd International Workshop on Recommendation Systems for Software Engineering, pages 52–56. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Gray, D. Bowes, N. Davey, Y. Sun, and B. Christianson. Using the support vector machine as a classification method for software defect prediction with static code metrics. In Int’l Conf. on Eng. Apps. of Neural Networks, pages 223–234. Springer, 2009.Google ScholarGoogle Scholar
  7. P. J. Guo, T. Zimmermann, N. Nagappan, and B. Murphy. Characterizing and predicting which bugs get fixed: an empirical study of microsoft windows. In 2010 ACM/IEEE 32nd International Conf. on Software Eng., volume 1, pages 495–504. IEEE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The weka data mining software: an update. ACM SIGKDD explorations newsletter, 11(1):10–18, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Hewett and P. Kijsanayothin. On modeling software defect repair time. Empirical Software Engineering, 14(2):165–186, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Hofmann and R. Klinkenberg. RapidMiner: Data mining use cases and business analytics applications. CRC Press, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Jorgensen. What we do and don’t know about software development effort estimation. IEEE software, 31(2), 2014.Google ScholarGoogle Scholar
  12. Y. Kamei, T. Fukushima, S. McIntosh, K. Yamashita, N. Ubayashi, and A. E. Hassan. Studying just-in-time defect prediction using cross-project models. Empirical Software Engineering, pages 1–35, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Kittler, M. Hatef, R. P. Duin, and J. Matas. On combining classifiers. IEEE transactions on pattern analysis and machine intelligence, 20(3):226–239, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L. Marks, Y. Zou, and A. E. Hassan. Studying the fix-time for bugs in large open source projects. In Proc. of the 7th International Conf. on Predictive Models in Software Eng., page 11. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. T. Mısırlı, A. B. Bener, and B. Turhan. An industrial case study of classifier ensembles for locating software defects. Software Quality Journal, 19(3):515–536, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. D. Panjer. Predicting eclipse bug lifetimes. In Proc. of the Fourth International Workshop on mining software repositories, page 29. IEEE Computer Society, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Pfahl, S. Karus, and M. Stavnycha. Improving expert prediction of issue resolution time. In Proc. of the 20th Int’l Conf. on Evaluation and Assessment in Software Engineering, page 42. ACM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. W. Thomas, M. Nagappan, D. Blostein, and A. E. Hassan. The impact of classifier configuration and classifier combination on bug localization. IEEE Transactions on Soft. Eng., 39(10):1427–1443, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. Weiss, R. Premraj, T. Zimmermann, and A. Zeller. How long will it take to fix this bug? In Proc. of the Fourth International Workshop on Mining Software Repositories, page 1. IEEE Computer Society, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P. Willett. The porter stemming algorithm: then and now. Program, 40(3):219–223, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  21. D. H. Wolpert. Stacked generalization. Neural networks, 5(2):241–259, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A hybrid model for task completion effort estimation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SWAN 2016: Proceedings of the 2nd International Workshop on Software Analytics
        November 2016
        53 pages
        ISBN:9781450343954
        DOI:10.1145/2989238

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 November 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader