Skip to main content
Log in

On the value of parameter tuning in heterogeneous ensembles effort estimation

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Accurate estimation of software development effort estimation (SDEE) is fundamental for efficient management of software development projects as it assists software managers to efficiently manage their human resources. Over the last four decades, while software engineering researchers have used several effort estimation techniques, including those based on statistical and machine learning methods, no consensus has been reached on the technique that can perform best in all circumstances. To tackle this challenge, Ensemble Effort Estimation, which predicts software development effort by combining more than one solo estimation technique, has recently been investigated. In this paper, heterogeneous ensembles based on four well-known machine learning techniques (K-nearest neighbor, support vector regression, multilayer perceptron and decision trees) were developed and evaluated by investigating the impact of parameter values of the ensemble members on estimation accuracy. In particular, this paper evaluates whether setting ensemble parameters using two optimization techniques (e.g., grid search optimization and particle swarm) permits more accurate estimates of SDEE. The heterogeneous ensembles of this study were built using three combination rules (mean, median and inverse ranked weighted mean) over seven datasets. The results obtained suggest that: (1) Optimized single techniques using grid search or particle swarm optimization provide more accurate estimation; (2) in general ensembles achieve higher accuracy than their single techniques whatever the optimization technique used, even though ensembles do not dominate over all single techniques; (3) heterogeneous ensembles based on optimized single techniques provide more accurate estimation; and (4) generally, particle swarm optimization and grid search techniques generate ensembles with the same predictive capability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Albrecht AJ, Gaffney JE (1983) Software function, source lines of code, and development effort prediction: a software science validation. IEEE Trans Softw Eng SE–9:639–648. https://doi.org/10.1109/TSE.1983.235271

    Article  Google Scholar 

  • Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46:175–185. https://doi.org/10.1080/00031305.1992.10475879

    MathSciNet  Google Scholar 

  • Amazal FA, Idri A, Abran A (2014a) Software development effort estimation using classical and fuzzy analogy: a cross-validation comparative study. Int J Comput Intell Appl 13:1450013. https://doi.org/10.1142/S1469026814500138

    Article  Google Scholar 

  • Amazal FA, Idri A, Abran A (2014b) An analogy-based approach to estimation of software development effort using categorical data. In: Joint conference of the international workshop on software measurement and the international conference on software process and product measurement, pp 252–262

  • Araújo RDA, De Oliveira ALI, Soares S (2010) Hybrid intelligent design of morphological-rank-linear perceptrons for software development cost estimation. In: Proceedings of international conference of tools with artif intell ICTAI, vol 1, pp 160–167. https://doi.org/10.1109/ICTAI.2010.30

  • Azhar D, Riddle P, Mendes E, et al (2013) Using ensembles for web effort estimation. In: 2013 ACM/IEEE international symposium on empirical software engineering and measurement, pp 173–182

  • Azzeh M, Nassif AB, Minku LL (2015) An empirical evaluation of ensemble adjustment methods for analogy-based effort estimation. J Syst Softw 103:36–52. https://doi.org/10.1016/j.jss.2015.01.028

    Article  Google Scholar 

  • Barcelos Tronto IF, da Silva JDS, Sant’Anna N (2008) An investigation of artificial neural networks based prediction systems in software project management. J Syst Softw 81:356–367. https://doi.org/10.1016/j.jss.2007.05.011

    Article  Google Scholar 

  • Baskeles B, Turhan B, Bener A (2007) Software effort estimation using machine learning methods. In: Proceedings of the 22nd international symposium on computer and information sciences, pp 1–6

  • Berlin S, Raz T, Glezer C, Zviran M (2009) Comparison of estimation methods of cost and duration in IT projects. Inf Softw Technol 51:738–748

    Article  Google Scholar 

  • Bibi S, Stamelos I, Angelis L (2008) Combining probabilistic models for explanatory productivity estimation. Inf Softw Technol 50:656–669. https://doi.org/10.1016/j.infsof.2007.06.004

    Article  Google Scholar 

  • Bisognin D a, Douches DS, Jastrzebski K, Kirk WW (2002) Half-sib progeny evaluation and selection of potatoes resistant to the US8 genotype of Phytophthora infestans from crosses between resistant and susceptible parents. Euphytica 125:129–138. https://doi.org/10.1023/A:1015763207980

    Article  Google Scholar 

  • Boehm B (1984) Software engineering economics. IEEE Trans Softw Eng 10:4–21

    Article  Google Scholar 

  • Boeringer DW, Werner DH, Member S (2004) Particle swarm optimization versus genetic algorithms for phased array synthesis. IEEE Trans Antennas Propag 52:771–779

    Article  Google Scholar 

  • Bony S, Pichon N, Ravel C et al (2001) The relationship between mycotoxin synthesis and isolate morphology in fungal endophytes of Lolium perenne. New Phytol 152:125–137. https://doi.org/10.1046/J.0028-646x.2001.00231.X

    Article  Google Scholar 

  • Booba B, Gopal TV (2013) Comparison of ant colony optimization & particle swarm optimization in grid environment. Int J Adv Res Comput Sci Appl 1:27–33

    Google Scholar 

  • Borges L, Ferreira D (2003) Power and type I errors rate of Scott–Knott, Tukey and Newman–Keuls tests under normal and no-normal distributions of the residues. Rev Mat Estat 21:67–83

    Google Scholar 

  • Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc 26:211–252

    MATH  Google Scholar 

  • Braga P, Oliveira A, Ribeiro G, Meira S (2007a) Bagging predictors for estimation of software project effort. In: Proceedings of international joint conference on neural networks, pp 14–19

  • Braga PL, Oliveira ALI, Meira SRL (2007b) Software effort estimation using machine learning techniques with robust confidence intervals. In: 7th international conference on hybrid intelligent systems (HIS 2007), pp 352–357

  • Breiman L (1996) Bagging predictors. Mach Learn 26:123–140. https://doi.org/10.1023/A:1018054314350

    MATH  Google Scholar 

  • Brooks Jr FP (1975) The mythical man-month: essays on software engineering. Addison Wesley Longman, Inc, United States, Boston

  • Burgess CJ, Lefley M, Le M (2001) Can genetic programming improve software effort estimation? A comparative evaluation. Inf Softw Technol 43:863–873. https://doi.org/10.1016/S0950-5849(01)00192-6

    Article  Google Scholar 

  • Byrne BM (2009) Structural equation modeling with AMOS. Mahwah, New York

    Google Scholar 

  • Calinski T, Corsten LCA (1985) Clustering means in ANOVA by simultaneous testing. Biometrics 41:39–48

    Article  Google Scholar 

  • Chandra A, Yao X (2006) Ensemble learning using multi-objective evolutionary algorithms. J Math Model Algorithms 5:417–445. https://doi.org/10.1007/s10852-005-9020-3

    Article  MathSciNet  MATH  Google Scholar 

  • Chen KH, Wang KJ, Wang KM, Angelia MA (2014) Applying particle swarm optimization-based decision tree classifier for cancer classification on gene expression data. Appl Soft Comput J 24:773–780. https://doi.org/10.1016/j.asoc.2014.08.032

    Article  Google Scholar 

  • Cohen J (1992) A power primer. Psychol Bull 112:155–159. https://doi.org/10.1037/0033-2909.112.1.155

    Article  Google Scholar 

  • Conte SD, Dunsmore HE, Shen YE (1986) Software engineering metrics and models. Benjamin-Cummings Publishing Co., Inc, Redwood City

    Google Scholar 

  • Cox DR, Spjøtvoll E (1982) On partitioning means into groups. Scand J Stat 9:147–152

    MathSciNet  MATH  Google Scholar 

  • Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, UK

  • Das H, Jena AK, Nayak J, et al (2014) A novel PSO based back propagation learning-MLP (PSO-BP-MLP) for Classification. In: Proceedings of the international conference on IEEE symposium on computational intelligence and data mining, 20–21 December 2014

  • Deharnais J (1989) Analyse statistique de la productivitie des projects de development en informatique apartir de la technique des points des fontion. Quebec university

  • Elish MO (2013) Assessment of voting ensemble for estimating software development effort. In: IEEE symposium on computational intelligence and data mining, Singapore, pp 316–321

  • Elish MO, Helmy T, Hussain MI (2013) Empirical study of homogeneous and heterogeneous ensemble models for software development effort estimation. Math Prob Eng. https://doi.org/10.1155/2013/312067

    Google Scholar 

  • Finnie GR, Wittig GE, Desharnais J-M (1997) A comparison of software effort estimation techniques: using function points with neural networks, case-based reasoning and regression models. J Syst Softw 39:281–289

    Article  Google Scholar 

  • Fonseca CM, Fleming PJ (1993) Genetic algorithms for multiobjective optimization: formulation, discussion and generalization. In: Proceedings of the 5th international conference on genetic algorithms, pp 416–423

  • Foss T, Myrtveit I, Stensrud E (2001) MRE and heteroscedasticity?: An empirical validation of the assumption of homoscedasticity of the magnitude of relative error. In: ESCOM, 12th european software control and metrics conference, Netherlands, pp 157–164

  • Foss T, Stensrud E, Kitchenham B, Myrtveit I (2003) A simulation study of the model evaluation criterion MMRE. IEEE Trans Softw Eng 29:985–995. https://doi.org/10.1109/TSE.2003.1245300

    Article  Google Scholar 

  • Freund Y, Schapire RE (1995) A desicion-theoretic generalization of on-line learning and an application to boosting. In: Computational learning theory, pp 23–37

  • Göndör M, Bresfelean VP (2012) REPTree and M5P for measuring fiscal policy influences on the Romanian capital market during 2003–2010. Int J Math Comput Simul 6:378–386

    Google Scholar 

  • Gray AR, MacDonell SG (1997) A comparison of techniques for developing predictive models of software metrics. Inf Softw Technol 39:425–437. https://doi.org/10.1016/S0950-5849(96)00006-7

    Article  Google Scholar 

  • Hassan R, Cohanim B, De Weck O et al (2005) A comparison of particle swarm optimization and the genetic algorithm. AIAA Pap 2005–1897:1–13

    Google Scholar 

  • Heiat A (2002) Comparison of artificial neural network and regression models for estimating software development effort. Inf Softw Technol 44:911–922. https://doi.org/10.1016/S0950-5849(02)00128-3

    Article  Google Scholar 

  • Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20:832–844. https://doi.org/10.1109/34.709601

    Article  Google Scholar 

  • Ho TK (2005) Nearest neighbors in random subspaces. Adv Pattern Recognit. https://doi.org/10.1007/BFb0033288

    Google Scholar 

  • Hosni M, Idri A (2017) Software effort estimation using classical analogy ensembles based on random subspace. In: Proceedings of the ACM symposium on applied computing

  • Hsu C-J, Rodas NU, Huang C-Y, Peng K-L (2010) A study of improving the accuracy of software effort estimation using linearly weighted combinations. In: Proceedings of the 34th IEEE annual computer software and applications conference workshops, Seoul, pp 98–103

  • Huang CL, Wang CJ (2006) A GA-based feature selection and parameters optimizationfor support vector machines. Expert Syst Appl 31:231–240. https://doi.org/10.1016/j.eswa.2005.09.024

    Article  Google Scholar 

  • Hughes RT (1996) Expert judgement as an estimating method. Inf Softw Technol 38:67–75. https://doi.org/10.1016/0950-5849(95)01045-9

    Article  Google Scholar 

  • Idri A, Abran A, Kjiri L (2000) COCOMO cost model using fuzzy logic. In: Proceedings of the 7th international conference on fuzzy theory & techniques. Atlantic, New Jersey, pp 1–4

  • Idri A, Amazal FA (2012) Software cost estimation by fuzzy analogy for ISBSG repository. In: Proceedings of the 10th international FLINS conference on uncertainty modeling in knowledge engineering and decision making, Istanbul, Turkey

  • Idri A, Amazal FA, Abran A (2015a) Analogy-based software development effort estimation: a systematic mapping and review. Inf Softw Technol 58:206–230. https://doi.org/10.1016/j.infsof.2014.07.013

    Article  Google Scholar 

  • Idri A, azzahra Amazal F, Abran A (2015) Accuracy comparison of analogy-based software development effort estimation techniques. Int J Intell Syst. https://doi.org/10.1002/int

  • Idri A, Hosni M, Abran A (2016) Systematic literature review of ensemble effort estimation. J Syst Softw 118:151–175. https://doi.org/10.1016/j.jss.2016.05.016

    Article  Google Scholar 

  • Idri A, Hosni M, Abran A (2016) Improved estimation of software development effort using classical and fuzzy analogy ensembles. Appl Soft Comput. https://doi.org/10.1016/j.asoc.2016.08.012

    Google Scholar 

  • Idri A, Hosni M, Abran A (2016b) Systematic mapping study of ensemble effort estimation. In: Proceedings of the 11th international conference on evaluation of novel software approaches to software engineering, pp 132–139

  • Idri A, Khoshgoftaar TM, Abran A (2002) Can neural networks be easily interpreted in software cost estimation? World Congr Comput Intell. https://doi.org/10.1109/FUZZ.2002.1006668

    Google Scholar 

  • Jeffery R, Ruhe M, Wieczorek I (2001) Using public domain metrics to estimate software development effort. In: Seventh international software metrics symposium, METRICS 2001, pp 16–27

  • Jolliffe IT (1975) Cluster analysis as multiple comparison method. In: Applied statistics, Proceedings of conference at Dalhousie University. North Holland, pp 159–168

  • Jorgensen M, Shepperd M (2007) A systematic review of software development cost estimation studies. IEEE Trans Softw Eng 33:33–53. https://doi.org/10.1109/TSE.2007.256943

    Article  Google Scholar 

  • Kalmegh S (2015) Analysis of WEKA data mining algorithm REPTree, simple cart and randomtree for classification of indian news. Int J Innov Sci Eng Technol 2:438–446

    Google Scholar 

  • Kemerer CF (1987) An empirical validation of software cost estimation models. Commun ACM 30:416–429. https://doi.org/10.1145/22899.22906

    Article  Google Scholar 

  • Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of IEEE international conference on neural networks, vol 4, pp 1942–1948

  • Kitchenham B, Pickard LM, MacDonell SG, Shepperd MJ (2001) What accuracy statistics really measure. IEE Proc Softw 148:81. https://doi.org/10.1049/ip-sen:20010506

    Article  Google Scholar 

  • Kocaguneli E, Kultur Y, Bener AB (2009) Combining multiple learners induced on multiple datasets for software effort prediction. In: Proceedings of international symposium on software reliability engineering

  • Kocaguneli E, Menzies T (2013) Software effort models should be assessed via leave-one-out validation. J Syst Softw 86:1879–1890. https://doi.org/10.1016/j.jss.2013.02.053

    Article  Google Scholar 

  • Kocaguneli E, Menzies T, Keung JW (2012) On the value of ensemble effort estimation. IEEE Trans Softw Eng 38:1403–1416. https://doi.org/10.1109/TSE.2011.111

    Article  Google Scholar 

  • Konak A, Coit DW, Smith AE (2006) Multi-objective optimization using genetic algorithms: a tutorial. Reliab Eng Syst Saf 91:992–1007. https://doi.org/10.1016/j.ress.2005.11.018

    Article  Google Scholar 

  • Korte M, Port D (2008) Confidence in software cost estimation results based on MMRE and PRED. In: Proceedings of 4th international workshop on predictor models in software engineering, pp 63–70. https://doi.org/10.1145/1370788.1370804

  • Kuncheva LI, Rodríguez JJ, Plumpton CO et al (2010) Random subspace ensembles for fMRI classification. Lect Notes Comput Sci 29:531–542. https://doi.org/10.1109/TMI.2009.2037756

    Google Scholar 

  • Lilliefors HW (1967) On the Kolmogorov–Smirnov test for normality with mean and variance unknown. J Am Stat Assoc 62:399–402. https://doi.org/10.1080/01621459.1967.10482916

    Article  Google Scholar 

  • Liu Y, Yao X (1999) Ensemble learning via negative correlation. Neural Netw 12:1399–1404. https://doi.org/10.1016/S0893-6080(99)00073-8

    Article  Google Scholar 

  • Lokan C, Wright T, Hill P, Stringer M (2001) Organizational benchmarking using the ISBSG data repository. IEEE Softw 18:26–32. https://doi.org/10.1109/52.951491

    Article  Google Scholar 

  • Ma X, Zhang Y, Wang Y (2015) Performance evaluation of kernel functions based on grid search for support vector regression. In: 2015 IEEE 7th international conference on cybernetics and intelligent systems (CIS) and IEEE conference on robotics, automation and mechatronics (RAM), pp 283–288

  • Mansour Y (1997) Pessimistic decision tree pruning based on tree size. In: Proceedings on 14th international conference on machine learning, pp 195–201

  • Mendes E, Watson I, Triggs C, et al (2002) A comparison of development effort estimation techniques for Web hypermedia applications. In: Proceedings on international software metrics symposium, pp 131–140

  • Menzies T, Caglayan B, Kocaguneli E, et al (2012) The promise repository of empirical software engineering data. terapromise.csc.ncsu.edu

  • Menzies T, Chen Z, Hihn J, Lum K (2006) Selecting best practices for effort estimation. IEEE Trans Softw Eng 32:883–895. https://doi.org/10.1109/TSE.2006.114

    Article  Google Scholar 

  • Minku LL, Yao X (2013) Software effort estimation as a multiobjective learning problem. ACM Trans Softw Eng Methodol 22:35:1–35:32

    Article  Google Scholar 

  • Minku LL, Yao X (2013b) Ensembles and locality: insight on improving software effort estimation. Inf Softw Technol 55:1512–1528. https://doi.org/10.1016/j.infsof.2012.09.012

    Article  Google Scholar 

  • Minku LL, Yao X (2013c) An analysis of multi-objective evolutionary algorithms for training ensemble models based on different performance measures in software effort estimation. In: Proceedings of the 9th international conference on predictive models in software engineering—PROMISE ’13, pp 1–10

  • Minku LL, Yao X (2013d) Ensembles and locality: insight on improving software effort estimation. Inf Softw Technol 55:1512–1528. https://doi.org/10.1016/j.infsof.2012.09.012

    Article  Google Scholar 

  • Mittas N, Angelis L (2013) Ranking and clustering software cost estimation models through a multiple comparisons algorithm. IEEE Trans Softw Eng 39:537–551. https://doi.org/10.1109/TSE.2012.45

    Article  Google Scholar 

  • Mittas N, Mamalikidis I, Angelis L (2015) A framework for comparing multiple cost estimation methods using an automated visualization toolkit. Inf Softw Technol 57:310–328. https://doi.org/10.1016/j.infsof.2014.05.010

    Article  Google Scholar 

  • Miyazaki Y (1991) Method to estimate parameter values in software prediction models. Inf Softw Technol 33:239–243. https://doi.org/10.1016/0950-5849(91)90139-3

    Article  Google Scholar 

  • Miyazaki Y, Terakado M, Ozaki K (1994) Robust regression for developing software estimation models. J Syst Softw 27:3–16. https://doi.org/10.1016/0164-1212(94)90110-4

    Article  Google Scholar 

  • Myrtveit I, Stensrud E, Shepperd M (2005) Reliability and validity in comparative studies of software prediction models. IEEE Trans Softw Eng 31:380–391. https://doi.org/10.1109/TSE.2005.58

    Article  Google Scholar 

  • Nassif AB, Azzeh M, Capretz LF, Ho D (2015) Neural network models for software development effort estimation: a comparative study. Neural Comput Appl. https://doi.org/10.1007/s00521-015-2127-1

    Google Scholar 

  • Oliveira ALI (2006) Estimation of software project effort with support vector regression. Neurocomputing 69:1749–1753. https://doi.org/10.1016/j.neucom.2005.12.119

    Article  Google Scholar 

  • Oliveira ALI, Braga PL, Lima RMF, Cornélio ML (2010) GA-based method for feature selection and parameters optimization for machine learning regression applied to software effort estimation. Inf Softw Technol 52:1155–1166. https://doi.org/10.1016/j.infsof.2010.05.009

    Article  Google Scholar 

  • Pendharkar PC, Subramanian GH, Rodger JA (2005) A probabilistic model for predicting software development effort. IEEE Trans Softw Eng 31:615–624. https://doi.org/10.1109/TSE.2005.75

    Article  Google Scholar 

  • Putnam LH (1978) A general empirical solution to the macro software sizing and estimating problem. IEEE Trans Softw Eng 4:345–361. https://doi.org/10.1109/TSE.1978.231521

    Article  MATH  Google Scholar 

  • Quenouille AMH (1956) Notes on bias in estimation. Biometrika 43:353–360. https://doi.org/10.1093/biomet/43.3-4.353

    Article  MathSciNet  MATH  Google Scholar 

  • Quinlan JR (1986) Induction of decision trees. Mach Learn. https://doi.org/10.1023/A:1022643204877

    Google Scholar 

  • Quinlan JR (1993) C4.5: program for machine learning. Morgan Kaufmann, Burlington

    Google Scholar 

  • Sadri J, Suen CY, Bui TD (2003) Application of support vector machines for recognition of handwritten Arabic/Persian digits. In: Second conference on machine vision and image processing & applications (MVIP 2003), pp 300–307

  • Schapire RE (1990) The strength of weak ties. J Mach Learn 1:197–227. https://doi.org/10.1023/A:1022648800760

    Google Scholar 

  • Scott AJ, Knott M (1974) A cluster analysis method for grouping means in the analysis of variance. Biometrics 30:507–512

    Article  MATH  Google Scholar 

  • Sharma J, Zettler LW, Van Sambeek JW et al (2003) Symbiotic seed germination and mycorrhizae of federally threatened platanthera praeclara (orchidaceae). Am Midl Nat 149:104–120. https://doi.org/10.1674/0003-0031(2003)149

    Article  Google Scholar 

  • Shepperd M, MacDonell S (2012) Evaluating prediction systems in software project estimation. Inf Softw Technol 54:820–827. https://doi.org/10.1016/j.infsof.2011.12.008

    Article  Google Scholar 

  • Shepperd M, Schofield C (1997) Estimating software project effort using analogies. IEEE Trans Softw Eng 23:736–743. https://doi.org/10.1109/32.637387

    Article  Google Scholar 

  • Shepperd MJ, Kadoda G (2001) Comparing software prediction techniques using simulation. IEEE Trans Softw Eng 27:1014–1022. https://doi.org/10.1109/32.965341

    Article  Google Scholar 

  • Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: 1998 IEEE international conference on evolutionary computation proceedings. IEEE world congress on computational intelligence (Cat. No. 98TH8360), pp 69–73

  • Simon H (1999) Neural networks: a comprehensive foundation, 2nd edn. MacMillan Publishing Company, Basingstoke

    MATH  Google Scholar 

  • Song L, Minku LL, Yao X (2013) The impact of parameter tuning on software effort estimation using learning machines. In: Proceedings of the 9th international conference on predictive models in software engineering

  • Srinivasan K, Fisher D (1995) Machine learning approaches to estimating software development effort. IEEE Trans Softw Eng 21:126–137. https://doi.org/10.1109/32.345828

    Article  Google Scholar 

  • Tong S, Koller D (2002) Support vector machine active learning with applications to text classification. J Mach Learn Res 2:45–66. https://doi.org/10.1162/153244302760185243

    MATH  Google Scholar 

  • Tsoumakas G, Angelis L, Vlahavas I (2005) Selective fusion of heterogeneous classifiers. Intell Data Anal 9:511–525

    Google Scholar 

  • Vapnik V (1992) Principles of risk minimization for learning theory. In: Advances in neural information processing systems, pp 831–838

  • Vapnik V, Bottou L (1993) Local algorithms for pattern recognition and dependencies estimation. Neural Comput 5:893–909

    Article  Google Scholar 

  • Vinaykumar K, Ravi V, Carr M (2009) Software cost estimation using soft computing approaches. In: Handbook of research on machine learning applications and trends. IGI-global, pp 499–518

  • W. N. Haizan W. M, Mohd Najib Mohd S, Abdul Halim O (2012) A comparative study of Reduced Error Pruning method in decision tree algorithms. In: Proceedings—2012 IEEE international conference on control system, computing and engineering, ICCSCE 2012. pp 392–397

  • Wen J, Li S, Lin Z et al (2012) Systematic literature review of machine learning based software development effort estimation models. Inf Softw Technol 54:41–59. https://doi.org/10.1016/j.infsof.2011.09.002

    Article  Google Scholar 

  • Witten I, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann Publishers, Inc, San Francisco, USA

  • Xiao T, Ren D, Lei S et al (2014) Based on grid-search and PSO parameter optimization for support vector machine. In: 11th world congress on intelligent control and automation (WCICA). IEEE, pp 1529–1533

  • Zhao Y, Zhang Y (2008) Comparison of decision tree methods for finding active objects. Adv Space Res 41:1955–1959. https://doi.org/10.1016/j.asr.2007.07.020

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ali Idri.

Ethics declarations

Conflict of interest

The authors have no conflict of interest.

Additional information

Communicated by S. Deb, T. Hanne, K.C. Wong.

Appendices

Appendix

Appendix A: Performance of single and ensemble techniques using five accuracy measures: MAE, Pred, MIBRE, MBRE and LSD

See Tables 25, 26, 27, 28, 29, 30 and 31.

Table 25 Descriptive statistics of the performance of 12 single techniques and nine ensembles techniques using five performance criteria for the Albrecht dataset
Table 26 Descriptive statistics of the performance of 12 single techniques and nine ensemble techniques using five performance criteria for the China dataset
Table 27 Descriptive statistics of the performance of 12 single techniques and nine ensemble techniques using performance criteria for the COCOMO81 dataset
Table 28 Descriptive statistics of the performance of 12 single techniques and nine ensemble techniques using five performance criteria for the Desharnais dataset
Table 29 Descriptive statistics of the performance of 12 single techniques and nine ensemble techniques using five performance criteria for the ISBSG dataset
Table 30 Descriptive statistics of the performance of 12 single techniques and nine ensemble techniques using five performance criteria for the Kemerer dataset
Table 31 Descriptive statistics of the performance of 12 single techniques and nine ensemble techniques using five performance criteria for the Miyazaki dataset
Table 32 GS optimal parameter values of the four ML techniques

Appendix B: Optimal configurations of the four ML techniques

See Tables 32 and 32.

Table 33 PSO optimal parameter values of the four ML techniques

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hosni, M., Idri, A., Abran, A. et al. On the value of parameter tuning in heterogeneous ensembles effort estimation. Soft Comput 22, 5977–6010 (2018). https://doi.org/10.1007/s00500-017-2945-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-017-2945-4

Keywords

Navigation