Abstract
Empirical validation of software metrics to predict quality using machine learning methods is important to ensure their practical relevance in the software organizations. It would also be interesting to know the relationship between object-oriented metrics and fault proneness at different severity levels. In this paper, we build a Support vector machine (SVM) model to find the relationship between object-oriented metrics given by Chidamber and Kemerer and fault proneness, at different severity levels. The proposed models at different severity levels are empirically evaluated using public domain NASA data set. The performance of the SVM method was evaluated by receiver operating characteristic (ROC) analysis. Based on these results, it is reasonable to claim that such models could help for planning and performing testing by focusing resources on fault-prone parts of the design and code. The performance of the model predicted using high severity faults is low as compared to performance of the model predicted with respect to medium and low severity faults. Thus, the study shows that SVM method may also be used in constructing software quality models. However, similar types of studies are required to be carried out in order to establish the acceptability of the model.
Similar content being viewed by others
Abbreviations
- Coupling:
-
Coupling is a measure of the degree of interdependence between classes
- Cohesion:
-
Cohesion is a measure of the degree to which the elements of a module are functionally related (Aggarwal et al. 2006a)
- Severity:
-
This value quantifies the impact of the fault on the overall environment
- Fault proneness:
-
The probability of the detection of a fault
- Sensitivity:
-
The ratio of predicted faulty classes and actual faulty classes
- Specificity:
-
The ratio of predicted non faulty classes and actual non faulty classes
- Completeness:
-
The number of faults in classes predicted fault-prone, divided by the total number of faults in the system
- Precision:
-
The number of classes that are predicted correctly (both faulty and not faulty), divided by the total number of classes
- ROC curves:
-
ROC curve, which is defined as a plot of sensitivity on the y-coordinate versus its 1-specificity on the x coordinate
References
Aggarwal KK, Singh Y, Kaur A, Malhotra R (2005) Software reuse metrics for object-oriented systems. In: Proceedings of the 3rd ACIS international conference on software engineering research, management and applications (SERA ‘05), Mt. Pleasant, pp 48–55
Aggarwal KK, Singh Y, Kaur A, Malhotra R (2006a) Empirical study of object-oriented metrics. J Object Technol 5(8):149–173
Aggarwal KK, Singh Y, Kaur A, Malhotra R (2006b) Investigating the effect of coupling metrics on fault proneness in object-oriented systems. Software Qual Prof 8(4):4–16
Aggarwal KK, Singh Y, Kaur A, Malhotra R (2009) Empirical analysis for investigating the effect of object-oriented metrics on fault proneness: a replicated case study. Softw Process Improv Pract 14(1):39–62
Barnett V, Price T (1995) Outliers in statistical data. Wiley, NewYork
Basili V, Briand L, Melo W (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng 22(10):751–761
Belsley D, Kuh E, Welsch R (1980) Regression diagnostics: identifying influential data and sources of collinearity. Wiley, New York
Bieman J. Kang B. (1995). Cohesion and reuse in an object-oriented system. Proc ACM Symp Softw Reusability (SSR’94) 259–262
Binkley A, Schach S (1998). Validation of the coupling dependency metric as a risk predictor. In: Proceedings of the international conference on software engineering, Kyoto, pp 452–455
Briand L, Daly W, Wust J (1998) Unified framework for cohesion measurement in object-oriented systems. Empir Softw Eng 3(1):65–117
Briand L, Daly W, Wust J (1999) A unified framework for coupling measurement in object-oriented systems. IEEE Trans Softw Eng 25(1):91–121
Briand L, Daly W, Wust J (2000) Exploring the relationships between design measures and software quality. J Syst Softw 51(3):245–273
Briand L, Wüst J, Lounis H (2001) Replicated case studies for investigating quality factors in object-oriented designs. Empir Softw Eng 6(1):11–58
Burges C (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc 2:121–167
Cartwright M, Shepperd M (1999) An empirical investigation of an object-oriented software system. IEEE Trans Softw Eng 26(8):786–796
Chidamber S, Kamerer C (1991) Towards a metrics suite for object oriented design. In: Proceedings of the conference on object-oriented programming: systems, languages and applications (OOPSLA’91). SIGPLAN Notices 26(11):197–211
Chidamber S, Kamerer C (1994) A metrics suite for object-oriented design. IEEE Trans Softw Eng 20(6):476–493
Chidamber S, Darcy D, Kemerer C (1998) Managerial use of metrics for object-oriented software: an exploratory analysis. IEEE Trans Softw Eng 24(8):629–639
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge
El Emam K, Benlarbi S, Goel N, Rai S (1999) A validation of object-oriented metrics. Tech Rep. ERB-1063, NRC
El Emam K, Benlarbi S, Goel N, Rai S (2001) The confounding effect of class size on the validity of object-oriented metrics. IEEE Trans Softw Eng 27(7):630–650
Gyimothy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910
Hair J, Anderson R, Tatham W (2006) Black multivariate data analysis. Pearson Education, Upper Saddle River
Hanley J, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic ROC curve. Radiology 143:29–36
Harrison R, Counsell SJ, Nithi RV (1998) An evaluation of MOOD set of object-oriented software metrics. IEEE Trans Softw Eng 24(6):491–496
Henderson-Sellers B (1996) Object-oriented metrics, measures of complexity. Prentice Hall, Englewood Cliffs
Hitz M, Montazeri B (1995) Measuring coupling and cohesion in object-oriented systems. In: Proceedings of the international symposium on applied corporate computing, Monterrey, pp 25–27
Lake A, Cook C (1994) Use of factor analysis to develop OOP software complexity metrics. In: Proceedings of the 6th annual oregon workshop on software metrics, Silver Falls
Lee Y, Liang B, Wu S, Wang F (1995) Measuring the coupling and cohesion of an object-oriented program based on information flow. In: Proceedings of the international conference on software quality, Maribor
Li W, Henry S (1993) Object-oriented metrics that predict maintainability. J Syst Softw 23(2):111–122
Lorenz M, Kidd J (1994) Object-oriented software metrics. Prentice-Hall, Englewood Cliffs
Morris C, Autret A, Boddy L (2001) Support vector machines for identifying organisms-a comparison with strongly partitioned radial basis function networks. Ecol Model 146:57–67
Olague H, Etzkorn L, Gholston S, Quattlebaum S (2007) Empirical validation of three software metrics suites to predict fault-proneness of object-oriented classes developed using highly iterative or agile software development processes. IEEE Trans Softw Eng 33(8):402–419
Pai G (2007) Empirical analysis of software fault content and fault proneness using Bayesian methods. IEEE Trans Softw Eng 33(10):675–686
Sherrod P. (2003) DTreg predictive modeling software
Singh Y, Kaur A, Malhotra R (2010) Empirical validation of object-oriented metrics for predicting fault proneness models. Softw Qual J 18(1):3–35
Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J Royal Stat Soc 36:111–147
Tang MH, Kao MH, Chen MH (1999) An empirical study on object-oriented metrics. In: Proceedings of 6th international software metrics symposium, pp 242–249
Tegarden D, Sheetz S, Monarchi D (1995) A software complexity model of object-oriented systems. Decision Support Syst 13(3–4):241–262
Wang X, Bi D, Wang S (2007) Fault recognition with labelled multi-category’, Third conference on Natural Computation. Haikou, China
NASA metrics data repository (2004). Available at www.mdp.ivv.nasa.gov
Yu P, Systa T, Muller H (2002) Predicting fault-proneness using OO metrics: an industrial case study. In: Proceedings of sixth European conference on software maintenance and reengineering, Budapest, Hungary, pp 99–107
Yuming Z, Hareton L (2006) Empirical analysis of Object-Oriented Design Metrics for predicting high severity faults. IEEE Transactions on Software Engineering 32(10):771–784
Zhao L, Takagi N (2007) An application of Support vector machines to Chinese character classification problem. In: IEEE international conference on systems, Man and Cybernetics, Montreal
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Malhotra, R., Kaur, A. & Singh, Y. Empirical validation of object-oriented metrics for predicting fault proneness at different severity levels using support vector machines. Int J Syst Assur Eng Manag 1, 269–281 (2010). https://doi.org/10.1007/s13198-011-0048-7
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13198-011-0048-7