Abstract
Predicting software defects in software systems at early stages of its development has always been a very crucial and desirable aspect of software development industry. Today the machine learning algorithms are playing a massive role in classifying and predicting the possible bugs during the design phase. In this research work, the authors have proposed a discretization method based on metrics threshold values in order to gain better classification accuracy on a given data set. For the experimentation purpose, the authors have chosen the defect data sets from NASA repositories. In this Jedit, Lucene, Tomcat, Velocity, Xalan, Xerces software systems have been considered for experimentation using WEKA. The authors have also considered object-oriented CK metrics specifically for the study. Two very common and popular classifiers namely Naive Bayes and voted perceptron for the classification purpose. In the proposed work, various performance measures like ROC, RMSE values have been considered and analyzed. The results show that classification accuracy improvements can be made while using proposed discretization method with both classifiers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arora, D., Khanna, P., Tripathi, A., Sharma, S., Shukla, S.: Software quality estimation through object oriented design metrics. IJCSNS Int. J. Comput. Sci. Netw. Secur. 11(4) (2011)
A systematic literature review of software defect prediction: research trends, datasets, methods and frameworks. J. Softw. Eng. (2015)
Arora I., Tetarwala, V., Sahaa, A.: Open issues in software defect prediction. Elsevier (2015)
Zimmermann, T., Premraj, R., Zeller, A.: Predicting defects for eclipse. In: Proceedings of the Third International Workshop on Predictor Models in Software Engineering, p. 9. IEEE Computer Society (2007)
Ren, J., Qin, K., Ma, Y., Luo, G.: On software defect prediction using machine learning. J. Appl. Math. (2014)
Bakar, A.D., Sultan, A., Zulzalil, H., Din, J.: Predicting maintainability of object-oriented software using metric threshold. Inf. Technol. J. 13, 1540–1547 (2014)
Salvador, G., et al.: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans. Knowl. Data Eng. 25(4), 734–750 (2013)
Kotsiantis, S., Kanellopoulos, D.: Discretization techniques: a recent survey. GESTS Int. Trans. Computer Sci. Eng. 32, 47–58 (2006)
Kaya, F.: Discretizing continuous features for Naive Bayes and C4. 5 classifiers. University of Maryland publications, College Park, MD, USA (2008)
Kohavi, R., Sahami, M.: Error-based and entropy-based discretization of continuous features. KDD (1996)
Kapoor, P., Arora, D., Kumar A.: Effects of mean metric value over CK metrics distribution towards improved software fault predictions. In: Proceedings of the Springer’s International Conference IC4S (2016)
Rish, I.: An empirical study of the Naive Bayes classifier. In: IJCAI Workshop on Empirical Methods in Artificial Intelligence, vol. 3, no. 22, p. 111. IBM New York (2001)
Metz, C.E.: Basic principles of ROC analysis (PDF). Semin. Nucl. Med. 8(4), 283–298 (1978)
Tomar, Divya, Agarwal, Sonal: Twin support vector machine for multiple instance learning based on bag dissimilarities. Adv. Artif. Intell. 2016, 1–18 (2016)
Rish, I.: An empirical study of the Naive Bayes classifier. In: IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, vol. 3, no. 22. IBM New York (2001)
Yousef, A.H.: Extracting software static defect models using data minig. Elsevier (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kapoor, P., Arora, D., Kumar, A. (2018). An Approach for Improving Classification Accuracy Using Discretized Software Defect Data. In: Sa, P., Bakshi, S., Hatzilygeroudis, I., Sahoo, M. (eds) Recent Findings in Intelligent Computing Techniques . Advances in Intelligent Systems and Computing, vol 709. Springer, Singapore. https://doi.org/10.1007/978-981-10-8633-5_34
Download citation
DOI: https://doi.org/10.1007/978-981-10-8633-5_34
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8632-8
Online ISBN: 978-981-10-8633-5
eBook Packages: EngineeringEngineering (R0)