An Approach for Improving Classification Accuracy Using Discretized Software Defect Data

Kapoor, Pooja; Arora, Deepak; Kumar, Ashwani

doi:10.1007/978-981-10-8633-5_34

Pooja Kapoor¹⁸,
Deepak Arora¹⁸ &
Ashwani Kumar¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 709))

709 Accesses

Abstract

Predicting software defects in software systems at early stages of its development has always been a very crucial and desirable aspect of software development industry. Today the machine learning algorithms are playing a massive role in classifying and predicting the possible bugs during the design phase. In this research work, the authors have proposed a discretization method based on metrics threshold values in order to gain better classification accuracy on a given data set. For the experimentation purpose, the authors have chosen the defect data sets from NASA repositories. In this Jedit, Lucene, Tomcat, Velocity, Xalan, Xerces software systems have been considered for experimentation using WEKA. The authors have also considered object-oriented CK metrics specifically for the study. Two very common and popular classifiers namely Naive Bayes and voted perceptron for the classification purpose. In the proposed work, various performance measures like ROC, RMSE values have been considered and analyzed. The results show that classification accuracy improvements can be made while using proposed discretization method with both classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Arora, D., Khanna, P., Tripathi, A., Sharma, S., Shukla, S.: Software quality estimation through object oriented design metrics. IJCSNS Int. J. Comput. Sci. Netw. Secur. 11(4) (2011)
Google Scholar
A systematic literature review of software defect prediction: research trends, datasets, methods and frameworks. J. Softw. Eng. (2015)
Google Scholar
Arora I., Tetarwala, V., Sahaa, A.: Open issues in software defect prediction. Elsevier (2015)
Google Scholar
Zimmermann, T., Premraj, R., Zeller, A.: Predicting defects for eclipse. In: Proceedings of the Third International Workshop on Predictor Models in Software Engineering, p. 9. IEEE Computer Society (2007)
Google Scholar
Ren, J., Qin, K., Ma, Y., Luo, G.: On software defect prediction using machine learning. J. Appl. Math. (2014)
Google Scholar
Bakar, A.D., Sultan, A., Zulzalil, H., Din, J.: Predicting maintainability of object-oriented software using metric threshold. Inf. Technol. J. 13, 1540–1547 (2014)
Article Google Scholar
Salvador, G., et al.: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans. Knowl. Data Eng. 25(4), 734–750 (2013)
Article MathSciNet Google Scholar
Kotsiantis, S., Kanellopoulos, D.: Discretization techniques: a recent survey. GESTS Int. Trans. Computer Sci. Eng. 32, 47–58 (2006)
Google Scholar
Kaya, F.: Discretizing continuous features for Naive Bayes and C4. 5 classifiers. University of Maryland publications, College Park, MD, USA (2008)
Google Scholar
Kohavi, R., Sahami, M.: Error-based and entropy-based discretization of continuous features. KDD (1996)
Google Scholar
Kapoor, P., Arora, D., Kumar A.: Effects of mean metric value over CK metrics distribution towards improved software fault predictions. In: Proceedings of the Springer’s International Conference IC4S (2016)
Google Scholar
http://promisedata.googlecode.com
Rish, I.: An empirical study of the Naive Bayes classifier. In: IJCAI Workshop on Empirical Methods in Artificial Intelligence, vol. 3, no. 22, p. 111. IBM New York (2001)
Google Scholar
Metz, C.E.: Basic principles of ROC analysis (PDF). Semin. Nucl. Med. 8(4), 283–298 (1978)
Article Google Scholar
Tomar, Divya, Agarwal, Sonal: Twin support vector machine for multiple instance learning based on bag dissimilarities. Adv. Artif. Intell. 2016, 1–18 (2016)
Article Google Scholar
Rish, I.: An empirical study of the Naive Bayes classifier. In: IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, vol. 3, no. 22. IBM New York (2001)
Google Scholar
Yousef, A.H.: Extracting software static defect models using data minig. Elsevier (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science & Engineering, Amity University, Lucknow, India
Pooja Kapoor & Deepak Arora
Area of IT & Systems, IIM Lucknow, Lucknow, India
Ashwani Kumar

Authors

Pooja Kapoor
View author publications
You can also search for this author in PubMed Google Scholar
Deepak Arora
View author publications
You can also search for this author in PubMed Google Scholar
Ashwani Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pooja Kapoor .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, National Institute of Technology, Rourkela, Rourkela, Odisha, India
Pankaj Kumar Sa
Department of Computer Science and Engineering, National Institute of Technology, Rourkela, Rourkela, Odisha, India
Sambit Bakshi
Department of Computer Engineering and Informatics, University of Patras, Patras, Greece
Ioannis K. Hatzilygeroudis
Department of Computer Science and Engineering, National Institute of Technology, Rourkela, Rourkela, Odisha, India
Manmath Narayan Sahoo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kapoor, P., Arora, D., Kumar, A. (2018). An Approach for Improving Classification Accuracy Using Discretized Software Defect Data. In: Sa, P., Bakshi, S., Hatzilygeroudis, I., Sahoo, M. (eds) Recent Findings in Intelligent Computing Techniques . Advances in Intelligent Systems and Computing, vol 709. Springer, Singapore. https://doi.org/10.1007/978-981-10-8633-5_34

Download citation

DOI: https://doi.org/10.1007/978-981-10-8633-5_34
Published: 04 November 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8632-8
Online ISBN: 978-981-10-8633-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics