Abstract
Information security is one of the important issues to protect data or information from unauthorized access. Classification techniques play very important role in information security to classify data as legitimate or normal data. Nowadays, network traffic includes large amount of irrelevant information that increases complexity of classifier and affect the classification result, so we need to develop robust model that can classify the data with high accuracy. In this paper, various types of classification techniques are applied on NSL-KDD data with Tenfold cross-validation technique in two different viewpoints. First, the classification techniques are applied for two class problem as binary classification (normal and attack), and second, it is applied for five class problem as multiclass classification. Empirical result shows that random forest technique outperforms in case of two class problem as well as five class problem on NSL-KDD data set. Due to large amount of redundant data, we have also applied feature selection techniques on random forest tree model which is best model as binary classifier as well as multiclass classifier. Model produces highest accuracy with 15 features in case of binary classification. Performance of the various models are also evaluated using other performance measures like true-positive rate (TPR), false-positive rate (FPR), precision, F-measure and receiver operating characteristic (ROC) curve and the results are found to be satisfactory.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Koc, L., et al.: A network intrusion detection system based on hidden naive bayes multiclass classifier. J. Expert Syst. Appl. 39, 13492–13500 (2012)
Sun, M., et al.: A new method of feature selection for flow classification. International Conference on Applied Physics and Industrial Engineering, vol. 24, pp. 1729–1736 (2012)
Mukherjee, S., et al.: Intrusion detection using bayes classifier with feature reduction. Procedia Technol. 4, 119–128 (2012)
NSL-KDD data set for network based intrusion detection system, available on http://www.iscx.info/NSL-KDD/
Pujari, A.K.: Data mining techniques, 4th edn. Universities Press (India) Private Limited (2001)
Cios, K., et al.: Data mining methods for knowledge discovery, 3rd edn. Kluwer Academic Publishers, Heidelberg (2000)
Han, J., Kamber, M.: Data mining concepts and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2006)
Wang, J.: Data Mining: opportunities and challenges. Idea Group, USA (2003)
Web sources: http://www.cs.waikato.ac.nz/ml/weka/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer India
About this paper
Cite this paper
Hota, H.S., Shrivas, A.K. (2014). Data Mining Approach for Developing Various Models Based on Types of Attack and Feature Selection as Intrusion Detection Systems (IDS). In: Mohapatra, D.P., Patnaik, S. (eds) Intelligent Computing, Networking, and Informatics. Advances in Intelligent Systems and Computing, vol 243. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1665-0_85
Download citation
DOI: https://doi.org/10.1007/978-81-322-1665-0_85
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-1664-3
Online ISBN: 978-81-322-1665-0
eBook Packages: EngineeringEngineering (R0)