Abstract
Network traffic analysis is a process to infer patterns in communication. Reliance on computer network and increasing connectivity of these networks makes it a challenging task for the network managers to understand the nature of the traffic that is carried in their network. However, it is an important data analysis task, given the amount of network traffic generated. Summarization is a key data mining concept, which is considered as a solution for creating concise yet accurate summary of network traffic. In this paper, we propose a new definition of summary for network traffic which outperforms the existing state-of-the-art summarization techniques. Our approach is based on clustering algorithm which reduces the information loss incurred by the existing techniques. By analysing the traffic summarization results using most up to date evaluation metrics, we demonstrate that our approach achieves better summaries than others on benchmark KDD cup 1999 dataset and also on real life network traffic including simulated attacks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hoplaros, D., Tari, Z., Khalil, I.: Data summarization for network traffic monitoring. J. Netw. Comput. Appl. 37, 194–205 (2014)
Wang, X., Abraham, A., Smith, K.A.: Intelligent web traffic mining and analysis. J. Netw. Compu. Appl. 28(2), 147–165 (2005)
Keys, K., Moore, D., Estan, C.: A robust system for accurate real-time summaries of internet traffic. In: Proceedings of the 2005 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2005, pp. 85–96. ACM, New York (2005)
Mahmood, A., Leckie, C., Udaya, P.: A scalable sampling scheme for clustering in network traffic analysis. In: Proceedings of the 2nd International Conference on Scalable Information Systems, InfoScale 2007, pp. 38:1–38 (2007)
Ahmed M., Naser, A.: Clustering based saemantic data summarization technique: a new approach. In: Accepted to appear in 9th IEEE Conference on Industrial Electronics and Applications (ICIEA), China (2014)
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Cam, L.M.L., Neyman, J. (eds.) Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press (1967)
Chandola, V., Kumar, V.: Summarization - compressing data into an informative representation. Knowl. Inf. Syst. 12(3), 355–378 (2007)
Wendel, P., Ghanem, M., Guo, Y.: Scalable clustering on the data grid. In: 5th IEEE International Symposium Cluster Computing and the Grid (CCGrid) (2005)
More P., Hall, L.: Scalable clustering: a distributed approach. In: Proceedings of the 2004 IEEE International Conference on Fuzzy Systems, vol. 1, pp. 143–148 ( 2004)
Kendall, K.: A database of computer attacks for the evaluation of intrusion detection systems. In: Proceedings of DARPA Information Survivality Conference and Eexposition (DISCEX), DARPA Off-line Intrusion Detection Evaluation, pp. 12–26 (1999)
1999 kdd cup dataset. http://www.kdd.ics.uci.edu
Leung, K., Leckie, C.: Unsupervised anomaly detection in network intrusion detection using clusters. In: Proceedings of the Twenty-Eighth Australasian Conference on Computer Science, ACSC 2005, vol. 38, pp. 333–342. Australian Computer Society Inc., Darlinghurst (2005)
Shafi, K., Abbass, H.: Evaluation of an adaptive genetic-based signature extraction system for network intrusion detection. Pattern Anal. Appl. 16(4), 549–566 (2013)
The Fully Labelled TCP dataset. http://seit.unsw.adfa.edu.au/staff/sites/kshafi/Datasets
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Ahmed, M., Mahmood, A.N., Maher, M.J. (2015). A Novel Approach for Network Traffic Summarization. In: Jung, J., Badica, C., Kiss, A. (eds) Scalable Information Systems. INFOSCALE 2014. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 139. Springer, Cham. https://doi.org/10.1007/978-3-319-16868-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-16868-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16867-8
Online ISBN: 978-3-319-16868-5
eBook Packages: Computer ScienceComputer Science (R0)