skip to main content
10.1145/2528228.2528243acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesi-careConference Proceedingsconference-collections
research-article

P2P traffic classification using ensemble learning

Published:17 October 2013Publication History

ABSTRACT

Early Peer-to-Peer overlay network traffic classification schemes were based on port-based and payload based inspection. In recent years researchers have focused on alternate machine learning approaches. This paper presents ensemble learning which combines multiple models to improve prediction accuracy over a single classifier or semi-supervised learning techniques. In this paper, statistical characteristics of TCP and UDP flows are extracted from the network traces to construct a feature set first. We then apply feature selection techniques to reduce the number of features required to train the model, hence reducing the build time. We used Stacking and Voting ensemble learning techniques to improve prediction accuracy with base classifiers modelled using Machine Learning (ML) algorithms: Naïve Bayes classifier, Bayesian Network, Decision trees. We used meta classifiers to further improve classification accuracy to 99.9%. Our experimental results show that Stacking perform better over Voting in identifying P2P traffic.

References

  1. Mawi traffic archive. available online at:. http://mawi.wide.ad.jp/mawi/. Accessed on 25th July 2013.Google ScholarGoogle Scholar
  2. Sandvine. available online at:. http://sandvine.com/. Accessed on 10th July 2013.Google ScholarGoogle Scholar
  3. H. H. Ang, V. Gopalkrishnan, S. C. Hoi, and W. K. Ng. Adaptive ensemble classification in p2p networks. In Database Systems for Advanced Applications, pages 34--48. Springer, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Bednarczyk. jnetpcap website. http://jnetpcap.com. Accessed on 7th May 2013.Google ScholarGoogle Scholar
  5. N. Brownlee. Netramet & nemac reference manual v4. 3, 1999.Google ScholarGoogle Scholar
  6. M. Dash and H. Liu. Consistency-based search in feature selection. Artificial intelligence, 151(1): 155--176, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. T. G. Dietterich. Machine-learning research. AI magazine, 18(4): 97, 1997.Google ScholarGoogle Scholar
  8. S. Dong, D. Zhou, and W. Ding. Traffic classification model based on integration of multiple classifiers? Journal of Computational Information Systems, 8(24): 10429--10437, 2012.Google ScholarGoogle Scholar
  9. I. Jolliffe. Principal component analysis. Wiley Online Library, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  10. T. Karagiannis, K. Papagiannaki, and M. Faloutsos. Blinc: multilevel traffic classification in the dark. In ACM SIGCOMM Computer Communication Review, volume 35, pages 229--240. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. Mark, F. Eibe, H. Geoffrey, P. Bernhard, R. Peter, and W. Ian H. The weka data mining software: An update. In SIGKDD Explorations. KDD, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. T. Nguyen and G. Armitage. A survey of techniques for internet traffic classification using machine learning. Communications Surveys & Tutorials, IEEE, 10(4): 56--76, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. N. Pratik, D. Jagan Mohan Reddy, and C. Hota. Feature selection for detection of p2p botnet traffic. In ACM Compute, Vellore. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Rahbarinia, R. Perdisci, A. Lanzi, and K. Li. Peerrush: Mining for unwanted p2p traffic. volume 7967 of LNCS, pages 62--82. Springer Berlin Heidelberg, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. H. Schulze and K. Mochalski. ipoque internet study 2008/2009. available online at:. http://ipoque.com/en/news-events/press-center/press-releases/2009/. Accessed on 11th July 2013.Google ScholarGoogle Scholar
  16. P. Van Der Putten and M. Van Someren. A bias-variance analysis of a real world learning problem: The coil challenge 2000. Machine Learning, 57(1--2): 177--195, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. Wang, L. Shi, and B. Jennings. Ensemble classifier for traffic in presence of changing distributions. In IEEE ISCC, Split, Croatia. IEEE, 2013.Google ScholarGoogle Scholar
  18. R. Wang, L. Shi, and B. Jennings. Training traffic classifiers with arbitrary packets sets. In IEEE TRICANS ICC Workshop, Budapest, Hungary. IEEE, 2013.Google ScholarGoogle Scholar
  19. D. Zhao, R. C. Wang, and H. Xu. P2p traffic identification model based on ensemble learning. Journal of Nanjing University of Posts and Telecommunications(Natural Science), 2011-04.Google ScholarGoogle Scholar

Index Terms

  1. P2P traffic classification using ensemble learning

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        I-CARE '13: Proceedings of the 5th IBM Collaborative Academia Research Exchange Workshop
        October 2013
        68 pages
        ISBN:9781450323208
        DOI:10.1145/2528228

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 17 October 2013

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        I-CARE '13 Paper Acceptance Rate16of66submissions,24%Overall Acceptance Rate16of66submissions,24%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader