skip to main content
10.1145/3442381.3450023acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article
Open Access

MStream: Fast Anomaly Detection in Multi-Aspect Streams

Published:03 June 2021Publication History

ABSTRACT

Given a stream of entries in a multi-aspect data setting i.e., entries having multiple dimensions, how can we detect anomalous activities in an unsupervised manner? For example, in the intrusion detection setting, existing work seeks to detect anomalous events or edges in dynamic graph streams, but this does not allow us to take into account additional attributes of each entry. Our work aims to define a streaming multi-aspect data anomaly detection framework, termed MStream  which can detect unusual group anomalies as they occur, in a dynamic manner. MStream has the following properties: (a) it detects anomalies in multi-aspect data including both categorical and numeric attributes; (b) it is online, thus processing each record in constant time and constant memory; (c) it can capture the correlation between multiple aspects of the data. MStream is evaluated over the KDDCUP99, CICIDS-DoS, UNSW-NB 15 and CICIDS-DDoS datasets, and outperforms state-of-the-art baselines.

References

  1. Leman Akoglu, Hanghang Tong, and Danai Koutra. 2015. Graph Based Anomaly Detection and Description: A Survey. Data mining and knowledge discovery(2015).Google ScholarGoogle Scholar
  2. Azeem Aqil, Karim Khalil, Ahmed O F Atya, Evangelos E Papalexakis, Srikanth V Krishnamurthy, Trent Jaeger, K K Ramakrishnan, Paul Yu, and Ananthram Swami. 2017. Jaal: Towards Network Intrusion Detection at ISP Scale. In CoNEXT.Google ScholarGoogle Scholar
  3. Elisa Bertino, Evimaria Terzi, Ashish Kamra, and Athena Vakali. 2005. Intrusion detection in RBAC-administered databases. In ACSAC.Google ScholarGoogle Scholar
  4. Siddharth Bhatia, Bryan Hooi, Minji Yoon, Kijung Shin, and Christos Faloutsos. 2020. MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams. In AAAI.Google ScholarGoogle Scholar
  5. Petko Bogdanov, Christos Faloutsos, Misael Mongiovì, Evangelos E Papalexakis, Razvan Ranca, and Ambuj K Singh. 2013. NetSpot: Spotting Significant Anomalous Regions on Dynamic Networks. In SDM.Google ScholarGoogle Scholar
  6. Francesco Bonchi, Ilaria Bordino, Francesco Gullo, and Giovanni Stilo. 2016. Identifying Buzzing Stories via Anomalous Temporal Subgraph Discovery. In WI.Google ScholarGoogle Scholar
  7. Francesco Bonchi, Ilaria Bordino, Francesco Gullo, and Giovanni Stilo. 2019. The importance of unexpectedness: Discovering buzzing stories in anomalous temporal graphs. Web Intelligence (2019).Google ScholarGoogle Scholar
  8. Markus M Breunig, Hans-Peter Kriegel, Raymond T Ng, and Jörg Sander. 2000. LOF: identifying density-based local outliers. In SIGMOD.Google ScholarGoogle Scholar
  9. Moses S Charikar. 2002. Similarity estimation techniques from rounding algorithms. In STOC.Google ScholarGoogle Scholar
  10. L Chi, B Li, X Zhu, S Pan, and L Chen. 2018. Chang, Yen-Yu and Li, Pan and Sosic, Rok and Afifi, MH and Schweighauser, Marco and Leskovec, Jure. IEEE Transactions on Cybernetics(2018).Google ScholarGoogle Scholar
  11. Graham Cormode and Shan Muthukrishnan. 2005. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms(2005).Google ScholarGoogle Scholar
  12. [12] KDD Cup 1999 Dataset.1999. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.Google ScholarGoogle Scholar
  13. Paulo Vitor de Campos Souza, Augusto Junio Guimarães, Thiago Silva Rezende, Vinicius Jonathan Silva Araujo, and Vanessa Souza Araujo. 2020. Detection of Anomalies in Large-Scale Cyberattacks Using Fuzzy Neural Networks. Artificial Intelligence(2020).Google ScholarGoogle Scholar
  14. Dhivya Eswaran and Christos Faloutsos. 2018. Sedanspot: Detecting anomalies in edge streams. In ICDM.Google ScholarGoogle Scholar
  15. Hadi Fanaee-T and João Gama. 2015. Multi-aspect-streaming tensor analysis. Knowledge-Based Systems(2015).Google ScholarGoogle Scholar
  16. Hadi Fanaee-T and João Gama. 2016. Tensor-based anomaly detection: An interdisciplinary survey. Knowledge-Based Systems(2016).Google ScholarGoogle Scholar
  17. Adam Goodge, Bryan Hooi, See-Kiong Ng, and Wee Siong Ng. 2020. Robustness of Autoencoders for Anomaly Detection Under Adversarial Impact. In IJCAI.Google ScholarGoogle Scholar
  18. Tyrone Gradison and Evimaria Terzi. 2018. Intrusion Detection Technology. In Encyclopedia of Database Systems.Google ScholarGoogle Scholar
  19. Sudipto Guha, Nina Mishra, Gourav Roy, and Okke Schrijvers. 2016. Robust Random Cut Forest Based Anomaly Detection on Streams. In ICML.Google ScholarGoogle Scholar
  20. Nikhil Gupta, Dhivya Eswaran, Neil Shah, Leman Akoglu, and Christos Faloutsos. 2017. LookOut on Time-Evolving Graphs: Succinctly Explaining Anomalies from Any Detector. ArXiv abs/1710.05333(2017).Google ScholarGoogle Scholar
  21. Kawther Hassine, Aiman Erbad, and Ridha Hamila. 2019. Important Complexity Reduction of Random Forest in Multi-Classification Problem. In IWCMC.Google ScholarGoogle Scholar
  22. Geoffrey E Hinton and Richard S Zemel. 1994. Autoencoders, minimum description length and Helmholtz free energy. In NIPS.Google ScholarGoogle Scholar
  23. Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos. 2015. A general suspiciousness metric for dense blocks in multimodal data. In ICDM.Google ScholarGoogle Scholar
  24. Hyunjun Ju, Dongha Lee, Junyoung Hwang, Junghyun Namkung, and Hwanjo Yu. 2020. PUMAD: PU Metric learning for anomaly detection. Information Sciences(2020).Google ScholarGoogle Scholar
  25. Farrukh Aslam Khan, Abdu Gumaei, Abdelouahid Derhab, and Amir Hussain. 2019. A Novel Two-Stage Deep Learning Model for Efficient Network Intrusion Detection. IEEE Access (2019).Google ScholarGoogle Scholar
  26. Artemy Kolchinsky, Brendan D Tracey, and David H Wolpert. 2019. Nonlinear Information Bottleneck. Entropy (2019).Google ScholarGoogle Scholar
  27. Tamara G Kolda and Brett W Bader. 2009. Tensor decompositions and applications. SIAM review (2009).Google ScholarGoogle Scholar
  28. Xiangnan Kong and S Yu Philip. 2011. An ensemble-based approach to fast classification of multi-label data streams. In CollaborateCom.Google ScholarGoogle Scholar
  29. Rithesh Kumar, Anirudh Goyal, Aaron C Courville, and Yoshua Bengio. 2019. Maximum Entropy Generators for Energy-Based Models. ArXiv abs/1901.08508(2019).Google ScholarGoogle Scholar
  30. Jie Li, Guan Han, Jing Wen, and Xinbo Gao. 2011. Robust tensor subspace learning for anomaly detection. IJMLC (2011).Google ScholarGoogle Scholar
  31. Witold Litwin. 1980. Linear hashing: a new tool for file and table addressing.. In VLDB.Google ScholarGoogle Scholar
  32. Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation Forest. ICDM (2008).Google ScholarGoogle Scholar
  33. Chen Luo and Anshumali Shrivastava. 2018. Arrays of (Locality-Sensitive) Count Estimators (ACE): Anomaly Detection on the Edge. In WWW.Google ScholarGoogle Scholar
  34. Fragkiskos D Malliaros, Vasileios Megalooikonomou, and Christos Faloutsos. 2012. Fast Robustness Estimation in Large Social Graphs: Communities and Anomaly Detection. In SDM.Google ScholarGoogle Scholar
  35. Emaad A Manzoor, Hemank Lamba, and Leman Akoglu. 2018. xStream: Outlier Detection in Feature-Evolving Data Streams. In KDD.Google ScholarGoogle Scholar
  36. Hing-Hao Mao, Chung-Jung Wu, Evangelos E Papalexakis, Christos Faloutsos, Kuo-Chen Lee, and Tien-Cheu Kao. 2014. MalSpot: Multi 2 malicious network behavior patterns analysis. In PAKDD.Google ScholarGoogle Scholar
  37. Koji Maruhashi, Fan Guo, and Christos Faloutsos. 2011. Multiaspectforensics: Pattern mining on large-scale heterogeneous networks with tensor analysis. In ASONAM.Google ScholarGoogle Scholar
  38. Misael Mongiovì, Petko Bogdanov, Razvan Ranca, Ambuj K Singh, Evangelos E Papalexakis, and Christos Faloutsos. 2012. SigSpot: Mining Significant Anomalous Regions from Time-Evolving Networks (Abstract Only). In SIGMOD.Google ScholarGoogle Scholar
  39. Nour Moustafa and Jill Slay. 2015. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In MilCIS.Google ScholarGoogle Scholar
  40. Gyoung S Na, Donghyun Kim, and Hwanjo Yu. 2018. DILOF: Effective and Memory Efficient Local Outlier Detection in Data Streams. In KDD.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Phuc Cuong Ngo, Amadeus Aristo Winarto, Connie Khor Li Kou, Sojeong Park, Farhan Akram, and Hwee Kuan Lee. 2019. Fence GAN: Towards Better Anomaly Detection. ICTAI (2019).Google ScholarGoogle Scholar
  42. Shirui Pan, Jia Wu, Xingquan Zhu, and Chengqi Zhang. 2015. Graph Ensemble Boosting for Imbalanced Noisy Graph Stream Classification. IEEE Transactions on Cybernetics(2015).Google ScholarGoogle Scholar
  43. Shirui Pan, Kuan Wu, Yang Zhang, and Xue Li. 2010. Classifier Ensemble for Uncertain Data Stream Classification. In Advances in Knowledge Discovery and Data Mining.Google ScholarGoogle Scholar
  44. Shirui Pan, Xingquan Zhu, Chengqi Zhang, and S Yu Philip. 2013. Graph stream classification using labeled and unlabeled graphs. In ICDE.Google ScholarGoogle Scholar
  45. Evangelos Papalexakis, Konstantinos Pelechrinis, and Christos Faloutsos. 2014. Spotting misbehaviors in location-based social networks using tensors. In WWW.Google ScholarGoogle Scholar
  46. Evangelos E Papalexakis, Christos Faloutsos, and Nicholas D Sidiropoulos. 2012. Parcube: Sparse parallelizable tensor decompositions. In ECMLPKDD.Google ScholarGoogle Scholar
  47. Karl Pearson. 1901. LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science (1901).Google ScholarGoogle Scholar
  48. Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, 2011. Scikit-learn: Machine Learning in Python. JMLR (2011).Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Bryan Perozzi and Leman Akoglu. 2016. Scalable anomaly ranking of attributed neighborhoods. In SDM.Google ScholarGoogle Scholar
  50. Bryan Perozzi and Leman Akoglu. 2018. Discovering Communities and Anomalies in Attributed Graphs: Interactive Visual Exploration and Summarization. TKDD (2018).Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Bryan Perozzi, Michael Schueppert, Jack Saalweachter, and Mayur Thakur. 2016. When Recommendation Goes Wrong: Anomalous Link Discovery in Recommendation Networks. In KDD.Google ScholarGoogle Scholar
  52. Smitha Rajagopal, Katiganere Siddaramappa Hareesha, and Poornima Panduranga Kundapur. 2020. Feature Relevance Analysis and Feature Reduction of UNSW NB-15 Using Neural Networks on MAMLS. In ICACIE.Google ScholarGoogle Scholar
  53. Smitha Rajagopal, Poornima Panduranga Kundapur, and Katiganere Siddaramappa Hareesha. 2020. A Stacking Ensemble for Network Intrusion Detection Using Heterogeneous Datasets. Security and Communication Networks(2020).Google ScholarGoogle Scholar
  54. Stephen Ranshous, Steve Harenberg, Kshitij Sharma, and Nagiza F Samatova. 2016. A Scalable Approach for Outlier Detection in Edge Streams Using Sketch-based Approximations. In SDM.Google ScholarGoogle Scholar
  55. Markus Ring, Sarah Wunderlich, Deniz Scheuring, Dieter Landes, and Andreas Hotho. 2019. A survey of network-based intrusion detection data sets. Computers & Security(2019).Google ScholarGoogle Scholar
  56. Peter J Rousseeuw and Katrien Van Driessen. 1999. A fast algorithm for the minimum covariance determinant estimator. Technometrics (1999).Google ScholarGoogle Scholar
  57. Saket Sathe and Charu C Aggarwal. 2016. Subspace Outlier Detection in Linear Time with Randomized Hashing. In ICDM.Google ScholarGoogle Scholar
  58. Neil Shah, Alex Beutel, Bryan Hooi, Leman Akoglu, Stephan Gunnemann, Disha Makhija, Mohit Kumar, and Christos Faloutsos. 2016. EdgeCentric: Anomaly Detection in Edge-Attributed Networks. In ICDMW.Google ScholarGoogle Scholar
  59. Iman Sharafaldin, Arash Habibi Lashkari, and Ali A Ghorbani. 2018. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization. In ICISSP.Google ScholarGoogle Scholar
  60. Lei Shi, Aryya Gangopadhyay, and Vandana P Janeja. 2015. STenSr: Spatio-temporal tensor streams for anomaly detection and pattern discovery. Knowledge and Information Systems(2015).Google ScholarGoogle Scholar
  61. Kijung Shin, Bryan Hooi, and Christos Faloutsos. 2016. M-zoom: Fast dense-block detection in tensors with quality guarantees. In ECMLPKDD.Google ScholarGoogle Scholar
  62. Kijung Shin, Bryan Hooi, Jisu Kim, and Christos Faloutsos. 2017. D-cube: Dense-block detection in terabyte-scale tensors. In WSDM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Kijung Shin, Bryan Hooi, Jisu Kim, and Christos Faloutsos. 2017. DenseAlert: Incremental Dense-Subtensor Detection in Tensor Streams. KDD (2017).Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Hongyu Sun, Qiang He, Kewen Liao, Timos Sellis, Longkun Guo, Xuyun Zhang, Jun Shen, and Feifei Chen. 2019. Fast Anomaly Detection in Multiple Multi-Dimensional Data Streams. In BigData.Google ScholarGoogle Scholar
  65. Jimeng Sun, Dacheng Tao, and Christos Faloutsos. 2006. Beyond streams and graphs: dynamic tensor analysis. In KDD.Google ScholarGoogle Scholar
  66. Naftali Tishby, Fernando C Pereira, and William Bialek. 2000. The information bottleneck method. arXiv preprint physics/0004057(2000).Google ScholarGoogle Scholar
  67. Hanghang Tong, Chongrong Li, Jingrui He, Jiajian Chen, Quang-Anh Tran, Haixin Duan, and Xing Li. 2005. Anomaly Internet Network Traffic Detection by Kernel Principle Component Classifier. In ISNN.Google ScholarGoogle Scholar
  68. Hanghang Tong and Ching-Yung Lin. 2011. Non-Negative Residual Matrix Factorization with Application to Graph Anomaly Detection. In SDM.Google ScholarGoogle Scholar
  69. Ravi Vinayakumar, Mamoun Alazab, KP Soman, Prabaharan Poornachandran, Ameer Al-Nemrat, and Sitalakshmi Venkatraman. 2019. Deep Learning Approach for Intelligent Intrusion Detection System. IEEE Access (2019).Google ScholarGoogle ScholarCross RefCross Ref
  70. Wei Wang, Xiaohong Guan, Xiangliang Zhang, and Liwei Yang. 2006. Profiling program behavior for anomaly intrusion detection based on the transition and frequency property of computer audit data. Computers & Security(2006).Google ScholarGoogle Scholar
  71. Wei Wang, Thomas Guyet, René Quiniou, Marie-Odile Cordier, Florent Masseglia, and Xiangliang Zhang. 2014. Autonomic intrusion detection: Adaptively detecting anomalies over unlabeled audit data streams in computer networks. Knowledge-Based Systems(2014).Google ScholarGoogle Scholar
  72. Yiwei Wang, Shenghua Liu, Minji Yoon, Hemank Lamba, Wei Wang, Christos Faloutsos, and Bryan Hooi. 2020. Provably Robust Node Classification via Low-Pass Message Passing. ICDM (2020).Google ScholarGoogle Scholar
  73. Audrey Wilmet, Tiphaine Viard, Matthieu Latapy, and Robin Lamarche-Perrin. 2018. Degree-Based Outliers Detection Within IP Traffic Modelled as a Link Stream. 2018 Network Traffic Measurement and Analysis Conference (TMA) (2018).Google ScholarGoogle ScholarCross RefCross Ref
  74. Audrey Wilmet, Tiphaine Viard, Matthieu Latapy, and Robin Lamarche-Perrin. 2019. Outlier detection in IP traffic modelled as a link stream using the stability of degree distributions over time. Computer Networks (2019).Google ScholarGoogle Scholar
  75. Minji Yoon, Bryan Hooi, Kijung Shin, and Christos Faloutsos. 2019. Fast and Accurate Anomaly Detection in Dynamic Graphs with a Two-Pronged Approach. In KDD.Google ScholarGoogle Scholar
  76. Weiren Yu, Charu C Aggarwal, Shuai Ma, and Haixun Wang. 2013. On anomalous hotspot discovery in graph streams. In ICDM.Google ScholarGoogle Scholar
  77. Shuangfei Zhai, Yu Cheng, Weining Lu, and Zhongfei Zhang. 2016. Deep structured energy based models for anomaly detection. In ICML.Google ScholarGoogle Scholar
  78. Jiabao Zhang, Shenghua Liu, Wenjian Yu, Wenjie Feng, and Xueqi Cheng. 2019. EigenPulse: Detecting Surges in Large Streaming Graphs with Row Augmentation. In PAKDD.Google ScholarGoogle Scholar
  79. Shuo Zhou, Nguyen Xuan Vinh, James Bailey, Yunzhe Jia, and Ian Davidson. 2016. Accelerating online cp decompositions for higher order tensors. In KDD.Google ScholarGoogle Scholar
  80. Artur Ziviani, Antonio Tadeu A Gomes, Marcelo L Monsores, and Paulo SS Rodrigues. 2007. Network anomaly detection using nonextensive entropy. IEEE Communications Letters(2007).Google ScholarGoogle Scholar
  81. Bo Zong, Qi Song, Martin Renqiang Min, Wei Cheng, Cristian Lumezanu, Daeki Cho, and Haifeng Chen. 2018. Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection. In ICLR.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    WWW '21: Proceedings of the Web Conference 2021
    April 2021
    4054 pages
    ISBN:9781450383127
    DOI:10.1145/3442381

    Copyright © 2021 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 3 June 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate1,899of8,196submissions,23%

    Upcoming Conference

    WWW '24
    The ACM Web Conference 2024
    May 13 - 17, 2024
    Singapore , Singapore

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format