ABSTRACT
Given a stream of entries in a multi-aspect data setting i.e., entries having multiple dimensions, how can we detect anomalous activities in an unsupervised manner? For example, in the intrusion detection setting, existing work seeks to detect anomalous events or edges in dynamic graph streams, but this does not allow us to take into account additional attributes of each entry. Our work aims to define a streaming multi-aspect data anomaly detection framework, termed MStream which can detect unusual group anomalies as they occur, in a dynamic manner. MStream has the following properties: (a) it detects anomalies in multi-aspect data including both categorical and numeric attributes; (b) it is online, thus processing each record in constant time and constant memory; (c) it can capture the correlation between multiple aspects of the data. MStream is evaluated over the KDDCUP99, CICIDS-DoS, UNSW-NB 15 and CICIDS-DDoS datasets, and outperforms state-of-the-art baselines.
- Leman Akoglu, Hanghang Tong, and Danai Koutra. 2015. Graph Based Anomaly Detection and Description: A Survey. Data mining and knowledge discovery(2015).Google Scholar
- Azeem Aqil, Karim Khalil, Ahmed O F Atya, Evangelos E Papalexakis, Srikanth V Krishnamurthy, Trent Jaeger, K K Ramakrishnan, Paul Yu, and Ananthram Swami. 2017. Jaal: Towards Network Intrusion Detection at ISP Scale. In CoNEXT.Google Scholar
- Elisa Bertino, Evimaria Terzi, Ashish Kamra, and Athena Vakali. 2005. Intrusion detection in RBAC-administered databases. In ACSAC.Google Scholar
- Siddharth Bhatia, Bryan Hooi, Minji Yoon, Kijung Shin, and Christos Faloutsos. 2020. MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams. In AAAI.Google Scholar
- Petko Bogdanov, Christos Faloutsos, Misael Mongiovì, Evangelos E Papalexakis, Razvan Ranca, and Ambuj K Singh. 2013. NetSpot: Spotting Significant Anomalous Regions on Dynamic Networks. In SDM.Google Scholar
- Francesco Bonchi, Ilaria Bordino, Francesco Gullo, and Giovanni Stilo. 2016. Identifying Buzzing Stories via Anomalous Temporal Subgraph Discovery. In WI.Google Scholar
- Francesco Bonchi, Ilaria Bordino, Francesco Gullo, and Giovanni Stilo. 2019. The importance of unexpectedness: Discovering buzzing stories in anomalous temporal graphs. Web Intelligence (2019).Google Scholar
- Markus M Breunig, Hans-Peter Kriegel, Raymond T Ng, and Jörg Sander. 2000. LOF: identifying density-based local outliers. In SIGMOD.Google Scholar
- Moses S Charikar. 2002. Similarity estimation techniques from rounding algorithms. In STOC.Google Scholar
- L Chi, B Li, X Zhu, S Pan, and L Chen. 2018. Chang, Yen-Yu and Li, Pan and Sosic, Rok and Afifi, MH and Schweighauser, Marco and Leskovec, Jure. IEEE Transactions on Cybernetics(2018).Google Scholar
- Graham Cormode and Shan Muthukrishnan. 2005. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms(2005).Google Scholar
- [12] KDD Cup 1999 Dataset.1999. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.Google Scholar
- Paulo Vitor de Campos Souza, Augusto Junio Guimarães, Thiago Silva Rezende, Vinicius Jonathan Silva Araujo, and Vanessa Souza Araujo. 2020. Detection of Anomalies in Large-Scale Cyberattacks Using Fuzzy Neural Networks. Artificial Intelligence(2020).Google Scholar
- Dhivya Eswaran and Christos Faloutsos. 2018. Sedanspot: Detecting anomalies in edge streams. In ICDM.Google Scholar
- Hadi Fanaee-T and João Gama. 2015. Multi-aspect-streaming tensor analysis. Knowledge-Based Systems(2015).Google Scholar
- Hadi Fanaee-T and João Gama. 2016. Tensor-based anomaly detection: An interdisciplinary survey. Knowledge-Based Systems(2016).Google Scholar
- Adam Goodge, Bryan Hooi, See-Kiong Ng, and Wee Siong Ng. 2020. Robustness of Autoencoders for Anomaly Detection Under Adversarial Impact. In IJCAI.Google Scholar
- Tyrone Gradison and Evimaria Terzi. 2018. Intrusion Detection Technology. In Encyclopedia of Database Systems.Google Scholar
- Sudipto Guha, Nina Mishra, Gourav Roy, and Okke Schrijvers. 2016. Robust Random Cut Forest Based Anomaly Detection on Streams. In ICML.Google Scholar
- Nikhil Gupta, Dhivya Eswaran, Neil Shah, Leman Akoglu, and Christos Faloutsos. 2017. LookOut on Time-Evolving Graphs: Succinctly Explaining Anomalies from Any Detector. ArXiv abs/1710.05333(2017).Google Scholar
- Kawther Hassine, Aiman Erbad, and Ridha Hamila. 2019. Important Complexity Reduction of Random Forest in Multi-Classification Problem. In IWCMC.Google Scholar
- Geoffrey E Hinton and Richard S Zemel. 1994. Autoencoders, minimum description length and Helmholtz free energy. In NIPS.Google Scholar
- Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos. 2015. A general suspiciousness metric for dense blocks in multimodal data. In ICDM.Google Scholar
- Hyunjun Ju, Dongha Lee, Junyoung Hwang, Junghyun Namkung, and Hwanjo Yu. 2020. PUMAD: PU Metric learning for anomaly detection. Information Sciences(2020).Google Scholar
- Farrukh Aslam Khan, Abdu Gumaei, Abdelouahid Derhab, and Amir Hussain. 2019. A Novel Two-Stage Deep Learning Model for Efficient Network Intrusion Detection. IEEE Access (2019).Google Scholar
- Artemy Kolchinsky, Brendan D Tracey, and David H Wolpert. 2019. Nonlinear Information Bottleneck. Entropy (2019).Google Scholar
- Tamara G Kolda and Brett W Bader. 2009. Tensor decompositions and applications. SIAM review (2009).Google Scholar
- Xiangnan Kong and S Yu Philip. 2011. An ensemble-based approach to fast classification of multi-label data streams. In CollaborateCom.Google Scholar
- Rithesh Kumar, Anirudh Goyal, Aaron C Courville, and Yoshua Bengio. 2019. Maximum Entropy Generators for Energy-Based Models. ArXiv abs/1901.08508(2019).Google Scholar
- Jie Li, Guan Han, Jing Wen, and Xinbo Gao. 2011. Robust tensor subspace learning for anomaly detection. IJMLC (2011).Google Scholar
- Witold Litwin. 1980. Linear hashing: a new tool for file and table addressing.. In VLDB.Google Scholar
- Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation Forest. ICDM (2008).Google Scholar
- Chen Luo and Anshumali Shrivastava. 2018. Arrays of (Locality-Sensitive) Count Estimators (ACE): Anomaly Detection on the Edge. In WWW.Google Scholar
- Fragkiskos D Malliaros, Vasileios Megalooikonomou, and Christos Faloutsos. 2012. Fast Robustness Estimation in Large Social Graphs: Communities and Anomaly Detection. In SDM.Google Scholar
- Emaad A Manzoor, Hemank Lamba, and Leman Akoglu. 2018. xStream: Outlier Detection in Feature-Evolving Data Streams. In KDD.Google Scholar
- Hing-Hao Mao, Chung-Jung Wu, Evangelos E Papalexakis, Christos Faloutsos, Kuo-Chen Lee, and Tien-Cheu Kao. 2014. MalSpot: Multi 2 malicious network behavior patterns analysis. In PAKDD.Google Scholar
- Koji Maruhashi, Fan Guo, and Christos Faloutsos. 2011. Multiaspectforensics: Pattern mining on large-scale heterogeneous networks with tensor analysis. In ASONAM.Google Scholar
- Misael Mongiovì, Petko Bogdanov, Razvan Ranca, Ambuj K Singh, Evangelos E Papalexakis, and Christos Faloutsos. 2012. SigSpot: Mining Significant Anomalous Regions from Time-Evolving Networks (Abstract Only). In SIGMOD.Google Scholar
- Nour Moustafa and Jill Slay. 2015. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In MilCIS.Google Scholar
- Gyoung S Na, Donghyun Kim, and Hwanjo Yu. 2018. DILOF: Effective and Memory Efficient Local Outlier Detection in Data Streams. In KDD.Google ScholarDigital Library
- Phuc Cuong Ngo, Amadeus Aristo Winarto, Connie Khor Li Kou, Sojeong Park, Farhan Akram, and Hwee Kuan Lee. 2019. Fence GAN: Towards Better Anomaly Detection. ICTAI (2019).Google Scholar
- Shirui Pan, Jia Wu, Xingquan Zhu, and Chengqi Zhang. 2015. Graph Ensemble Boosting for Imbalanced Noisy Graph Stream Classification. IEEE Transactions on Cybernetics(2015).Google Scholar
- Shirui Pan, Kuan Wu, Yang Zhang, and Xue Li. 2010. Classifier Ensemble for Uncertain Data Stream Classification. In Advances in Knowledge Discovery and Data Mining.Google Scholar
- Shirui Pan, Xingquan Zhu, Chengqi Zhang, and S Yu Philip. 2013. Graph stream classification using labeled and unlabeled graphs. In ICDE.Google Scholar
- Evangelos Papalexakis, Konstantinos Pelechrinis, and Christos Faloutsos. 2014. Spotting misbehaviors in location-based social networks using tensors. In WWW.Google Scholar
- Evangelos E Papalexakis, Christos Faloutsos, and Nicholas D Sidiropoulos. 2012. Parcube: Sparse parallelizable tensor decompositions. In ECMLPKDD.Google Scholar
- Karl Pearson. 1901. LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science (1901).Google Scholar
- Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, 2011. Scikit-learn: Machine Learning in Python. JMLR (2011).Google ScholarDigital Library
- Bryan Perozzi and Leman Akoglu. 2016. Scalable anomaly ranking of attributed neighborhoods. In SDM.Google Scholar
- Bryan Perozzi and Leman Akoglu. 2018. Discovering Communities and Anomalies in Attributed Graphs: Interactive Visual Exploration and Summarization. TKDD (2018).Google ScholarDigital Library
- Bryan Perozzi, Michael Schueppert, Jack Saalweachter, and Mayur Thakur. 2016. When Recommendation Goes Wrong: Anomalous Link Discovery in Recommendation Networks. In KDD.Google Scholar
- Smitha Rajagopal, Katiganere Siddaramappa Hareesha, and Poornima Panduranga Kundapur. 2020. Feature Relevance Analysis and Feature Reduction of UNSW NB-15 Using Neural Networks on MAMLS. In ICACIE.Google Scholar
- Smitha Rajagopal, Poornima Panduranga Kundapur, and Katiganere Siddaramappa Hareesha. 2020. A Stacking Ensemble for Network Intrusion Detection Using Heterogeneous Datasets. Security and Communication Networks(2020).Google Scholar
- Stephen Ranshous, Steve Harenberg, Kshitij Sharma, and Nagiza F Samatova. 2016. A Scalable Approach for Outlier Detection in Edge Streams Using Sketch-based Approximations. In SDM.Google Scholar
- Markus Ring, Sarah Wunderlich, Deniz Scheuring, Dieter Landes, and Andreas Hotho. 2019. A survey of network-based intrusion detection data sets. Computers & Security(2019).Google Scholar
- Peter J Rousseeuw and Katrien Van Driessen. 1999. A fast algorithm for the minimum covariance determinant estimator. Technometrics (1999).Google Scholar
- Saket Sathe and Charu C Aggarwal. 2016. Subspace Outlier Detection in Linear Time with Randomized Hashing. In ICDM.Google Scholar
- Neil Shah, Alex Beutel, Bryan Hooi, Leman Akoglu, Stephan Gunnemann, Disha Makhija, Mohit Kumar, and Christos Faloutsos. 2016. EdgeCentric: Anomaly Detection in Edge-Attributed Networks. In ICDMW.Google Scholar
- Iman Sharafaldin, Arash Habibi Lashkari, and Ali A Ghorbani. 2018. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization. In ICISSP.Google Scholar
- Lei Shi, Aryya Gangopadhyay, and Vandana P Janeja. 2015. STenSr: Spatio-temporal tensor streams for anomaly detection and pattern discovery. Knowledge and Information Systems(2015).Google Scholar
- Kijung Shin, Bryan Hooi, and Christos Faloutsos. 2016. M-zoom: Fast dense-block detection in tensors with quality guarantees. In ECMLPKDD.Google Scholar
- Kijung Shin, Bryan Hooi, Jisu Kim, and Christos Faloutsos. 2017. D-cube: Dense-block detection in terabyte-scale tensors. In WSDM.Google ScholarDigital Library
- Kijung Shin, Bryan Hooi, Jisu Kim, and Christos Faloutsos. 2017. DenseAlert: Incremental Dense-Subtensor Detection in Tensor Streams. KDD (2017).Google ScholarDigital Library
- Hongyu Sun, Qiang He, Kewen Liao, Timos Sellis, Longkun Guo, Xuyun Zhang, Jun Shen, and Feifei Chen. 2019. Fast Anomaly Detection in Multiple Multi-Dimensional Data Streams. In BigData.Google Scholar
- Jimeng Sun, Dacheng Tao, and Christos Faloutsos. 2006. Beyond streams and graphs: dynamic tensor analysis. In KDD.Google Scholar
- Naftali Tishby, Fernando C Pereira, and William Bialek. 2000. The information bottleneck method. arXiv preprint physics/0004057(2000).Google Scholar
- Hanghang Tong, Chongrong Li, Jingrui He, Jiajian Chen, Quang-Anh Tran, Haixin Duan, and Xing Li. 2005. Anomaly Internet Network Traffic Detection by Kernel Principle Component Classifier. In ISNN.Google Scholar
- Hanghang Tong and Ching-Yung Lin. 2011. Non-Negative Residual Matrix Factorization with Application to Graph Anomaly Detection. In SDM.Google Scholar
- Ravi Vinayakumar, Mamoun Alazab, KP Soman, Prabaharan Poornachandran, Ameer Al-Nemrat, and Sitalakshmi Venkatraman. 2019. Deep Learning Approach for Intelligent Intrusion Detection System. IEEE Access (2019).Google ScholarCross Ref
- Wei Wang, Xiaohong Guan, Xiangliang Zhang, and Liwei Yang. 2006. Profiling program behavior for anomaly intrusion detection based on the transition and frequency property of computer audit data. Computers & Security(2006).Google Scholar
- Wei Wang, Thomas Guyet, René Quiniou, Marie-Odile Cordier, Florent Masseglia, and Xiangliang Zhang. 2014. Autonomic intrusion detection: Adaptively detecting anomalies over unlabeled audit data streams in computer networks. Knowledge-Based Systems(2014).Google Scholar
- Yiwei Wang, Shenghua Liu, Minji Yoon, Hemank Lamba, Wei Wang, Christos Faloutsos, and Bryan Hooi. 2020. Provably Robust Node Classification via Low-Pass Message Passing. ICDM (2020).Google Scholar
- Audrey Wilmet, Tiphaine Viard, Matthieu Latapy, and Robin Lamarche-Perrin. 2018. Degree-Based Outliers Detection Within IP Traffic Modelled as a Link Stream. 2018 Network Traffic Measurement and Analysis Conference (TMA) (2018).Google ScholarCross Ref
- Audrey Wilmet, Tiphaine Viard, Matthieu Latapy, and Robin Lamarche-Perrin. 2019. Outlier detection in IP traffic modelled as a link stream using the stability of degree distributions over time. Computer Networks (2019).Google Scholar
- Minji Yoon, Bryan Hooi, Kijung Shin, and Christos Faloutsos. 2019. Fast and Accurate Anomaly Detection in Dynamic Graphs with a Two-Pronged Approach. In KDD.Google Scholar
- Weiren Yu, Charu C Aggarwal, Shuai Ma, and Haixun Wang. 2013. On anomalous hotspot discovery in graph streams. In ICDM.Google Scholar
- Shuangfei Zhai, Yu Cheng, Weining Lu, and Zhongfei Zhang. 2016. Deep structured energy based models for anomaly detection. In ICML.Google Scholar
- Jiabao Zhang, Shenghua Liu, Wenjian Yu, Wenjie Feng, and Xueqi Cheng. 2019. EigenPulse: Detecting Surges in Large Streaming Graphs with Row Augmentation. In PAKDD.Google Scholar
- Shuo Zhou, Nguyen Xuan Vinh, James Bailey, Yunzhe Jia, and Ian Davidson. 2016. Accelerating online cp decompositions for higher order tensors. In KDD.Google Scholar
- Artur Ziviani, Antonio Tadeu A Gomes, Marcelo L Monsores, and Paulo SS Rodrigues. 2007. Network anomaly detection using nonextensive entropy. IEEE Communications Letters(2007).Google Scholar
- Bo Zong, Qi Song, Martin Renqiang Min, Wei Cheng, Cristian Lumezanu, Daeki Cho, and Haifeng Chen. 2018. Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection. In ICLR.Google Scholar
Recommendations
Fast Anomaly Detection based on Data Stream in Network Intrusion Detection System
ACM TURC '21: Proceedings of the ACM Turing Award Celebration Conference - ChinaIntrusion detection system is a primary defense mechanism in aspect of protecting network security. Anomaly detection, as one of the most commonly used intrusion detection methods, plays a significant part in detecting the network traffic data. However,...
Program Anomaly Detection: Methodology and Practices
CCS '16: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications SecurityThis tutorial will present an overview of program anomaly detection, which analyzes normal program behaviors and discovers aberrant executions caused by attacks, misconfigurations, program bugs, and unusual usage patterns. It was first introduced as an ...
Specification-based anomaly detection: a new approach for detecting network intrusions
CCS '02: Proceedings of the 9th ACM conference on Computer and communications securityUnlike signature or misuse based intrusion detection techniques, anomaly detection is capable of detecting novel attacks. However, the use of anomaly detection in practice is hampered by a high rate of false alarms. Specification-based techniques have ...
Comments