MStream: Fast Anomaly Detection in Multi-Aspect Streams

Authors:
Siddharth Bhatia

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

,
Arjit Jain

Indian Institute of Technology, Bombay, India

Indian Institute of Technology, Bombay, India
View Profile

,
Pan Li

Purdue University, USA

Purdue University, USA
View Profile

,
Ritesh Kumar

Indian Institute of Technology, Kharagpur, India

Indian Institute of Technology, Kharagpur, India
View Profile

,
Bryan Hooi

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

Authors Info & Claims

WWW '21: Proceedings of the Web Conference 2021April 2021Pages 3371–3382https://doi.org/10.1145/3442381.3450023

Published:03 June 2021Publication History

WWW '21: Proceedings of the Web Conference 2021

Pages 3371–3382

ABSTRACT

Given a stream of entries in a multi-aspect data setting i.e., entries having multiple dimensions, how can we detect anomalous activities in an unsupervised manner? For example, in the intrusion detection setting, existing work seeks to detect anomalous events or edges in dynamic graph streams, but this does not allow us to take into account additional attributes of each entry. Our work aims to define a streaming multi-aspect data anomaly detection framework, termed MStream which can detect unusual group anomalies as they occur, in a dynamic manner. MStream has the following properties: (a) it detects anomalies in multi-aspect data including both categorical and numeric attributes; (b) it is online, thus processing each record in constant time and constant memory; (c) it can capture the correlation between multiple aspects of the data. MStream is evaluated over the KDDCUP99, CICIDS-DoS, UNSW-NB 15 and CICIDS-DDoS datasets, and outperforms state-of-the-art baselines.

References

Leman Akoglu, Hanghang Tong, and Danai Koutra. 2015. Graph Based Anomaly Detection and Description: A Survey. Data mining and knowledge discovery(2015).Google Scholar
Azeem Aqil, Karim Khalil, Ahmed O F Atya, Evangelos E Papalexakis, Srikanth V Krishnamurthy, Trent Jaeger, K K Ramakrishnan, Paul Yu, and Ananthram Swami. 2017. Jaal: Towards Network Intrusion Detection at ISP Scale. In CoNEXT.Google Scholar
Elisa Bertino, Evimaria Terzi, Ashish Kamra, and Athena Vakali. 2005. Intrusion detection in RBAC-administered databases. In ACSAC.Google Scholar
Siddharth Bhatia, Bryan Hooi, Minji Yoon, Kijung Shin, and Christos Faloutsos. 2020. MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams. In AAAI.Google Scholar
Petko Bogdanov, Christos Faloutsos, Misael Mongiovì, Evangelos E Papalexakis, Razvan Ranca, and Ambuj K Singh. 2013. NetSpot: Spotting Significant Anomalous Regions on Dynamic Networks. In SDM.Google Scholar
Francesco Bonchi, Ilaria Bordino, Francesco Gullo, and Giovanni Stilo. 2016. Identifying Buzzing Stories via Anomalous Temporal Subgraph Discovery. In WI.Google Scholar
Francesco Bonchi, Ilaria Bordino, Francesco Gullo, and Giovanni Stilo. 2019. The importance of unexpectedness: Discovering buzzing stories in anomalous temporal graphs. Web Intelligence (2019).Google Scholar
Markus M Breunig, Hans-Peter Kriegel, Raymond T Ng, and Jörg Sander. 2000. LOF: identifying density-based local outliers. In SIGMOD.Google Scholar
Moses S Charikar. 2002. Similarity estimation techniques from rounding algorithms. In STOC.Google Scholar
L Chi, B Li, X Zhu, S Pan, and L Chen. 2018. Chang, Yen-Yu and Li, Pan and Sosic, Rok and Afifi, MH and Schweighauser, Marco and Leskovec, Jure. IEEE Transactions on Cybernetics(2018).Google Scholar
Graham Cormode and Shan Muthukrishnan. 2005. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms(2005).Google Scholar
[12] KDD Cup 1999 Dataset.1999. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.Google Scholar
Paulo Vitor de Campos Souza, Augusto Junio Guimarães, Thiago Silva Rezende, Vinicius Jonathan Silva Araujo, and Vanessa Souza Araujo. 2020. Detection of Anomalies in Large-Scale Cyberattacks Using Fuzzy Neural Networks. Artificial Intelligence(2020).Google Scholar
Dhivya Eswaran and Christos Faloutsos. 2018. Sedanspot: Detecting anomalies in edge streams. In ICDM.Google Scholar
Hadi Fanaee-T and João Gama. 2015. Multi-aspect-streaming tensor analysis. Knowledge-Based Systems(2015).Google Scholar
Hadi Fanaee-T and João Gama. 2016. Tensor-based anomaly detection: An interdisciplinary survey. Knowledge-Based Systems(2016).Google Scholar
Adam Goodge, Bryan Hooi, See-Kiong Ng, and Wee Siong Ng. 2020. Robustness of Autoencoders for Anomaly Detection Under Adversarial Impact. In IJCAI.Google Scholar
Tyrone Gradison and Evimaria Terzi. 2018. Intrusion Detection Technology. In Encyclopedia of Database Systems.Google Scholar
Sudipto Guha, Nina Mishra, Gourav Roy, and Okke Schrijvers. 2016. Robust Random Cut Forest Based Anomaly Detection on Streams. In ICML.Google Scholar
Nikhil Gupta, Dhivya Eswaran, Neil Shah, Leman Akoglu, and Christos Faloutsos. 2017. LookOut on Time-Evolving Graphs: Succinctly Explaining Anomalies from Any Detector. ArXiv abs/1710.05333(2017).Google Scholar
Kawther Hassine, Aiman Erbad, and Ridha Hamila. 2019. Important Complexity Reduction of Random Forest in Multi-Classification Problem. In IWCMC.Google Scholar
Geoffrey E Hinton and Richard S Zemel. 1994. Autoencoders, minimum description length and Helmholtz free energy. In NIPS.Google Scholar
Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos. 2015. A general suspiciousness metric for dense blocks in multimodal data. In ICDM.Google Scholar
Hyunjun Ju, Dongha Lee, Junyoung Hwang, Junghyun Namkung, and Hwanjo Yu. 2020. PUMAD: PU Metric learning for anomaly detection. Information Sciences(2020).Google Scholar
Farrukh Aslam Khan, Abdu Gumaei, Abdelouahid Derhab, and Amir Hussain. 2019. A Novel Two-Stage Deep Learning Model for Efficient Network Intrusion Detection. IEEE Access (2019).Google Scholar
Artemy Kolchinsky, Brendan D Tracey, and David H Wolpert. 2019. Nonlinear Information Bottleneck. Entropy (2019).Google Scholar
Tamara G Kolda and Brett W Bader. 2009. Tensor decompositions and applications. SIAM review (2009).Google Scholar
Xiangnan Kong and S Yu Philip. 2011. An ensemble-based approach to fast classification of multi-label data streams. In CollaborateCom.Google Scholar
Rithesh Kumar, Anirudh Goyal, Aaron C Courville, and Yoshua Bengio. 2019. Maximum Entropy Generators for Energy-Based Models. ArXiv abs/1901.08508(2019).Google Scholar
Jie Li, Guan Han, Jing Wen, and Xinbo Gao. 2011. Robust tensor subspace learning for anomaly detection. IJMLC (2011).Google Scholar
Witold Litwin. 1980. Linear hashing: a new tool for file and table addressing.. In VLDB.Google Scholar
Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation Forest. ICDM (2008).Google Scholar
Chen Luo and Anshumali Shrivastava. 2018. Arrays of (Locality-Sensitive) Count Estimators (ACE): Anomaly Detection on the Edge. In WWW.Google Scholar
Fragkiskos D Malliaros, Vasileios Megalooikonomou, and Christos Faloutsos. 2012. Fast Robustness Estimation in Large Social Graphs: Communities and Anomaly Detection. In SDM.Google Scholar
Emaad A Manzoor, Hemank Lamba, and Leman Akoglu. 2018. xStream: Outlier Detection in Feature-Evolving Data Streams. In KDD.Google Scholar
Hing-Hao Mao, Chung-Jung Wu, Evangelos E Papalexakis, Christos Faloutsos, Kuo-Chen Lee, and Tien-Cheu Kao. 2014. MalSpot: Multi 2 malicious network behavior patterns analysis. In PAKDD.Google Scholar
Koji Maruhashi, Fan Guo, and Christos Faloutsos. 2011. Multiaspectforensics: Pattern mining on large-scale heterogeneous networks with tensor analysis. In ASONAM.Google Scholar
Misael Mongiovì, Petko Bogdanov, Razvan Ranca, Ambuj K Singh, Evangelos E Papalexakis, and Christos Faloutsos. 2012. SigSpot: Mining Significant Anomalous Regions from Time-Evolving Networks (Abstract Only). In SIGMOD.Google Scholar
Nour Moustafa and Jill Slay. 2015. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In MilCIS.Google Scholar
Gyoung S Na, Donghyun Kim, and Hwanjo Yu. 2018. DILOF: Effective and Memory Efficient Local Outlier Detection in Data Streams. In KDD.Google ScholarDigital Library
Phuc Cuong Ngo, Amadeus Aristo Winarto, Connie Khor Li Kou, Sojeong Park, Farhan Akram, and Hwee Kuan Lee. 2019. Fence GAN: Towards Better Anomaly Detection. ICTAI (2019).Google Scholar
Shirui Pan, Jia Wu, Xingquan Zhu, and Chengqi Zhang. 2015. Graph Ensemble Boosting for Imbalanced Noisy Graph Stream Classification. IEEE Transactions on Cybernetics(2015).Google Scholar
Shirui Pan, Kuan Wu, Yang Zhang, and Xue Li. 2010. Classifier Ensemble for Uncertain Data Stream Classification. In Advances in Knowledge Discovery and Data Mining.Google Scholar
Shirui Pan, Xingquan Zhu, Chengqi Zhang, and S Yu Philip. 2013. Graph stream classification using labeled and unlabeled graphs. In ICDE.Google Scholar
Evangelos Papalexakis, Konstantinos Pelechrinis, and Christos Faloutsos. 2014. Spotting misbehaviors in location-based social networks using tensors. In WWW.Google Scholar
Evangelos E Papalexakis, Christos Faloutsos, and Nicholas D Sidiropoulos. 2012. Parcube: Sparse parallelizable tensor decompositions. In ECMLPKDD.Google Scholar
Karl Pearson. 1901. LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science (1901).Google Scholar
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, 2011. Scikit-learn: Machine Learning in Python. JMLR (2011).Google ScholarDigital Library
Bryan Perozzi and Leman Akoglu. 2016. Scalable anomaly ranking of attributed neighborhoods. In SDM.Google Scholar
Bryan Perozzi and Leman Akoglu. 2018. Discovering Communities and Anomalies in Attributed Graphs: Interactive Visual Exploration and Summarization. TKDD (2018).Google ScholarDigital Library
Bryan Perozzi, Michael Schueppert, Jack Saalweachter, and Mayur Thakur. 2016. When Recommendation Goes Wrong: Anomalous Link Discovery in Recommendation Networks. In KDD.Google Scholar
Smitha Rajagopal, Katiganere Siddaramappa Hareesha, and Poornima Panduranga Kundapur. 2020. Feature Relevance Analysis and Feature Reduction of UNSW NB-15 Using Neural Networks on MAMLS. In ICACIE.Google Scholar
Smitha Rajagopal, Poornima Panduranga Kundapur, and Katiganere Siddaramappa Hareesha. 2020. A Stacking Ensemble for Network Intrusion Detection Using Heterogeneous Datasets. Security and Communication Networks(2020).Google Scholar
Stephen Ranshous, Steve Harenberg, Kshitij Sharma, and Nagiza F Samatova. 2016. A Scalable Approach for Outlier Detection in Edge Streams Using Sketch-based Approximations. In SDM.Google Scholar
Markus Ring, Sarah Wunderlich, Deniz Scheuring, Dieter Landes, and Andreas Hotho. 2019. A survey of network-based intrusion detection data sets. Computers & Security(2019).Google Scholar
Peter J Rousseeuw and Katrien Van Driessen. 1999. A fast algorithm for the minimum covariance determinant estimator. Technometrics (1999).Google Scholar
Saket Sathe and Charu C Aggarwal. 2016. Subspace Outlier Detection in Linear Time with Randomized Hashing. In ICDM.Google Scholar
Neil Shah, Alex Beutel, Bryan Hooi, Leman Akoglu, Stephan Gunnemann, Disha Makhija, Mohit Kumar, and Christos Faloutsos. 2016. EdgeCentric: Anomaly Detection in Edge-Attributed Networks. In ICDMW.Google Scholar
Iman Sharafaldin, Arash Habibi Lashkari, and Ali A Ghorbani. 2018. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization. In ICISSP.Google Scholar
Lei Shi, Aryya Gangopadhyay, and Vandana P Janeja. 2015. STenSr: Spatio-temporal tensor streams for anomaly detection and pattern discovery. Knowledge and Information Systems(2015).Google Scholar
Kijung Shin, Bryan Hooi, and Christos Faloutsos. 2016. M-zoom: Fast dense-block detection in tensors with quality guarantees. In ECMLPKDD.Google Scholar
Kijung Shin, Bryan Hooi, Jisu Kim, and Christos Faloutsos. 2017. D-cube: Dense-block detection in terabyte-scale tensors. In WSDM.Google ScholarDigital Library
Kijung Shin, Bryan Hooi, Jisu Kim, and Christos Faloutsos. 2017. DenseAlert: Incremental Dense-Subtensor Detection in Tensor Streams. KDD (2017).Google ScholarDigital Library
Hongyu Sun, Qiang He, Kewen Liao, Timos Sellis, Longkun Guo, Xuyun Zhang, Jun Shen, and Feifei Chen. 2019. Fast Anomaly Detection in Multiple Multi-Dimensional Data Streams. In BigData.Google Scholar
Jimeng Sun, Dacheng Tao, and Christos Faloutsos. 2006. Beyond streams and graphs: dynamic tensor analysis. In KDD.Google Scholar
Naftali Tishby, Fernando C Pereira, and William Bialek. 2000. The information bottleneck method. arXiv preprint physics/0004057(2000).Google Scholar
Hanghang Tong, Chongrong Li, Jingrui He, Jiajian Chen, Quang-Anh Tran, Haixin Duan, and Xing Li. 2005. Anomaly Internet Network Traffic Detection by Kernel Principle Component Classifier. In ISNN.Google Scholar
Hanghang Tong and Ching-Yung Lin. 2011. Non-Negative Residual Matrix Factorization with Application to Graph Anomaly Detection. In SDM.Google Scholar
Ravi Vinayakumar, Mamoun Alazab, KP Soman, Prabaharan Poornachandran, Ameer Al-Nemrat, and Sitalakshmi Venkatraman. 2019. Deep Learning Approach for Intelligent Intrusion Detection System. IEEE Access (2019).Google ScholarCross Ref
Wei Wang, Xiaohong Guan, Xiangliang Zhang, and Liwei Yang. 2006. Profiling program behavior for anomaly intrusion detection based on the transition and frequency property of computer audit data. Computers & Security(2006).Google Scholar
Wei Wang, Thomas Guyet, René Quiniou, Marie-Odile Cordier, Florent Masseglia, and Xiangliang Zhang. 2014. Autonomic intrusion detection: Adaptively detecting anomalies over unlabeled audit data streams in computer networks. Knowledge-Based Systems(2014).Google Scholar
Yiwei Wang, Shenghua Liu, Minji Yoon, Hemank Lamba, Wei Wang, Christos Faloutsos, and Bryan Hooi. 2020. Provably Robust Node Classification via Low-Pass Message Passing. ICDM (2020).Google Scholar
Audrey Wilmet, Tiphaine Viard, Matthieu Latapy, and Robin Lamarche-Perrin. 2018. Degree-Based Outliers Detection Within IP Traffic Modelled as a Link Stream. 2018 Network Traffic Measurement and Analysis Conference (TMA) (2018).Google ScholarCross Ref
Audrey Wilmet, Tiphaine Viard, Matthieu Latapy, and Robin Lamarche-Perrin. 2019. Outlier detection in IP traffic modelled as a link stream using the stability of degree distributions over time. Computer Networks (2019).Google Scholar
Minji Yoon, Bryan Hooi, Kijung Shin, and Christos Faloutsos. 2019. Fast and Accurate Anomaly Detection in Dynamic Graphs with a Two-Pronged Approach. In KDD.Google Scholar
Weiren Yu, Charu C Aggarwal, Shuai Ma, and Haixun Wang. 2013. On anomalous hotspot discovery in graph streams. In ICDM.Google Scholar
Shuangfei Zhai, Yu Cheng, Weining Lu, and Zhongfei Zhang. 2016. Deep structured energy based models for anomaly detection. In ICML.Google Scholar
Jiabao Zhang, Shenghua Liu, Wenjian Yu, Wenjie Feng, and Xueqi Cheng. 2019. EigenPulse: Detecting Surges in Large Streaming Graphs with Row Augmentation. In PAKDD.Google Scholar
Shuo Zhou, Nguyen Xuan Vinh, James Bailey, Yunzhe Jia, and Ian Davidson. 2016. Accelerating online cp decompositions for higher order tensors. In KDD.Google Scholar
Artur Ziviani, Antonio Tadeu A Gomes, Marcelo L Monsores, and Paulo SS Rodrigues. 2007. Network anomaly detection using nonextensive entropy. IEEE Communications Letters(2007).Google Scholar
Bo Zong, Qi Song, Martin Renqiang Min, Wei Cheng, Cristian Lumezanu, Daeki Cho, and Haifeng Chen. 2018. Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection. In ICLR.Google Scholar

Recommendations

Fast Anomaly Detection based on Data Stream in Network Intrusion Detection System
ACM TURC '21: Proceedings of the ACM Turing Award Celebration Conference - China

Intrusion detection system is a primary defense mechanism in aspect of protecting network security. Anomaly detection, as one of the most commonly used intrusion detection methods, plays a significant part in detecting the network traffic data. However,...
Read More
Program Anomaly Detection: Methodology and Practices
CCS '16: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security

This tutorial will present an overview of program anomaly detection, which analyzes normal program behaviors and discovers aberrant executions caused by attacks, misconfigurations, program bugs, and unusual usage patterns. It was first introduced as an ...
Read More
Specification-based anomaly detection: a new approach for detecting network intrusions
CCS '02: Proceedings of the 9th ACM conference on Computer and communications security

Unlike signature or misuse based intrusion detection techniques, anomaly detection is capable of detecting novel attacks. However, the use of anomaly detection in practice is hampered by a high rate of false alarms. Specification-based techniques have ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '21: Proceedings of the Web Conference 2021
April 2021
4054 pages
ISBN:9781450383127
DOI:10.1145/3442381
Editors:
Jure Leskovec
Stanford
,
Marko Grobelnik
Jožef Stefan Institute
,
Marc Najork
Google
,
Jie Tang
Tsinghua University
,
Leila Zia
Wikimedia Foundation
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 June 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Anomaly Detection
Intrusion Detection
Multi-Aspect Data
Stream
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Upcoming Conference
WWW '24

Sponsor:

sigweb

The ACM Web Conference 2024

May 13 - 17, 2024

Singapore , Singapore
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 12
  Total Citations
  View Citations
- 2,147
  Total Downloads
- Downloads (Last 12 months)676
- Downloads (Last 6 weeks)89
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

MStream: Fast Anomaly Detection in Multi-Aspect Streams

WWW '21: Proceedings of the Web Conference 2021

ABSTRACT

References

Cited By

Recommendations

Fast Anomaly Detection based on Data Stream in Network Intrusion Detection System

Program Anomaly Detection: Methodology and Practices

Specification-based anomaly detection: a new approach for detecting network intrusions