ABSTRACT
One of the most challenging problems in the field of intrusion detection is anomaly detection for discrete event logs. While most earlier work focused on applying unsupervised learning upon engineered features, most recent work has started to resolve this challenge by applying deep learning methodology to abstraction of discrete event entries. Inspired by natural language processing, LSTM-based anomaly detection models were proposed. They try to predict upcoming events, and raise an anomaly alert when a prediction fails to meet a certain criterion. However, such a predict-next-event methodology has a fundamental limitation: event predictions may not be able to fully exploit the distinctive characteristics of sequences. This limitation leads to high false positives (FPs). It is also critical to examine the structure of sequences and the bi-directional causality among individual events. To this end, we propose a new methodology: Recomposing event sequences as anomaly detection. We propose DabLog, a LSTM-based Deep Autoencoder-Based anomaly detection method for discrete event Logs. The fundamental difference is that, rather than predicting upcoming events, our approach determines whether a sequence is normal or abnormal by analyzing (encoding) and reconstructing (decoding) the given sequence. Our evaluation results show that our new methodology can significantly reduce the numbers of FPs, hence achieving a higher F1 score.
Supplemental Material
- Mejbah Alam, Justin Gottschlich, Nesime Tatbul, Javier Turek, Timothy Mattson, and Abdullah Muzahid. 2017. A Zero-Positive Learning Approach for Diagnosing Software Performance Regressions. arxiv: 1709.07536 [cs.SE]Google Scholar
- Arwa Aldweesh, Abdelouahid Derhab, and Ahmed Z. Emam. 2020. Deep learning approaches for anomaly-based intrusion detection systems: A survey, taxonomy, and open issues. Knowledge-Based Systems, Vol. 189 (2020), 105124.Google ScholarDigital Library
- Y. Bengio, P. Simard, and P. Frasconi. 1994. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, Vol. 5, 2 (1994), 157--166.Google ScholarDigital Library
- Andy Brown, Aaron Tuor, Brian Hutchinson, and Nicole Nichols. 2018. Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection. In Proceedings of the First Workshop on Machine Learning for Computing Systems (Tempe, AZ, USA) (MLCS'18). Association for Computing Machinery, New York, NY, USA, Article 1, 8 pages.Google ScholarDigital Library
- Raghavendra Chalapathy, Aditya Krishna Menon, and Sanjay Chawla. 2017. Robust, Deep and Inductive Anomaly Detection. In Machine Learning and Knowledge Discovery in Databases, Michelangelo Ceci, Jaakko Hollmén, Ljupvc o Todorovski, Celine Vens, and Savs o Dvz eroski (Eds.). Springer International Publishing, Cham, 36--51.Google Scholar
- Zouhair Chiba, Noureddine Abghour, Khalid Moussaid, Amina El Omri, and Mohamed Rida. 2018. A novel architecture combined with optimal parameters for back propagation neural networks applied to anomaly network intrusion detection. Computers & Security, Vol. 75 (2018), 36 -- 58.Google ScholarCross Ref
- Kyunghyun Cho, Bart van Merrienboer, Caglar Gülcehre, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. CoRR, Vol. abs/1406.1078 (2014). arxiv: 1406.1078Google Scholar
- Min Du, Zhi Chen, Chang Liu, Rajvardhan Oak, and Dawn Song. 2019. Lifelong Anomaly Detection Through Unlearning. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (London, United Kingdom) (CCS '19). Association for Computing Machinery, New York, NY, USA, 1283--1297.Google ScholarDigital Library
- M. Du and F. Li. 2016. Spell: Streaming Parsing of System Event Logs. In 2016 IEEE 16th International Conference on Data Mining (ICDM). 859--864.Google Scholar
- Min Du, Feifei Li, Guineng Zheng, and Vivek Srikumar. 2017. DeepLog: Anomaly Detection and Diagnosis from System Logs Through Deep Learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (Dallas, Texas, USA) (CCS '17). ACM, New York, NY, USA, 1285--1298.Google ScholarDigital Library
- M. O. Ezeme, Q. H. Mahmoud, and A. Azim. 2018. Hierarchical Attention-Based Anomaly Detection Model for Embedded Operating Systems. In 2018 IEEE 24th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA). 225--231.Google Scholar
- O. M. Ezeme, Q. H. Mahmoud, and A. Azim. 2019. DReAM: Deep Recursive Attentive Model for Anomaly Detection in Kernel Events. IEEE Access, Vol. 7 (2019), 18860--18870.Google ScholarCross Ref
- Filipe Falcao, Tommaso Zoppi, Caio Barbosa Viera Silva, Anderson Santos, Baldoino Fonseca, Andrea Ceccarelli, and Andrea Bondavalli. 2019. Quantitative Comparison of Unsupervised Anomaly Detection Algorithms for Intrusion Detection. In Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing (Limassol, Cyprus) (SAC '19). Association for Computing Machinery, New York, NY, USA, 318--327.Google ScholarDigital Library
- Alex Graves, Abdelrahman Mohamed, and Geoffrey Hinton. 2013. Speech Recognition with Deep Recurrent Neural Networks. arxiv: 1303.5778 [cs.NE]Google Scholar
- Klaus Greff, Rupesh Kumar Srivastava, Jan Koutn'i k, Bas R. Steunebrink, and Jürgen Schmidhuber. 2015. LSTM: A Search Space Odyssey. CoRR, Vol. abs/1503.04069 (2015). arxiv: 1503.04069Google Scholar
- Michiel Hermans and Benjamin Schrauwen. 2013. Training and Analysing Deep Recurrent Neural Networks. In Advances in Neural Information Processing Systems 26, C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 190--198.Google ScholarDigital Library
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation, Vol. 9, 8 (1997), 1735--1780.Google ScholarDigital Library
- Q. Hu, B. Tang, and D. Lin. 2017. Anomalous User Activity Detection in Enterprise Multi-source Logs. In 2017 IEEE International Conference on Data Mining Workshops (ICDMW). 797--803.Google Scholar
- Rafal Jozefowicz, Wojciech Zaremba, and Ilya Sutskever. 2015. An empirical exploration of recurrent network architectures. In International conference on machine learning. 2342--2350.Google ScholarDigital Library
- Tayeb Kenaza, Khadidja Bennaceur, and Abdenour Labed. 2018. An Efficient Hybrid SVDD/Clustering Approach for Anomaly-Based Intrusion Detection. In Proceedings of the 33rd Annual ACM Symposium on Applied Computing (Pau, France) (SAC '18). Association for Computing Machinery, New York, NY, USA, 435--443.Google ScholarDigital Library
- Fucheng Liu, Yu Wen, Dongxue Zhang, Xihe Jiang, Xinyu Xing, and Dan Meng. 2019 b. Log2vec: A Heterogeneous Graph Embedding Based Approach for Detecting Cyber Threats within Enterprise. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (London, United Kingdom) (CCS '19). Association for Computing Machinery, New York, NY, USA, 1777--1794.Google ScholarDigital Library
- L. Liu, C. Chen, J. Zhang, O. De Vel, and Y. Xiang. 2019. Insider Threat Identification Using the Simultaneous Neural Learning of Multi-Source Logs. IEEE Access, Vol. 7 (2019), 183162--183176.Google ScholarCross Ref
- Liu Liu, Chao Chen, Jun Zhang, Olivier De Vel, and Yang Xiang. 2019 a. Unsupervised Insider Detection Through Neural Feature Learning and Model Optimisation. In Network and System Security, Joseph K. Liu and Xinyi Huang (Eds.). Springer International Publishing, Cham, 18--36.Google Scholar
- L. Liu, O. De Vel, C. Chen, J. Zhang, and Y. Xiang. 2018a. Anomaly-Based Insider Threat Detection Using Deep Autoencoders. In 2018 IEEE International Conference on Data Mining Workshops (ICDMW). 39--48.Google Scholar
- Z. Liu, T. Qin, X. Guan, H. Jiang, and C. Wang. 2018b. An Integrated Method for Anomaly Detection From Massive System Logs. IEEE Access, Vol. 6 (2018), 30602--30611.Google ScholarCross Ref
- X. Lu, W. Zhang, and J. Huang. 2020. Exploiting Embedding Manifold of Autoencoders for Hyperspectral Anomaly Detection. IEEE Transactions on Geoscience and Remote Sensing, Vol. 58, 3 (March 2020), 1527--1537.Google ScholarCross Ref
- Marcus A. Maloof and Gregory D. Stephens. 2007. ELICIT: A System for Detecting Insiders Who Violate Need-to-Know. In Proceedings of the 10th International Conference on Recent Advances in Intrusion Detection (Gold Goast, Australia) (RAID'07). Springer-Verlag, Berlin, Heidelberg, 146--166.Google Scholar
- Yisroel Mirsky, Tomer Doitshman, Yuval Elovici, and Asaf Shabtai. 2018. Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection. arxiv: 1802.09089 [cs.CR]Google Scholar
- N. Moustafa and J. Slay. 2015. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In 2015 Military Communications and Information Systems Conference (MilCIS). 1--6.Google Scholar
- Q. P. Nguyen, K. W. Lim, D. M. Divakaran, K. H. Low, and M. C. Chan. 2019. GEE: A Gradient-based Explainable Variational Autoencoder for Network Anomaly Detection. In 2019 IEEE Conference on Communications and Network Security (CNS). 91--99.Google Scholar
- A. Oprea, Z. Li, T. Yen, S. H. Chin, and S. Alrwais. 2015. Detection of Early-Stage Enterprise Infection by Mining Large-Scale Log Data. In 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. 45--56.Google Scholar
- Razvan Pascanu, Caglar Gulcehre, Kyunghyun Cho, and Yoshua Bengio. 2013. How to Construct Deep Recurrent Neural Networks. arxiv: 1312.6026 [cs.NE]Google Scholar
- Yuval Pinter, Robert Guthrie, and Jacob Eisenstein. 2017. Mimicking Word Embeddings using Subword RNNs. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 102--112.Google ScholarCross Ref
- Mayu Sakurada and Takehisa Yairi. 2014. Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction. In Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis (Gold Coast, Australia QLD, Australia) (MLSDA'14). Association for Computing Machinery, New York, NY, USA, 4--11.Google ScholarDigital Library
- N. Shone, T. N. Ngoc, V. D. Phai, and Q. Shi. 2018. A Deep Learning Approach to Network Intrusion Detection. IEEE Transactions on Emerging Topics in Computational Intelligence, Vol. 2, 1 (Feb 2018), 41--50.Google ScholarCross Ref
- Nitish Srivastava, Elman Mansimov, and Ruslan Salakhutdinov. 2015. Unsupervised Learning of Video Representations using LSTMs. CoRR, Vol. abs/1502.04681 (2015). arxiv: 1502.04681Google Scholar
- Xuhong Wang, Ying Du, Shijie Lin, Ping Cui, Yuntian Shen, and Yupu Yang. 2020. adVAE: A self-adversarial variational autoencoder with Gaussian anomaly prior knowledge for anomaly detection. Knowledge-Based Systems, Vol. 190 (2020), 105187.Google ScholarDigital Library
- Wei Xu, Ling Huang, Armando Fox, David Patterson, and Michael Jordan. 2009a. Online System Problem Detection by Mining Patterns of Console Logs. In Proceedings of the 2009 Ninth IEEE International Conference on Data Mining (ICDM '09). IEEE Computer Society, USA, 588--597.Google ScholarDigital Library
- Wei Xu, Ling Huang, Armando Fox, David Patterson, and Michael I. Jordan. 2009b. Detecting Large-Scale System Problems by Mining Console Logs. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (Big Sky, Montana, USA) (SOSP '09). Association for Computing Machinery, New York, NY, USA, 117--132.Google Scholar
- R. Yang, D. Qu, Y. Gao, Y. Qian, and Y. Tang. 2019. nLSALog: An Anomaly Detection Framework for Log Sequence in Security Management. IEEE Access, Vol. 7 (2019), 181152--181164.Google ScholarCross Ref
- M. Yousefi-Azar, V. Varadharajan, L. Hamey, and U. Tupakula. 2017. Autoencoder-based feature learning for cyber security applications. In 2017 International Joint Conference on Neural Networks (IJCNN). 3854--3861.Google Scholar
- Chong Zhou and Randy C. Paffenroth. 2017. Anomaly Detection with Robust Deep Autoencoders. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Halifax, NS, Canada) (KDD '17). Association for Computing Machinery, New York, NY, USA, 665--674.Google ScholarDigital Library
- Bo Zong, Qi Song, Martin Renqiang Min, Wei Cheng, Cristian Lumezanu, Daeki Cho, and Haifeng Chen. 2018. Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection. In International Conference on Learning Representations.Google Scholar
Index Terms
- Recompose Event Sequences vs. Predict Next Events: A Novel Anomaly Detection Approach for Discrete Event Logs
Recommendations
Multi-Scale One-Class Recurrent Neural Networks for Discrete Event Sequence Anomaly Detection
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data MiningDiscrete event sequences are ubiquitous, such as an ordered event series of process interactions in Information and Communication Technology systems. Recent years have witnessed increasing efforts in detecting anomalies with discrete event sequences. ...
MADDC: Multi-Scale Anomaly Detection, Diagnosis and Correction for Discrete Event Logs
ACSAC '22: Proceedings of the 38th Annual Computer Security Applications ConferenceAnomaly detection for discrete event logs can provide critical information for building secure and reliable systems in various application domains, such as large scale data centers, autonomous driving, and intrusion detection. However, the task is very ...
Event log anomaly detection method based on auto-encoder and control flow
AbstractAnomaly detection is widely used in the field of business process management, and researchers have proposed various anomaly detection algorithms to detect anomalies in event logs. However, existing research focuses on detecting anomalies in event ...
Comments