skip to main content
research-article
Artifacts Available / v1.1

TranAD: deep transformer networks for anomaly detection in multivariate time series data

Published:01 February 2022Publication History
Skip Abstract Section

Abstract

Efficient anomaly detection and diagnosis in multivariate time-series data is of great importance for modern industrial applications. However, building a system that is able to quickly and accurately pinpoint anomalous observations is a challenging problem. This is due to the lack of anomaly labels, high data volatility and the demands of ultra-low inference times in modern applications. Despite the recent developments of deep learning approaches for anomaly detection, only a few of them can address all of these challenges. In this paper, we propose TranAD, a deep transformer network based anomaly detection and diagnosis model which uses attention-based sequence encoders to swiftly perform inference with the knowledge of the broader temporal trends in the data. TranAD uses focus score-based self-conditioning to enable robust multi-modal feature extraction and adversarial training to gain stability. Additionally, model-agnostic meta learning (MAML) allows us to train the model using limited data. Extensive empirical studies on six publicly available datasets demonstrate that TranAD can outperform state-of-the-art baseline methods in detection and diagnosis performance with data and time-efficient training. Specifically, TranAD increases F1 scores by up to 17%, reducing training times by up to 99% compared to the baselines.

References

  1. Hossein Abbasimehr, Mostafa Shabani, and Mohsen Yousefi. 2020. An optimized model using LSTM network for demand forecasting. Computers & industrial engineering 143 (2020), 106435.Google ScholarGoogle Scholar
  2. Subutai Ahmad, Alexander Lavin, Scott Purdy, and Zuha Agha. 2017. Unsupervised real-time anomaly detection for streaming data. Neurocomputing 262 (2017), 134--147.Google ScholarGoogle ScholarCross RefCross Ref
  3. Chuadhry Mujeeb Ahmed, Venkata Reddy Palleti, and Aditya P Mathur. 2017. WADI: a water distribution testbed for research in the design of secure cyber physical systems. In Proceedings of the 3rd International Workshop on Cyber-Physical Systems for Smart Water Networks. 25--28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Julien Audibert, Pietro Michiardi, Frédéric Guyard, Sébastien Marti, and Maria A Zuluaga. 2020. USAD: UnSupervised Anomaly Detection on Multivariate Time Series. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 3395--3404.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Tharindu R Bandaragoda, Kai Ming Ting, David Albrecht, Fei Tony Liu, and Jonathan R Wells. 2014. Efficient anomaly detection by isolation using nearest neighbour ensemble. In 2014 IEEE International Conference on Data Mining Workshop. IEEE, 698--705.Google ScholarGoogle ScholarCross RefCross Ref
  6. Julian Bellendorf and Zoltán Ádám Mann. 2020. Classification of optimization problems in fog computing. Future Generation Computer Systems 107 (2020), 158--176.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Nejc Bezak, Mitja Brilly, and Mojca Šraj. 2014. Comparison between the peaks-over-threshold method and the annual maximum method for flood frequency analysis. Hydrological Sciences Journal 59, 5 (2014), 959--977.Google ScholarGoogle ScholarCross RefCross Ref
  8. Paul Boniol, Michele Linardi, Federico Roncallo, and Themis Palpanas. 2020. Automated anomaly detection in large sequences. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1834--1837.Google ScholarGoogle ScholarCross RefCross Ref
  9. Paul Boniol, Themis Palpanas, Mohammed Meftah, and Emmanuel Remy. 2020. Graphan: Graph-based subsequence anomaly detection. Proceedings of the VLDB Endowment 13, 12 (2020), 2941--2944.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Paul Boniol, John Paparrizos, Themis Palpanas, and Michael J. Franklin. 2021. SAND: Streaming Subsequence Anomaly Detection. Proc. VLDB Endow. 14, 10 (2021), 1717--1729.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Saikiran Bulusu, Bhavya Kailkhura, Bo Li, Pramod K Varshney, and Dawn Song. 2020. Anomalous example detection in deep learning: A survey. IEEE Access 8 (2020), 132330--132347.Google ScholarGoogle ScholarCross RefCross Ref
  12. Raghavendra Chalapathy and Sanjay Chawla. 2019. Deep learning for anomaly detection: A survey. arXiv preprint arXiv:1901.03407 (2019).Google ScholarGoogle Scholar
  13. Hoang Anh Dau, Eamonn Keogh, Kaveh Kamgar, Chin-Chia Michael Yeh, Yan Zhu, Shaghayegh Gharghabi, Chotirat Ann Ratanamahatana, Yanping, Bing Hu, Nurjahan Begum, Anthony Bagnall, Abdullah Mueen, Gustavo Batista, and Hexagon-ML. 2018. The UCR Time Series Classification Archive. https://www.cs.ucr.edu/~eamonn/time_series_data_2018/.Google ScholarGoogle Scholar
  14. Ailin Deng and Bryan Hooi. 2021. Graph neural network-based anomaly detection in multivariate time series. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 4027--4035.Google ScholarGoogle ScholarCross RefCross Ref
  15. Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning. PMLR, 1126--1135.Google ScholarGoogle Scholar
  16. Shaghayegh Gharghabi, Shima Imani, Anthony Bagnall, Amirali Darvishzadeh, and Eamonn Keogh. 2018. Matrix profile XII: MPDIST: a novel time series distance measure to allow data mining in more challenging scenarios. In 2018 IEEE International Conference on Data Mining (ICDM). IEEE, 965--970.Google ScholarGoogle ScholarCross RefCross Ref
  17. Ary L Goldberger, Luis AN Amaral, Leon Glass, Jeffrey M Hausdorff, Plamen Ch Ivanov, Roger G Mark, Joseph E Mietus, George B Moody, Chung-Kang Peng, and H Eugene Stanley. 2000. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. circulation 101, 23 (2000), e215--e220.Google ScholarGoogle Scholar
  18. Xin He, Kaiyong Zhao, and Xiaowen Chu. 2021. AutoML: A Survey of the State-of-the-Art. Knowledge-Based Systems 212 (2021), 106622.Google ScholarGoogle ScholarCross RefCross Ref
  19. Shaohan Huang, Yi Liu, Carol Fung, Rong He, Yining Zhao, Hailong Yang, and Zhongzhi Luan. 2020. HitAnomaly: Hierarchical Transformers for Anomaly Detection in System Log. IEEE Transactions on Network and Service Management 17, 4 (2020), 2064--2076.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Kyle Hundman, Valentino Constantinou, Christopher Laporte, Ian Colwell, and Tom Soderstrom. 2018. Detecting spacecraft anomalies using LSTMs and non-parametric dynamic thresholding. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 387--395.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Shima Imani, Frank Madrid, Wei Ding, Scott Crouter, and Eamonn Keogh. 2018. Matrix profile xiii: Time series snippets: a new primitive for time series data mining. In 2018 IEEE international conference on big knowledge (ICBK). IEEE, 382--389.Google ScholarGoogle ScholarCross RefCross Ref
  22. Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, and Pierre-Alain Muller. 2019. Deep learning for time series classification: a review. Data Mining and Knowledge Discovery 33, 4 (2019), 917--963.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Vincent Jacob, Fei Song, Arnaud Stiegler, Bijan Rad, Yanlei Diao, and Nesime Tatbul. 2020. Exathlon: A Benchmark for Explainable Anomaly Detection over Time Series. Proceedings of the VLDB Endowment (2020).Google ScholarGoogle Scholar
  24. Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) 20, 4 (2002), 422--446.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Stratis Kanarachos, Jino Mathew, Alexander Chroneos, and M Fitzpatrick. 2015. Anomaly detection in time series data using a combination of wavelets, neural networks and Hilbert transform. In 2015 6th International Conference on Information, Intelligence, Systems and Applications (IISA). IEEE, 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  26. Eamonn Keogh, Dutta Roy Taposh, U Naik, and A Agrawal. 2021. Multi-dataset Time-Series Anomaly Detection Competition. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. https://compete.hexagon-ml.com/practice/competition/39/.Google ScholarGoogle Scholar
  27. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  28. Kyle Kingsbury and Peter Alvaro. 2020. Elle: inferring isolation anomalies from experimental observations. Proceedings of the VLDB Endowment 14, 3 (2020), 268--280.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Dan Li, Dacheng Chen, Baihong Jin, Lei Shi, Jonathan Goh, and See-Kiong Ng. 2019. MAD-GAN: Multivariate anomaly detection for time series data with generative adversarial networks. In International Conference on Artificial Neural Networks. Springer, 703--716.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Guoliang Li, Xuanhe Zhou, Ji Sun, Xiang Yu, Yue Han, Lianyuan Jin, Wenbo Li, Tianqing Wang, and Shifu Li. 2021. opengauss: An autonomous database system. Proceedings of the VLDB Endowment 14, 12 (2021), 3028--3042.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation forest. In 2008 eighth ieee international conference on data mining. IEEE, 413--422.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Steven Liu, Tongzhou Wang, David Bau, Jun-Yan Zhu, and Antonio Torralba. 2020. Diverse image generation via self-conditioned GANs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14286--14295.Google ScholarGoogle ScholarCross RefCross Ref
  33. Aditya P Mathur and Nils Ole Tippenhauer. 2016. SWaT: a water treatment testbed for research and training on ICS security. In 2016 international workshop on cyber-physical systems for smart water networks (CySWater). IEEE, 31--36.Google ScholarGoogle ScholarCross RefCross Ref
  34. Gideon Mbiydzenyuy. 2020. Univariate Time Series Anomaly Labelling Algorithm. In International Conference on Machine Learning, Optimization, and Data Science. Springer, 586--599.Google ScholarGoogle Scholar
  35. Steena Monteiro, Forrest Iandola, and Daniel Wong. 2016. STOMP: Statistical Techniques for Optimizing and Modeling Performance of blocked sparse matrix vector multiplication. In 2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). IEEE, 93--100.Google ScholarGoogle ScholarCross RefCross Ref
  36. George B Moody and Roger G Mark. 2001. The impact of the MIT-BIH arrhythmia database. IEEE Engineering in Medicine and Biology Magazine 20, 3 (2001), 45--50.Google ScholarGoogle ScholarCross RefCross Ref
  37. Takaaki Nakamura, Makoto Imamura, Ryan Mercer, and Eamonn Keogh. 2020. MERLIN: Parameter-Free Discovery of Arbitrary Length Anomalies in Massive Time Series Archives. In 2020 IEEE International Conference on Data Mining (ICDM). IEEE, 1190--1195.Google ScholarGoogle Scholar
  38. Sasho Nedelkoski, Jasmin Bogatinovski, Ajay Kumar Mandapati, Soeren Becker, Jorge Cardoso, and Odej Kao. 2020. Multi-source distributed system data for AI-powered analytics. In European Conference on Service-Oriented and Cloud Computing. Springer, 161--176.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Daehyung Park, Yuuna Hoshi, and Charles C Kemp. 2018. A multimodal anomaly detector for robot-assisted feeding using an LSTM-based variational autoencoder. IEEE Robotics and Automation Letters 3, 3 (2018), 1544--1551.Google ScholarGoogle ScholarCross RefCross Ref
  40. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. Advances in Neural Information Processing Systems 32 (2019), 8026--8037.Google ScholarGoogle Scholar
  41. Animesh Patcha and Jung-Min Park. 2007. An overview of anomaly detection techniques: Existing solutions and latest technological trends. Computer networks 51, 12 (2007), 3448--3470.Google ScholarGoogle Scholar
  42. Noorhan Saleh and Maggie Mashaly. 2019. A Dynamic Simulation Environment for Container-based Cloud Data Centers using Container CloudSim. In 2019 Ninth International Conference on Intelligent Computing and Information Systems (ICICIS). IEEE, 332--336.Google ScholarGoogle Scholar
  43. Osman Salem, Alexey Guerassimov, Ahmed Mehaoua, Anthony Marcus, and Borko Furht. 2014. Anomaly detection in medical wireless sensor networks using SVM and linear regression models. International Journal of E-Health and Medical Communications (IJEHMC) 5, 1 (2014), 20--45.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Alban Siffer, Pierre-Alain Fouque, Alexandre Termier, and Christine Largouet. 2017. Anomaly detection in streams with extreme value theory. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1067--1075.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Ya Su, Youjian Zhao, Chenhao Niu, Rong Liu, Wei Sun, and Dan Pei. 2019. Robust anomaly detection for multivariate time series through stochastic recurrent neural network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2828--2837.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Srikanth Thudumu, Philip Branch, Jiong Jin, and Jugdutt Jack Singh. 2020. A comprehensive survey of anomaly detection techniques for high dimensional big data. Journal of Big Data 7, 1 (2020), 1--30.Google ScholarGoogle ScholarCross RefCross Ref
  47. Luan Tran, Min Y Mun, and Cyrus Shahabi. 2020. Real-time distance-based outlier detection in data streams. Proceedings of the VLDB Endowment 14, 2 (2020), 141--153.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Shreshth Tuli, Giuliano Casale, and Nicholas R Jennings. 2022. PreGAN: Preemptive Migration Prediction Network for Proactive Fault-Tolerant Edge Computing. In IEEE Conference on Computer Communications (INFOCOM). IEEE.Google ScholarGoogle Scholar
  49. Shreshth Tuli, Giuliano Casale, and Nicholas R Jennings. 2022. TranAD: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data. arXiv preprint arXiv:2201.07284 (2022).Google ScholarGoogle Scholar
  50. Shreshth Tuli, Shivananda Poojara, Satish Narayana Srirama, Giuliano Casale, and Nick Jennings. 2021. COSCO: Container Orchestration using Co-Simulation and Gradient Based Optimization for Fog Computing Environments. IEEE Transactions on Parallel and Distributed Systems (2021).Google ScholarGoogle Scholar
  51. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 6000--6010.Google ScholarGoogle Scholar
  52. Yiyang Wang, Neda Masoud, and Anahita Khojandi. 2020. Real-time sensor anomaly detection and recovery in connected automated vehicle sensors. IEEE Transactions on Intelligent Transportation Systems 22, 3 (2020), 1411--1421.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Y Webscope. [n.d.]. S5-A Labeled Anomaly Detection Dataset, Version 1.0. https://webscope.sandbox.yahoo.com/catalog.php?datatype=s&did=70. Accessed: 2021-08-31.Google ScholarGoogle Scholar
  54. Krzysztof Witkowski. 2017. Internet of things, big data, industry 4.0--innovative solutions in logistics and supply chains management. Procedia engineering 182 (2017), 763--769.Google ScholarGoogle Scholar
  55. Renjie Wu and Eamonn J Keogh. 2020. Current Time Series Anomaly Detection Benchmarks are Flawed and are Creating the Illusion of Progress. arXiv preprint arXiv:2009.13807 (2020).Google ScholarGoogle Scholar
  56. Asrul H Yaacob, Ian KT Tan, Su Fong Chien, and Hon Khi Tan. 2010. Arima based network anomaly detection. In 2010 Second International Conference on Communication Software and Networks. IEEE, 205--209.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 2 (2019), 1--19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Dragomir Yankov, Eamonn Keogh, and Umaa Rebbapragada. 2008. Disk aware discord discovery: Finding unusual time series in terabyte sized datasets. Knowledge and Information Systems 17, 2 (2008), 241--262.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Chin-Chia Michael Yeh, Yan Zhu, Liudmila Ulanova, Nurjahan Begum, Yifei Ding, Hoang Anh Dau, Diego Furtado Silva, Abdullah Mueen, and Eamonn Keogh. 2016. Matrix profile I: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets. In 2016 IEEE 16th international conference on data mining (ICDM). Ieee, 1317--1322.Google ScholarGoogle Scholar
  60. Chuxu Zhang, Dongjin Song, Yuncong Chen, Xinyang Feng, Cristian Lumezanu, Wei Cheng, Jingchao Ni, Bo Zong, Haifeng Chen, and Nitesh V Chawla. 2019. A deep neural network for unsupervised anomaly detection and diagnosis in multivariate time series data. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 1409--1416.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Yuxin Zhang, Yiqiang Chen, Jindong Wang, and Zhiwen Pan. 2021. Unsupervised Deep Anomaly Detection for Multi-Sensor Time-Series Signals. IEEE Transactions on Knowledge and Data Engineering (2021).Google ScholarGoogle ScholarCross RefCross Ref
  62. Hang Zhao, Yujing Wang, Juanyong Duan, Congrui Huang, Defu Cao, Yunhai Tong, Bixiong Xu, Jing Bai, Jie Tong, and Qi Zhang. 2020. Multivariate time-series anomaly detection via graph attention network. International Conference on Data Mining (2020).Google ScholarGoogle ScholarCross RefCross Ref
  63. Yan Zhu, Chin-Chia Michael Yeh, Zachary Zimmerman, Kaveh Kamgar, and Eamonn Keogh. 2018. Matrix profile XI: SCRIMP++ : time series motif discovery at interactive speeds. In 2018 IEEE International Conference on Data Mining (ICDM). IEEE, 837--846.Google ScholarGoogle ScholarCross RefCross Ref
  64. Zachary Zimmerman, Nader Shakibay Senobari, Gareth Funning, Evangelos Papalexakis, Samet Oymak, Philip Brisk, and Eamonn Keogh. 2019. Matrix profile XVIII: time series mining in the face of fast moving streams using a learned approximate matrix profile. In 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 936--945.Google ScholarGoogle ScholarCross RefCross Ref
  65. Bo Zong, Qi Song, Martin Renqiang Min, Wei Cheng, Cristian Lumezanu, Daeki Cho, and Haifeng Chen. 2018. Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In International Conference on Learning Representations.Google ScholarGoogle Scholar

Index Terms

  1. TranAD: deep transformer networks for anomaly detection in multivariate time series data
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader