Abstract
The form of spreading malware through end-users and thereby escalating and stealing data in organizations is one of the attack techniques widely used by Advanced Persistent Threat (APT) attackers today. Therefore, the task of timely detecting and warning about APT malware on the workstation is an important and necessary issue because if this task is successful, it will prevent the whole APT attack campaign on the system. To accomplish this purpose, this study proposes a method of detecting APT malware on the workstation based on analyzing the behavior profile of malware using the deep learning graph network. Accordingly, the proposed method includes two main tasks: (i) building behavior profiles of malware: for this task, behavior profiles will be built based on the process of gathering and evaluating Event IDs from the kernel of the workstation. The result of this process of building behavior profiles is the set of processes and labels of each process performed by executable files. The label value is normal, malicious, suspicious, or unknown; (ii) detecting malware based on analyzing behavior profiles using graph network: for this task, based on behavior profiles built from the task (i), we are evaluate and analyze these behavior profiles by the Graph Isomorphism Network (GIN) deep learning graph network method. The results of this behavior profile classification will be used as a basis to conclude which behavior profiles were generated by the APT malware and which behavior profiles are normal. The method of detecting APT malware on workstation based on analyzing behavior profiles using the graph network is a novel method. According to our survey, up to now, this method has not been proposed and applied in any research. The experimental results in Section 4.3 of the paper have shown the remarkable efficiency of our proposed method. With such results, this proposal has not only scientific but also practical significance. The method of using graph networks to analyze and evaluate behavior profiles helps improve the efficiency of the process of analyzing and detecting APT malware on the workstation.
Similar content being viewed by others
References
Antoine L, Joan C, François M, Jose F (2018) Survey of publicly available reports on advanced persistent threat actors. Comput Secur 72:26–59. https://doi.org/10.1016/j.cose.2017.08.005
Alshamrani A, Chowdhary A, Myneni S, Huang D (2019) A survey on advanced persistent threats: techniques, solutions, challenges, and research opportunities. IEEE Comm Surv Tutor 21(2):1851–1877. https://doi.org/10.1109/COMST.2019.2891891
Cho DX, Nguyen HD, Nikolaevich TV (2020) Malicious URL detection based on machine learning. Int J Adv Comput Sci Appl 11(1). https://doi.org/10.14569/IJACSA.2020.0110119
Rubio JE, Alcaraz C, Roman R, Lopez J (2019) Current cyber-defense trends in industrial control systems. Comput Secur 87. https://doi.org/10.1016/j.cose.2019.06.015
Quintero-Bonilla S, Rey Á (2020) A new proposal on the advanced persistent threat: a survey. Appl Sci 10:38–74. https://doi.org/10.3390/app10113874
Stojanović B, Hofer-Schmitz K, Kleb U (2019) APT datasets and attack modeling for automated detection methods: a review. Comput Secur 92. https://doi.org/10.1016/j.cose.2020.101734
Tan MKS, Goode S, Richardson A (2020) Understanding negotiated anti-malware interruption effects on user decision quality in endpoint security. Behav Inform Technol. https://doi.org/10.1080/0144929X.2020.1734087
Yang LX, Li P, Yang X, Tang YY (2020) A risk management approach to defending against the advanced persistent threat. IEEE Trans Dependable Secure Comput 17(6):1163–1172. https://doi.org/10.1109/TDSC.2018.2858786
Russinovich M, Garnier T (2021) Sysmon v12.03. https://docs.microsoft.com/en-us/sysinternals/downloads/sysmon. Accessed 26 Mar 2021
Kim B-H, Ye JC (2020) Understanding graph isomorphism network for rs-fMRI functional connectivity analysis. Front Neurosci. https://doi.org/10.3389/fnins.2020.00630
Xuan CD, Dao MH, Nguyen HD (2020) APT attack detection based on flow network analysis techniques using deep learning. J Intell Fuzzy Syst 39(3):4785–4801. https://doi.org/10.3233/JIFS-200694
Xuan CD, Duong D, Dau HX (2021) A multi-layer approach for advanced persistent threat detection using machine learning based on network traffic. J Intell Fuzzy Syst 40:1–19. https://doi.org/10.3233/JIFS-202465
Xuan CD (2021) Detecting APT attacks based on network traffic using machine learning. J Web Eng 20(1):171–190. https://doi.org/10.13052/jwe1540-9589.2019
Xuan CD, Nam HH (2019) A method of monitoring and detecting APT attacks based on unknown domains. Procedia Comput Sci 150:316–323. https://doi.org/10.1016/j.procs.2019.02.058
Hana W, Xue J, Wang Y, Zhang F, Gao X (2021) APTMalInsight: identify and cognize APT malware based on system call information and ontology knowledge framework. Inf Sci 546:633–664. https://doi.org/10.1016/j.ins.2020.08.095
Wang X, Yu L, He H, Gong X (2020) MAAC: novel alert correlation method to detect multi-step attack. arXiv:arXiv:2011.07793v1
Zhao G, Xu K, Xu L, Wu B (2015) Detecting APT malware infections based on malicious DNS and traffic analysis. IEEE Access 3:1132–1142. https://doi.org/10.1109/ACCESS.2015.2458581
Han X, Pasquier T, Bates A, Mickens J, Seltzer M (2020) UNICORN: runtime provenance-based detector for advanced persistent threats. 27th ISOC network and distributed system security symposium (NDSS’20), San Diego, CA, USA
Schindler T (2018) Anomaly detection in log data using graph databases and machine learning to defend advanced persistent threats. arXiv:arXiv:1802.00259
Pei Kexin, et al. (2016) HERCULE: attack story reconstruction via community discovery on correlated log graph. In Proceedings of the 32nd annual conference on computer security applications, Los Angeles, California USA, pp 583–595. https://doi.org/10.1145/2991079.2991122
Hassan WU, Bates A, Marino D (2020) Tactical provenance analysis for endpoint detection and response systems. 2020 IEEE symposium on security and privacy (SP), San Francisco, CA, USA, pp. 1172–1189. https://doi.org/10.1109/SP40000.2020.00096
Yan G, Li Q, Guo D, Meng X (2020) Discovering suspicious APT behaviors by analyzing DNS activities. Sensors 20:1–17. https://doi.org/10.3390/s20030731
Xuan CD, Dao MH (2021) A novel approach for APT attack detection based on combined deep learning model. Neural Comput Appl. https://doi.org/10.1007/s00521-021-05952-5
Busch J, Kocheturov A, Tresp V, Seidl T (2021) NF-GNN: network flow graph neural networks for malware detection and classification. arXiv, arXiv:2103.03939
Schranko de Oliveira A, Sassi RJ (2019) Behavioral malware detection using deep graph convolutional neural networks. TechRxiv. Preprint. https://doi.org/10.36227/techrxiv.10043099.v1
HaddadPajouh H, Dehghantanha A, Khayami R, Choo KKR (2018) A deep recurrent neural network based approach for internet of things malware threat hunting. Future Gener Comput Syst 85:88–96. https://doi.org/10.1016/j.future.2018.03.007
Hashemi H, Azmoodeh A et al (2017) Graph embedding as a new approach for unknown malware detection. J Comput Virol Hack Tech 13:153–166. https://doi.org/10.1007/s11416-016-0278-y
Jiang H, Turki T, Wang JTL, Graph DL (2018) Malware detection using deep learning and graph embedding. 17th IEEE international conference on machine learning and applications (ICMLA), pp. 1029–1033. https://doi.org/10.1109/ICMLA.2018.00168
Yan J, Yan G, Jin D (2019) Classifying malware represented as control flow graphs using deep graph convolutional neural network. 49th Annual IEEE/IFIP international conference on dependable systems and networks (DSN), pp. 52–63. https://doi.org/10.1109/DSN.2019.00020
Cai M, Jiang Y, Gao C, Li H, Yuan W (2021) Learning features from enhanced function call graphs for Android malware detection. Neurocomputing 423:301–307. https://doi.org/10.1016/j.neucom.2020.10.054
Wang S, Chen Z et al (2019) Heterogeneous graph matching networks for unknown malware detection. Proceedings of the twenty-eighth international joint conference on artificial intelligence main track, pp 3762–3770. https://doi.org/10.24963/ijcai.2019/522
Tajoddin A, Abadi M (2019) RAMD: registry-based anomaly malware detection using one-class ensemble classifiers. Appl Intell 49:2641–2658. https://doi.org/10.1007/s10489-018-01405-0
Halsey M, Bettany A (2015) Windows registry troubleshooting. Apress, Berkeley. https://doi.org/10.1007/978-1-4842-0992-9
Blake E. Strom, Andy Applebaum, Doug P. Miller, Kathryn C. Nickels, Adam G. Pennington, Cody B. Thomas (2020) MITRE ATT&CK: design and philosophy. https://attack.mitre.org/docs/ATTACK_Design_and_Philosophy_March_2020.pdf?fbclid=IwAR3AAczELLv3svk25sy_l3I3yxnuhj6E-LAszibwFi02DBpddhy0qqKrfOE. Accessed 26 Mar 2021
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Mahdavifar S, Ghorbani AA (2019) Application of deep learning to cybersecurity: a survey. Neurocomputing 347:149–176. https://doi.org/10.1016/j.neucom.2019.02.056
Zhou J, Cui G, Shengding H et al (2020) Graph neural networks: a review of methods and applications. AI Open 1:57–81. https://doi.org/10.1016/j.aiopen.2021.01.001
Makarov I, Kiselev D, Nikitinsky N, Subelj L (2021) Survey on graph embeddings and their applications to machine learning problems on graphs. PeerJ Comput Sci 7(3). https://doi.org/10.7717/peerj-cs.357
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv:arXiv:1609.02907
Chen M, Wei Z, Huang Z, Ding B, Li Y (2021) Simple and deep graph convolutional networks. arXiv:arXiv:2007.02133v1
Kishan KC, Li R, Cui F, Haake A (2020) Predicting biomedical interactions with higher-order graph convolutional networks. arXiv:arXiv:2010.08516
Balcilar M, Renton G et al (2020) Bridging the gap between spectral and spatial domains in graph neural networks. arXiv:arXiv:2003.11702
Zhang M, Cui Z, Neumann M, Chen Y (2018) An end-to-end deep learning architecture for graph classification. The thirty-second AAAI conference on artificial intelligence (AAAI-18), Hilton New Orleans Riverside, New Orleans, Louisiana, USA, pp. 4438–4445
Xu K, Hu W, Leskovec J, Jegelka S (2018) How powerful are graph neural networks? arXiv:arXiv:1810.00826
Peng Y, Lin Y et al (2020) Enhanced graph isomorphism network for molecular ADMET properties prediction. IEEE Access 8:168344–168360. https://doi.org/10.1109/ACCESS.2020.3022850
Li F, Chen Z et al (2019) Graph intention network for click-through rate prediction in sponsored search. Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval, pp. 961–964. https://doi.org/10.1145/3331184.3331283
Goy P, Ferrara E (2018) Graph embedding techniques, applications, and performance: a survey. Knowl-Based Syst 151:78–94. https://doi.org/10.1016/j.knosys.2018.03.022
Interactive Online Malware Sandbox. https://app.any.run/. Accessed 26 Mar 2021
Vietnam Cyberspace Security Technology JSC (VNCS). http://www.vncert.gov.vn/index.php. Accessed 26 Mar 2021
Viettel cyberspace center. https://viettelcybersecurity.com/#/home. Accessed 26 Mar 2021
CyRadar. https://cyradar.com/#. Accessed 26 Mar 2021
National Cyber Security Center – NCSC. https://khonggianmang.vn/intro. Accessed 26 Mar 2021
Pei Xinjun Y, Long TS (2020) AMalNet: a deep learning framework based on graph convolutional networks for malware detection. Comput Secur 93:101792. https://doi.org/10.1016/j.cose.2020.101792
Phan AV, Nguyen LM, Nguyen Y, Bui LT (2018) DGCNN: a convolutional neural network over large-scale labeled graphs. Neural Netw 108:533–543
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Do Xuan, C., Huong, D. A new approach for APT malware detection based on deep graph network for endpoint systems. Appl Intell 52, 14005–14024 (2022). https://doi.org/10.1007/s10489-021-03138-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-03138-z