Malware Analysis and Classification: A Survey

Abstract

One of the major and serious threats on the Internet today is malicious software, often referred to as a malware. The malwares being designed by attackers are polymorphic and metamorphic which have the ability to change their code as they propagate. Moreover, the diversity and volume of their variants severely undermine the effectiveness of traditional defenses which typically use signature based techniques and are unable to detect the previously unknown malicious executables. The variants of malware families share typical behavioral patterns reflecting their origin and purpose. The behavioral patterns obtained either statically or dynamically can be exploited to detect and classify unknown malwares into their known families using machine learning techniques. This survey paper provides an overview of techniques for analyzing and classifying the malwares.

Share and Cite:

Gandotra, E. , Bansal, D. and Sofat, S. (2014) Malware Analysis and Classification: A Survey. Journal of Information Security, 5, 56-64. doi: 10.4236/jis.2014.52006.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Bayer, U., Moser, A., Kruegel, C. and Kirda, E. (2006) Dynamic Analysis of Malicious Code. Journal in Computer Virology, 2, 67-77. http://dx.doi.org/10.1007/s11416-006-0012-2
[2] (2013) The Need for Speed: 2013 Incident Response Survey, FireEye.
http://www.inforisktoday.in/surveys/2013-incident-response-survey-s-18
[3] (2012) Addressing Big Data Security Challenges: The Right Tools for Smart Protection.
http://www.trendmicro.com/cloud-content/us/pdfs/business/white-papers/wp_addressing-big-data-security-challenges.pdf
[4] (2013) Infographic: The State of Malware.
http://www.mcafee.com/in/security-awareness/articles/state-of-malware-2013.aspx
[5] (2013) Next Generation Threats. http://www.fireeye.com/threat-protection/
[6] You, I. and Yim, K. (2010) Malware Obfuscation Techniques: A Brief Survey. Proceedings of International conference on Broadband, Wireless Computing, Communication and Applications, Fukuoka, 4-6 November 2010, 297-300.
http://dx.doi.org/10.1109/BWCCA.2010.85
[7] IDAPro. https://www.hex-rays.com/products/ida/support/download_freeware.shtml
[8] OllyDbg. http://www.ollydbg.de/
[9] LordPE. http://www.woodmann.com/collaborative/tools/index.php/LordPE
[10] OllyDump. http://www.woodmann.com/collaborative/tools/index.php/OllyDump
[11] Egele, M., Scholte, T., Kirda, E. and Kruegel, C. (2012) A Survey on Automated Dynamic Malware-Analysis Techniques and Tools. Journal in ACM Computing Surveys, 44, Article No. 6.
[12] Moser, A., Kruegel, C.and Kirda, E. (2007) Limits of Static Analysis for Malware Detection. 23rd Annual Computer Security Applications Conference, Miami Beach, 421-430.
[13] (2014) Process Monitor. http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx
[14] Capture BAT. https://www.honeynet.org/node/315
[15] (2014) Process Explorer. http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx
[16] Process Hackerreplace. http://processhacker.sourceforge.net/
[17] Wireshark. http://www.wireshark.org/
[18] Regshot. http://sourceforge.net/projects/regshot/
[19] Norman Sandbox. http://sandbox.norman.no
[20] Willems, C., Holz, T. and Freiling, F. (2007) Toward Automated Dynamic Malware Analysis Using Cwsandbox. IEEE Security & Privacy, 5, 32-39. http://dx.doi.org/10.1109/MSP.2007.45
[21] Anubis. http://anubis.iseclab.org/
[22] Bayer, U., Kruegel, C. and Kirda, E. (2006) TTAnalyze: A Tool for Analyzing Malware. Proceedings of the 15th European Institute for Computer Antivirus Research Annual Conference.
[23] Dinaburg, A., Royal, P., Sharif, M. and Lee, W. (2008) Ether: Malware Analysis via Hardware Virtualization Extensions. Proceedings of the 15th ACM Conference on Computer and Communications Security, CCS’08, Alexandria, 27-31 October 2008, 51-62.
[24] ThreatExpert. http://www.threatexpert.com/submit.aspx
[25] Schultz, M., Eskin, E., Zadok, F. and Stolfo, S. (2001) Data Mining Methods for Detection of New Malicious Executables. Proceedings of 2001 IEEE Symposium on Security and Privacy, Oakland, 14-16 May 2001, 38-49.
[26] Cohen, W. (1995) Fast Effective Rule Induction. Proceedings of 12th International Conference on Machine Learning, San Francisco, 115-123.
[27] Kolter, J. and Maloof, M. (2004) Learning to Detect Malicious Executables in the Wild. Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 470-478.
[28] Nataraj, L., Karthikeyan, S., Jacob, G. and Manjunath, B. (2011) Malware Images: Visualization and Automatic Classification. Proceedings of the 8th International Symposium on Visualization for Cyber Security, Article No. 4.
[29] Nataraj, L., Yegneswaran, V., Porras, P. and Zhang, J. (2011) A Comparative Assessment of Malware Classification Using Binary Texture Analysis and Dynamic Analysis. Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence, 21-30.
[30] Kong, D. and Yan, G. (2013) Discriminant Malware Distance Learning on Structural Information for Automated Malware Classification. Proceedings of the ACM SIGMETRICS/International Conference on Measurement and Modeling of Computer Systems, 347-348.
[31] Tian, R., Batten, L. and Versteeg, S. (2008) Function Length as a Tool for Malware Classification. Proceedings of the 3rd International Conference on Malicious and Unwanted Software, Fairfax, 7-8 October 2008, 57-64.
[32] Tian, R., Batten, L., Islam, R. and Versteeg, S. (2009) An Automated Classification System Based on the Strings of Trojan and Virus Families. Proceedings of the 4th International Conference on Malicious and Unwanted Software, Montréal, 13-14 October 2009, 23-30.
[33] Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P. and Witten, I. (2009) The WEKA Data Mining Software: An Update. ACM SIGKDD Explorations Newsletter, 10-18.
[34] Santos, I., Nieves, J. and Bringas, P.G. (2011) Semi-Supervised Learning for Unknown Malware Detection. International Symposium on Distributed Computing and Artificial Intelligence Advances in Intelligent and Soft Computing, 91, 415-422.
[35] Moskovitch, R., Stopel, D., Feher, C., Nissim, N. and Elovici, Y. (2008) Unknown Malcode Detection via Text Categorization and the Imbalance Problem. Proceedings of the 6th IEEE International Conference on Intelligence and Security Informatics, Taipei, 17-20 June 2008, 156-161.
[36] Santos, I., Nieves, J. and Bringas, P.G. (2011) Collective Classification for Unknown Malware Detection. Proceedings of the International Conference on Security and Cryptography, Seville, 18-21 July 2011, 251-256.
[37] Siddiqui, M., Wang, M.C. and Lee, J. (2009) Detecting Internet Worms Using Data Mining Techniques. Journal of Systemics, Cybernetics and Informatics, 6, 48-53.
[38] Zolkipli, M.F. and Jantan, A. (2011) An Approach for Malware Behavior Identification and Classification. Proceeding of 3rd International Conference on Computer Research and Development, Shanghai, 11-13 March 2011, 191-194.
[39] Rieck, K., Trinius, P., Willems, C. and Holz, T. (2011) Automatic Analysis of Malware Behavior Using Machine Learning. Journal of Computer Security, 19, 639-668.
[40] Anderson, B., Quist, D., Neil, J., Storlie, C. and Lane, T. (2011) Graph Based Malware Detection Using Dynamic Analysis. Journal in Computer Virology, 7, 247-258. http://dx.doi.org/10.1007/s11416-011-0152-x
[41] Bayer, U., Comparetti, P.M., Hlauschek, C. and Kruegel, C. (2009) Scalable, Behavior-Based Malware Clustering. Proceedings of the 16th Annual Network and Distributed System Security Symposium.
[42] Indyk, P. and Motwani, R. (1998) Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality. Proceedings of 30th Annual ACM Symposium on Theory of Computing, Dallas, 24-26 May 1998, 604-613.
[43] Tian, R., Islam, M.R., Batten, L. and Versteeg, S. (2010) Differentiating Malware from Cleanwares Using Behavioral Analysis. Proceedings of 5th International Conference on Malicious and Unwanted Software (Malware), Nancy, 19-20 October 2010, 23-30.
[44] Biley, M., Oberheid, J., Andersen, J., Morley Mao, Z., Jahanian, F. and Nazario, J. (2007) Automated Classification and Analysis of Internet Malware. Proceedings of the 10th International Conference on Recent Advances in Intrusion Detection, 4637, 178-197. http://dx.doi.org/10.1007/978-3-540-74320-0_10
[45] Park, Y., Reeves, D., Mulukutla, V. and Sundaravel, B. (2010) Fast Malware Classification by Automated Behavioral Graph Matching. Proceedings of the 6th Annual Workshop on Cyber Security and Information Intelligence Research, Article No. 45.
[46] Firdausi, I., Lim, C. and Erwin, A. (2010) Analysis of Machine Learning Techniques Used in Behavior Based Malware Detection. Proceedings of 2nd International Conference on Advances in Computing, Control and Telecommunication Technologies (ACT), Jakarta, 2-3 December 2010, 201-203.
[47] Nari, S. and Ghorbani, A. (2013) Automated Malware Classification Based on Network Behavior. Proceedings of International Conference on Computing, Networking and Communications (ICNC), San Diego, 28-31 January 2013, 642-647.
[48] Lee, T. and Mody, J.J. (2006) Behavioral Classification. Proceedings of the European Institute for Computer Antivirus Research Conference (EICAR’06).
[49] Santos, I., Devesa, J., Brezo, F., Nieves, J. and Bringas, P.G. (2013) OPEM: A Static-Dynamic Approach for Machine Learning Based Malware Detection. Proceedings of International Conference CISIS’12-ICEUTE’12, Special Sessions Advances in Intelligent Systems and Computing, 189, 271-280.
[50] Islam, R., Tian, R., Battenb, L. and Versteeg, S. (2013) Classification of Malware Based on Integrated Static and Dynamic Features. Journal of Network and Computer Application, 36, 646-556.
http://dx.doi.org/10.1016/j.jnca.2012.10.004
[51] Anderson, B., Storlie, C. and Lane, T. (2012) Improving Malware Classification: Bridging the Static/Dynamic Gap. Proceedings of 5th ACM Workshop on Security and Artificial Intelligence (AISec), 3-14.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.