ABSTRACT
We propose a novel approach to infer protocol state machines in the realistic high-latency network setting, and apply it to the analysis of botnet Command and Control (C &C) protocols. Our proposed techniques enable an order of magnitude reduction in the number of queries and time needed to learn a botnet C &C protocol compared to classic algorithms (from days to hours for inferring the MegaD C &C protocol). We also show that the computed protocol state machines enable formal analysis for botnet defense, including finding the weakest links in a protocol, uncovering protocol design flaws, inferring the existence of unobservable communication back-channels among botnet servers, and finding deviations of protocol implementations which can be used for fingerprinting. We validate our technique by inferring the protocol state-machine from Postfix's SMTP implementation and comparing the inferred state-machine to the SMTP standard. Further, our experimental results offer new insights into MegaD's C &C, showing our technique can be used as a powerful tool for defense against botnets.
- }}2007 malware report: The economic impact of viruses, spyware, adware, botnets, and other malicious code. Technical report, Computer Economics Inc., 2007.Google Scholar
- }}Moheeb Abu Rajab, Jay Zarfoss, Fabian Monrose, and Andreas Terzis. A multifaceted approach to understanding the botnet phenomenon. In IMC '06: Proceedings of the 6th ACM SIGCOMM conference on Internet measurement, pages 41--52, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- }}Dana Angluin. Learning regular sets from queries and counterexamples. Information and Computation, 75(2):87--106, 1987. Google ScholarDigital Library
- }}Nikita Borisov, David Brumley, Helen J. Wang, John Dunagan, Pallavi Joshi, and Chuanxiong Guo. Generic application-level protocol analyzer and its language. In NDSS'07: Proceedings of the 2007 Network and Distributed System Security Symposium. The Internet Society, Feb 2007.Google Scholar
- }}Juan Caballero, Noah M. Johnson, Stephen McCamant, and Dawn Song. Binary code extraction and interface identification for security applications. In NDSS'10: Proceedings of the 17th Annual Network and Distributed System Security Symposium, Feb 2010.Google Scholar
- }}Juan Caballero, Pongsin Poosankam, Christian Kreibich, and Dawn Song. Dispatcher: Enabling active botnet infiltration using automatic protocol reverse-engineering. In CCS'09: Proceedings of the 16th ACM conference on Computer and communications security, pages 621--634, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- }}Juan Caballero, Heng Yin, Zhenkai Liang, and Dawn Song. Polyglot: Automatic extraction of protocol message format using dynamic binary analysis. In CCS'07: Proceedings of the 14th ACM Conference on Computer and Communications Security, pages 317--329, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- }}Chia Yuan Cho, Juan Caballero, Chris Grier, Vern Paxson, and Dawn Song. Insights from the inside: A view of botnet management from infiltration. In LEET'10: Proceedings of the 3rd USENIX Workshop on Large-Scale Exploits and Emergent Threats, pages 1--1, Berkeley, CA, USA, 2010. USENIX Association. Google ScholarDigital Library
- }}Edmund M. Clarke, Orna Grumberg, and Doron A. Peled. Model checking. MIT Press, Cambridge, MA, USA, 1999. Google ScholarDigital Library
- }}Paolo Milani Comparetti, Gilbert Wondracek, Christopher Kruegel, and Engin Kirda. Prospex: Protocol specification extraction. In SP'09: Proceedings of the 2009 30th IEEE Symposium on Security and Privacy, pages 110--125, Washington, DC, USA, 2009. IEEE Computer Society. Google ScholarDigital Library
- }}Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. The MIT Press, 2nd edition, 2001. Google ScholarDigital Library
- }}Weidong Cui, Jayanthkumar Kannan, and Helen J. Wang. Discoverer: Automatic protocol reverse engineering from network traces. In SS'07: Proceedings of 16th USENIX Security Symposium, pages 1--14, Berkeley, CA, USA, 2007. USENIX Association. Google ScholarDigital Library
- }}Weidong Cui, Marcus Peinado, Karl Chen, Helen J. Wang, and Luis Ir ún-Briz. Tupni: Automatic reverse engineering of input formats. In CCS'08: Proceedings of the 15th ACM Conference on Computer and Communications Security, pages 391--402. ACM, Oct 2008. Google ScholarDigital Library
- }}Colin de la Higuera. Grammatical Inference: Learning Automata and Grammars. Cambridge University Press, 2010. Google ScholarDigital Library
- }}Roger Dingledine, Nick Mathewson, and Paul Syverson. Tor: the second-generation onion router. In SSYM'04: Proceedings of the 13th conference on USENIX Security Symposium, pages 21--21, Berkeley, CA, USA, 2004. USENIX Association. Google ScholarDigital Library
- }}E. Mark Gold. Complexity of automaton identification from given data. Information and Control, 37(3):302--320, 1978.Google ScholarCross Ref
- }}Julian B. Grizzard, Vikram Sharma, Chris Nunnery, Brent ByungHoon Kang, and David Dagon. Peer-to-peer botnets: overview and case study. In HotBots'07: Proceedings of the 1st Workshop on Hot Topics in Understanding Botnets, pages 1--1, Berkeley, CA, USA, 2007. USENIX Association. Google ScholarDigital Library
- }}Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee. Botminer: clustering analysis of network traffic for protocoland structure-independent botnet detection. In SS'08: Proceedings of the 17th conference on Security symposium, pages 139--154, Berkeley, CA, USA, 2008. USENIX Association. Google ScholarDigital Library
- }}Guofei Gu, Phillip Porras, Vinod Yegneswaran, Martin Fong, and Wenke Lee. Bothunter: detecting malware infection through IDS-driven dialog correlation. In SS'07: Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, pages 1--16, Berkeley, CA, USA, 2007. USENIX Association. Google ScholarDigital Library
- }}Anubhav Gupta, K. L. McMillan, and Zhaohui Fu. Automated assumption generation for compositional verification. Form. Methods Syst. Des., 32(3):285--301, 2008. Google ScholarDigital Library
- }}Tating Hsu, Guoqiang Shu, and David Lee. A model-based approach to security flaw detection of network protocol implementation. In ICNP'08: Proceedings of the 15th IEEE International Conference on Network Protocols, pages 114--123, Oct 2008.Google ScholarDigital Library
- }}Anestis Karasaridis, Brian Rexroad, and David Hoeflin. Wide-scale botnet detection and characterization. In HotBots'07: Proceedings of the 1st Workshop on Hot Topics in Understanding Botnets, pages 7--7, Berkeley, CA, USA, 2007. USENIX Association. Google ScholarDigital Library
- }}J. Klensin. RFC 5321: Simple Mail Transfer Protocol, Oct 2008.Google Scholar
- }}Andreas Kuehlmann and Florian Krohm. Equivalence checking using cuts and heaps. In DAC'97: Proceedings of the 34th annual Design Automation Conference, pages 263--268, New York, NY, USA, 1997. ACM. Google ScholarDigital Library
- }}Kevin J. Lang. Faster algorithms for finding minimal consistent DFAs. Technical report, NEC, 1999.Google Scholar
- }}George H. Mealy. A method for synthesizing sequential circuits. Bell System Technical Journal, 34(5):1045--1079, 1955.Google ScholarCross Ref
- }}Mehryar Mohri and Mark-Jan Nederhof. Regular approximation of context-free grammars through transformation. In Robustness in Language and Speech Technology, pages 153--163. Kluwer Academic Publishers, Dordrecht, 2001.Google Scholar
- }}E. F. Moore. Gedanken Experiments On Sequential Machines. In Automata Studies, Annals of Mathematical Studies, volume 34, pages 129--153, Princeton, NJ, USA, 1956. Princeton University Press.Google Scholar
- }}Vern Paxson. Bro: a system for detecting network intruders in real-time. In SSYM'98: Proceedings of the 7th conference on USENIX Security Symposium, pages 3--3, Berkeley, CA, USA, 1998. USENIX Association. Google ScholarDigital Library
- }}C. P. Pfleeger. State reduction in incompletely specified finite-state machines. IEEE Transactions on Computers, 22(12):1099--1102, 1973. Google ScholarDigital Library
- }}J. R. Quinlan. Induction of decision trees. Machine Learning, 1(1):81--106, 1986. Google ScholarCross Ref
- }}R. L. Rivest and R. E. Schapire. Inference of finite automata using homing sequences. In STOC'89: Proceedings of the 21st annual ACM symposium on Theory of computing, pages 411--420, New York, NY, USA, 1989. ACM. Google ScholarDigital Library
- }}Muzammil Shahbaz and Roland Groz. Inferring Mealy machines. In FM'09: Proceedings of the 2nd World Congress on Formal Methods, pages 207--222, Berlin, Heidelberg, 2009. Springer. Google ScholarDigital Library
- }}B. A. Trakhtenbrot and Ya. M. Barzdin. Finite Automata, Behavior and Synthesis. North Holland, Amsterdam, 1973.Google Scholar
- }}Helen J. Wang, Chuanxiong Guo, Daniel R. Simon, and Alf Zugenmaier. Shield: vulnerability-driven network filters for preventing known vulnerability exploits. SIGCOMM Computer Communication Review, 34(4):193--204, 2004. Google ScholarDigital Library
- }}Zhi Wang, Xuxian Jiang, Weidong Cui, Xinyuan Wang, and Mike Grace. ReFormat: Automatic Reverse Engineering of Encrypted Messages. In ESORICS'09: Proceedings of the 14th European Symposium on Research in Computer Security, volume 5789 of Lecture Notes in Computer Science, pages 200--215. Springer, Sep 2009. Google ScholarDigital Library
Index Terms
- Inference and analysis of formal models of botnet command and control protocols
Recommendations
Social network-based botnet command-and-control: emerging threats and countermeasures
ACNS'10: Proceedings of the 8th international conference on Applied cryptography and network securityBotnets have become a major threat in cyberspace. In order to effectively combat botnets, we need to understand a botnet's Command-and-Control (C&C), which is challenging because C&C strategies and methods evolve rapidly. Very recently, botmasters have ...
Inferring protocol state machine from network traces: a probabilistic approach
ACNS'11: Proceedings of the 9th international conference on Applied cryptography and network securityApplication-level protocol specifications (i.e., how a protocol should behave) are helpful for network security management, including intrusion detection and intrusion prevention. The knowledge of protocol specifications is also an effective way of ...
Botnet Command and Control Mechanisms
ICCEE '09: Proceedings of the 2009 Second International Conference on Computer and Electrical Engineering - Volume 01Botnet is most widespread and occurs commonly in today's cyber attacks, resulting in serious threats to our network assets and organization's properties. Botnets are collections of compromised computers (Bots) which are remotely controlled by its ...
Comments