ABSTRACT
Deep packet inspection systems (DPI) process wire format network data from untrusted sources, collecting semantic information from a variety of protocols and file formats as they work their way upwards through the network stack. However, implementing corresponding dissectors for the potpourri of formats that today's networks carry, remains time-consuming and cumbersome, and also poses fundamental security challenges.
We introduce a novel framework, Spicy, for dissecting wire format data that consists of (i) a format specification language that tightly integrates syntax and semantics; (ii) a compiler toolchain that generates efficient and robust native dissector code from these specifications just-in-time; and (iii) an extensive API for DPI applications to drive the process and leverage results. Furthermore, Spicy can reverse the process as well, assembling wire format from the high-level specifications. We pursue a number of case studies that show-case dissectors for network protocols and file formats---individually, as well as chained into a dynamic stack that processes raw packets up to application-layer content. We also demonstrate a number of example host applications, from a generic driver program to integration into Wireshark and Bro. Overall, this work provides a new capability for developing powerful, robust, and reusable dissectors for DPI applications. We publish Spicy as open-source under BSD license.
- ab - Apache HTTP server benchmarking tool. https://httpd.apache.org/docs/2.4/programs/ab.html.Google Scholar
- G. Back. DataScript - A Specification and Scripting Language for Binary Data. In Proc. ACM Conference on Generative Programming and Component Engineering, 2002. Google ScholarDigital Library
- N. Borisov, D. J. Brumley, and H. J. Wang. A Generic Application-level Protocol Analyzer and Its Language. In Proc. Network & Distributed System Security Symposium, 2007.Google Scholar
- L. Burgy, L. Reveillere, J. Lawall, and G. Muller. Zebu: A Language-Based Approach for Network Protocol Message Processing. IEEE Transactions on Software Engineering, 37(4):575--591, July 2011. Google ScholarDigital Library
- H. Dreger, A. Feldmann, M. Mai, V. Paxson, and R. Sommer. Dynamic Application-Layer Protocol Analysis for Network Intrusion Detection. In Proc. USENIX Security Symposium, 2006. Google ScholarDigital Library
- P. Eronen. TLS Record Layer Bugs (Presentation at IETF67). http://www.ietf.org/proceedings/67/slides/tls-3/tls-3.ppt, 2006.Google Scholar
- K. Fisher and R. Gruber. PADS: A Domain-specific Language for Processing Ad Hoc Data. In Proc. ACM Conference on Programming Language Design and Implementation, 2005. Google ScholarDigital Library
- K. Fisher, Y. Mandelbaum, and D. Walker. The Next 700 Data Description Languages. In Proc. ACM Symposium on Principles of Programming Languages, pages 2--15, 2006. Google ScholarDigital Library
- D. Grune and J. Ceriel. Parsing Techniques: A Practical Guide. Springer Publishing Company, 2nd edition, 2010. ISBN 978-0-387-20248-8. Google ScholarDigital Library
- Haka---Software Defined Security. http://www.haka-security.org.Google Scholar
- Hammer. https://github.com/UpstandingHackers/hammer.Google Scholar
- D. Knuth. Semantics of context-free languages. Mathematical systems theory, 2(2):127--145, 1968.Google Scholar
- Z. Li et al. NetShield: Massive Semantics-Based Vulnerability Signature Matching for High-Speed Networks. In Proc. ACM SIGCOMM, 2010. Google ScholarDigital Library
- P. J. McCann and S. Chandra. Packet Types: Abstract Specification of Network Protocol Messages. In Proc. ACM SIGCOMM, pages 321--333, 2000. Google ScholarDigital Library
- Microsoft Developer Network - DocSigSerializedCert-Store. https://msdn.microsoft.com/en-us/library/dd922793%28v=office.12%29.aspx.Google Scholar
- Microsoft support - TLS/SSL fragmentation update. http://support.microsoft.com/kb/2541763.Google Scholar
- OpenSSL Security Advisory. https://www.openssl.org/news/secadv_20140806.txt, Aug. 2014.Google Scholar
- PEF Architecture Tutorial. https://msdn.microsoft.com/en-us/library/jj714800.aspx.Google Scholar
- R. Pang and V. Paxson. A High-Level Programming Environment for Packet Trace Anonymization and Transformation. In Proc. ACM SIGCOMM, Aug. 2003. Google ScholarDigital Library
- R. Pang, V. Paxson, R. Sommer, and L. Peterson. binpac: A yacc for Writing Application Protocol Parsers. In Proc. ACM Internet Measurement Conference, 2006. Google ScholarDigital Library
- V. Paxson. Bro: A System for Detecting Network Intruders in Real-Time. Computer Networks, 31(23--24), 1999. Google ScholarDigital Library
- Protocol Buffers. https://code.google.com/p/protobuf.Google Scholar
- Scapy. http://www.secdev.org/projects/scapy.Google Scholar
- N. Schear, D. Albrecht, and N. Borisov. High-Speed Matching of Vulnerability Signatures. In Proc. Recent Advances in Intrusion Detection, 2008. Google ScholarDigital Library
- R. Sommer, M. Vallentin, L. De Carli, and V. Paxson. HILTI: An Abstract Execution Environment for Deep, Stateful Network Traffic Analysis. In Proc. ACM Internet Measurement Conference, 2014. Google ScholarDigital Library
- The Spicy Home Page. http://www.icir.org/hilti.Google Scholar
- The Open Information Security Foundation. http://www.openinfosecfoundation.org.Google Scholar
- The LLVM Compiler Infrastructure. http://llvm.org.Google Scholar
- Apache Thrift. http://thrift.apache.org.Google Scholar
- Wireshark. http://www.wireshark.org.Google Scholar
Recommendations
++Spicy: an open-source tool for second-generation schema mapping and data exchange
Recent results in schema-mapping and data-exchange research may be considered the starting point for a new generation of systems, capable of dealing with a significantly larger class of applications. In this paper we demonstrate the first of these ...
The Spicy system: towards a notion of mapping quality
SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of dataWe introduce the Spicy system, a novel approach to the problem of automatically selecting the best mappings among two data sources. Known schema mapping algorithms rely on value correspondences -- i.e. correspondences among semantically related ...
Mapping of bibliographical standards into XML
The most popular bibliographical standards, which prescribe the exchange of bibliographical data in machine readable form, are MARC (Machine Readable Cataloguing) and UNIMARC (Universal Machine Readable Cataloguing). This paper presents two schemas, ...
Comments