skip to main content
10.1145/3079856.3080207acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article
Public Access

Parallel Automata Processor

Published:24 June 2017Publication History

ABSTRACT

Finite State Machines (FSM) are widely used computation models for many application domains. These embarrassingly sequential applications with irregular memory access patterns perform poorly on conventional von-Neumann architectures. The Micron Automata Processor (AP) is an in-situ memory-based computational architecture that accelerates non-deterministic finite automata (NFA) processing in hardware. However, each FSM on the AP is processed sequentially, limiting potential speedups.

In this paper, we explore the FSM parallelization problem in the context of the AP. Extending classical parallelization techniques to NFAs executing on AP is non-trivial because of high state-transition tracking overheads and exponential computation complexity. We present the associated challenges and propose solutions that leverage both the unique properties of the NFAs (connected components, input symbol ranges, convergence, common parent states) and unique features in the AP (support for simultaneous transitions, low-overhead flow switching, state vector cache) to realize parallel NFA execution on the AP.

We evaluate our techniques against several important benchmarks including NFAs used for network intrusion detection, malware detection, text processing, protein motif searching, DNA sequencing, and data analytics. Our proposed parallelization scheme demonstrates significant speedup (25.5x on average) compared to sequential execution on AP. Prior work has already shown that sequential execution on AP is at least an order of magnitude better than GPUs, multi-core processors and Xeon Phi accelerator.

References

  1. Micron Automata Processing. Retrieved May 3, 2017 from http://www.micronautomata.com/Google ScholarGoogle Scholar
  2. Micron Automata Processing D480 Documentation Design Notes. Retrieved May 3, 2017 from http://www.micronautomata.com/documentation/anml_documentation/c_D480_design_notes.htmlGoogle ScholarGoogle Scholar
  3. Micron Automata Processing D480 Software Development Kit. AP Flow Concepts. Retrieved May 3, 2017 from http://micronautomata.com/apsdk_documentation/latest/h1_ap.htmlGoogle ScholarGoogle Scholar
  4. Alfred V. Aho and Margaret J. Corasick. 1975. Efficient String Matching: An Aid to Bibliographic Search. Commun. ACM 18, 6 (June 1975), 333--340. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Rajeev Alur and Mihalis Yannakakis. 1998. Model checking of hierarchical state machines. In ACM SIGSOFT Software Engineering Notes, Vol. 23. ACM, 175--188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Kevin Angstadt, Westley Weimer, and Kevin Skadron. 2016. RAPID Programming of Pattern-Recognition Processors. In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 593--605. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Michela Becchi and Patrick Crowley. 2008. Efficient regular expression evaluation: theory to practice. In Proceedings of the 2008 ACM/IEEE Symposium on Architecture for Networking and Communications Systems, ANCS 2008, San Jose, California, USA, November 6-7, 2008. 50--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Michela Becchi, Mark A. Franklin, and Patrick Crowley. 2008. A workload for evaluating deep packet inspection architectures. In 4th International Symposium on Workload Characterization (IISWC 2008), Seattle, Washington, USA, September 14-16, 2008. 79--89.Google ScholarGoogle ScholarCross RefCross Ref
  9. Chunkun Bo, Ke Wang, Jeffrey J Fox, and Kevin Skadron. 2015. Entity Resolution Acceleration using Micron's Automata Processor. Architectures and Systems for Big Data (ASBD), in conjunction with ISCA (2015).Google ScholarGoogle Scholar
  10. Alessandro Cimatti, Edmund Clarke, Enrico Giunchiglia, Fausto Giunchiglia, Marco Pistore, Marco Roveri, Roberto Sebastiani, and Armando Tacchella. 2002. Nusmv 2: An opensource tool for symbolic model checking. In International Conference on Computer Aided Verification. Springer, 359--364. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Sutapa Datta and Subhasis Mukhopadhyay. 2015. A grammar inference approach for predicting kinase specific phosphorylation sites. PloS one 10, 4 (2015), e0122294.Google ScholarGoogle ScholarCross RefCross Ref
  12. Paul Dlugosch, Dave Brown, Paul Glendenning, Michael Leventhal, and Harold Noyes. 2014. An efficient and scalable semiconductor architecture for parallel automata processing. IEEE Transactions on Parallel and Distributed Systems 25, 12 (2014), 3088--3098.Google ScholarGoogle ScholarCross RefCross Ref
  13. Domenico Ficara, Stefano Giordano, Gregorio Procissi, Fabio Vitucci, Gianni Antichi, and Andrea Di Pietro. 2008. An improved DFA for fast regular expression matching. ACM SIGCOMM Computer Communication Review 38, 5 (2008), 29--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Linley Gwennap. 2014. Micron Accelerates Automata:New Chip Speeds NFA Processing Using DRAM Architectures. In Microprocessor Report.Google ScholarGoogle Scholar
  15. W Daniel Hillis and Guy L Steele Jr. 1986. Data parallel algorithms. Commun. ACM 29, 12 (1986), 1170--1183. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Tommy Tracy II, Yao Fu, Indranil Roy, Eric Jonas, and Paul Glendenning. 2016. Towards Machine Learning on the Automata Processor. In High Performance Computing: 31st International Conference, ISC High Performance 2016, Frankfurt, Germany, June 19-23, 2016, Proceedings, Vol. 9697. Springer, 200.Google ScholarGoogle Scholar
  17. Christopher Grant Jones, Rose Liu, Leo Meyerovich, Krste Asanovic, and Rastislav Bodik. 2009. Parallelizing the web browser. In Proceedings of the First USENIX Workshop on Hot Topics in Parallelism. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Christopher Grant Jones, Rose Liu, Leo Meyerovich, Krste Asanović, and Rastislav Bodík. 2009. Parallelizing the Web Browser. In Proceedings of the First USENIX Conference on Hot Topics in Parallelism (HotPar'09). USENIX Association, Berkeley, CA, USA, 7--7. http://dl.acm.org/citation.cfm?id=1855591.1855598 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Blake Kaplan. Speculative parsing path. Bug 527623. Retrieved May 3, 2017 from http://bugzilla.mozilla.orgGoogle ScholarGoogle Scholar
  20. Shmuel Tomi Klein and Yair Wiseman. 2003. Parallel Huffman decoding with applications to JPEG files. Comput. J. 46, 5 (2003), 487--497.Google ScholarGoogle ScholarCross RefCross Ref
  21. Sailesh Kumar, Sarang Dharmapurikar, Fang Yu, Patrick Crowley, and Jonathan Turner. 2006. Algorithms to accelerate multiple regular expressions matching for deep packet inspection. In ACM SIGCOMM Computer Communication Review, Vol. 36. ACM, 339--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Richard E Ladner and Michael J Fischer. 1980. Parallel prefix computation. Journal of the ACM (JACM) 27, 4 (1980), 831--838. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Daniel Luchaup, Randy Smith, Cristian Estan, and Somesh Jha. 2009. Multi-byte regular expression matching with speculation. In International Workshop on Recent Advances in Intrusion Detection. Springer, 284--303. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Sasa Misailovic, Michael Carbin, Sara Achour, Zichao Qi, and Martin C Rinard. 2014. Chisel: Reliability-and accuracy-aware optimization of approximate computational kernels. In ACM SIGPLAN Notices, Vol. 49. ACM, 309--328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Todd Mytkowicz, Madanlal Musuvathi, and Wolfram Schulte. 2014. Data-parallel finite-state machines. In Architectural Support for Programming Languages and Operating Systems, ASPLOS '14, Salt Lake City, UT, USA, March 1-5, 2014. 529--542. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Alexandre Petrenko. 2001. Fault model-driven test derivation from finite state models: Annotated bibliography. In Modeling and verification of parallel processes. Springer, 196--205. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Junqiao Qiu, Zhijia Zhao, and Bin Ren. 2016. MicroSpec: Speculation-Centric Fine-Grained Parallelization for FSM Computations. In Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, PACT 2016, Haifa, Israel, September 11-15, 2016. 221--233. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Indranil Roy and Srinivas Aluru. 2016. Discovering motifs in biological sequences using the micron automata processor. IEEE/ACM Transactions on Computational Biology and Bioinformatics 13, 1 (2016), 99--111.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Margus Veanes, Todd Mytkowicz, David Molnar, and Benjamin Livshits. 2015. Data-Parallel String-Manipulating Programs. In Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2015, Mumbai, India, January 15-17, 2015. 139--152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Jack Wadden, Nathan Brunelle, Ke Wang, Mohamed El-Hadedy, Gabriel Robins, Mircea Stan, and Kevin Skadron. 2016. Generating efficient and high-quality pseudo-random behavior on Automata Processors. In 34th IEEE International Conference on Computer Design, ICCD 2016, Scottsdale, AZ, USA, October 2-5, 2016. 622--629.Google ScholarGoogle ScholarCross RefCross Ref
  31. Jack Wadden, Vinh Dang, Nathan Brunelle, Tommy Tracy II, Deyuan Guo, Elaheh Sadredini, Ke Wang, Chunkun Bo, Gabriel Robins, Mircea Stan, and Kevin Skadron. 2016. ANMLzoo: a benchmark suite for exploring bottlenecks in automata processing engines and architectures. In 2016 IEEE International Symposium on Workload Characterization, IISWC 2016, Providence, RI, USA, September 25-27, 2016. 105--166.Google ScholarGoogle ScholarCross RefCross Ref
  32. Ke Wang, Yanjun Qi, Jeffrey J Fox, Mircea R Stan, and Kevin Skadron. 2015. Association rule mining with the micron automata processor. In Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International. IEEE, 689--699. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Ke Wang, Elaheh Sadredini, and Kevin Skadron. 2016. Sequential pattern mining with the Micron automata processor. In Proceedings of the ACM International Conference on Computing Frontiers. ACM, 135--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Michael HLS Wang, Gustavo Cancelo, Christopher Green, Deyuan Guo, Ke Wang, and Ted Zmuda. 2016. Using the automata processor for fast pattern recognition in high energy physics experiments--A proof of concept. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 832 (2016), 219--230.Google ScholarGoogle ScholarCross RefCross Ref
  35. Qiong Wang, Mohamed El-Hadedy, Ke Wang, and Kevin Skadron. 2016. Accelerating Weeder: A DNA Motif Search Tool using the Micron Automata Processor. (2016).Google ScholarGoogle Scholar
  36. Zhen-Gang Wang, Johann Elbaz, Françoise Remacle, RD Levine, and Itamar Willner. 2010. All-DNA finite-state automata with finite memory. Proceedings of the National Academy of Sciences 107, 51 (2010), 21996--22001.Google ScholarGoogle ScholarCross RefCross Ref
  37. Yi-Hua E Yang and Viktor K Prasanna. 2011. Optimizing regular expression matching with sr-nfa on multi-core systems. In Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on. IEEE, 424--433. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Fang Yu, Zhifeng Chen, Yanlei Diao, TV Lakshman, and Randy H Katz. 2006. Fast and memory-efficient regular expression matching for deep packet inspection. In Proceedings of the 2006 ACM/IEEE symposium on Architecture for networking and communications systems. ACM, 93--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Zhijia Zhao and Xipeng Shen. 2015. On-the-Fly Principled Speculation for FSM Parallelization. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '15, Istanbul, Turkey, March 14-18, 2015. 619--630. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Zhijia Zhao, Bo Wu, and Xipeng Shen. 2014. Challenging the "embarrassingly sequential": parallelizing finite state machine-based computations through principled speculation. In Architectural Support for Programming Languages and Operating Systems, ASPLOS '14, Salt Lake City, UT, USA, March 1-5, 2014. 543--558. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Keira Zhou, Jeffrey J Fox, Ke Wang, Donald E Brown, and Kevin Skadron. 2015. Brill tagging on the micron automata processor. In Semantic Computing (ICSC), 2015 IEEE International Conference on. IEEE, 236--239.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Parallel Automata Processor

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ISCA '17: Proceedings of the 44th Annual International Symposium on Computer Architecture
          June 2017
          736 pages
          ISBN:9781450348928
          DOI:10.1145/3079856

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 June 2017

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          ISCA '17 Paper Acceptance Rate54of322submissions,17%Overall Acceptance Rate543of3,203submissions,17%

          Upcoming Conference

          ISCA '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader