skip to main content
10.1145/3400302.3415781acmconferencesArticle/Chapter ViewAbstractPublication PagesiccadConference Proceedingsconference-collections
research-article

SODA: a new synthesis infrastructure for agile hardware design of machine learning accelerators

Published:17 December 2020Publication History

ABSTRACT

Next-generation systems, such as edge devices, will have to provide efficient processing of machine learning (ML) algorithms, along with several metrics, including energy, performance, area, and latency. However, the quickly evolving field of ML makes it extremely difficult to generate accelerators able to support a wide variety of algorithms. Simultaneously, designing accelerators in hardware description languages (HDLs) by hand is laborious and time-consuming, and does not allow quick exploration of the design space. This paper discusses the SODA synthesizer, an automated open-source high-level ML framework-to-Verilog compiler targeting ML Application-Specific Integrated Circuits (ASICs) chiplets based on the LLVM infrastructure. The SODA synthesizers will allow implementing optimal designs by combining templated and fully tunable IPs and macros, and fully custom components generated through high-level synthesis. All these components will be provided through an extendable resource library, characterized by commercial and open-source logic design flows. Through a closed-loop design space exploration engine, developers will quickly explore their hardware designs along different dimensions.

References

  1. The LLVM Compiler Infrastructure. http://llvm.org/.Google ScholarGoogle Scholar
  2. Open neural network exchange format, January 2019.Google ScholarGoogle Scholar
  3. Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache, and Oleksandr Zinenko. Mlir: A compiler infrastructure for the end of moore's law, 2020.Google ScholarGoogle Scholar
  4. Tianqi Chen, Thierry Moreau, Ziheng Jiang, Haichen Shen, Eddie Q. Yan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. TVM: end-to-end optimization stack for deep learning. CoRR, abs/1802.04799, 2018.Google ScholarGoogle Scholar
  5. Jared Roesch, Steven Lyubomirsky, Logan Weber, Josh Pollock, Marisa Kirisame, Tianqi Chen, and Zachary Tatlock. Relay: A new ir for machine learning frameworks. In Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages, MAPL 2018, page 58--68, New York, NY, USA, 2018. Association for Computing Machinery.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Izraelevitz, J. Koenig, P. Li, R. Lin, A. Wang, A. Magyar, D. Kim, C. Schmidt, C. Markley, J. Lawson, and J. Bachrach. Reusability is firrtl ground: Hardware construction languages, compiler frameworks, and transformations. In 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pages 209--216, Nov 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Tutu Ajayi, Vidya A. Chhabria, Mateus Fogaça, Soheil Hashemi, Abdelrahman Hosny, Andrew B. Kahng, Minsoo Kim, Jeongsup Lee, Uday Mallappa, Marina Neseem, Geraldo Pradipta, Sherief Reda, Mehdi Saligane, Sachin S. Sapatnekar, Carl Sechen, Mohamed Shalan, William Swartz, Lutong Wang, Zhehong Wang, Mingyu Woo, and Bangqi Xu. Toward an open-source digital flow: First learnings from the openroad project. In Proceedings of the 56th Annual Design Automation Conference 2019, DAC '19, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Marco Ceriani, Fabrizio Ferrandi, Pier Luca Lanzi, Donatella Sciuto, and Antonino Tumeo. Multiprocessor systems-on-chip synthesis using multi-objective evolutionary computation. In GECCO 2010: the 12th Annual Conference on Genetic and Evolutionary Computation, pages 1267--1274, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. F. Ferrandi, P. L. Lanzi, G. Palermo, C. Pilato, D. Sciuto, and A. Tumeo. An evolutionary approach to area-time optimization of fpga designs. In 2007 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, pages 145--152, July 2007.Google ScholarGoogle ScholarCross RefCross Ref
  10. Christian Pilato, Antonino Tumeo, Gianluca Palermo, Fabrizio Ferrandi, Pier Luca Lanzi, and Donatella Sciuto. Improving evolutionary exploration to area-time optimization of fpga designs. Journal of Systems Architecture, 54(11):1046 -- 1057, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Fabrizio Ferrandi, Pier Luca Lanzi, Christian Pilato, Donatella Sciuto, and Antonino Tumeo. Ant colony heuristic for mapping and scheduling tasks and communications on heterogeneous embedded systems. IEEE Trans. on CAD of Integrated Circuits and Systems, 29(6):911--924, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. F. Ferrandi, P. L. Lanzi, G. Palermo, C. Pilato, D. Sciuto, and A. Tumeo. An evolutionary approach to area-time optimization of fpga designs. In 2007 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, pages 145--152, July 2007.Google ScholarGoogle ScholarCross RefCross Ref
  13. Fabrizio Ferrandi, Christian Pilato, Donatella Sciuto, and Antonino Tumeo. Mapping and scheduling of parallel C applications with ant colony optimization onto heterogeneous reconfigurable mpsocs. In ASP-DAC 2010: the 15th Asia South Pacific Design Automation Conference, pages 799--804, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Antonino Tumeo, Marco Branca, Lorenzo Camerini, Christian Pilato, Pier Luca Lanzi, Fabrizio Ferrandi, and Donatella Sciuto. Mapping pipelined applications onto heterogeneous embedded systems: A bayesian optimization algorithm based approach. In CODES+ISSS '09: 7th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis, pages 443--452, 2009.Google ScholarGoogle Scholar
  15. Max Willsey, Vincent T Lee, Alvin Cheung, Rastislav Bodík, and Luis Ceze. Iterative search for reconfigurable accelerator blocks with a compiler in the loop. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 38(3):407--418, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  16. J. Weng, S. Liu, V. Dadu, Z. Wang, P. Shah, and T. Nowatzki. Dsagen: Synthesizing programmable spatial accelerators. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), pages 268--281, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Google Inc. Xla is a compiler that optimizes tensorflow computations, 2019.Google ScholarGoogle Scholar
  18. Nadav Rotem, Jordan Fix, Saleem Abdulrasool, Summer Deng, Roman Dzhabarov, James Hegeman, Roman Levenstein, Bert Maher, Nadathur Satish, Jakob Olesen, Jongsoo Park, Artem Rakhov, and Misha Smelyanskiy. Glow: Graph lowering compiler techniques for neural networks. CoRR, abs/1805.00907, 2018.Google ScholarGoogle Scholar
  19. Jonathan Ragan-Kelley, Andrew Adams, Dillon Sharlet, Connelly Barnes, Sylvain Paris, Marc Levoy, Saman Amarasinghe, and Frédo Durand. Halide: Decoupling algorithms from schedules for high-performance image processing. Commun. ACM, 61(1):106--115, December 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Intel. Plaidml. available at: https://www.intel.com/content/www/us/en/artificial-intelligence/plaidml.html.Google ScholarGoogle Scholar
  21. Daofu Liu, Tianshi Chen, Shaoli Liu, Jinhong Zhou, Shengyuan Zhou, Olivier Teman, Xiaobing Feng, Xuehai Zhou, and Yunji Chen. Pudiannao: A polyvalent machine learning accelerator. In ASPLOS '15: the 20th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 369--381, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. D. Mahajan, J. Park, E. Amaro, H. Sharma, A. Yazdanbakhsh, J. K. Kim, and H. Esmaeilzadeh. Tabla: A unified template-based framework for accelerating statistical machine learning. In HPCA 2016: IEEE International Symposium on High Performance Computer Architecture, pages 14--26, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  23. Y. Chen, J. Emer, and V. Sze. Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks. In ISCA 2016: ACM/IEEE 43rd Annual International Symposium on Computer Architecture, pages 367--379, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Luis A. Plana, David Clark, Simon Davidson, Steve Furber, Jim Garside, Eustace Painkras, Jeffrey Pepper, Steve Temple, and John Bainbridge. Spinnaker: Design and implementation of a gals multicore system-on-chip. J. Emerg. Technol. Comput. Syst., 7(4):17:1--17:18, December 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. R. Prabhakar, Y. Zhang, D. Koeplinger, M. Feldman, T. Zhao, S. Hadjis, A. Pedram, C. Kozyrakis, and K. Olukotun. Plasticine: A reconfigurable architecture for parallel patterns. In 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), pages 389--402, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Rick Bahr, Clark Barrett, Nikhil Bhagdikar, Alex Carsello, Nate Chizgi, Ross G Daly, Caleb Donovick, David Durst, Kayvon Fatahalian, Kathleen Feng, Pat Hanrahan, Teguh Hofstee, Mark Horowitz, Dillon Huff, Taeyoung Kong, Zheng Liang, Qiaoyi Liu, Makai Mann, Zachary Alexander Myers, Ankita Nayak, Aina Niemetz, Gedeon Nyengele, Priyanka Raina, Stephen Richardson, Raj Setaluri, Jeff Setter, Daniel Stanley, Maxwell Strange, Charles Tsao, James Thomas, Leonard Truong, Xuan Yang, and Keyi Zhang. Creating an agile hardware flow. In 2019 IEEE Hot Chips 31 Symposium (HCS), 2019.Google ScholarGoogle Scholar
  27. Hyoukjun Kwon, Ananda Samajdar, and Tushar Krishna. Maeri: Enabling flexible dataflow mapping over dnn accelerators via reconfigurable interconnects. ACM SIGPLAN Notices, 53(2):461--475, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. H. Sharma, J. Park, D. Mahajan, E. Amaro, J. K. Kim, C. Shao, A. Mishra, and H. Esmaeilzadeh. From high-level deep neural models to fpgas. In MICRO 2016: 49th Annual IEEE/ACM International Symposium on Microarchitecture, pages 1--12, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  29. LegUp High-Level Synthesis. http://legup.eecg.utoronto.ca.Google ScholarGoogle Scholar
  30. PandA: on Open Source Framework for Hardware-Software Codesign. https://panda.dei.polimi.it.Google ScholarGoogle Scholar

Index Terms

  1. SODA: a new synthesis infrastructure for agile hardware design of machine learning accelerators

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            ICCAD '20: Proceedings of the 39th International Conference on Computer-Aided Design
            November 2020
            1396 pages
            ISBN:9781450380263
            DOI:10.1145/3400302
            • General Chair:
            • Yuan Xie

            Copyright © 2020 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 17 December 2020

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate457of1,762submissions,26%

            Upcoming Conference

            ICCAD '24
            IEEE/ACM International Conference on Computer-Aided Design
            October 27 - 31, 2024
            New York , NY , USA

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader