skip to main content
10.1145/2967938.2967969acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
research-article

A DSL Compiler for Accelerating Image Processing Pipelines on FPGAs

Authors Info & Claims
Published:11 September 2016Publication History

ABSTRACT

This paper describes an automatic approach to accelerate image processing pipelines using FPGAs. An image processing pipeline can be viewed as a graph of interconnected stages that processes images successively. Each stage typically performs a point-wise, stencil, or other more complex operations on image pixels. Recent efforts have led to the development of domain-specific languages (DSL) and optimization frameworks for image processing pipelines. In this paper, we develop an approach to map image processing pipelines expressed in the PolyMage DSL to efficient parallel FPGA designs. Our approach exploits reuse and available memory bandwidth (or chip resources) maximally. When compared to Darkroom, a state-of-the-art approach to compile high-level DSL to FPGAs, our approach (a) leads to designs that deliver significantly higher throughput, and (b) supports a greater variety of filters. Furthermore, the designs we generate obtain an improvement even over pre-optimized FPGA implementations provided by vendor libraries for some of the benchmarks.

References

  1. C. Alias, A. Darte, and A. Plesco. Optimizing remote accesses for offloaded kernels: application to high-level synthesis for FPGA. In International workshop on Polyhedral Compilation Techniques (IMPACT), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Auerbach, D. F. Bacon, I. Burcea, P. Cheng, S. J. Fink, R. Rabbah, and S. Shukla. A compiler and runtime for heterogeneous computing. In Design Automation Conference, pages 271--276, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. F. Bacon, R. M. Rabbah, and S. Shukla. FPGA programming for the masses. Commun. ACM, 56(4):56--63, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Blender Foundation. Big Buck Bunny, 2008. The movie. http://www.bigbuckbunny.org/ License: CC BY 3.0 https://creativecommons.org/licenses/by/3.0/.Google ScholarGoogle Scholar
  5. U. Bondhugula, J. Ramanujam, and P. Sadayappan. Automatic mapping of nested loops to FPGAs. In ACM SIGPLAN PPoPP, Mar. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. M. Cardoso and D. P. C. Compilation Techniques for Reconfigurable Architectures. Springer US, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Creative Commons Attribution 3.0 license (CC BY 3.0). https://creativecommons.org/licenses/by/3.0/.Google ScholarGoogle Scholar
  8. Creative Commons Attribution-ShareAlike 3.0 license (CC BY-SA 3.0). https://creativecommons.org/licenses/by-sa/3.0/.Google ScholarGoogle Scholar
  9. A. Darte, R. Schreiber, B. R. Rau, and F. Vivien. A Constructive Solution to the Juggling Problem in Processor Array Synthesis. In IPDPS, pages 815--822, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Dase, J. Falcon, and B. MacCleery. Motorcycle control prototyping using an FPGA-based embedded control system. Control Systems, IEEE, 26(5):17--21, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  11. P. C. Diniz, M. W. Hall, J. Park, B. So, and H. Ziegler. Bridging the Gap between Compilation and Synthesis in the DEFACTO System. In LCPC, pages 52--70, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. B. Gokhale, J. M. Stone, J. Arnold, and M. Kalinowski. Stream-oriented FPGA computing in the Streams-C high level language. In IEEE symposium on Field-Programmable Custom Computing Machines, pages 49--56, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Z. Guo, W. Najjar, and B. Buyukkurt. Efficient hardware code generation for FPGAs. ACM Trans. Archit. Code Optim., 5(1):6:1--6:26, May 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Hagiescu, W.-F. Wong, D. Bacon, and R. Rabbah. A computing origami: Folding streams in FPGAs. In ACM/IEEE Design Automation Conference, pages 282--287, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Hegarty, J. Brunhaver, Z. DeVito, J. Ragan-Kelley, N. Cohen, S. Bell, A. Vasilyev, M. Horowitz, and P. Hanrahan. Darkroom: Compiling high-level image nprocessing code into hardware pipelines. ACM Trans. Graph., 33(4):144:1--144:11, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. The Heterogeneous Image Processing Acceleration Framework. http://hipacc-lang.org/.Google ScholarGoogle Scholar
  17. J. Holewinski, L.-N. Pouchet, and P. Sadayappan. High-performance code generation for stencil computations on GPU architectures. In International conference on Supercomputing, pages 311--320, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Hormati, M. Kudlur, S. Mahlke, D. Bacon, and R. Rabbah. Optimus: Efficient realization of streaming applications on FPGAs. In 2008 International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES), pages 41--50, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. B. K. P. Horn and B. G. Schunck. Determining optical flow. Artif. Intell., 17(1-3):185--203, 1981.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Krishnamoorthy, M. Baskaran, U. Bondhugula, J. Ramanujam, A. Rountev, and P. Sadayappan. Effective Automatic Parallelization of Stencil Computations. In ACM SIGPLAN conference on Programming Languages Design and Implementation, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. MATLAB HDL Coder. The MathWorks Inc. http://in.mathworks.com/products/hdl-coder//.Google ScholarGoogle Scholar
  22. R. Membarth, O. Reiche, F. Hannig, J. Teich, M. Körner, and W. Eckert. Hipacc: A domain-specific language and compiler for image processing. IEEE Trans. Parallel Distrib. Syst., 27(1):210--224, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. T. Mullapudi, V. Vasista, and U. Bondhugula. Polymage: Automatic optimization for image processing pipelines. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 429--443, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. W. A. Najjar, W. Böhm, B. A. Draper, J. Hammes, R. Rinker, J. R. Beveridge, M. Chawathe, and C. Ross. High-level language abstraction for reconfigurable computing. Computer, 36(8):63--69, Aug. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. R. S. Nikhil and Arvind. What is bluespec? SIGDA Newsl., 39(1):1--1, Jan. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Owaida, N. Bellas, K. Daloukas, and C. Antonopoulos. Synthesis of platform architectures from OpenCL programs. In IEEE Field-Programmable Custom Computing Machines (FCCM), pages 186--193, May 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. P. R. Panda. Systemc: A modeling platform supporting multiple design abstractions. In 14th International symposium on Systems Synthesis, pages 75--80, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Papakonstantinou, K. Gururaj, J. A. Stratton, D. Chen, J. Cong, and W. W. Hwu. Efficient compilation of CUDA kernels for high-performance computing on FPGAs. ACM Trans. Embedded Comput. Syst., 13(2):25, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. PolyMage benchmarks, 2015. https://github.com/bondhugula/polymage-benchmarks.Google ScholarGoogle Scholar
  30. PolyMage: A DSL and compiler for automatic optimization of image processing pipelines, 2015. http://mcl.csa.iisc.ernet.in/polymage.html.Google ScholarGoogle Scholar
  31. L. Pouchet, P. Zhang, P. Sadayappan, and J. Cong. Polyhedral-based data reuse optimization for configurable computing. In ACM/SIGDA International symposium on FPGAs, pages 29--38, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Ragan-Kelley, C. Barnes, A. Adams, S. Paris, F. Durand, and S. Amarasinghe. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In ACM SIGPLAN conference on Programming Languages Design and Implementation, pages 519--530, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. Ravishankar, J. Holewinski, and V. Grover. Forma: A dsl for image processing applications to target gpus and multi-core cpus. In 8th Workshop on General Purpose Processing Using GPUs, pages 109--120, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. O. Reiche, M. Schmid, F. Hannig, R. Membarth, and J. Teich. Code generation from a domain-specific language for C-based HLS of hardware accelerators. In 2014 International Conference on Hardware/Software Codesign and System Synthesis, pages 17:1--17:10, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. R. Schreiber, S. Aditya, S. Mahlke, V. Kathail, B. R. Rau, D. Cronquist, and M. Sivaraman. PICO-NPA: High-Level synthesis of non-programmable hardware maccelerators. J. VLSI Signal Process. Syst., 31(2):127--142, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. B. So, M. W. Hall, and P. C. Diniz. A compiler approach to fast hardware design space exploration in FPGA-based systems. In ACM SIGPLAN conference on Programming Languages Design and Implementation, pages 165--176, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. C. B. Spear. SystemVerilog for Verification: A Guide to Learning the Testbench Language Features. Springer, 2nd edition, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Adult tortoise, 2016. Finlay Cox. http://www.pasthorizonspr.com/wp-content/uploads/2016/02/tortoise.jpg License: CC BY-SA 3.0 https://creativecommons.org/licenses/by-sa/3.0/.Google ScholarGoogle Scholar
  39. X. Zhou, J.-P. Giacalone, M. J. Garzarán, R. H. Kuhn, Y. Ni, and D. Padua. Hierarchical overlapped tiling. In International symposium on Code Generation and Optimization, pages 207--218, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A DSL Compiler for Accelerating Image Processing Pipelines on FPGAs

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          PACT '16: Proceedings of the 2016 International Conference on Parallel Architectures and Compilation
          September 2016
          474 pages
          ISBN:9781450341219
          DOI:10.1145/2967938

          Copyright © 2016 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 11 September 2016

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          PACT '16 Paper Acceptance Rate31of119submissions,26%Overall Acceptance Rate121of471submissions,26%

          Upcoming Conference

          PACT '24
          International Conference on Parallel Architectures and Compilation Techniques
          October 14 - 16, 2024
          Southern California , CA , USA

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader