skip to main content
10.1145/2429069.2429128acmconferencesArticle/Chapter ViewAbstractPublication PagespoplConference Proceedingsconference-collections
research-article

Optimizing data structures in high-level programs: new directions for extensible compilers based on staging

Published:23 January 2013Publication History

ABSTRACT

High level data structures are a cornerstone of modern programming and at the same time stand in the way of compiler optimizations. In order to reason about user- or library-defined data structures compilers need to be extensible. Common mechanisms to extend compilers fall into two categories. Frontend macros, staging or partial evaluation systems can be used to programmatically remove abstraction and specialize programs before they enter the compiler. Alternatively, some compilers allow extending the internal workings by adding new transformation passes at different points in the compile chain or adding new intermediate representation (IR) types. None of these mechanisms alone is sufficient to handle the challenges posed by high level data structures. This paper shows a novel way to combine them to yield benefits that are greater than the sum of the parts.

Instead of using staging merely as a front end, we implement internal compiler passes using staging as well. These internal passes delegate back to program execution to construct the transformed IR. Staging is known to simplify program generation, and in the same way it can simplify program transformation. Defining a transformation as a staged IR interpreter is simpler than implementing a low-level IR to IR transformer. With custom IR nodes, many optimizations that are expressed as rewritings from IR nodes to staged program fragments can be combined into a single pass, mitigating phase ordering problems. Speculative rewriting can preserve optimistic assumptions around loops.

We demonstrate several powerful program optimizations using this architecture that are particularly geared towards data structures: a novel loop fusion and deforestation algorithm, array of struct to struct of array conversion, object flattening and code generation for heterogeneous parallel devices. We validate our approach using several non trivial case studies that exhibit order of magnitude speedups in experiments.

Skip Supplemental Material Section

Supplemental Material

r1d3_talk10.mp4

mp4

226.6 MB

References

  1. S. Ackermann, V. Jovanovic, T. Rompf, and M. Odersky. Jet: An embedded dsl for high performance big data processing. BigData, 2012. http://infoscience.epfl.ch/record/181673/files/paper.pdf.Google ScholarGoogle Scholar
  2. M. S. Ager, O. Danvy, and H. K. Rohde. Fast partial evaluation of pattern matching in strings. ACM Trans. Program. Lang. Syst., 28 (4): 696--714, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Auerbach, D. F. Bacon, P. Cheng, and R. Rabbah. Lime: a java-compatible and synthesizable language for heterogeneous architectures. OOPSLA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Bravenboer, A. van Dam, K. Olmos, and E. Visser. Program transformation with scoped dynamic rewrite rules. Fundam. Inf., 69: 123--178, July 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. J. Brown, A. K. Sujeeth, H. Lee, T. Rompf, H. Chafi, M. Odersky, and K. Olukotun. A heterogeneous parallel framework for domain-specific languages. PACT, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. A. Brzozowski. Derivatives of regular expressions. J. ACM, 11 (4): 481--494, 1964. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Calcagno, W. Taha, L. Huang, and X. Leroy. Implementing multi-stage languages using asts, gensym, and reflection. In GPCE, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Carette, O. Kiselyov, and C. chieh Shan. Finally tagless, partially evaluated: Tagless staged interpreters for simpler typed languages. J. Funct. Program., 19 (5): 509--543, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Click and K. D. Cooper. Combining analyses, combining optimizations. ACM Trans. Program. Lang. Syst., 17: 181--196, March 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Consel and O. Danvy. Partial evaluation of pattern matching in strings. Inf. Process. Lett., 30 (2): 79--86, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. W. R. Cook, B. Delaware, T. Finsterbusch, A. Ibrahim, and B. Wiedermann. Model transformation by partial evaluation of model interpreters. Technical Report TR-09-09, UT Austin Department of Computer Science, 2008.Google ScholarGoogle Scholar
  12. D. Coutts, R. Leshchinskiy, and D. Stewart. Stream fusion: from lists to streams to nothing at all. In ICFP, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Ekman and G. Hedin. The jastadd system - modular extensible compiler construction. Sci. Comput. Program., 69 (1--3): 14--26, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Elliott, S. Finne, and O. de Moor. Compiling embedded languages. In W. Taha, editor, phSemantics, Applications, and Implementation of Program Generation, volume 1924 of Lecture Notes in Computer Science, pages 9--26. Springer Berlin / Heidelberg, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Futamura. Partial evaluation of computation process - an approach to a compiler-compiler. Higher-Order and Symbolic Computation, 12 (4): 381--391, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Grelck, K. Hinckfuß, and S.-B. Scholz. With-loop fusion for data locality and parallelism. IFL, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. M. Groenewegen, Z. Hemel, L. C. L. Kats, and E. Visser. WebDSL: a domain-specific language for dynamic web applications. In OOPSLA Companion, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Hofer, K. Ostermann, T. Rendel, and A. Moors. Polymorphic embedding of DSLs. GPCE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. JetBrains. Meta Programming System, 2009. URL http://www.jetbrains.com/mps/.Google ScholarGoogle Scholar
  20. N. D. Jones, C. K. Gomard, and P. Sestoft. Partial evaluation and automatic program generation. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. L. P. Jones, R. Leshchinskiy, G. Keller, and M. M. T. Chakravarty. Harnessing the Multicores: Nested Data Parallelism in Haskell. In FSTTCS, 2008.Google ScholarGoogle Scholar
  22. S. P. Jones, A. Tolmach, and T. Hoare. Playing by the rules: rewriting as a practical optimisation technique in ghc. Haskell, 2001.Google ScholarGoogle Scholar
  23. S. Karmesin, J. Crotinger, J. Cummings, S. Haney, W. Humphrey, J. Reynders, S. Smith, and T. J. Williams. Array design and expression evaluation in pooma ii. In ISCOPE, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. L. C. L. Kats and E. Visser. The Spoofax language workbench. rules for declarative specification of languages and IDEs. In SPLASH/OOPSLA Companion, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. R. Kelsey and P. Hudak. Realistic compilation by program transformation. In POPL, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. K. Kennedy, B. Broom, A. Chauhan, R. Fowler, J. Garvin, C. Koelbel, C. McCosh, and J. Mellor-Crummey. Telescoping languages: A system for automatic generation of domain languages. Proceedings of the IEEE, 93 (3): 387--408, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  27. G. Kossakowski, N. Amin, T. Rompf, and M. Odersky. Javascript as an embedded dsl. In ECOOP, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. H. Lee, K. J. Brown, A. K. Sujeeth, H. Chafi, T. Rompf, M. Odersky, and K. Olukotun. Implementing domain-specific languages for heterogeneous parallel computing. IEEE Micro, 31 (5): 42--53, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Lerner, D. Grove, and C. Chambers. Composing dataflow analyses and transformations. SIGPLAN Not., 37: 270--282, January 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. S. Lerner, T. D. Millstein, and C. Chambers. Automatically proving the correctness of compiler optimizations. In PLDI, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. A. Møller. dk.brics.automaton -- finite-state automata and regular expressions for Java, 2010.texttthttp://www.brics.dk/automaton/.Google ScholarGoogle Scholar
  32. A. Moors, T. Rompf, P. Haller, and M. Odersky. Scala-virtualized. PEPM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. N. Nystrom, M. R. Clarkson, and A. C. Myers. Polyglot: An extensible compiler framework for java. In CC, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. N. Nystrom, D. White, and K. Das. Firepile: run-time compilation for gpus in scala. GPCE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. S. Owens, J. Reppy, and A. Turon. Regular-expression derivatives re-examined. J. Funct. Program., 19 (2): 173--190, Mar. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. D. J. Quinlan, M. Schordan, Q. Yi, and A. Sæbjørnsen. Classification and utilization of abstractions for optimization. In ISoLA (Preliminary proceedings), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. T. Rompf. phLightweight Modular Staging and Embedded Compilers: Abstraction Without Regret for High-Level High-Performance Programming. PhD thesis, EPFL, 2012.Google ScholarGoogle Scholar
  38. T. Rompf and M. Odersky. Lightweight modular staging: a pragmatic approach to runtime code generation and compiled dsls. GPCE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. T. Rompf and M. Odersky. Lightweight modular staging: a pragmatic approach to runtime code generation and compiled dsls. Commun. ACM, 55 (6): 121--130, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. T. Rompf, I. Maier, and M. Odersky. Implementing first-class polymorphic delimited continuations by a type-directed selective cps-transform. In ICFP, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. T. Rompf, A. K. Sujeeth, H. Lee, K. J. Brown, H. Chafi, M. Odersky, and K. Olukotun. Building-blocks for performance oriented dsls. DSL, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  42. A. Shali and W. R. Cook. Hybrid partial evaluation. OOPSLA, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. M. Sperber and P. Thiemann. Realistic compilation by partial evaluation. In PLDI, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. A. K. Sujeeth, H. Lee, K. J. Brown, T. Rompf, M. Wu, A. R. Atreya, M. Odersky, and K. Olukotun. OptiML: an implicitly parallel domain-specific language for machine learning. ICML, 2011.Google ScholarGoogle Scholar
  45. E. Sumii and N. Kobayashi. A hybrid approach to online and offline partial evaluation. Higher-Order and Symbolic Computation, 14 (2--3): 101--142, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. W. Taha and T. Sheard. Metaml and multi-stage programming with explicit annotations. Theor. Comput. Sci., 248 (1--2): 211--242, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. R. Tate, M. Stepp, Z. Tatlock, and S. Lerner. Equality saturation: a new approach to optimization. In POPL, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. R. Tate, M. Stepp, and S. Lerner. Generating compiler optimizations from proofs. In POPL, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. P. Thiemann and D. Dussart. Partial evaluation for higher-order languages with state. Technical report, 1999. URL http://www.informatik.uni-freiburg.de/ thiemann/papers/mlpe.ps.gz.Google ScholarGoogle Scholar
  50. S. Tobin-Hochstadt, V. St-Amour, R. Culpepper, M. Flatt, and M. Felleisen. Languages as libraries. PLDI'11, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. T. L. Veldhuizen. Expression templates, C++gems. SIGS Publications, Inc., New York, NY, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. T. L. Veldhuizen. Arrays in blitz. In ISCOPE, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. T. L. Veldhuizen and J. G. Siek. Combining optimizations, combining theories. Technical report, Indiana University, 2008.Google ScholarGoogle Scholar
  54. P. Wadler. Deforestation: Transforming programs to eliminate trees. Theor. Comput. Sci., 73 (2): 231--248, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. P. Wadler and S. Blott. How to make ad-hoc polymorphism less ad-hoc. In POPL, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Optimizing data structures in high-level programs: new directions for extensible compilers based on staging

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            POPL '13: Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
            January 2013
            586 pages
            ISBN:9781450318327
            DOI:10.1145/2429069
            • cover image ACM SIGPLAN Notices
              ACM SIGPLAN Notices  Volume 48, Issue 1
              POPL '13
              January 2013
              561 pages
              ISSN:0362-1340
              EISSN:1558-1160
              DOI:10.1145/2480359
              Issue’s Table of Contents

            Copyright © 2013 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 23 January 2013

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate824of4,130submissions,20%

            Upcoming Conference

            POPL '25

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader