skip to main content
10.1145/377792.377849acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
Article

Register-sensitive selection, duplication, and sequencing of instructions

Published:17 June 2001Publication History

ABSTRACT

In this paper, we present a new framework for selecting, duplicating and sequencing instructions so as to decrease register pressure. The motivation for this work is to target current and future high-performance processors where reductions in register pressure in the compiled programs can lead to improved performance.

For instruction selection and duplication, a unique feature of our approach is the ability to perform these transformations on intermediate-language instructions in a general dependence graph that contains both true and non-true dependences, unlike past work that restricted their attention to a single expression tree or a single expression dag. For instruction sequencing, we present a new algorithm for reducing register pressure that is based on backwards scheduling

We present preliminary performance results to validate our approach. Our results show that register-sensitive instruction duplication can deliver significant speedups (up to 1.22x) for the SPECint95 benchmarks on an IA-32 processor. We also show that register-sensitive sequencing delivers smaller speedups (up to 1.12x) for the SPECjvm and Java Grande benchmarks on a PowerPC processor (when utilizing two-thirds of its registers). We expect to see more significant speedups due to register-sensitive sequencing on processors with fewer register than the PowerPC (such as the IA-32).

References

  1. 1.A.V. Aho, R. Sethi, and J.D. Ullman. Compilers: Principles, Techniques, and Tools. Addison-Wesley, 1986.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2.Randy Allen and Ken Kennedy. Automatic Translation of FORTRAN Programs to Vector Form. ACM Transactions on Programming Languages and Systems, 9(4):491-592, October 1987.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3.Bowen Alpern et al. The Jalepeno virtual machine. IBM Systems Journal special issue on Java performance, 39(1), 2000. (See also http://www.research.ibm.com/jalapeno.)]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4.Matthew Arnold, David Grove, Michael Hind, Stephen Fink, and Peter F. Sweeney. Adaptive optimization in the Jalapeno JVM. In ACM Conference on Object-Oriented Programming Systems, Languages, and Applications, October 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5.M. Auslander and M. Hopkins. An Overview of the PL.8 Compiler. Proceedings of the Sigplan '82 Symposium on Compiler Construction, 17(6):22-31, June 1982.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6.D.A. Berson, R. Gupta, and M.L. Soffa. URSA: A Unified ReSource Allocator for Registers and Functional Units in VLIW Architectures. Proceedings of the IFIP WG 10.3 Working Conference onArchitectures and Compilations Techniques for Fine and Medium Grained Parallelism, pages 243-254, 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7.Michael G. Burke, Jong-Deok Choi, Stephen Fink, David Grove, Michael Hind, Vivek Sarkar, Mauricio J. Serrano, V. C. Sreedhar, Harini Srinivasan, and John Whaley. The Jalape no Dynamic Optimizing Compiler for Java. In ACM Java Grande Conference, June 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8.Steve Carr and Ken Kennedy. Scalar Replacement in the Presence of Conditional Control Flow. Software|Practice and Experience, (1):51-77, January 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9.Craig Chambers, Igor Pechtchanski, Vivek Sarkar, Mauricio J. Serrano, and Harini Srinivasan. Dependence analysis for Java. In 12th International Workshop on Languages and Compilers for Parallel Computing, August1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10.Jong-Deok Choi, David Grove, Michael Hind, and Vivek Sarkar. Efficient and precise modeling of exceptions for the analysis of Java programs. In ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, September 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11.R.R. Henry C.W. Fraser and T.A. Proebsting. Burg | fast optimal instruction selection and tree parsing. In SIGPLAN '92 Conference on Programming Language Design and Implementation, 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12.Ron Cytron and Jeanne Ferrante. What's in a Name? Or the Value of Renaming for Parallelism Detection and Storage Allocation. Proceedings of the 1987 International Conference onParallel Processing, pages 19-27, August 1987.]]Google ScholarGoogle Scholar
  13. 13.Ron Cytron, Jeanne Ferrante, and Vivek Sarkar. Experiences Using Control Dependence in PTRAN. In Languages and compilers for parallel computing. Selection of papers of the 2nd workshop. Held Aug. 1-3, 1989 in Urbana, IL., Research Monographs in Parallel and Distributed Computing, pages 186-212. MIT Press, Cambridge, MA, 1990.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14.Ron Cytron, Jim Lipkis, and Edith Schonberg. A Compiler-Assisted Approach to SPMD Execution. Supercomputing 90, November 1990.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15.S.J. Eggers D.G. Bradlee and R.R. Henry. Integrating register allocation and instruction scheduling for riscs. In Fourth ACM International Conference onArchitectural Support for Programming Languages and Operating Systems, 1991.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16.David A. Dunn and Wei-Chung Hsu. Instruction Scheduling for the HP PA-8000. Proceedings of MICRO-29, pages 298-307, December 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17.M. Anton Ertl. Optimal code selection in DAGs. In 26th Annual ACM SIGACT-SIGPLAN Symposium on the Principles of Programming Languages, January 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18.J. Ferrante, K. Ottenstein, and J. Warren. The Program Dependence Graph and its Use in Optimization. ACM Transactions on Programming Languages and Systems, 9(3):319-349, July 1987.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. 19.Stephen Fink, Kathleen Knobe, and Vivek Sarkar. Unified analysis of array and object references in strongly typed languages. In Static Analysis Symposium (SAS'00), June 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. 20.Christopher Fraser and David Hanson. A Retargetable C Compiler: Design and Implementation. Addison-Wesley, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21.Seth Copen Goldstein, Herman Schmit, Mihai Budiu, Srihari Cadambi, Matt Moe, and Reed Taylor. Baring it all to Software: Raw Processors. IEEE Computer, 33(4), April 2000.]]Google ScholarGoogle Scholar
  22. 22.J. Goodman and W. Hsu. Code Scheduling and Register Allocation in Large Basic Blocks. Proceedings of ACM Conference on Supercomputing, pages 442-452, 1988.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. 23.Tim Lindholm and Frank Yellin. The Java Virtual Machine Specification. The Java Series. Addison-Wesley, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. 24.Rajeev Motwani, Krishna V. Palem, Vivek Sarkar, and Salem Reyen. Combining Register Allocation and Instruction Scheduling (Technical Summary). Technical report, Courant Institute, New York University, July 1995. TR 698.]]Google ScholarGoogle Scholar
  25. 25.E. Pelegr -Llopart and S. L. Graham. Optimal code generation for expression trees: an application burs theory. In15th Annual ACM Symposium on the Principles of Programming Languages, pages 294-308, January 1988.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. 26.S.S. Pinter. Register allocation with instruction scheduling: a new approach. In ACM SIGPLAN Conference onProgramming Language Design and Implementation, pages 248-257, 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. 27.R. Silvera, J. Wang, G.R. Gao, and R. Govindarajan. A Register Pressure Sensitive Instruction Scheduler for Dynamic Issue Processors. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT), October 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. 28.M.G. Valluri and R. Govindarajan. Evaluating Register Allocation and Instruction Scheduling Techniques in Out-of-Order Issue Processors. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT), October 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. 29.E. Waingold, M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb, S. Amarasinghe, and A. Agarwal. Baring it all to Software: Raw Processors. IEEE Computer, September 1997. Special issue on "Future Microprocessors - How to use a Billion Transistors".]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. 30.Michael J. Wolfe. Optimizing Supercompilers for Supercomputers. Pitman, London and The MIT Press, Cambridge, Massachusetts, 1989. In the series, Research Monographs in Parallel and Distributed Computing.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. 31.Daniel Yellin. Speeding up dynamic transitive closure for bounded degree graphs. Acta Informatica, 30:369-384, 1993.]]Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Register-sensitive selection, duplication, and sequencing of instructions

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ICS '01: Proceedings of the 15th international conference on Supercomputing
          June 2001
          510 pages
          ISBN:158113410X
          DOI:10.1145/377792

          Copyright © 2001 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 17 June 2001

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          ICS '01 Paper Acceptance Rate45of133submissions,34%Overall Acceptance Rate584of2,055submissions,28%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader