Article

Register-sensitive selection, duplication, and sequencing of instructions

Authors:
Vivek Sarkar

IBM Research, T. J. Watson Research Center

IBM Research, T. J. Watson Research Center
View Profile

,
Mauricio J. Serrano

Intel Microprocessor, Research Labs

Intel Microprocessor, Research Labs
View Profile

,
Barbara B. Simons

Stanford University

Stanford University
View Profile

ICS '01: Proceedings of the 15th international conference on SupercomputingJune 2001Pages 277–288https://doi.org/10.1145/377792.377849

Published:17 June 2001Publication History

ICS '01: Proceedings of the 15th international conference on Supercomputing

Pages 277–288

ABSTRACT

In this paper, we present a new framework for selecting, duplicating and sequencing instructions so as to decrease register pressure. The motivation for this work is to target current and future high-performance processors where reductions in register pressure in the compiled programs can lead to improved performance.

For instruction selection and duplication, a unique feature of our approach is the ability to perform these transformations on intermediate-language instructions in a general dependence graph that contains both true and non-true dependences, unlike past work that restricted their attention to a single expression tree or a single expression dag. For instruction sequencing, we present a new algorithm for reducing register pressure that is based on backwards scheduling

We present preliminary performance results to validate our approach. Our results show that register-sensitive instruction duplication can deliver significant speedups (up to 1.22x) for the SPECint95 benchmarks on an IA-32 processor. We also show that register-sensitive sequencing delivers smaller speedups (up to 1.12x) for the SPECjvm and Java Grande benchmarks on a PowerPC processor (when utilizing two-thirds of its registers). We expect to see more significant speedups due to register-sensitive sequencing on processors with fewer register than the PowerPC (such as the IA-32).

References

1.A.V. Aho, R. Sethi, and J.D. Ullman. Compilers: Principles, Techniques, and Tools. Addison-Wesley, 1986.]] Google ScholarDigital Library
2.Randy Allen and Ken Kennedy. Automatic Translation of FORTRAN Programs to Vector Form. ACM Transactions on Programming Languages and Systems, 9(4):491-592, October 1987.]] Google ScholarDigital Library
3.Bowen Alpern et al. The Jalepeno virtual machine. IBM Systems Journal special issue on Java performance, 39(1), 2000. (See also http://www.research.ibm.com/jalapeno.)]] Google ScholarDigital Library
4.Matthew Arnold, David Grove, Michael Hind, Stephen Fink, and Peter F. Sweeney. Adaptive optimization in the Jalapeno JVM. In ACM Conference on Object-Oriented Programming Systems, Languages, and Applications, October 2000.]] Google ScholarDigital Library
5.M. Auslander and M. Hopkins. An Overview of the PL.8 Compiler. Proceedings of the Sigplan '82 Symposium on Compiler Construction, 17(6):22-31, June 1982.]] Google ScholarDigital Library
6.D.A. Berson, R. Gupta, and M.L. Soffa. URSA: A Unified ReSource Allocator for Registers and Functional Units in VLIW Architectures. Proceedings of the IFIP WG 10.3 Working Conference onArchitectures and Compilations Techniques for Fine and Medium Grained Parallelism, pages 243-254, 1993.]] Google ScholarDigital Library
7.Michael G. Burke, Jong-Deok Choi, Stephen Fink, David Grove, Michael Hind, Vivek Sarkar, Mauricio J. Serrano, V. C. Sreedhar, Harini Srinivasan, and John Whaley. The Jalape no Dynamic Optimizing Compiler for Java. In ACM Java Grande Conference, June 1999.]] Google ScholarDigital Library
8.Steve Carr and Ken Kennedy. Scalar Replacement in the Presence of Conditional Control Flow. Software|Practice and Experience, (1):51-77, January 1994.]] Google ScholarDigital Library
9.Craig Chambers, Igor Pechtchanski, Vivek Sarkar, Mauricio J. Serrano, and Harini Srinivasan. Dependence analysis for Java. In 12th International Workshop on Languages and Compilers for Parallel Computing, August1999.]] Google ScholarDigital Library
10.Jong-Deok Choi, David Grove, Michael Hind, and Vivek Sarkar. Efficient and precise modeling of exceptions for the analysis of Java programs. In ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, September 1999.]] Google ScholarDigital Library
11.R.R. Henry C.W. Fraser and T.A. Proebsting. Burg | fast optimal instruction selection and tree parsing. In SIGPLAN '92 Conference on Programming Language Design and Implementation, 1992.]] Google ScholarDigital Library
12.Ron Cytron and Jeanne Ferrante. What's in a Name? Or the Value of Renaming for Parallelism Detection and Storage Allocation. Proceedings of the 1987 International Conference onParallel Processing, pages 19-27, August 1987.]]Google Scholar
13.Ron Cytron, Jeanne Ferrante, and Vivek Sarkar. Experiences Using Control Dependence in PTRAN. In Languages and compilers for parallel computing. Selection of papers of the 2nd workshop. Held Aug. 1-3, 1989 in Urbana, IL., Research Monographs in Parallel and Distributed Computing, pages 186-212. MIT Press, Cambridge, MA, 1990.]] Google ScholarDigital Library
14.Ron Cytron, Jim Lipkis, and Edith Schonberg. A Compiler-Assisted Approach to SPMD Execution. Supercomputing 90, November 1990.]] Google ScholarDigital Library
15.S.J. Eggers D.G. Bradlee and R.R. Henry. Integrating register allocation and instruction scheduling for riscs. In Fourth ACM International Conference onArchitectural Support for Programming Languages and Operating Systems, 1991.]] Google ScholarDigital Library
16.David A. Dunn and Wei-Chung Hsu. Instruction Scheduling for the HP PA-8000. Proceedings of MICRO-29, pages 298-307, December 1996.]] Google ScholarDigital Library
17.M. Anton Ertl. Optimal code selection in DAGs. In 26th Annual ACM SIGACT-SIGPLAN Symposium on the Principles of Programming Languages, January 1999.]] Google ScholarDigital Library
18.J. Ferrante, K. Ottenstein, and J. Warren. The Program Dependence Graph and its Use in Optimization. ACM Transactions on Programming Languages and Systems, 9(3):319-349, July 1987.]] Google ScholarDigital Library
19.Stephen Fink, Kathleen Knobe, and Vivek Sarkar. Unified analysis of array and object references in strongly typed languages. In Static Analysis Symposium (SAS'00), June 2000.]] Google ScholarDigital Library
20.Christopher Fraser and David Hanson. A Retargetable C Compiler: Design and Implementation. Addison-Wesley, 1995.]] Google ScholarDigital Library
21.Seth Copen Goldstein, Herman Schmit, Mihai Budiu, Srihari Cadambi, Matt Moe, and Reed Taylor. Baring it all to Software: Raw Processors. IEEE Computer, 33(4), April 2000.]]Google Scholar
22.J. Goodman and W. Hsu. Code Scheduling and Register Allocation in Large Basic Blocks. Proceedings of ACM Conference on Supercomputing, pages 442-452, 1988.]] Google ScholarDigital Library
23.Tim Lindholm and Frank Yellin. The Java Virtual Machine Specification. The Java Series. Addison-Wesley, 1996.]] Google ScholarDigital Library
24.Rajeev Motwani, Krishna V. Palem, Vivek Sarkar, and Salem Reyen. Combining Register Allocation and Instruction Scheduling (Technical Summary). Technical report, Courant Institute, New York University, July 1995. TR 698.]]Google Scholar
25.E. Pelegr -Llopart and S. L. Graham. Optimal code generation for expression trees: an application burs theory. In15th Annual ACM Symposium on the Principles of Programming Languages, pages 294-308, January 1988.]] Google ScholarDigital Library
26.S.S. Pinter. Register allocation with instruction scheduling: a new approach. In ACM SIGPLAN Conference onProgramming Language Design and Implementation, pages 248-257, 1993.]] Google ScholarDigital Library
27.R. Silvera, J. Wang, G.R. Gao, and R. Govindarajan. A Register Pressure Sensitive Instruction Scheduler for Dynamic Issue Processors. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT), October 1997.]] Google ScholarDigital Library
28.M.G. Valluri and R. Govindarajan. Evaluating Register Allocation and Instruction Scheduling Techniques in Out-of-Order Issue Processors. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT), October 1999.]] Google ScholarDigital Library
29.E. Waingold, M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb, S. Amarasinghe, and A. Agarwal. Baring it all to Software: Raw Processors. IEEE Computer, September 1997. Special issue on "Future Microprocessors - How to use a Billion Transistors".]] Google ScholarDigital Library
30.Michael J. Wolfe. Optimizing Supercompilers for Supercomputers. Pitman, London and The MIT Press, Cambridge, Massachusetts, 1989. In the series, Research Monographs in Parallel and Distributed Computing.]] Google ScholarDigital Library
31.Daniel Yellin. Speeding up dynamic transitive closure for bounded degree graphs. Acta Informatica, 30:369-384, 1993.]]Google ScholarDigital Library

Index Terms

Recommendations

A Register Pressure Sensitive Instruction Scheduler for Dynamic Issue Processors
PACT '97: Proceedings of the 1997 International Conference on Parallel Architectures and Compilation Techniques

Several modern superscalar processors contain an out of order (OOO) instruction issue mechanism, which resolves dependencies between instructions to expose greater instruction level parallelism (ILP). How to extend a traditional instruction scheduler to ...
Read More
Dynamic coalescing for 16-bit instructions

In the embedded domain, memory usage and energy consumption are critical constraints.Embedded processors such as the ARM and MIPS provide a 16-bit instruction set, (called Thumb in the case of the ARM family of processors), in addition to the 32-bit ...
Read More
Minimum Register Instruction Sequencing to Reduce Register Spills in Out-of-Order Issue Superscalar Architectures

In this paper, we address the problem of generating an optimal instruction sequence S for a Directed Acyclic Graph (DAG), where S is optimal in terms of the number of registers used. We call this the Minimum Register Instruction Sequence (MRIS) problem. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICS '01: Proceedings of the 15th international conference on Supercomputing
June 2001
510 pages
ISBN:158113410X
DOI:10.1145/377792
Chairmen:
Mario Mango Furnari
Istituto di Cibernetica, CNR, Italy
,
Efstratios Gallopoulos
Univ. of Patras
Copyright © 2001 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 June 2001
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
ICS '01 Paper Acceptance Rate45of133submissions,34%Overall Acceptance Rate584of2,055submissions,28%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 642
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Register-sensitive selection, duplication, and sequencing of instructions

ICS '01: Proceedings of the 15th international conference on Supercomputing

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Register Pressure Sensitive Instruction Scheduler for Dynamic Issue Processors

Dynamic coalescing for 16-bit instructions

Minimum Register Instruction Sequencing to Reduce Register Spills in Out-of-Order Issue Superscalar Architectures

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Register-sensitive selection, duplication, and sequencing of instructions

ICS '01: Proceedings of the 15th international conference on Supercomputing

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Register Pressure Sensitive Instruction Scheduler for Dynamic Issue Processors

Dynamic coalescing for 16-bit instructions

Minimum Register Instruction Sequencing to Reduce Register Spills in Out-of-Order Issue Superscalar Architectures

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media