article

Efficient instruction scheduling for a pipelined architecture

Authors:
Steven S. Muchnick

San Francisco, CA

San Francisco, CA
View Profile

,
Phillip B. Gibbons

Intel Research Pittsburgh, Pittsburgh, PA

Intel Research Pittsburgh, Pittsburgh, PA
View Profile

Authors Info & Claims

ACM SIGPLAN Notices Volume 39 Issue 4April 2004pp 167–174https://doi.org/10.1145/989393.989413

Published:01 April 2004Publication History

ACM SIGPLAN Notices

Abstract

As part of an effort to develop an optimizing compiler for a pipelined architecture, a code reorganization algorithm has been developed that significantly reduces the number of runtime pipeline interlocks. In a pass after code generation, the algorithm uses a dag representation to heuristically schedule the instructions in each basic block.Previous algorithms for reducing pipeline interlocks have had worst-case runtimes of at least O(n⁴). By using a dag representation which prevents scheduling deadlocks and a selection method that requires no lookahead, the resulting algorithm reorganizes instructions almost as effectively in practice, while having an O(n²) worst-case runtime.

References

D. Bernstein, D. Cohen, Y. Lavon, and V. Rainish. Performance evaluation of instruction scheduling on the IBM RISC System/6000. In Proc. of MICRO-25, pages 226--235, Portland, OR, December 1992. Google ScholarDigital Library
D. G. Bradlee, S. J. Eggers, and R. R. Henry. Integrated register allocation and instruction scheduling for RISCs. In Proc. of 4th Intl. Conf. on Arch. Supp. For Prog. Lang. and Oper. Syst., pages 121--131, Santa Clara, CA, April 1991. Google ScholarDigital Library
K. Ebcioglu, R. D. Groves, K.-C. Kim, G. M. Silberman, and I. Ziv. VLIW compilation techniques in a superscalar environment. In Proc. of ACM '94 Conf. on PLDI, pages 36--48, Orlando, FL, June 1994. Google ScholarDigital Library
J. A. Fisher. Trace scheduling: a technique for global microcode compaction. IEEE Trans. on Computers, C-30(7):478--490, July 1981.Google ScholarDigital Library
M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, San Francisco, CA, 1979. Google ScholarDigital Library
P. B. Gibbons and S. S. Muchnick. Efficient instruction scheduling for a pipelined architecture. In Proc. of SIGPLAN '86 Conf. on Comp. Constr., pages 11--16, Palo Alto, CA, June 1986. Google ScholarDigital Library
J. L. Hennessy and T. R. Gross. Code generation and reorganization in the presence of pipeline constraints. In Conf. Recd. of 9th Annual ACM Symp. on Princ. of Prog. Lang., pages 120--127, Albuquerque, NM, January 1982. Google ScholarDigital Library
J. L. Hennessy and T. R. Gross. Postpass code optimization of pipeline constraints. ACM TOPLAS, 5(3):422--448, July 1983. Google ScholarDigital Library
R. A. Huff. Lifetime-sensitive modulo scheduling. In Proc. of ACM '93 Conf. on PLDI, pages 258--267, Albuquerque, NM, June 1993. Google ScholarDigital Library
W.-M. Hwu, S. A. Mahlke, W. Y. Chen, P. P. Cheng, N. J. Warter, R. A. Bringmann, R. G. Ouellette, R. E. Hank, T. Kiyohara, G. E. Haab, J. G. Holm, and D. M. Lavery. The superbblock: an effective technique for VLIW and superscalar compilation. J. Supercomp., pages 229--248, July 1993. Google ScholarDigital Library
M. S. Johnson and T. C. Miller. Effectiveness of a machine-level, global optimizer. In Proc. of SIGPLAN '86 Conf. on Comp. Constr., pages 109--117, Palo Alto, CA, June 1986. Google ScholarDigital Library
G. Kane. PA-RISC Architecture. Prentice Hall PTR, Upper Saddle River, NJ, 1995. Google ScholarDigital Library
D. R. Kerns and S. J. Eggers. Balanced scheduling: instruction scheduling when memory latency is uncertain. In Proc. of ACM '93 Conf. on PLDI, pages 278--289, Albuquerque, NM, June 1993. Google ScholarDigital Library
M. S. Lam. Instruction scheduling for superscalar architectures. In J. F. Traub, editor, Annl. Revw. of Comp. Sci., volume 4, pages 173--201. Annual Reviews, Inc., Palo Alto, CA, 1990.Google Scholar
S. S. Muchnick. Advanced Compiler Design and Implementation. Morgan Kaufmann, San Francisco, CA, 1997. Google ScholarDigital Library
A. Nicolau. A fine-grain parallelizing compiler. Technical Report TR-86-792, Dept. of Comp. Sci., Cornell Univ., Ithaca, NY, December 1986. Google ScholarDigital Library
S. S. Pinter. Register allocation with instruction scheduling. In Proc. of ACM '93 Conf. on PLDI, pages 248--257, Albuquerque, NM, June 1993. Google ScholarDigital Library
T. A. Proebsting. and C. A. Fischer. Linear-time, optimal code scheduling for delayed-load architectures. In Proc. of ACM '91 Conf. on PLDI, pages 256--267, Toronto, ON, June 1991. Google ScholarDigital Library
R. L. Sites. Instruction scheduling for the Cray-1 computer. Technical Report 78-CS-023, Univ. of Calif., San Diego, CA, July 1978.Google Scholar
M. Smotherman, S. Krishnamurthy, P. S. Aravind, and D. Hunnicutt. Efficient DAG construction and heuristic calculation for instruction scheduling. In Proc. of 24th Annual Intl. Symp. on Microarch., pages 93--102, Albuquerque, NM, November 1991. Google ScholarDigital Library
H.S. Warren. Instruction scheduling for the IBM RISC System/6000. IBM J. of Res. and Devt., 34(1):85--92, January 1990. Google ScholarDigital Library
K. Wilken, J. Liu, and M. Heffernan. Optimal instruction scheduling using integer programming. In Proc. of ACM 2000 Conf. on PLDI, pages 121--133, Vancouver, BC, June 2000. Google ScholarDigital Library
{Ary83} Arya, S. Optimal Instruction Scheduling for a Class of Vector Processors: An Integer Programming Approach. Tech. Rept. CRL-TR-19-83, Computer Research Laboratory, the Univ. of Michigan, Ann Arbor, April 1983.Google Scholar
{Aus82} Auslander, M. & M. Hopkins. An Overview of the PL.8 Compiler. Proc. ACM SIGPLAN Symp. on Compiler Construction, Boston, June 1982, pp. 22--31. Google ScholarDigital Library
{Con67} Conway, R. W., W. L. Maxwell & L. W. Miller, Theory of Scheduling, Addison-Wesley, Reading, MA, 1967.Google Scholar
{Cou86} Coutant, D. S. Retargetable High-Level Alias Analysis, Proc. ACM Symp. on Princ. of Prog. Lang., St. Petersburg Beach, FL, January 1986, pp. 110--118. Google ScholarDigital Library
{Dav81} Davidson, S., D. Landskov, B. D. Shriver & P. W. Mallett. Some Experiments in Local Microcode Compaction for Horizontal Machines. IEEE Trans. on Computers, Vol. C-30, No. 7, July 1981, pp. 460--477.Google Scholar
{Fis81} Fisher, J. A. Trace Scheduling: A Technique for Global Microcode Compaction. IEEE Trans. on Computers, Vol. C-30, No. 7, July 1981, pp. 478--490.Google Scholar
{Gro83} Gross, T. R. Code Optimization of Pipeline Constraints. Tech. Rept. 83-255, Computer Systems Lab., Stanford Univ., Dec. 1983.Google Scholar
{Hen81} Hennessy, J. L. Symbolic Debugging of Optimized Code, ACM Trans. on Prog. Lang. and Sys., Vol. 3, No. 1, Jan. 1981, pp. 200--206.Google Scholar
{Hen83} Hennessy, J. L. & T. R. Gross. Postpass Code Optimization of Pipeline Constraints. ACM Trans. on Prog. Lang. and Sys, Vol. 5, No. 3, July 1983, pp. 422--448. Google ScholarDigital Library
{Joh86} Johnson, M. S. & T. C. Miller. Effectiveness of a Machine-Level, Global Optimizer, Proc. of the SIGPLAN '86 Conf. on Comp. Constr., June 1986. Google ScholarDigital Library
{Knu68} Knuth, D. E. Fundamental Algorithms, Addison-Wesley, Reading, MA, p. 258.Google Scholar
{Kog81} Kogge, P. M. The Architecture of Pipelined Computers, McGraw-Hill, New York, 1981.Google ScholarDigital Library
{Rym82} Rymarczyk, J. W. Coding Guidelines for Pipelined Processors, Proc. of the Symp. on Arch. Supt. for Prog. Lang. and Oper. Syst., Palo Alto, CA, March 1982, pp. 12--19. Google ScholarDigital Library
{Sit78} Sites, R. L. Instruction Ordering for the Cray-1 Computer. Tech. Rept. 78-CS-023, Univ. of California, San Diego, July 1978.Google Scholar
{Spi71} Spillman, Thomas C., Exposing Side-Effects in a PL/I Optimizing Compiler, Information Processing 81, North-Holland, 1972, pp. 376--381.Google Scholar
{Tho64} Thornton, J. E. Parallel Operation in the Control Data 6600, Proc. Fall Joint Comp. Conf., Part 2, Vol. 26, 1964, pp. 33--40.Google Scholar
{Tok81} Tokoru, M., E. Tamura & T. Takizuka. Optimization of Microprograms. IEEE Trans. on Computers, Vol. C-30, No. 7, July 1981, pp. 491--504.Google Scholar
{Tom67} Tomasulo, R. M. An Efficient Algorithm for Exploiting Multiple Arithmetic Units, IBM J. of Res. and Devt., vol. 11, No. 1, Jan. 1967, pp. 25--33.Google ScholarDigital Library
{Veg82} Vegdahl, S. Local Code Generation and Compaction in Optimizing Microcode Compilers, Ph.D. thesis, Carnegie-Mellon Univ., Dec. 1982. Google ScholarDigital Library
{Zel84} Zellweger, P.T. Interactive Source-Level Debugging of Optimized Programs, Research Report CSL-84-5, Xerox Palo Alto Research Center, Palo Alto, CA, May 1984.Google Scholar

Recommendations

Efficient instruction scheduling for a pipelined architecture
SIGPLAN '86: Proceedings of the 1986 SIGPLAN symposium on Compiler construction

As part of an effort to develop an optimizing compiler for a pipelined architecture, a code reorganization algorithm has been developed that significantly reduces the number of runtime pipeline interlocks. In a pass after code generation, the algorithm ...
Read More
Retargetable instruction scheduling for pipelined processors
Read More
Code generation and instruction scheduling for pipelined sisd machines
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGPLAN Notices Volume 39, Issue 4
20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation 1979-1999: A Selection
April 2004
673 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/989393
Editor:
Kathryn S. McKinley
The University of Texas at Austin, USA
Issue’s Table of Contents
Copyright © 2004 Authors
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 April 2004
Check for updates
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 672
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Efficient instruction scheduling for a pipelined architecture

ACM SIGPLAN Notices

Abstract

References

Cited By

Recommendations

Efficient instruction scheduling for a pipelined architecture

Retargetable instruction scheduling for pipelined processors

Code generation and instruction scheduling for pipelined sisd machines

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Efficient instruction scheduling for a pipelined architecture

ACM SIGPLAN Notices

Abstract

References

Cited By

Recommendations

Efficient instruction scheduling for a pipelined architecture

Retargetable instruction scheduling for pipelined processors

Code generation and instruction scheduling for pipelined sisd machines

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media