skip to main content
article

Efficient instruction scheduling for a pipelined architecture

Published:01 April 2004Publication History
Skip Abstract Section

Abstract

As part of an effort to develop an optimizing compiler for a pipelined architecture, a code reorganization algorithm has been developed that significantly reduces the number of runtime pipeline interlocks. In a pass after code generation, the algorithm uses a dag representation to heuristically schedule the instructions in each basic block.Previous algorithms for reducing pipeline interlocks have had worst-case runtimes of at least O(n4). By using a dag representation which prevents scheduling deadlocks and a selection method that requires no lookahead, the resulting algorithm reorganizes instructions almost as effectively in practice, while having an O(n2) worst-case runtime.

References

  1. D. Bernstein, D. Cohen, Y. Lavon, and V. Rainish. Performance evaluation of instruction scheduling on the IBM RISC System/6000. In Proc. of MICRO-25, pages 226--235, Portland, OR, December 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. G. Bradlee, S. J. Eggers, and R. R. Henry. Integrated register allocation and instruction scheduling for RISCs. In Proc. of 4th Intl. Conf. on Arch. Supp. For Prog. Lang. and Oper. Syst., pages 121--131, Santa Clara, CA, April 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. K. Ebcioglu, R. D. Groves, K.-C. Kim, G. M. Silberman, and I. Ziv. VLIW compilation techniques in a superscalar environment. In Proc. of ACM '94 Conf. on PLDI, pages 36--48, Orlando, FL, June 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. A. Fisher. Trace scheduling: a technique for global microcode compaction. IEEE Trans. on Computers, C-30(7):478--490, July 1981.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, San Francisco, CA, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. B. Gibbons and S. S. Muchnick. Efficient instruction scheduling for a pipelined architecture. In Proc. of SIGPLAN '86 Conf. on Comp. Constr., pages 11--16, Palo Alto, CA, June 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. L. Hennessy and T. R. Gross. Code generation and reorganization in the presence of pipeline constraints. In Conf. Recd. of 9th Annual ACM Symp. on Princ. of Prog. Lang., pages 120--127, Albuquerque, NM, January 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. L. Hennessy and T. R. Gross. Postpass code optimization of pipeline constraints. ACM TOPLAS, 5(3):422--448, July 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. A. Huff. Lifetime-sensitive modulo scheduling. In Proc. of ACM '93 Conf. on PLDI, pages 258--267, Albuquerque, NM, June 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. W.-M. Hwu, S. A. Mahlke, W. Y. Chen, P. P. Cheng, N. J. Warter, R. A. Bringmann, R. G. Ouellette, R. E. Hank, T. Kiyohara, G. E. Haab, J. G. Holm, and D. M. Lavery. The superbblock: an effective technique for VLIW and superscalar compilation. J. Supercomp., pages 229--248, July 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. S. Johnson and T. C. Miller. Effectiveness of a machine-level, global optimizer. In Proc. of SIGPLAN '86 Conf. on Comp. Constr., pages 109--117, Palo Alto, CA, June 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. Kane. PA-RISC Architecture. Prentice Hall PTR, Upper Saddle River, NJ, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. R. Kerns and S. J. Eggers. Balanced scheduling: instruction scheduling when memory latency is uncertain. In Proc. of ACM '93 Conf. on PLDI, pages 278--289, Albuquerque, NM, June 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. S. Lam. Instruction scheduling for superscalar architectures. In J. F. Traub, editor, Annl. Revw. of Comp. Sci., volume 4, pages 173--201. Annual Reviews, Inc., Palo Alto, CA, 1990.Google ScholarGoogle Scholar
  15. S. S. Muchnick. Advanced Compiler Design and Implementation. Morgan Kaufmann, San Francisco, CA, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Nicolau. A fine-grain parallelizing compiler. Technical Report TR-86-792, Dept. of Comp. Sci., Cornell Univ., Ithaca, NY, December 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. S. Pinter. Register allocation with instruction scheduling. In Proc. of ACM '93 Conf. on PLDI, pages 248--257, Albuquerque, NM, June 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. A. Proebsting. and C. A. Fischer. Linear-time, optimal code scheduling for delayed-load architectures. In Proc. of ACM '91 Conf. on PLDI, pages 256--267, Toronto, ON, June 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. L. Sites. Instruction scheduling for the Cray-1 computer. Technical Report 78-CS-023, Univ. of Calif., San Diego, CA, July 1978.Google ScholarGoogle Scholar
  20. M. Smotherman, S. Krishnamurthy, P. S. Aravind, and D. Hunnicutt. Efficient DAG construction and heuristic calculation for instruction scheduling. In Proc. of 24th Annual Intl. Symp. on Microarch., pages 93--102, Albuquerque, NM, November 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. H.S. Warren. Instruction scheduling for the IBM RISC System/6000. IBM J. of Res. and Devt., 34(1):85--92, January 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. K. Wilken, J. Liu, and M. Heffernan. Optimal instruction scheduling using integer programming. In Proc. of ACM 2000 Conf. on PLDI, pages 121--133, Vancouver, BC, June 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. {Ary83} Arya, S. Optimal Instruction Scheduling for a Class of Vector Processors: An Integer Programming Approach. Tech. Rept. CRL-TR-19-83, Computer Research Laboratory, the Univ. of Michigan, Ann Arbor, April 1983.Google ScholarGoogle Scholar
  24. {Aus82} Auslander, M. & M. Hopkins. An Overview of the PL.8 Compiler. Proc. ACM SIGPLAN Symp. on Compiler Construction, Boston, June 1982, pp. 22--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. {Con67} Conway, R. W., W. L. Maxwell & L. W. Miller, Theory of Scheduling, Addison-Wesley, Reading, MA, 1967.Google ScholarGoogle Scholar
  26. {Cou86} Coutant, D. S. Retargetable High-Level Alias Analysis, Proc. ACM Symp. on Princ. of Prog. Lang., St. Petersburg Beach, FL, January 1986, pp. 110--118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. {Dav81} Davidson, S., D. Landskov, B. D. Shriver & P. W. Mallett. Some Experiments in Local Microcode Compaction for Horizontal Machines. IEEE Trans. on Computers, Vol. C-30, No. 7, July 1981, pp. 460--477.Google ScholarGoogle Scholar
  28. {Fis81} Fisher, J. A. Trace Scheduling: A Technique for Global Microcode Compaction. IEEE Trans. on Computers, Vol. C-30, No. 7, July 1981, pp. 478--490.Google ScholarGoogle Scholar
  29. {Gro83} Gross, T. R. Code Optimization of Pipeline Constraints. Tech. Rept. 83-255, Computer Systems Lab., Stanford Univ., Dec. 1983.Google ScholarGoogle Scholar
  30. {Hen81} Hennessy, J. L. Symbolic Debugging of Optimized Code, ACM Trans. on Prog. Lang. and Sys., Vol. 3, No. 1, Jan. 1981, pp. 200--206.Google ScholarGoogle Scholar
  31. {Hen83} Hennessy, J. L. & T. R. Gross. Postpass Code Optimization of Pipeline Constraints. ACM Trans. on Prog. Lang. and Sys, Vol. 5, No. 3, July 1983, pp. 422--448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. {Joh86} Johnson, M. S. & T. C. Miller. Effectiveness of a Machine-Level, Global Optimizer, Proc. of the SIGPLAN '86 Conf. on Comp. Constr., June 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. {Knu68} Knuth, D. E. Fundamental Algorithms, Addison-Wesley, Reading, MA, p. 258.Google ScholarGoogle Scholar
  34. {Kog81} Kogge, P. M. The Architecture of Pipelined Computers, McGraw-Hill, New York, 1981.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. {Rym82} Rymarczyk, J. W. Coding Guidelines for Pipelined Processors, Proc. of the Symp. on Arch. Supt. for Prog. Lang. and Oper. Syst., Palo Alto, CA, March 1982, pp. 12--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. {Sit78} Sites, R. L. Instruction Ordering for the Cray-1 Computer. Tech. Rept. 78-CS-023, Univ. of California, San Diego, July 1978.Google ScholarGoogle Scholar
  37. {Spi71} Spillman, Thomas C., Exposing Side-Effects in a PL/I Optimizing Compiler, Information Processing 81, North-Holland, 1972, pp. 376--381.Google ScholarGoogle Scholar
  38. {Tho64} Thornton, J. E. Parallel Operation in the Control Data 6600, Proc. Fall Joint Comp. Conf., Part 2, Vol. 26, 1964, pp. 33--40.Google ScholarGoogle Scholar
  39. {Tok81} Tokoru, M., E. Tamura & T. Takizuka. Optimization of Microprograms. IEEE Trans. on Computers, Vol. C-30, No. 7, July 1981, pp. 491--504.Google ScholarGoogle Scholar
  40. {Tom67} Tomasulo, R. M. An Efficient Algorithm for Exploiting Multiple Arithmetic Units, IBM J. of Res. and Devt., vol. 11, No. 1, Jan. 1967, pp. 25--33.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. {Veg82} Vegdahl, S. Local Code Generation and Compaction in Optimizing Microcode Compilers, Ph.D. thesis, Carnegie-Mellon Univ., Dec. 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. {Zel84} Zellweger, P.T. Interactive Source-Level Debugging of Optimized Programs, Research Report CSL-84-5, Xerox Palo Alto Research Center, Palo Alto, CA, May 1984.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 39, Issue 4
    20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation 1979-1999: A Selection
    April 2004
    673 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/989393
    Issue’s Table of Contents

    Copyright © 2004 Authors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 1 April 2004

    Check for updates

    Qualifiers

    • article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader