The superblock: An effective technique for VLIW and superscalar compilation

Hwu, Wen -Mei W.; Mahlke, Scott A.; Chen, William Y.; Chang, Pohua P.; Warter, Nancy J.; Bringmann, Roger A.; Ouellette, Roland G.; Hank, Richard E.; Kiyohara, Tokuzo; Haab, Grant E.; Holm, John G.; Lavery, Daniel M.

doi:10.1007/BF01205185

The superblock: An effective technique for VLIW and superscalar compilation

Published: May 1993

Volume 7, pages 229–248, (1993)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Wen -Mei W. Hwu¹,
Scott A. Mahlke¹,
William Y. Chen¹,
Pohua P. Chang¹,
Nancy J. Warter¹,
Roger A. Bringmann¹,
Roland G. Ouellette¹,
Richard E. Hank¹,
Tokuzo Kiyohara¹,
Grant E. Haab¹,
John G. Holm¹ &
…
Daniel M. Lavery¹

345 Accesses
409 Citations
6 Altmetric
Explore all metrics

Abstract

A compiler for VLIW and superscalar processors must expose sufficient instruction-level parallelism (ILP) to effectively utilize the parallel hardware. However, ILP within basic blocks is extremely limited for control-intensive programs. We have developed a set of techniques for exploiting ILP across basic block boundaries. These techniques are based on a novel structure called thesuperblock. The superblock enables the optimizer and scheduler to extract more ILP along the important execution paths by systematically removing constraints due to the unimportant paths. Superblock optimization and scheduling have been implemented in the IMPACT-I compiler. This implementation gives us a unique opportunity to fully understand the issues involved in incorporating these techniques into a real compiler. Superblock optimizations and scheduling are shown to be useful while taking into account a variety of architectural features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Aho, A., Sethi, R., and Ullman, J. 1986.Compilers: Principles, Techniques, and Tools. Addison-Wesley, Reading, Mass.
Google Scholar
Aiken, A., and Nicolau, A. 1988. A development environment for horizontal microcode.IEEE Trans. Software Engineering, 14 (May): 584–594.
Google Scholar
Bernstein, D., and Rodeh, M. 1991. Global instruction scheduling for superscalar machines. InProc., ACM SIGPLAN 1991 Conf. on Programming Language Design and Implementation (June), pp. 241–255.
Google Scholar
Chaitin, G.J. 1982. Register allocation and spilling via graph coloring. InProc., ACM SIGPLAN 82 Symp. on Compiler Construction (June), pp. 98–105.
Google Scholar
Chang, P.P., and Hwu, W.W. 1988. Trace selection for compiling large C application programs to microcode. InProc., 21st Internat. Workshop on Microprogramming and Microarchitecture (Nov.), pp. 188–198.
Google Scholar
Chang, P.P., Mahlke, S.A., and Hwu, W.W. 1991. Using profile information to assist classic code optimizations.Software Practice and Experience, 21, 12 (Dec): 1301–1321.
Google Scholar
Chang, P.P., Mahlke, S.A., Chen, W.Y., Waiter, N.J., and Hwu, W.W. 1991. IMPACT: An architectural framework for multiple-instruction-issue processors. InProc., 18th Internat. Symp. on Comp. Architecture (May), pp. 266–275.
Google Scholar
Chen, W.Y., Chang, P.P., Conte, T.M., and Hwu, W.W. 1991. The effect of code expanding optimizations on instruction cache design. Tech. Rept. CRHC-91-17, Center for Reliable and High-Performance Computing, Univ. of Ill., Urbana, Ill.
Google Scholar
Chow, F.C., and Hennessy, J.L. 1990. The priority-based coloring approach to register allocation.ACM Trans. Programming Languages and Systems, 12 (Oct.): 501–536.
Google Scholar
Colwell, R.P., Nix, R.P., O'Donnell, J.J., Papworth, D.B., and Rodman, P.K. 1987. A VLIW architecture for a trace scheduling compiler. InProc., 2nd Internat. Conf. on Architectural Support for Programming Languages and Operating Systems (Apr.), pp. 180–192.
Google Scholar
Ellis, J. 1986.Bulldog: A Compiler for VLIW Architectures. MIT Press, Cambridge, Mass.
Google Scholar
Fisher, J.A. 1981. Trace scheduling: A technique for global microcode compaction.IEEE Trans. Comps., C-30, 7 (July): 478–490.
Google Scholar
Gupta, R., and Soffa, M.L. 1990. Region scheduling: An approach for detecting and redistributing parallelism.IEEE Trans. Software Engineering, 16 (Apr.): 421–431.
Google Scholar
Horst, R.W., Harris, R.L., and Jardine, R.L. 1990. Multiple instruction issue in the NonStop Cyclone processor. InProc., 17th Internat. Symp. on Computer Architecture (May), pp. 216–226.
Google Scholar
Hwu, W.W., and Chang, P.P. 1989a. Achieving high instruction cache performance with an optimizing compiler. InProc., 16th Internat. Symp. on Comp. Architecture (May), pp. 242–251.
Google Scholar
Hwu, W.W., and Chang, P.P. 1989b. Inline function expansion for compiling realistic C programs. InProc., ACM SIGPLAN 1989 Conf. on Programming Language Design and Implementation (June), pp. 246–257.
Google Scholar
Hwu, W.W., and Chang, P.P. 1992. Efficient instruction sequencing with inline target insertion.IEEE Trans. Comps., 41, 12 (Dec.):1537–1551.
Google Scholar
Intel. 1989.i860 64-Bit Microprocessor Programmer's Reference Manual. Intel Corp., Santa Clara, Calif.
Google Scholar
Jouppi, N.P., and Wall, D.W. 1989. Available instruction-level parallelism for superscalar and superpipelined machines. InProc., 3rd Internat. Conf. on Architectural Support for Programming Languages and Operating Systems (Apr.), pp. 272–282.
Google Scholar
Kane, G. 1987.MIPS R2000 RISC Architecture. Prentice-Hall, Englewood Cliffs, N.J.
Google Scholar
Kuck, D.J. 1978.The Structure of Computers and Computations. John Wiley, New York.
Google Scholar
Kuck, D.J., Kuhn, R.H., Padua, D.A., Leasure, B., and Wolfe, M. 1981. Dependence graphs and compiler optimizations. InProc., 8th ACM Symp. on Principles of Programming Languages (Jan.), pp. 207–218.
Google Scholar
Mahlke, S.A., Chen, W.Y., Hwu, W.W., Rau, B.R., and Schlansker, M.S.S. 1992. Sentinel scheduling for VLIW and superscalar processors. InProc., 5th Internat. Conf. on Architectural Support for Programming Languages and Operating Systems (Boston, Oct.), pp. 238–247.
Nakatani, T., and Ebcioglu, K. 1989. Combining as a compilation technique for VLIW architectures. InProc., 22nd Internat. Workshop on Microprogramming and Microarchitecture (Sept.), pp. 43–55.
Google Scholar
Rau, B.R., Yen, D.W.L., Yen, W., and Towle, R. A. 1989. The Cydra 5 departmental supercomputer.IEEE Comp., 22, 1 (Jan.): 12–34.
Google Scholar
Schuette, M.A., and Shen, J.P. 1991. An instruction-level performance analysis of the Multiflow TRACE 14/300. InProc., 24th Internat. Workshop on Microprogramming and Microarchitecture (Nov.), pp. 2–11.
Google Scholar
Smith, M.D., Johnson, M., and Horowitz, M.A. 1989. Limits on multiple instruction issue. InProc., 3rd Internat. Conf. on Architectural Support for Programming Languages and Operating Systems (Apr.), pp. 290–302.
Google Scholar
Warren, H.S., Jr. 1990. Instruction scheduling for the IBM RISC System/6000 processor.IBM J. Res. and Dev., 34, 1 (Jan.): 85–92.
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Reliable and High-Performance Computing, University of Illinois, 61801, Urbana-Champaign, IL
Wen -Mei W. Hwu, Scott A. Mahlke, William Y. Chen, Pohua P. Chang, Nancy J. Warter, Roger A. Bringmann, Roland G. Ouellette, Richard E. Hank, Tokuzo Kiyohara, Grant E. Haab, John G. Holm & Daniel M. Lavery

Authors

Wen -Mei W. Hwu
View author publications
You can also search for this author in PubMed Google Scholar
Scott A. Mahlke
View author publications
You can also search for this author in PubMed Google Scholar
William Y. Chen
View author publications
You can also search for this author in PubMed Google Scholar
Pohua P. Chang
View author publications
You can also search for this author in PubMed Google Scholar
Nancy J. Warter
View author publications
You can also search for this author in PubMed Google Scholar
Roger A. Bringmann
View author publications
You can also search for this author in PubMed Google Scholar
Roland G. Ouellette
View author publications
You can also search for this author in PubMed Google Scholar
Richard E. Hank
View author publications
You can also search for this author in PubMed Google Scholar
Tokuzo Kiyohara
View author publications
You can also search for this author in PubMed Google Scholar
Grant E. Haab
View author publications
You can also search for this author in PubMed Google Scholar
John G. Holm
View author publications
You can also search for this author in PubMed Google Scholar
Daniel M. Lavery
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hwu, W.M.W., Mahlke, S.A., Chen, W.Y. et al. The superblock: An effective technique for VLIW and superscalar compilation. J Supercomput 7, 229–248 (1993). https://doi.org/10.1007/BF01205185

Download citation

Received: 15 March 1992
Accepted: 15 October 1992
Issue Date: May 1993
DOI: https://doi.org/10.1007/BF01205185

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The superblock: An effective technique for VLIW and superscalar compilation

Abstract

Access this article

Similar content being viewed by others

Can GPU performance increase faster than the code error rate?

In-memory database acceleration on FPGAs: a survey

Efficient High-Level Programming in Plain Java

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The superblock: An effective technique for VLIW and superscalar compilation

Abstract

Access this article

Similar content being viewed by others

Can GPU performance increase faster than the code error rate?

In-memory database acceleration on FPGAs: a survey

Efficient High-Level Programming in Plain Java

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation