Comparing software and hardware schemes for reducing the cost of branches

Authors:
W. W. Hwu

Coordinated Science Laboratory, 1101 W. Sprintfield Ave., University of Illinois, Urbana, IL

Coordinated Science Laboratory, 1101 W. Sprintfield Ave., University of Illinois, Urbana, IL
View Profile

,
T. M. Conte

Coordinated Science Laboratory, 1101 W. Sprintfield Ave., University of Illinois, Urbana, IL

Coordinated Science Laboratory, 1101 W. Sprintfield Ave., University of Illinois, Urbana, IL
View Profile

,
P. P. Chang

Coordinated Science Laboratory, 1101 W. Sprintfield Ave., University of Illinois, Urbana, IL

Coordinated Science Laboratory, 1101 W. Sprintfield Ave., University of Illinois, Urbana, IL
View Profile

ISCA '89: Proceedings of the 16th annual international symposium on Computer architectureApril 1989Pages 224–233https://doi.org/10.1145/74925.74951

Published:01 April 1989Publication History

ISCA '89: Proceedings of the 16th annual international symposium on Computer architecture

Pages 224–233

ABSTRACT

Pipelining has become a common technique to increase throughput of the instruction fetch, instruction decode, and instruction execution portions of modern computers. Branch instructions disrupt the flow of instructions through the pipeline, increasing the overall execution cost of branch instructions. Three schemes to reduce the cost of branches are presented in the context of a general pipeline model. Ten realistic Unix domain programs are used to directly compare the cost and performance of the three schemes and the results are in favor of the software-based scheme. For example, the software-based scheme has a cost of 1.65 cycles/branch vs. a cost of 1.68 cycles/branch of the best hardware scheme for a highly pipelined processor (11-stage pipeline). The results are 1.19 (software scheme) vs. 1.23 cycles/branch (best hardware scheme) for a moderately pipelined processor (5-stage pipeline).

References

1.S. McFarling and J. L. Hennessy, "Reducing the cost of branches," in Proc. 13th Annu. Symp. on Comput. Arch., (Tokyo, Japan), pp. 396-403, June 1986. Google ScholarDigital Library
2.J. S. Emer and D. W. Clark, "A characterization of processor performance in the VAX-11/780,"" in Proc. flth. Annu. Symp. on Comput. Arch., pp. 301-309, Google ScholarDigital Library
3.J. K. F. Lee and A. J. Smith, "Branch prediction strategies and branch target buffer design," IEEE Computer, Jan. 1984.Google Scholar
4.J. E. Smith, "A study of branch predition strategies," in Proc. 8th Annu. Symp. on Comput. Arch., pp. 135-148, June 1981. Google ScholarDigital Library
5.D. J. Lilja, "Reducing the branch penalty in pipelined processors," IEEE Computer, July 1988. Google ScholarDigital Library
6.J. A. DeRosa and H. M. Levy, "An evaluation of branch architectures,"" in Proc. 15th. Annu. Symp. on Comput. Arch., pp. 10-16, June 1987. Google ScholarDigital Library
7.S. Bandyopadhyay, V. S. Begwani, and R. B. Murray, 'Compiling for the CRISP microprocessor," in Proc. 1987 Spring COMPCON, pp. 86-96, 1987.Google Scholar
8.D. R. Ditzel and H. R. McLellan, "Branch folding in the CRISP microprocessor: reducing branch delay to zero,"" in Proc. 14th Annu. Symp. on Comput. Arch., pp. 2-9, June 1987. Google ScholarDigital Library
9.Digital Equipment Corp., VAX12 Architecture Handbook, 1979.Google Scholar
10.D. A. Patterson and C. H. Sequin, "RISC I: a reduced instruction set VLSI computer," in Proc. 8th Annu. Symp. on Comput. Arch., pp. 443-457, May 1981. Google ScholarDigital Library
11.W. W. Hwu and P. P. Chang, "Trace selection for compiling large C application programs to microcode," in Proc. 2lst Annu. Workshop on Microprogramming and Microarchitectures, (San Diego, CA.), Nov. 1988. Google ScholarDigital Library
12.R. M. Tomasulo, "An efhcient algorithm for exploiting multiple arithmetic units," IBM Journal of Besearch and Development, vol. 11, pp. 25-33, Jan. 1967.Google ScholarDigital Library
13.J. E. Thornton, "Parallel operation in the Control Data 6600," in Proc. AFIPS FJCC, pp. 33-40, 1964.Google Scholar
14.J. A. Fisher, "Trace scheduling: A technique for global microcode compaction," IEEE Trans. Comput., vol. c-30, no. 7, pp. 478-490, July 1981.Google ScholarDigital Library

Index Terms

Comparing software and hardware schemes for reducing the cost of branches
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
      1. Very long instruction word
    2. Serial architectures
2. Computing methodologies
  1. Modeling and simulation
    1. Model development and analysis
      1. Modeling methodologies

Recommendations

Comparing software and hardware schemes for reducing the cost of branches
Special Issue: Proceedings of the 16th annual international symposium on Computer Architecture

Pipelining has become a common technique to increase throughput of the instruction fetch, instruction decode, and instruction execution portions of modern computers. Branch instructions disrupt the flow of instructions through the pipeline, increasing ...
Read More
Reducing the cost of branches by using registers
Special Issue: Proceedings of the 17th annual international symposium on Computer Architecture

In an attempt to reduce the number of operand memory references, many RISC machines have thirty-two or more general-purpose registers (e.g., MIPS, ARM, Spectrum, 88K). Without special compiler optimizations, such as inlining or interprocedural register ...
Read More
Reducing the cost of branches by using registers
ISCA '90: Proceedings of the 17th annual international symposium on Computer Architecture

In an attempt to reduce the number of operand memory references, many RISC machines have thirty-two or more general-purpose registers (e.g., MIPS, ARM, Spectrum, 88K). Without special compiler optimizations, such as inlining or interprocedural register ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ISCA '89: Proceedings of the 16th annual international symposium on Computer architecture
April 1989
426 pages
ISBN:0897913191
DOI:10.1145/74925
Chairman:
Jean-Claude Syre
ACM SIGARCH Computer Architecture News Volume 17, Issue 3
Special Issue: Proceedings of the 16th annual international symposium on Computer Architecture
June 1989
400 pages
ISSN:0163-5964
DOI:10.1145/74926
Editor:
Jean-Claude Syre
Issue’s Table of Contents
Copyright © 1989 Authors
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 April 1989
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate543of3,203submissions,17%
Upcoming Conference
ISCA '24

Sponsor:

sigarch

ISCA '24: The 51st Annual International Symposium on Computer Architecture

June 29 - July 3, 2024

Buenos Aires , Argentina
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 71
  Total Citations
  View Citations
- 532
  Total Downloads
- Downloads (Last 12 months)51
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Comparing software and hardware schemes for reducing the cost of branches

ISCA '89: Proceedings of the 16th annual international symposium on Computer architecture

ABSTRACT

References

Cited By

Index Terms

Recommendations

Comparing software and hardware schemes for reducing the cost of branches

Reducing the cost of branches by using registers

Reducing the cost of branches by using registers