Article

Reducing misspeculation overhead for module-level speculative execution

Authors:
Fredrik Warg

Chalmers University of Technology, Göteborg, Sweden

Chalmers University of Technology, Göteborg, Sweden
View Profile

,
Per Stenstrom

Chalmers University of Technology, Göteborg, Sweden

Chalmers University of Technology, Göteborg, Sweden
View Profile

CF '05: Proceedings of the 2nd conference on Computing frontiersMay 2005Pages 289–298https://doi.org/10.1145/1062261.1062310

Published:04 May 2005Publication History

CF '05: Proceedings of the 2nd conference on Computing frontiers

Pages 289–298

ABSTRACT

Thread-level speculative execution is a technique that makes it possible for a wider range of single-threaded applications to make use of the processing resources in a chip multiprocessor.We consider module-level speculation, i.e., speculative threads executing the code after a module (i.e., a procedure, function, or method) call. Unfortunately, previous studies have shown that indiscriminate module-level speculation results in significant overheads, mainly due to frequent misspeculations. In addition to hurting performance, excessive overhead is harmful from a resource usage and energy efficiency standpoint. We show that the overhead when spawning speculative threads for all module continuations is on average three times as big as the time spent on useful execution on our baseline 8-way chip multiprocessorIn this paper, we present and make a detailed evaluation of a technique that aims at reducing the overhead associated with misspeculations. History-based prediction is used in an attempt to prevent speculative threads from being spawned when they are expected to cause misspeculations. We find that the overhead can be reduced with a factor of six on average compared to indiscriminate speculation. The impact on speedup is small for most applications, but in several cases speedup is slightly improved.

References

L. A. Barroso, K. Gharachorloo, R. McNamara, A. Nowatzyk, S. Qadeer, B. Sano, S. Smith, R. Stets, and B. Verghese. Piranha: A scalable architecture based on single-chip multiprocessing. In Proceedings of the 27th Annual International Symposium on Computer Architecture (ISCA '00), pages 282--293. ACM Press, June 2000. Google ScholarDigital Library
M. K. Chen and K. Olukotun. Exploiting method-level parallelism in single-threaded Java programs. In Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques (PACT'98), pages 176--184. IEEE Computer Society, Oct. 1998. Google ScholarDigital Library
Y. Chen, R. Sendag, and D. J. Lilja. Using incorrect speculation to prefetch data in a concurrent multithreaded processor. In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS '03), page~76. IEEE Computer Society, Apr. 2003. Google ScholarDigital Library
M. Cintra and J. Torrellas. Eliminating squashes through learning cross-thread violations in speculative parallelization for multiprocessors. In Proceedings of the Eight International Symposium on High-Performance Computer Architecture (HPCA '02), pages 43--54. IEEE Computer Society, Feb. 2002. Google ScholarDigital Library
L. Codrescu and D. S. Wills. Architecture of the atlas chip-multiprocessor: Dynamically parallelizing irregular applications. In Proceedings of the 1999 International Conference on Computer Design (ICCD '99), pages 428--435. IEEE Computer Society, Oct. 1999. Google ScholarDigital Library
L. Hammond, M. Willey, and K. Olukotun. Data speculation support for a chip multiprocessor. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VIII '98), pages 58--69. ACM Press, Oct. 1998. Google ScholarDigital Library
V. Krishnan and J. Torrellas. A chip-multiprocessor architecture with speculative multithreading. IEEE Transactions on Computers, 48(9):866--880, 1999. Google ScholarDigital Library
P. S. Magnusson, F. Larsson, A. Moestedt, B. Werner, F. Dahlgren, M. Karlsson, F. Lundholm, J. Nilsson, P. Stenström, and H. Grahn. SimICS/sun4m: A virtual workstation. In Proceedings of the USENIX 1998 Annual Technical Conference, pages 119--130. USENIX Association, June 1998. Google ScholarDigital Library
P. Marcuello and A. Gonzalez. Clustered speculative multithreaded processors. In Proceedings of the 1999 International Conference on Supercomputing (ICS '99), pages 365--372. ACM Press, June 1999. Google ScholarDigital Library
P. Marcuello and A. Gonzalez. A quantitative assessment of thread-level speculation techniques. In Proceedings of the 14th International Conference on Parallel and Distributed Processing Symposium (IPDPS '00), pages 595--604. IEEE Computer Society, May 2000. Google ScholarDigital Library
A. Moshovos and G. Sohi. Dynamic speculation and synchronization of data dependences. In Proceedings of the 24th Annual International Symposium on Computer Architecture (ISCA '97), pages 181--193. IEEE Computer Society, May 1997. Google ScholarDigital Library
C.-L. Ooi, S. W. Kim, I. Park, R. Eigenmann, B. Falsafi, and T. N. Vijaykumar. Multiplex: unifying conventional and speculative thread-level parallelism on a chip multiprocessor. In International Conference on Supercomputing (ICS '01), pages 368--380. ACM Press, June 2001. Google ScholarDigital Library
J. Oplinger and M. S. Lam. Enhancing software reliability with speculative threads. In Proceedings of the Tenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '02), pages 184--196. ACM Press, Oct. 2002. Google ScholarDigital Library
J. T. Oplinger, D. L. Heine, and M. S. Lam. In search of speculative thread-level parallelism. In Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques (PACT '99), pages 303--313. IEEE Computer Society, Oct. 1999. Google ScholarDigital Library
G. S. Sohi, S. E. Breach, and T. N. Vijaykumar. Multiscalar processors. In Proceedings of the 22nd Annual International Symposium on Computer Architecture (ISCA '95), pages 414--425. ACM Press, June 1995. Google ScholarDigital Library
J. G. Steffan, C. B. Colohan, A. Zhai, and T. C. Mowry. Improving value communication for thread-level speculation. In Proceedings of the Eight International Symposium on High-Performance Computer Architecture (HPCA '02), page~65. IEEE Computer Society, Feb. 2002. Google ScholarDigital Library
J. G. Steffan and T. C. Mowry. The potential for using thread-level data speculation to facilitate automatic parallelization. In Proceedings of the Fourth International Symposium on High-Performance Computer Architecture (HPCA '98), pages 2--13. IEEE Computer Society, Feb. 1998. Google ScholarDigital Library
J.-Y. Tsai and P.-C. Yew. The superthreaded architecture: Thread pipelining with run-time data dependence checking and control speculation. In Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques (PACT '96), pages 35--46. IEEE Computer Society, Oct. 1996. Google ScholarDigital Library
F. Warg and P. Stenström. Limits on speculative module-level parallelism in imperative and object-oriented programs on CMP platforms. In International Conference on Parallel Architectures and Compilation Techniques (PACT '01), pages 221--230. IEEE Computer Society, Sept. 2001. Google ScholarDigital Library
F. Warg and P. Stenström. Improving speculative thread-level parallelism through module run-length prediction. In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS '03), page~12. IEEE Computer Society, Apr. 2003. Google ScholarDigital Library

Index Terms

Reducing misspeculation overhead for module-level speculative execution

Recommendations

Reducing misspeculation penalty in trace-level speculative multithreaded architectures
ISHPC'05/ALPS'06: Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems

Trace-Level Speculative Multithreaded Processors exploit trace-level speculation by means of two threads working cooperatively. One thread, called the speculative thread, executes instructions ahead of the other by speculating on the result of several ...
Read More
Compiler and hardware support for reducing the synchronization of speculative threads

Thread-level speculation (TLS) allows us to automatically parallelize general-purpose programs by supporting parallel execution of threads that might not actually be independent. In this article, we focus on one important limitation of program ...
Read More
Exposing speculative thread parallelism in SPEC2000
PPoPP '05: Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming

As increasing the performance of single-threaded processors becomes increasingly difficult, consumer desktop processors are moving toward multi-core designs. One way to enhance the performance of chip multiprocessors that has received considerable ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CF '05: Proceedings of the 2nd conference on Computing frontiers
May 2005
467 pages
ISBN:1595930191
DOI:10.1145/1062261
General Chair:
Nader Bagherzadeh,
Program Chairs:
Mateo Valero,
Alex Ramirez
Copyright © 2005 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 May 2005
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
chip multiprocessors
misspeculation prediction
module-level parallelism
performance evaluation
thread-level speculation
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate240of680submissions,35%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 181
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Reducing misspeculation overhead for module-level speculative execution

CF '05: Proceedings of the 2nd conference on Computing frontiers

ABSTRACT

References

Cited By

Index Terms

Recommendations

Reducing misspeculation penalty in trace-level speculative multithreaded architectures

Compiler and hardware support for reducing the synchronization of speculative threads

Exposing speculative thread parallelism in SPEC2000