skip to main content
article
Free Access

Early load address resolution via register tracking

Published:01 May 2000Publication History
Skip Abstract Section

Abstract

Higher microprocessor frequencies accentuate the performance cost of memory accesses. This is especially noticeable in the Intel's IA32 architecture where lack of registers results in increased number of memory accesses. This paper presents novel, non-speculative technique that partially hides the increasing load-to-use latency, by allowing the early issue of load instructions. Early load address resolution relies on register tracking to safely compute the addresses of memory references in the front-end part of the processor pipeline. Register tracking enables decode-time computation of register values by tracking simple operations of the form reg±immediate. Register tracking may be performed in any pipeline stage following instruction decode and prior to execution.

Several tracking schemes are proposed in this paper:

  • Stack pointer tracking allows safe early resolution of stack references by keeping track of the value of the ESP register (the stack pointer). About 25% of all loads are stack loads and 95% of these loads may be resolved in the front-end.

  • Absolute address tracking allows the early resolution of constant-address loads.

  • Displacement-based tracking tackles all loads with addresses of the form reg±immediate by tracking the values of all general-purpose registers. This class corresponds to 82% of all loads, and about 65% of these loads can be safely resolved in the front-end pipeline.

The paper describes the tracking schemes, analyzes their performance potential in a deeply pipelined processor and discusses the integration of tracking with memory disambiguation.

References

  1. 1 T. M. Austin and G. S. Sohi, Zero-cycle Loads: Microarchitecture Support for Reducing Load Latency, in Proceedings of the 28th Annual International Symposium on Microarchitecture, November 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2 T.M.Austin, D.N. Pnevmatikatos, G.S. Sohi. Streamlining Data Cache Access with Fast Address Calculation, In 22nd International Symposium on Computer Architecture, 1995, pp. 369-380 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3 J. Baer and T. Chen, An Effective On-Chip Preloading Scheme to Reduce Data Access Penalty, in Proceedings of the International Conference on Supercomputing, November 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4 M. Bekerman, S. Jourdan, R. Ronen, G. Kirshenboim, L. Rappoport, A. Yoaz, U. Weiser. Correlated Load Address Predictors, in Proceedings of the 26th Annual International Symposium on Computer Architecture, May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5 T. Chen and and J. Baer, Effective Hardware-Based Data Prefetching for High-Performance Processors, in IEEE Transactions on Computer, V.45 N.5, May 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6 S. Cho, P.-C. Yew, G. Lee. Decoupling Local Variable Accesses in a Wide-Issue Superscalar Processor, in Proceedings of the 26th International Symposium on Computer Architecture, May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7 G. Chrysos and J. Emer, Memory Dependence Prediction Using Store Sets, in Proceedings of the 25th International Symposium on Computer Architecture, July 1998. Google ScholarGoogle ScholarCross RefCross Ref
  8. 8 D. Ditzel and R. McLellan. Register Allocation for Free: The C Machine Stack Cache, in Proc. of the Symposium on Architectural Support for Programming Languages and Operating Systems, March 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9 R. J. Eickemeyer and S. Vassiliadis, A Load-Instruction Unit for Pipelined Processors, in IBM Journal of Research and Development, July 93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10 F. Gabbay and A. Mendelson. The Effect of Instruction Fetch Bandwidth on Value Prediction, in Proceeding of the 25th International Symposium on computer Architecture, July, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11 J. Gonzalez and A. Gonzalez, Speculative Execution via Address Prediction and Data Prefetching, in Proceedings of the International Conference on Supercomputing, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12 Pentium Pro Family Developer Manual, Volume 2: Programmer s Reference Manual, Intel Corporation, 1996Google ScholarGoogle Scholar
  13. 13 S. Jourdan, R. Ronen, M. Bekerman, B. Shomar, A. Yoaz, A Novel Renaming Scheme to Exploit Value Temporal Locality Through Physical Register Reuse and Unification, in Proceedings of the 31st Annual International Symposium on Microarchitecture, November 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14 M. H. Lipasti, C. B. Wilkerson, and J. P. Shen, Value Locality and Load Value Prediction, in Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems, October 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15 A. I. Moshovos, S. E. Breach, T. N. Vijaykumar, and G. S. Sohi, Speculation and Synchronization of Data Dependencies, in Proceedings of the 24th International Symposium on Computer Architecture, June 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16 A. I. Moshovos and G. S. Sohi, Streamlining Inter-operation Memory Communication via Data Dependence Prediction, in Proceedings of the 30th Annual international Symposium on Microarchitecture, December 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17 E. Rotenberg, S. Bennett, and J. Smith, Trace Cache: A Low Latency Approach to High Bandwidth Instruction Fetching, in Proceedings of the 29th International Symposium on Microarchitecture, December 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18 R. Valentine, G. Sheaffer, R. Ronen, I. Spillinger and A. Yoaz, Out-of-order Superscalar Microprocessor with a Renaming Device that Maps Instructions from Memory to Registers, U.S. Patent 5,838,941, November 1998.Google ScholarGoogle Scholar
  19. 19 A. Yoaz, M. Erez, R. Ronen, and S. Jourdan, Speculation Techniques for Improving Load Related Instruction Scheduling, in Proceedings of the 26th Annual International Symposium on Computer Architecture, May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Early load address resolution via register tracking

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGARCH Computer Architecture News
          ACM SIGARCH Computer Architecture News  Volume 28, Issue 2
          Special Issue: Proceedings of the 27th annual international symposium on Computer architecture (ISCA '00)
          May 2000
          325 pages
          ISSN:0163-5964
          DOI:10.1145/342001
          Issue’s Table of Contents
          • cover image ACM Conferences
            ISCA '00: Proceedings of the 27th annual international symposium on Computer architecture
            June 2000
            327 pages
            ISBN:1581132328
            DOI:10.1145/339647

          Copyright © 2000 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 May 2000

          Check for updates

          Qualifiers

          • article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader