skip to main content
10.1145/512529.512554acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
Article

Dynamic hot data stream prefetching for general-purpose programs

Published:17 May 2002Publication History

ABSTRACT

Prefetching data ahead of use has the potential to tolerate the grow ing processor-memory performance gap by overlapping long latency memory accesses with useful computation. While sophisti cated prefetching techniques have been automated for limited domains, such as scientific codes that access dense arrays in loop nests, a similar level of success has eluded general-purpose pro grams, especially pointer-chasing codes written in languages such as C and C++. We address this problem by describing, implementing and evaluating a dynamic prefetching scheme. Our technique runs on stock hardware, is completely automatic, and works for general-purpose programs, including pointer-chasing codes written in weakly-typed languages, such as C and C++. It operates in three phases. First, the profiling phase gathers a temporal data reference profile from a running program with low-overhead. Next, the profiling is turned off and a fast analysis algorithm extracts hot data streams, which are data reference sequences that frequently repeat in the same order, from the temporal profile. Then, the system dynamically injects code at appropriate program points to detect and prefetch these hot data streams. Finally, the process enters the hibernation phase where no profiling or analysis is performed, and the program continues to execute with the added prefetch instructions. At the end of the hibernation phase, the program is de-optimized to remove the inserted checks and prefetch instructions, and control returns to the profiling phase. For long-running programs, this profile, analyze and optimize, hibernate, cycle will repeat multiple times. Our initial results from applying dynamic prefetching are promising, indicating overall execution time improvements of 5.19% for several memory-performance-limited SPECint2000 benchmarks running their largest (ref) inputs.

References

  1. M. Annavaram, J. Patel, and E. Davidson. "Data prefetching by dependence graph precomputation."In International Symposium on Computer Architecture (ISCA), 2001]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Arnold et al. "Adaptive optimization in the Jalapeno JVM", In Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), 2000]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Arnold, and B. Ryder. "A Framework for Reducing the Cost of Instrumented Code." In ACM SIGPLAN'01 Conference on Programming Languages Design and Implementation (PLDI), 2001]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. V. Bala, E. Duesterwald, and S. Banerjia. "Dynamo: A transpar¿ent dynamic optimization system." In ACM SIGPLAN'00 Conference on Programming Languages Design and Implementation (PLDI), 2000]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. Cahoon, and K. McKinley. "Data flow analysis for software prefetching linked data structures in Java." In International Conference on Parallel Architectures and Compilation Tech¿niques (PACT), 2001]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Charney, and A. Reeves. "Generalized correlation based hardware prefetching." Tech report EE-CEG-95-1, Cornell University, 1995]]Google ScholarGoogle Scholar
  7. T. Chen, and J. Baer." Reducing memory latency via non-blocking and prefetching caches."In Architectural Support for Programming Languages and Operating Systems (ASPLOS),1992]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T.M. Chilimbi. "Efficient Representations and Abstractions for Quantifying and Exploiting Data Reference Locality." In Proceedings of the ACM SIGPLAN'01 Conference on Program¿ming Language Design and Implementation, June 2001]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. T. M. Chilimbi, and J. R. Larus. "Using generational garbage collection to implement cache-conscious data placement." In Proceedings of the 1998 International Symposium on Memory Management, Oct. 1998]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. M. Chilimbi. "On the stability of temporal data reference profiles." In International Conference on Parallel Architectures and Compilation Techniques (PACT), 2001]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Cierniak, G. Lueh, and J. Stichnoth. "Practicing JUDO: Java under dynamic optimizations." In ACM SIGPLAN'00 Conference on Programming Languages Design and Implementation (PLDI), 2000]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Cooksey, D. Colarelli, and D. Grunwald, "Content-based prefetching: Initial results", In Workshop on Intelligent Memory Systems, 2000]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. Deaver, R. Gorton, and N. Rubin, "Wiggins/Redstone: An online program specializer.", In Hot Chips, 1999]]Google ScholarGoogle Scholar
  14. T. Harris. "Dynamic adaptive pre-tenuring." In International Symposium on Memory Management (ISMM), 2000]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Hirzel and T. Chilimbi. " Bursty Tracing: A Framework for Low-Overhead Temporal Profiling", In Workshop on Feed'back-Directed and Dynamic Optimizations (FDDO), 2001]]Google ScholarGoogle Scholar
  16. D. Joseph and D. Grunwald. " Prefetching using Markov pre¿dictors", In International Symposium on Computer Architec¿ture (ISCA), 1997]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. N. Jouppi. "Improving direct-mapped cache performance by the addition of a small fully associative cache and prefetch buff¿ers", In International Symposium on Computer Architecture (ISCA), 1990]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Karlsson, F. Dahlgren, and P. Stenstrom. "A Prefetching Technique for Irregular Accesses to Linked Data Structures, In High Performance Computer Architectures (HPCA), 1999]]Google ScholarGoogle Scholar
  19. T. Kistler and M. Franz. "Automated data-member layout of heap objects to improve memory-hierarchy performance." In Transactions on Programming Languages and Systems (TO'PLAS), 2000]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Klaiber and H. Levy. "An architecture for software-con¿trolled data prefetching." In International Symposium on Com¿puter Architecture (ISCA), 1991]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. R. Larus. "Whole program paths." In Proceedings of the ACM SIGPLAN'99 Conference on Programming Language Design and Implementation, pages 259-269, May 1999]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. K. Luk, and T. Mowry. "Compiler-based prefetching for re¿cursive data structures." In Architectural Support for Program¿ming Languages and Operating Systems (ASPLOS), 1996]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. G. Nevill-Manning and I. H. Witten. "Linear-time, incre¿mental hierarchy inference for compression." In Proceedings of the Data Compression Conference (DCC'97), 1997]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. T. Mowry, M. Lam, and A. Gupta. "Design and Analysis of a Compiler Algorithm for Prefetching.", In Architectural Support for Programming Languages and Operating Systems (ASP¿LOS), 1992]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Paleczny, C. Vick, and C. Click. "The Java HotSpot server compiler.", In USENIX Java Virtual Machine Research and Technology Symposium (JVM), 2001]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Roth, A. Moshovos, and G. Sohi. "Dependence based prefetching for linked data structures." In Architectural Support for Programming Languages and Operating Systems (ASP¿LOS), 1998]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. Roth and G. Sohi. "Effective jump pointer prefetching for linked data structures." In International Symposium on Com¿puter Architecture (ISCA), 1999]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Rubin, R. Bodik, and T. Chilimbi. "An Efficient Profile-Analysis Framework for Data-Layout Optimizations." In Prin¿ciples of Programming Languages, POPL'02, Jan 2002]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R. Saavedra and D. Park. "Improving the effectiveness of soft¿ware prefetching with adaptive execution." In International Conference on Parallel Architectures and Compilation Tech¿niques (PACT), 1996]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. T. Sherwood and B. Calder. "Automated design of finite state machine predictors for customized processors." In Internation¿al Symposium on Computer Architecture (ISCA), 2001]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. A. Srivastava and A. Eustace. "ATOM: A system for building customized program analysis tools." In Proceedings of the ACM SIGPLAN'94 Conference on Programming Language Design and Implementation, pages 196-205, May 1994]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. A. Srivastava, A. Edwards, and H. Vo. "Vulcan: Binary trans¿formation in a distributed environment.", In Microsoft Re'search Tech Report, MSR-TR-2001-50, 2001]]Google ScholarGoogle Scholar
  33. A. Stoutchinin et al. "Speculative prefetching of induction pointers." In International Conference on Compiler Construc¿tion (CC), 2001]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. D. Ung, and C. Cifuentes."Opimising hot paths in a dynamic binary translator."In Workshop on Binary Translation, 2000]]Google ScholarGoogle Scholar
  35. S. Vander Wiel, and D. Lilja. "Data prefetch mechanisms", In¿ACM Computing Surveys, 2000]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Dynamic hot data stream prefetching for general-purpose programs

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        PLDI '02: Proceedings of the ACM SIGPLAN 2002 conference on Programming language design and implementation
        June 2002
        338 pages
        ISBN:1581134630
        DOI:10.1145/512529

        Copyright © 2002 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 17 May 2002

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        PLDI '02 Paper Acceptance Rate28of169submissions,17%Overall Acceptance Rate406of2,067submissions,20%

        Upcoming Conference

        PLDI '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader