skip to main content
article

Software prefetching for mark-sweep garbage collection: hardware analysis and software redesign

Published:07 October 2004Publication History
Skip Abstract Section

Abstract

Tracing garbage collectors traverse references from live program variables, transitively tracing out the closure of live objects. Memory accesses incurred during tracing are essentially random: a given object may contain references to any other object. Since application heaps are typically much larger than hardware caches, tracing results in many cache misses. Technology trends will make cache misses more important, so tracing is a prime target for prefetching.Simulation of Java benchmarks running with the Boehm-De-mers-Weiser mark-sweep garbage collector for a projected hardware platform reveal high tracing overhead (up to 65% of elapsed time), and that cache misses are a problem. Applying Boehm's default prefetching strategy yields improvements in execution time (16% on average with incremental/generational collection for GC-intensive benchmarks), but analysis shows that his strategy suffers from significant timing problems: prefetches that occur too early or too late relative to their matching loads. This analysis drives development of a new prefetching strategy that yields up to three times the performance improvement of Boehm's strategy for GC-intensive benchmark (27% average speedup), and achieves performance close to that of perfect timing ie, few misses for tracing accesses) on some benchmarks. Validating these simulation results with live runs on current hardware produces average speedup of 6% for the new strategy on GC-intensive benchmarks with a GC configuration that tightly controls heap growth. In contrast, Boehm's default prefetching strategy is ineffective on this platform.

References

  1. Virtual memory primitives for user programs. In Proceedings of the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (Santa Clara, California, Apr.). ACM SIGPLAN Notices 26, 4 (Apr. 1991), pp. 96--107.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Austin, T. M., Larson, E., and Ernst, D. SimpleScalar: An infrastructure for computer system modeling. IEEE Computer 35, 2 (Feb. 2002), 59--67.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Boehm, H.-J. Space efficient conservative garbage collection. In Proceedings of the ACM Conference on Programming Language Design and Implementation (Albuquerque, New Mexico, June). ACM SIGPLAN Notices 28, 6 (June 1993), pp. 197--206.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Boehm, H.-J. Reducing garbage collector cache misses. In Proceedings of the ACM International Symposium on Memory Management (Minneapolis, Minnesota, Oct., 2000). ACM SIGPLAN Notices 36, 1 (Jan. 2001), pp. 59--64.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Boehm, H.-J., Demers, A. J., and Shenker, S. Mostly parallel garbage collection. In Proceedings of the ACM Conference on Object-Oriented Programming Systems, Languages, and Applications (Phoenix, Arizona, Oct.). ACM SIGPLAN Notices 26, 11 (Nov. 1991), pp. 157--164.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Boehm, H.-J., and Weiser, M. Garbage collection in an uncooperative environment. Software---Practice and Experience 18, 9 (Sept. 1988), 807--820.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Cahoon, B., and McKinley, K. S. Data data flow analysis for software prefetching linked data structures in Java. In Proceedings of IEEE International Conference on Parallel Architectures and Compilation Techniques (Barcelona, Spain, Sept.). 2001, pp. 280--291.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cahoon, B. D. Effective Compile-Time Analysis for Data Prefetching in Java. PhD thesis, University of Massachusetts at Amherst, Sept. 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Dijkstra, E., Lamport, L., Martin, A., Scholten, C., and Stefens, E. On-the-fly garbage collection: An exercise in cooperation. Commun. ACM 21, 11 (Nov. 1978), 966--975.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Gosling, J., Joy, B., Steele, Jr., G., and Bracha, G.The Java Language Specification, second ed. Addison-Wesley, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Horowitz, M., Martonosi, M., Mowry, T. C., and Smith, M. D. Informing memory operations: Memory performance feedback mechanisms and their applications.]]Google ScholarGoogle Scholar
  12. Hughes, R. J. M. A semi-incremental garbage collection algorithm. Software---Practice and Experience 21, 11 (Nov. 1982), 1081--1084.]]Google ScholarGoogle Scholar
  13. Jones, R., and Lins, R. Garbage Collection: Algorithms for Automatic Dynamic Memory Management. Wiley, May 1996. Chapter on distributed collection by Lins.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Karlsson, M., Dahlgren, F., and Stenström, P. A prefetching technique for irregular accesses to linked data structures. In Proceedings of the International Symposium on High Performance Computer Architecture (Toulouse, France, Jan.). IEEE Computer Society, 2000, pp. 206--217.]]Google ScholarGoogle Scholar
  15. Lipasti, M. H., Schmidt, W. J., Kunkel, S. R., and Roediger, R. R. SPAID: Software prefetching in pointer- and call-intensive environments. In Proceedings of the International Symposium on Microarchitecture. ACM/IEEE, 1995, pp. 231--236.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Luk, C.-K., and Mowry, T. C. Compiler-based prefetching for recursive data structures. In Proceedings of the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (Cambridge, Massachusetts, Oct.). ACM SIGPLAN Notices 31, 9 (Sept. 1996), pp. 222--233.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Roth, A., Moshovos, A., and Sohi, G. S. Dependence based prefetching for linked data structures. In Proceedings of the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (San Jose, California, Oct.). ACM SIGPLAN Notices 33, 11 (Nov. 1998), pp. 115--126.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Roth, A., and Sohi, G. S. Effective jump-pointer prefetching for linked data structures. In Proceedings of the International Symposium on Computer Architecture (Atlanta, Georgia, May). Computer Architecture News 27, 2 (May 1999), pp. 111--121.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Rubin, S., Bernstein, D., and Rodeh, M. Virtual cache line: A new technique to improve cache exploitation for recursive data structures. In Proceedings of the International Conference on Compiler Construction (Amsterdam, The Netherlands, Mar.), S. Jähnichen, Ed. vol. 1575 of Lecture Notes in Computer Science. 1999, pp. 259--273.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Schkolnick, M. A clustering algorithm for hierarchical structures. ACM Trans. Database Syst. 2, 1 (Mar. 1977), 27--44.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. SPEC. SPECjvm98 benchmarks, 1998. http://www.spec.org/osg/jvm98.]]Google ScholarGoogle Scholar
  22. Stamos, J. W. Static grouping of small objects to enhance performance of a paged virtual memory. ACM Trans. Comput. Syst. 2, 2 (May 1984), 155--180.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Stoutchinin, A., Amaral, J. N., Gao, G. R., Dehnert, J. C., Jain, S., and Douillet, A. Speculative prefetching of induction pointers. In Proceedings of the International Conference on Compiler Construction (Genova, Italy, Apr.), R. Wilhelm, Ed. vol. 2027 of Lecture Notes in Computer Science. 2001, pp. 289--303.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ungar, D. Generation scavenging: A non-disruptive high performance storage reclamation algorithm. In Proceedings of the ACM Symposium on Practical Software Development Environments (Pittsburgh, Pennsylvania, Apr.). 1984, pp. 157--167.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Wilson, P. R., Lam, M. S., and Moher, T. G. Effective ``static-graph'' reorganization to improve locality in garbage-collected systems. In Proceedings of the ACM Conference on Programming Language Design and Implementation (Toronto, Canada, June). ACM SIGPLAN Notices 26, 6 (June 1991), pp. 177--191.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Zilles, C. B. Benchmark Health considered harmful. ACM SIGARCH Newsletter 29, 3 (June 2001), 4--5.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Software prefetching for mark-sweep garbage collection: hardware analysis and software redesign

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGOPS Operating Systems Review
          ACM SIGOPS Operating Systems Review  Volume 38, Issue 5
          ASPLOS '04
          December 2004
          283 pages
          ISSN:0163-5980
          DOI:10.1145/1037949
          Issue’s Table of Contents
          • cover image ACM Conferences
            ASPLOS XI: Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
            October 2004
            296 pages
            ISBN:1581138040
            DOI:10.1145/1024393

          Copyright © 2004 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 7 October 2004

          Check for updates

          Qualifiers

          • article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader