Abstract
Tracing garbage collectors traverse references from live program variables, transitively tracing out the closure of live objects. Memory accesses incurred during tracing are essentially random: a given object may contain references to any other object. Since application heaps are typically much larger than hardware caches, tracing results in many cache misses. Technology trends will make cache misses more important, so tracing is a prime target for prefetching.Simulation of Java benchmarks running with the Boehm-De-mers-Weiser mark-sweep garbage collector for a projected hardware platform reveal high tracing overhead (up to 65% of elapsed time), and that cache misses are a problem. Applying Boehm's default prefetching strategy yields improvements in execution time (16% on average with incremental/generational collection for GC-intensive benchmarks), but analysis shows that his strategy suffers from significant timing problems: prefetches that occur too early or too late relative to their matching loads. This analysis drives development of a new prefetching strategy that yields up to three times the performance improvement of Boehm's strategy for GC-intensive benchmark (27% average speedup), and achieves performance close to that of perfect timing ie, few misses for tracing accesses) on some benchmarks. Validating these simulation results with live runs on current hardware produces average speedup of 6% for the new strategy on GC-intensive benchmarks with a GC configuration that tightly controls heap growth. In contrast, Boehm's default prefetching strategy is ineffective on this platform.
- Virtual memory primitives for user programs. In Proceedings of the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (Santa Clara, California, Apr.). ACM SIGPLAN Notices 26, 4 (Apr. 1991), pp. 96--107.]] Google ScholarDigital Library
- Austin, T. M., Larson, E., and Ernst, D. SimpleScalar: An infrastructure for computer system modeling. IEEE Computer 35, 2 (Feb. 2002), 59--67.]] Google ScholarDigital Library
- Boehm, H.-J. Space efficient conservative garbage collection. In Proceedings of the ACM Conference on Programming Language Design and Implementation (Albuquerque, New Mexico, June). ACM SIGPLAN Notices 28, 6 (June 1993), pp. 197--206.]] Google ScholarDigital Library
- Boehm, H.-J. Reducing garbage collector cache misses. In Proceedings of the ACM International Symposium on Memory Management (Minneapolis, Minnesota, Oct., 2000). ACM SIGPLAN Notices 36, 1 (Jan. 2001), pp. 59--64.]] Google ScholarDigital Library
- Boehm, H.-J., Demers, A. J., and Shenker, S. Mostly parallel garbage collection. In Proceedings of the ACM Conference on Object-Oriented Programming Systems, Languages, and Applications (Phoenix, Arizona, Oct.). ACM SIGPLAN Notices 26, 11 (Nov. 1991), pp. 157--164.]] Google ScholarDigital Library
- Boehm, H.-J., and Weiser, M. Garbage collection in an uncooperative environment. Software---Practice and Experience 18, 9 (Sept. 1988), 807--820.]] Google ScholarDigital Library
- Cahoon, B., and McKinley, K. S. Data data flow analysis for software prefetching linked data structures in Java. In Proceedings of IEEE International Conference on Parallel Architectures and Compilation Techniques (Barcelona, Spain, Sept.). 2001, pp. 280--291.]] Google ScholarDigital Library
- Cahoon, B. D. Effective Compile-Time Analysis for Data Prefetching in Java. PhD thesis, University of Massachusetts at Amherst, Sept. 2002.]] Google ScholarDigital Library
- Dijkstra, E., Lamport, L., Martin, A., Scholten, C., and Stefens, E. On-the-fly garbage collection: An exercise in cooperation. Commun. ACM 21, 11 (Nov. 1978), 966--975.]] Google ScholarDigital Library
- Gosling, J., Joy, B., Steele, Jr., G., and Bracha, G.The Java Language Specification, second ed. Addison-Wesley, 2000.]] Google ScholarDigital Library
- Horowitz, M., Martonosi, M., Mowry, T. C., and Smith, M. D. Informing memory operations: Memory performance feedback mechanisms and their applications.]]Google Scholar
- Hughes, R. J. M. A semi-incremental garbage collection algorithm. Software---Practice and Experience 21, 11 (Nov. 1982), 1081--1084.]]Google Scholar
- Jones, R., and Lins, R. Garbage Collection: Algorithms for Automatic Dynamic Memory Management. Wiley, May 1996. Chapter on distributed collection by Lins.]] Google ScholarDigital Library
- Karlsson, M., Dahlgren, F., and Stenström, P. A prefetching technique for irregular accesses to linked data structures. In Proceedings of the International Symposium on High Performance Computer Architecture (Toulouse, France, Jan.). IEEE Computer Society, 2000, pp. 206--217.]]Google Scholar
- Lipasti, M. H., Schmidt, W. J., Kunkel, S. R., and Roediger, R. R. SPAID: Software prefetching in pointer- and call-intensive environments. In Proceedings of the International Symposium on Microarchitecture. ACM/IEEE, 1995, pp. 231--236.]] Google ScholarDigital Library
- Luk, C.-K., and Mowry, T. C. Compiler-based prefetching for recursive data structures. In Proceedings of the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (Cambridge, Massachusetts, Oct.). ACM SIGPLAN Notices 31, 9 (Sept. 1996), pp. 222--233.]] Google ScholarDigital Library
- Roth, A., Moshovos, A., and Sohi, G. S. Dependence based prefetching for linked data structures. In Proceedings of the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (San Jose, California, Oct.). ACM SIGPLAN Notices 33, 11 (Nov. 1998), pp. 115--126.]] Google ScholarDigital Library
- Roth, A., and Sohi, G. S. Effective jump-pointer prefetching for linked data structures. In Proceedings of the International Symposium on Computer Architecture (Atlanta, Georgia, May). Computer Architecture News 27, 2 (May 1999), pp. 111--121.]] Google ScholarDigital Library
- Rubin, S., Bernstein, D., and Rodeh, M. Virtual cache line: A new technique to improve cache exploitation for recursive data structures. In Proceedings of the International Conference on Compiler Construction (Amsterdam, The Netherlands, Mar.), S. Jähnichen, Ed. vol. 1575 of Lecture Notes in Computer Science. 1999, pp. 259--273.]] Google ScholarDigital Library
- Schkolnick, M. A clustering algorithm for hierarchical structures. ACM Trans. Database Syst. 2, 1 (Mar. 1977), 27--44.]] Google ScholarDigital Library
- SPEC. SPECjvm98 benchmarks, 1998. http://www.spec.org/osg/jvm98.]]Google Scholar
- Stamos, J. W. Static grouping of small objects to enhance performance of a paged virtual memory. ACM Trans. Comput. Syst. 2, 2 (May 1984), 155--180.]] Google ScholarDigital Library
- Stoutchinin, A., Amaral, J. N., Gao, G. R., Dehnert, J. C., Jain, S., and Douillet, A. Speculative prefetching of induction pointers. In Proceedings of the International Conference on Compiler Construction (Genova, Italy, Apr.), R. Wilhelm, Ed. vol. 2027 of Lecture Notes in Computer Science. 2001, pp. 289--303.]] Google ScholarDigital Library
- Ungar, D. Generation scavenging: A non-disruptive high performance storage reclamation algorithm. In Proceedings of the ACM Symposium on Practical Software Development Environments (Pittsburgh, Pennsylvania, Apr.). 1984, pp. 157--167.]] Google ScholarDigital Library
- Wilson, P. R., Lam, M. S., and Moher, T. G. Effective ``static-graph'' reorganization to improve locality in garbage-collected systems. In Proceedings of the ACM Conference on Programming Language Design and Implementation (Toronto, Canada, June). ACM SIGPLAN Notices 26, 6 (June 1991), pp. 177--191.]] Google ScholarDigital Library
- Zilles, C. B. Benchmark Health considered harmful. ACM SIGARCH Newsletter 29, 3 (June 2001), 4--5.]] Google ScholarDigital Library
Index Terms
- Software prefetching for mark-sweep garbage collection: hardware analysis and software redesign
Recommendations
Software prefetching for mark-sweep garbage collection: hardware analysis and software redesign
ASPLOS 2004Tracing garbage collectors traverse references from live program variables, transitively tracing out the closure of live objects. Memory accesses incurred during tracing are essentially random: a given object may contain references to any other object. ...
Software prefetching for mark-sweep garbage collection: hardware analysis and software redesign
ASPLOS XI: Proceedings of the 11th international conference on Architectural support for programming languages and operating systemsTracing garbage collectors traverse references from live program variables, transitively tracing out the closure of live objects. Memory accesses incurred during tracing are essentially random: a given object may contain references to any other object. ...
Software prefetching for mark-sweep garbage collection: hardware analysis and software redesign
ASPLOS '04Tracing garbage collectors traverse references from live program variables, transitively tracing out the closure of live objects. Memory accesses incurred during tracing are essentially random: a given object may contain references to any other object. ...
Comments