skip to main content
article

Profile guided code positioning

Published:01 April 2004Publication History
Skip Abstract Section

Abstract

This paper presents the results of our investigation of code positioning techniques using execution profile data as input into the compilation process. The primary objective of the positioning is to reduce the overhead of the instruction memory hierarchy.After initial investigation in the literature, we decided to implement two prototypes for the Hewlett-Packard Precision Architecture (PA-RISC). The first, built on top of the linker, positions code based on whole procedures. This prototype has the ability to move procedures into an order that is determined by a "closest is best" strategy.The second prototype, built on top of an existing optimizer package, positions code based on basic blocks within procedures. Groups of basic blocks that would be better as straight-line sequences are identified as chains. These chains are then ordered according to branch heuristics. Code that is never executed during the data collection runs can be physically separated from the primary code of a procedure by a technique we devised called procedure splitting.The algorithms we implemented are described through examples in this paper. The performance improvements from our work are also summarized in various tables and charts.

References

  1. ASSOCIATION FOR COMPUTING MACHINERY. Sigplan awards. http://www.acm.org/sigplan/awards.htm.Google ScholarGoogle Scholar
  2. BALA, V., DUESTERWALD, E., AND BANERJIA, S. Dynamo: a transparent dynamic optimization system. In Proceedings of the ACM SIGPLAN'00 conference on Programming language design and implementation (June 2000), ACM Press, pp. 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. BENITEZ, M. E., AND DAVIDSON, J. W. A portable global optimizer and linker. In Proceedings of the SIGPLAN'88 conference on Programming Language design and Implementation (June 1988), ACM Press, pp. 329--338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. COHN, R. S., GOODWIN, D. W., AND LOWNEY, P. G. Optimizing executables on windows NT with spike. Digital Technical Journal 9, 4 (1997), 3--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. DUESTERWALD, E., AND BALA, V. Software profiling for hot path prediction: less is more. In Proceedings of the ninth international conference on Architectural support for programming languages and operating systems (November 2000), ACM Press, pp. 202--211. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. HARTLEY, S. J. Compile-time program restructuring in multiprogrammed virtual memory systems. IEEE Transactions on Software Engineering 14, 11 (Nov. 1988), 1640--1644. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. HATFIELD, D. J., AND GERALD, J. Program restructuring for virtual memory. IBM Systems Journal 10, 3 (1971), 169--192.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. HWU, W. W., AND CHANG, P. P. Achieving high instruction cache performance with an optimizing compiler. In Proceedings of the 16th annual international symposium on Computer architecture (June 1989), ACM Press, pp. 242--251. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. MCFARLING, S. Program optimization for instruction caches. In Proceedings of the third international conference on Architectural support for programming languages and operating systems (April 1989), ACM Press, pp. 183--191. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. PETTIS, K. W., BAILEY, T. A., JAIN, A. K., AND DUBES, R. C. An intrinsic dimensionality estimator from near-neighbor information. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-1 (1979).Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. PETTIS, K. W., AND BUZBEE, W. B. Hewlett-Packard Precision Architecture compiler performance. Hewlett-Packard Journal: technical information from the laboratories of Hewlett-Packard Company 38, 3 (Mar. 1987), 29--35.Google ScholarGoogle Scholar
  12. SHOWMAN, P. S., PETTIS, K. W., ARKIN, K. J., SPOELSTRA, J. A., PRICE, J., CULBERTSON, W. B., AND SHURTLEFF, JR., R. D. Applications software for the Touchscreen Personal Computer. Hewlett-Packard Journal: technical information from the laboratories of Hewlett-Packard Company 35, 8 (Aug. 1984), 15--24.Google ScholarGoogle Scholar
  13. SRIVASTAVA, A., EDWARDS, A., AND VOI, H. Vulcan binary transformation in a distributed environment. Tech. Rep. MSR-TR-2001-50, Microsoft Research, April 2001.Google ScholarGoogle Scholar
  14. UNG, D., AND CIFUENTES, C. Optimising hot paths in a dynamic binary translator. ACM SIGARCH Computer Architecture News 29, 1 (2001), 55--65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. WALL, D. W. Global register allocation at link time. In Proceedings of the 1986 SIGPLAN symposium on Compiler construction (June 1986), ACM Press, pp. 264--275. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. WULF, W. A., AND MCKEE, S. A. Hitting the memory wall: Implications of the obvious. Computer Architecture News 23, 1 (Mar. 1995), 20--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. {CMR88} Coutant, Meloy and Ruscetta, "DOC: A Practical Approach to Source-Level Debugging of Globally Optimized Code," Proceedings of SIGPLAN '88 Conference on Programming Language Design and Implementation, SIGPLAN Notices, Vol. 23, No. 7, July 1988, pp. 125--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. {DV87} Davidson and Vaughan, "The Effect of Instruction Set Complexity on Program Size and Memory Performance," Second International Conference on Architectural Support for Programming Languages and Operating Systems, October 1987, pp. 60--63. Google ScholarGoogle ScholarCross RefCross Ref
  19. {Fer74} Ferrari, "Improving Locality by Critical Working Sets," CACM, Vol. 17, No. 11, November 1974, pp. 614--620. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. {Fer76} Ferrari, "The Improvement of Program Behavior," Computer, Vol. 9, No. 11, November 1976, pp. 39--47.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. {GKM82} Graham, Kessler and McKusick, "gprof: a Call Graph Execution Profiler," Proceedings of the SIGPLAN '82 Symposium on Compiler Construction, SIGPLAN Notices, Vol. 17, No. 6, June 1982, pp. 120--126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. {Har88} Hartley, "Compile-Time Program Restructuring in Multiprogrammed Virtual Memory Systems," IEEE Transactions on Software Engineering, Vol. 14, No. 11, November, 1988, pp. 1640--1644. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. {HG71} Hatfield and Gerald, "Program Restructuring for Virtual Memory," IBM Systems Journal, Vol. 10, No. 3, 1971, pp. 168--192.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. {HC89} Hwu and Chang, "Achieving High Instruction Cache Performance with an Optimizing Compiler," Proc. 16th Sym. on Computer Architecture, Jerusalem, Israel, May 1989, pp. 242--250. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. {McF89} McFarling, "Program Optimization for Instruction Caches," Third International Conference on Architectural Support for Programming Languages and Operating Systems, April 1989, pp. 183--191. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. {Sar89} Sarkar, "Determining Average Program Execution Times and their Variance," Proceedings of SIGPLAN '89 Conference on Programming Language Design and Implementation, SIGPLAN Notices, Vol. 24, No. 7, July 1989, pp. 298--312. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 39, Issue 4
    20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation 1979-1999: A Selection
    April 2004
    673 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/989393
    Issue’s Table of Contents

    Copyright © 2004 Authors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 1 April 2004

    Check for updates

    Qualifiers

    • article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader