skip to main content
10.1145/143365.143488acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
Article
Free Access

Design and evaluation of a compiler algorithm for prefetching

Published:01 September 1992Publication History
First page image

References

  1. 1.W. Abu-Sufah, D. J. Kuck, and D. H. Lawrie. Automatic program transformations for virtual memory computers. Proc. of the 1979 National Computer Conference, pages 969-974, June 1979.Google ScholarGoogle ScholarCross RefCross Ref
  2. 2.J-L. Baer and T-F. Chen. An effective on-chip preloading scheme to reduce data access penalty. In Proceedings of Supercomputing '91, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3.D. Bailey, J. Barton, T. Lasinski, and H. Simon. The NAS Parallel Benchmarks. Technical Report RNR-91-002, NASA Ames Research Center, August 1991.Google ScholarGoogle Scholar
  4. 4.D. Callahan, K. Kennedy, and A. Porterfield. Software prefetching. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 40--52, April 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5.W. Y. Chen, S. A. Mahlke, P. P. Chang, and W. W. Hwu'. Data access microarchitecmres for superscalar processors with compiler-assisted data prefetching. In Proceedings of Microcomputing 24, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6.R. P. Colwell, R. P. Nix, I. $. O'Donnell, D. B. Papworth, and P. K. Rodman. A vliw architecture for a trace scheduling compiler, in Proc. Second Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, pages 180-192, Oct. 1987. Google ScholarGoogle ScholarCross RefCross Ref
  7. 7.J. C. Dehnert, P. Y.-T. Hsu, and J. P. Bratt. Overlapped loop support in the cydra 5. In Third international Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS III), pages 26-38, April 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8.I. Ferrante, V. $arkar, and W. Thrash. On estimating and enhancing cache effectiveness. In Fourth Workshop on Languages and Compilers for Parallel Computing, Aug 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9.K. Gallivan, W. Jalby, U. Meier, and A. Sameh. The impact of hierarchical memory systems on linear algebra algorithm design. Technical Report UIUCSRD 625, University of Illinios, 1987.Google ScholarGoogle Scholar
  10. 10.D. Gannon and W. Jalby. The influence of memory hierarchy on algorithm organization: Programming FFTs on a vector mulfiprocessor. In The Characteristics of Parallel Algorithms. MIT Press, 1987.Google ScholarGoogle Scholar
  11. 11.D. Gannon, W. Jalby, and K. Gallivan. Strategies for cache and local memory management by global program transformation. Journal of Parallel and Distributed Computing, 5:587-616, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12.G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins University Press, 1989.Google ScholarGoogle Scholar
  13. 13.E. Gomish, E. Granston, and A. Veidenbaum. Compiler- Directed Data Prefetching in Multiprocessors with Memory Hierarchies. In International Conference on Supercomputing, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14.E. H. Gomish. Compile time analysis for data prefetching. Master's thesis, University of Illinois at Urbana-Champaign, December 1989.Google ScholarGoogle Scholar
  15. 15.A. Gupta, I. Hennessy, K. Gharachorloo, T. Mowry, and W-D. Weber. Comparative evaluation of latency reducing and tolerating techniques. In Proceedings of the 18th Annum International Symposium on Computer Architecture, pages 254-263, May 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16.A. C. Klaiber and H. M. Levy. Architecture for softwarecontrolled data prefetching. In Proceedings of the 18th Annual International Symposium on Computer Architecture, pages 43- 63, May 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17.D. Kroft. Lockup-ffee instruction fetch/prefetch cache organization. In Proceedings of the 8th Annual International Symposium on Computer Architecture, pages 81-85, 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18.M. S. Lain. Software pipelining: An effective scheduling technique for vliw machines. In Proc. ACM SIGPLAN 88 Conference on Programming Language Design and Implementation, pages 318-328, June 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. 19.M. S. Lam, E. E. Rothberg, and M. E. Wolf. The cache performance and optimizations of blocked algorithms. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 63-74, April 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. 20.R. L. Lee. The Effectiveness of Caches and Data Prefetch Buffers in Large-Scale Shared Memory Multiprocessors. PhD thesis, Department of Computer Science, University of Illinois at Urbana-Champaign, May 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21.A. C. McKeller and E. G. Coffman. The organization of matrices and matrix operations in a paged multiprogramming environment. CACM, 12(3)'153-165, 1969. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. 22.T. Mowry and A. Gupta. Tolerating latency through softwarecontrolled prefetching in shared-memory multiprocessors. Journal of Parallel and Distributed Computing, 12(2):87-106, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. 23.A. K. Porterfield. Software Methods for Improvement of Cache Performance on Supercomputer Applications. PhD thesis, Department of Computer Science, Rice University, May 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. 24.B. R. Rau and C. D. Glaeser. Some Scheduling Techniques and an Easily Sehedulable Horizontal Architecture for High Performance Scientific Computing. In Proceedings of the 14th Annual Workshop on Microprogramming, pages 183-198, October 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. 25.J. P. Singh, W-D. Weber, and A. Gupta. Splash: Stanford parallel applications for shared memory. Technical Report CSL- TR-91-469, Stanford University, April 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. 26.M. D. Smith. Tracing with pixie. Technical Report CSL-TR- 91-497, Stanford University, November 1991.Google ScholarGoogle Scholar
  27. 27.SPEC. The SPEC Benchmark Report. Waterside Associates, Fremont, CA, January 1990.Google ScholarGoogle Scholar
  28. 28.S.W.K. Tjiang and J. L. Hennessy. Sharlit: A tool for building optimizers. In SiGPLAN Conference on Programming Language Design and Implementation, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. 29.M. E. Wolf and M. S. Lain. A data locality optimizing algorithm. In Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation, pages 30--44, June 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Design and evaluation of a compiler algorithm for prefetching

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            ASPLOS V: Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
            September 1992
            308 pages
            ISBN:0897915348
            DOI:10.1145/143365

            Copyright © 1992 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 1 September 1992

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            Overall Acceptance Rate535of2,713submissions,20%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader