- 1.W. Abu-Sufah. Improving the Performance of Virtual Memory Computers. PhD thesis, University of Illinois at Urbana-Champaign, Nov 1978. Google ScholarDigital Library
- 2.U. Banerjee. Data dependence in ordinary programs. Technical Report 76-837, University of Illinios at Urbana-Champaign, Nov 1976.Google Scholar
- 3.U. Banerjee. Dependence Analysis for Supercomputing. Kluwer Academic, 1988. Google ScholarDigital Library
- 4.U. Banerjee. Unimodular transformations of double loops. In 3rd Workshop on Languages and Compilers for Parallel Computing, Aug 1990.Google Scholar
- 5.D. Callahan, S. Carr, and K. Kennedy. Improving register allocation for subscripted variables. In Proceedings of the ACM SIGPLAN '90 Conference on Programming Language Design and Implementation, June 1990. Google ScholarDigital Library
- 6.J. Dongarra, J. Du Croz, S. Hammarling, and I. Duff. A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software, pages 1-17, March 1990. Google ScholarDigital Library
- 7.K. Gallivan, W. Jalby, U. Meier, and A. Sameh. The impact Of hierarchical memory systems on linear algebra algorithm design. Technical report, University of Ulinios, 1987.Google Scholar
- 8.D. Oannon, W. Jalby, ancl K. Oallivan. Strategies for cache and local memory management by global program transformation. Journal of Parallel and Distributed Computing, 5:587-616, 1988. Google ScholarDigital Library
- 9.G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins University Press, 1989.Google Scholar
- 10.F. Irigoin and R. Triolet. Computing dependence direction vectors and dependence cones. Technical Report E94, Centre D'Automatique et Informatique, 1988.Google Scholar
- 11.F. Irigoin and R. Triolet. Supemode partitioning. In Proc. 15th Annual ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, January 1988. Google ScholarDigital Library
- 12.M. S. Lam, E. E. Rothberg, and M. E. Wolf. The cache performance and opfimizations of blocked algorithms. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, April 1991. Google ScholarDigital Library
- 13.A. C. McKeller and E. G. Coffman. The organization of matrices and matrix operations in a paged multiprogramming environment. CACM, 12(3):153-165, 1969. Google ScholarDigital Library
- 14.A. Porterfield. Software Methods for Improvement of Cache Performance on Supercomputer Applications. PhD thesis, Rice University, May 1989. Google ScholarDigital Library
- 15.R. Schreiber and J. Dongarra. Automatic blocking of nested loops. 1990.Google Scholar
- 16.M. E. Wolf and M. S. Lam. A loop transformation theory and an algorithm to maximize parallelism. IEEE Transactions on Parallel and Distributed Systems, July 1991. Google ScholarDigital Library
- 17.M. j. Wolfe. Techniques for improving the inherent parallelism in programs. Technical Report UIUCDCS-R-78-929, University of Illinois, 1978.Google Scholar
- 18.M. j. Wolfe. More iteration space tiling. In Supercomputing '89, Nov 1989. Google ScholarDigital Library
Index Terms
- A data locality optimizing algorithm
Recommendations
Exploiting spatial locality in data caches using spatial footprints
Special Issue: Proceedings of the 25th annual international symposium on Computer architecture (ISCA '98)Modern cache designs exploit spatial locality by fetching large blocks of data called cache lines on a cache miss. Subsequent references to words within the same cache line result in cache hits. Although this approach benefits from spatial locality, ...
Exploiting spatial locality in data caches using spatial footprints
ISCA '98: Proceedings of the 25th annual international symposium on Computer architectureModern cache designs exploit spatial locality by fetching large blocks of data called cache lines on a cache miss. Subsequent references to words within the same cache line result in cache hits. Although this approach benefits from spatial locality, ...
Comments