- 1.J.L. Baer and T.F. Chen. "An Effective On-chip Preloading Scheme to Reduce Data Access Penalty". In Supercomputing 91, pp.176- t 86, 1991.]] Google ScholarDigital Library
- 2.K. Bo!land and A. Doltas. "Predicting and Precluding Problems with Memory Latency". 1EEE Micro, vol. 14, no. 4, Aug. 1994, pp.59-67.]] Google ScholarDigital Library
- 3.D. Burger, J.R. Goodman and A. K~igi. "Memory Bandwidth Limitations of Future Microprocessors". In Proc. of 23th Int. Syrup. on Computer Architecture, pp.78-89, May 1996.]] Google ScholarDigital Library
- 4.T.F. Chen and J.-L. Baer, "A PerFormance study of Software and hardware Data Prefetching Schemes", Proc. 21st Int. Symp. Computer Architecture, 1994, pp. 223-232.]] Google ScholarDigital Library
- 5.B. Cmelik and D. Keppel ," Shade: A Fast Instruction-Set Simsulator for Execution Profiling". Proc. of ACM SIGMETRICS, May 1994, pp. 128-137.]] Google ScholarDigital Library
- 6.F. Dahlgren, M. Dubois and P. Stenstr6m, "Fixed and Adaptive Sequential Prefetching in Shared-Memory Multiprocessors", Proc~ 1993 Int. Conf. Parallel Processing, CRC Press, Boca Rat6n, Fla., 1993, pp. 156-.i 63.]] Google ScholarDigital Library
- 7.F. Dahlgren, M. Dubois and P. Stenstr6m, "Sequential Hardware Prefetching in Shared-Memory Multiprocessors", IEEE Trans. Parallel and Distributed Systems, July 1995, pp. 733-746.]] Google ScholarDigital Library
- 8.F. Dahlgren and P. Stenstr0m, "Effectiveness of Hardware- Based Stride and Sequential Prefetching in Shared Memory Multiprocessors", Proc. first IEEE Symp. High-Performance Computer Architecture, I995, pp. 68-77.]] Google ScholarDigital Library
- 9.K. Farkas, N. Jouppi and P. Chow, "How useful are nonblocking loads, stream buffers and speculative execution in multiple issue processors", Proc. first IEEE Symp. High-Performance Computer Architecture. 1995, pp. 78-89.]] Google ScholarDigital Library
- 10.J.W.C. Fu, J.H. Patel and B. L. Janssens. "Stride Directed Prefetching in Scalar Processors". in Proc. of 25th Int. Symp. on Microarchitecture (MICRO-25), ACM, pp. 102-110, December 1992.]] Google ScholarDigital Library
- 11.E. Hagersten. "Towards Scalable Cache Only Memory Architectures'', PhD thesis, Swedish Inst. of Comp. Science, Oct. 1992.]]Google Scholar
- 12.P. Ib~fiez and V. Vifials. "Performance Assessment of Contents Management in Multilevel on-chip Caches". In Proc. of the 22nd Euromicro ConF. pp: 431-440, Sept. 1996.]]Google Scholar
- 13.L. Jimeno, P. lb~hez and V. Vifials. "Warm Time-sampling: Fast and Accurate Cycle-level Simulation of Cache Memory''. In Proc. of the 22nd Euromicro Conf. Short Contrib. pp: 39-44, Sept. 1996.]]Google Scholar
- 14.D. Josephand D. Grunwald, "Prefetching Using Markov Predictors'', Proc. of 24th Int. Syrup. Computer Architecture, pp.252-263, June 1997.]] Google ScholarDigital Library
- 15.N.P. Jouppi. "Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-associative Cache and Prefetch Buffers". In Proc. of 17th Int. Syrup. on Comp. Architecture, pp.364-373, May 1990.]] Google ScholarDigital Library
- 16.Y. Jegou and O. Temam. ~'Speculative Prefetching". Proc. of ICS-93, pp. 1-11, Dec. 1992.]] Google ScholarDigital Library
- 17.S. Mehrotra and L. Harrison. "Examination of a Memory Access Classification Scheme fi)r Pointer-Intensive and Numeric Programs". Proc. of ICS-96, pp. 133-140, 1996.]] Google ScholarDigital Library
- 18.J. M. Mulder, N. T. Quach and M. J. Flynn. "An Area Model For On-Cllip Memories and its Application". IEEE Jour. of Solid State Circuits 26 (2) Feb. 1991, pp. 98-106.]]Google ScholarCross Ref
- 19.T. Ozawa Y. Kimura and S. Nishizaki, "Cache miss heuristics and preloading techniques for general-purpose programs". Proc. 28th Int. Syrup. Microarchitecture, 1995, pp. 243-248.]] Google ScholarDigital Library
- 20.S. Palacharla and R.E. Kessler, "Evaluating Stream Buffers as secondary cache replacemenr', Proc. of 21th Int. Syrup. Computer Architecture, April t994, pp.24-33.]] Google ScholarDigital Library
- 21.A.J. Smith. "Cache Memories". Computing Surveys, 14(3):473-530, Sept. I982.]] Google ScholarDigital Library
- 22.D.M. Tultsen and S.J. Eggers,"Effective Cache Prefetching on Bus-Based Multiprocessors". ACM Transactions on Computer Systems, Vol. 13, No. 1, February 1995, pp. 57-88.]] Google ScholarDigital Library
- 23.S. VanderWiel and D.J. Lilja. "When Caches Aren't Enough: data prefetching techniques". IEEE Computer, Vol. 30, No. 7, july 1997, pp. 23-30.]] Google ScholarDigital Library
Index Terms
- Characterization and improvement of load/store cache-based prefetching
Recommendations
Effective cache prefetching on bus-based multiprocessors
Compiler-directed cache prefetching has the potential to hide much of the high memory latency seen by current and future high-performance processors. However, prefetching is not without costs, particularly on a shared-memory multiprocessor. Prefetching ...
Maintaining Cache Coherence through Compiler-Directed Data Prefetching
In this paper, we propose a compiler-directed cache coherence scheme which makes use of data prefetching to enforce cache coherence in large-scale distributed shared-memory (DSM) systems. TheCache Coherence With Data Prefetching(CCDP) scheme uses ...
Adaptive prefetching for shared cache based chip multiprocessors
DATE '09: Proceedings of the Conference on Design, Automation and Test in EuropeChip multiprocessors (CMPs) present a unique scenario for software data prefetching with subtle tradeoffs between memory bandwidth and performance. In a shared L2 based CMP, multiple cores compete for the shared on-chip cache space and limited off-chip ...
Comments