Skip to main content

Spatial Data Locality with Respect to Degree of Parallelism in Processor-and-Memory Hierarchies

  • Conference paper
Vector and Parallel Processing – VECPAR’98 (VECPAR 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1573))

Included in the following conference series:

  • 527 Accesses

Abstract

A system organised as a Hierarchy of Processor-And-Memory (HPAM) extends the familiar notion of memory hierarchy by including processors with different performance in different levels of the hierarchy. Tasks are assigned to different hierarchy levels according to their degree of parallelism. This paper studies the spatial locality (with respect to degree of parallelism) behaviour of simulated parallelised benchmarks in multi-level HPAM systems, and presents an inter-level cache coherence protocol that supports inclusion and multiple block sizes on an HPAM architecture. Inter-level miss rates and traffic simulation results show that the use of multiple data transfer sizes (as opposed to a unique size) across the HPAM hierarchy allows the reduction of data traffic between the uppermost levels in the hierarchy while not degrading the miss rate in the lowest level.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cmelik, B., Keppel, D.: Shade: A fast instruction-set simulator for execution profiling. In: Proceedings of the 1994 SIGMETRICS Conf. on Measurement and Modeling of Computer Systems (1994)

    Google Scholar 

  2. Patterson, D., et al.: A Case for Intelligent RAM: IRAM. IEEE Micro, Los Alamitos (1997)

    Google Scholar 

  3. Figueiredo, R.J.O., Fortes, J.A.B., Ben-Miled, Z., Taylor, V., Eigenmann, R.: Impact of Computing-in-Memory on the Performance of Processorand- Memory Hierarchies. Technical Report TR-ECE-98-1, Electrical and Computer Engineering Department, Purdue University (1998)

    Google Scholar 

  4. Hagersten, E., Landin, A., Haridi, S.: DDM - A Cache-Only Memory Architecture. IEEE Computer, Los Alamitos (1992)

    Google Scholar 

  5. Hennessy, J.L., Patterson, D.A.: Computer Architecture: A Quantitative Approach. Morgan Kaufmann, San Francisco (1996)

    MATH  Google Scholar 

  6. Lenosky, D., Laudon, J., Gharacharloo, K., Gupta, A., Hennessy, J.: The Directory-Based Cache Coherence Protocol for the DASH Multiprocessor. In: Proc. of the 17th Annual Int. Symp. on Computer Architecture (May 1990)

    Google Scholar 

  7. Berry, M., et al.: The Perfect Club Benchmarks: Effective Performance Evaluation on Supercomputers. Technical Report UIUC-CSRD-827, Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign (July 1994)

    Google Scholar 

  8. Papamarcos, M., Patel, J.: A Low Overhead Coherence Solution for Multiprocessors with Private Cache Memories. In: Proc. of 11th Annual Int. Symp. On Computer Architecture (1984)

    Google Scholar 

  9. Dinda, P., et al.: The CMU Task parallel Program Suite. Technical Report CMUCS- 94-131, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania (January 1994)

    Google Scholar 

  10. Kogge, P.M., Sunaga, T., Miyataka, H., Kitamura, K., Retter, E.: Combined DRAM and Logic Chip for Massively Parallel Systems. In: 16th Conference on Advanced Research in VLSI (1995)

    Google Scholar 

  11. Standard Performance Evaluation Corporation. Spec newsletter (September 1995)

    Google Scholar 

  12. Blume, W., Doallo, R., Eigenmann, R., Grout, J., Hoeflinger, J., Lawrence, T., Lee, J., Padua, D., Paek, Y., Pottenger, B., Rauchwerger, L., Tu, P.: Parallel Programming with Polaris. IEEE Computer, Los Alamitos (1996)

    Google Scholar 

  13. Chen, Y.-S., Dubois, M.: Cache Protocols with Partial Block Invalidations. In: Proc. 7th Int. Parallel Processing Symp. (1993)

    Google Scholar 

  14. Ben-Miled, Z., Fortes, J.A.B.: A Heterogeneous Hierarchical Solution to Cost efficient High Performance Computing. In: Par. and Dist. Processing Symp. (October 1996)

    Google Scholar 

  15. Ben-Miled, Z., Eigenmann, R., Fortes, J.A.B., Taylor, V.: Hierarchical Processors-and-Memory Architecture for High Performance Computing. In: Frontiers of Massively Parallel Computation Symp. (October 1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Figueiredo, R.J.O., Fortes, J.A.B., Miled, Z.B. (1999). Spatial Data Locality with Respect to Degree of Parallelism in Processor-and-Memory Hierarchies. In: Hernández, V., Palma, J.M.L.M., Dongarra, J.J. (eds) Vector and Parallel Processing – VECPAR’98. VECPAR 1998. Lecture Notes in Computer Science, vol 1573. Springer, Berlin, Heidelberg. https://doi.org/10.1007/10703040_30

Download citation

  • DOI: https://doi.org/10.1007/10703040_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66228-0

  • Online ISBN: 978-3-540-48516-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics