Skip to main content

Topology-Aware Quality-of-Service Support in Highly Integrated Chip Multiprocessors

  • Conference paper
Computer Architecture (ISCA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6161))

Included in the following conference series:

Abstract

Power limitations and complexity constraints demand modular designs, such as chip multiprocessors (CMPs) and systems-on-chip (SOCs). Today’s CMPs feature up to a hundred discrete cores, with greater levels of integration anticipated in the future. Supporting effective on-chip resource sharing for cloud computing and server consolidation necessitates CMP-level quality-of-service (QOS) for performance isolation, service guarantees, and security. This work takes a topology-aware approach to on-chip QOS. We propose to segregate shared resources into dedicated, QOS-enabled regions of the chip. We than eliminate QOS-related hardware and its associated overheads from the rest of the die via a combination of topology and operating system support. We evaluate several topologies for the QOS-enabled regions, including a new organization called Destination Partitioned Subnets (DPS) which uses a light-weight dedicated network for each destination node. DPS matches or bests other topologies with comparable bisection bandwidth in performance, area- and energy-efficiency, fairness, and preemption resilience.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Balfour, J.D., Dally, W.J.: Design Tradeoffs for Tiled CMP On-Chip Networks. In: 20th International Conference on Supercomputing, pp. 187–198. ACM, New York (2006)

    Google Scholar 

  2. Bitirgen, R., Ipek, E., Martinez, J.F.: Coordinated Management of Multiple Interacting Resources in Chip Multiprocessors: A Machine Learning Approach. In: 41st IEEE/ACM International Symposium on Microarchitecture, pp. 318–329. IEEE Computer Society, Washington, DC (2008)

    Google Scholar 

  3. Dally, W.J., Towles, B.: Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers Inc., San Francisco (2004)

    Google Scholar 

  4. Das, R., Mutlu, O., Moscibroda, T., Das, C.R.: Application-aware Prioritization Mechanisms for On-Chip Networks. In: 42nd IEEE/ACM International Symposium on Microarchitecture, pp. 280–291. ACM, New York (2009)

    Google Scholar 

  5. Demers, A., Keshav, S., Shenker, S.: Analysis and Simulation of a Fair Queueing Algorithm. In: SIGCOMM 1989: Communications Architectures and Protocols, pp. 1–12. ACM, New York (1989)

    Google Scholar 

  6. Ebrahimi, E., Lee, C.J., Mutlu, O., Patt, Y.N.: Fairness via Source Throttling: a Configurable and High-performance Fairness Substrate for Multi-Core Memory Systems. In: 15th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 335–346. ACM, New York (2010)

    Google Scholar 

  7. Golestani, S.: Congestion-free Communication in High-Speed Packet Networks. IEEE Transactions on Communications 39(12), 1802–1812 (1991)

    Article  Google Scholar 

  8. Grot, B., Hestness, J., Keckler, S.W., Mutlu, O.: Express Cube Topologies for On-Chip Interconnects. In: 15th International Symposium on High-Performance Computer Architecture, pp. 163–174. IEEE Computer Society, Washington, DC (2009)

    Google Scholar 

  9. Grot, B., Keckler, S.W., Mutlu, O.: Preemptive Virtual Clock: a Flexible, Efficient, and Cost-Effective QOS Scheme for Networks-on-Chip. In: 42nd IEEE/ACM International Symposium on Microarchitecture, pp. 268–279. ACM, New York (2009)

    Google Scholar 

  10. Iyer, R.: CQoS: a Framework for Enabling QoS in Shared Caches of CMP Platforms. In: 18th International Conference on Supercomputing, pp. 257–266. ACM, New York (2004)

    Google Scholar 

  11. Kahng, A., Li, B., Peh, L.S., Samadi, K.: ORION 2.0: A Fast and Accurate NoC Power and Area Model for Early-Stage Design Space Exploration. In: Conference on Design, Automation, and Test in Europe, pp. 423–428 (2009)

    Google Scholar 

  12. Kermani, P., Kleinrock, L.: Virtual Cut-Through: a New Computer Communication Switching Technique. Computer Networks 3, 267–286 (1979)

    MathSciNet  MATH  Google Scholar 

  13. Kim, J.H., Chien, A.A.: Rotating Combined Queueing (RCQ): Bandwidth and Latency Guarantees in Low-Cost, High-Performance Networks. In: 23rd International Symposium on Computer Architecture, pp. 226–236. ACM, New York (1996)

    Google Scholar 

  14. Kim, J., Balfour, J., Dally, W.: Flattened Butterfly Topology for On-Chip Networks. In: 40th IEEE/ACM International Symposium on Microarchitecture, pp. 172–182. IEEE Computer Society, Washington, DC (2007)

    Chapter  Google Scholar 

  15. Lee, J.W., Ng, M.C., Asanovic, K.: Globally-Synchronized Frames for Guaranteed Quality-of-Service in On-Chip Networks. In: 35th International Symposium on Computer Architecture, pp. 89–100. IEEE Computer Society, Washington, DC (2008)

    Google Scholar 

  16. Marty, M.R., Hill, M.D.: Virtual Hierarchies to Support Server Consolidation. In: 34th International Symposium on Computer Architecture, pp. 46–56. ACM, New York (2007)

    Google Scholar 

  17. Muralimanohar, N., Balasubramonian, R., Jouppi, N.: Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0. In: 40th IEEE/ACM International Symposium on Microarchitecture, pp. 3–14. IEEE Computer Society, Washington, DC (2007)

    Chapter  Google Scholar 

  18. Mutlu, O., Moscibroda, T.: Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems. In: 35th International Symposium on Computer Architecture, pp. 63–74. IEEE Computer Society, Washington, DC (2008)

    Google Scholar 

  19. Mutlu, O., Moscibroda, T.: Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors. In: 40th IEEE/ACM International Symposium on Microarchitecture, pp. 146–160. IEEE Computer Society, Washington, DC (2007)

    Chapter  Google Scholar 

  20. Nesbit, K.J., Laudon, J., Smith, J.E.: Virtual Private Caches. In: 34th International Symposium on Computer Architecture, pp. 57–68. ACM, New York (2007)

    Google Scholar 

  21. NVIDIA Fermi architecture, http://www.nvidia.com/object/fermi_architecture.html

  22. Rijpkema, E., Goossens, K.G.W., Radulescu, A., Dielissen, J., van Meerbergen, J., Wielage, P., Waterlander, E.: Trade Offs in the Design of a Router with Both Guaranteed and Best-Effort Services for Networks on Chip. In: Conference on Design, Automation and Test in Europe, IEEE Computer Society, Washington, DC (2003)

    Google Scholar 

  23. Ristenpart, T., Tromer, E., Shacham, H., Savage, S.: Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds. In: 16th ACM Conference on Computer and Communications Security. ACM, New York (2009)

    Google Scholar 

  24. Shin, J., Tam, K., Huang, D., Petrick, B., Pham, H., Hwang, C., Li, H., Smith, A., Johnson, T., Schumacher, F., Greenhill, D., Leon, A., Strong, A.: A 40nm 16-core 128-thread CMT SPARC SoC Processor. In: IEEE International Solid-State Circuits Conference, pp. 98–99 (2010)

    Google Scholar 

  25. Suh, G.E., Devadas, S., Rudolph, L.: A New Memory Monitoring Scheme for Memory-Aware Scheduling and Partitioning. In: 8th International Symposium on High-Performance Computer Architecture, pp. 117–128. IEEE Computer Society, Washington, DC (2002)

    Chapter  Google Scholar 

  26. Tilera TILE-Gx100, http://www.tilera.com/products/TILE-Gx.php

  27. Wendel, D., Kalla, R., Cargoni, R., Clables, J., Friedrich, J., Frech, R., Kahle, J., Sinharoy, B., Starke, W., Taylor, S., Weitzel, S., Chu, S., Islam, S., Zyuban, V.: The Implementation of POWER7: A Highly Parallel and Scalable Multi-Core High-End Server Processor. In: IEEE International Solid-State Circuits Conference, pp. 102–103 (2010)

    Google Scholar 

  28. Zhang, L.: Virtual Clock: a New Traffic Control Algorithm for Packet Switching Networks. SIGCOMM Computer Communication Review 20(4), 19–29 (1990)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Grot, B., Keckler, S.W., Mutlu, O. (2011). Topology-Aware Quality-of-Service Support in Highly Integrated Chip Multiprocessors. In: Varbanescu, A.L., Molnos, A., van Nieuwpoort, R. (eds) Computer Architecture. ISCA 2010. Lecture Notes in Computer Science, vol 6161. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24322-6_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24322-6_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24321-9

  • Online ISBN: 978-3-642-24322-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics