A Survey of Techniques for Cache Partitioning in Multicore Processors

Author:
Sparsh Mittal

Oak Ridge National Laboratory

Oak Ridge National Laboratory

0000-0002-2908-993X
View Profile

Authors Info & Claims

ACM Computing Surveys Volume 50 Issue 2Article No.: 27pp 1–39https://doi.org/10.1145/3062394

Published:10 May 2017Publication History

ACM Computing Surveys

Abstract

As the number of on-chip cores and memory demands of applications increase, judicious management of cache resources has become not merely attractive but imperative. Cache partitioning, that is, dividing cache space between applications based on their memory demands, is a promising approach to provide capacity benefits of shared cache with performance isolation of private caches. However, naively partitioning the cache may lead to performance loss, unfairness, and lack of quality-of-service guarantees. It is clear that intelligent techniques are required for realizing the full potential of cache partitioning. In this article, we present a survey of techniques for partitioning shared caches in multicore processors. We categorize the techniques based on important characteristics and provide a bird’s eye view of the field of cache partitioning.

References

Manu Awasthi, Kshitij Sudan, Rajeev Balasubramonian, and John Carter. 2009. Dynamic hardware-assisted software-controlled page placement to manage capacity allocation and sharing within large caches. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’09). 250--261.Google ScholarCross Ref
Nathan Beckmann and Daniel Sanchez. 2016. Modeling cache performance beyond LRU. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’16).Google ScholarCross Ref
Ramazan Bitirgen, Engin Ipek, and Jose F. Martinez. 2008. Coordinated management of multiple interacting resources in chip multiprocessors: A machine learning approach. In Proceedings of the International Symposium on Microarchitecture. 318--329. Google ScholarDigital Library
Jacob Brock, Chencheng Ye, Chen Ding, Yechen Li, Xiaolin Wang, and Yingwei Luo. 2015. Optimal cache partition-sharing. In Proceedings of the International Conference on Parallel Processing (ICPP’15). 749--758. Google ScholarDigital Library
J. Chang and G. S. Sohi. 2007. Cooperative cache partitioning for chip multiprocessors. In Proceedings of the International Conference on Supercomputing. 242--252. Google ScholarDigital Library
Jichuan Chang and Gurindar S. Sohi. 2006. Cooperative caching for chip multiprocessors. In Proceedings of the International Symposium on Computer Architecture (ISCA’06). 264--276. Google ScholarDigital Library
Derek Chiou, Prabhat Jain, Larry Rudolph, and Srinivas Devadas. 2000. Application-specific memory management for embedded systems using software-controlled caches. In Proceedings of the Design Automation Conference. 416--419. Google ScholarDigital Library
Pat Conway, Nathan Kalyanasundharam, Gregg Donley, Kevin Lepak, and Bill Hughes. 2010. Cache hierarchy and memory subsystem of the AMD opteron processor. IEEE Micro. 30, 2 (2010), 16--29. Google ScholarDigital Library
Henry Cook, Miquel Moreto, Sarah Bird, Khanh Dao, David A. Patterson, and Krste Asanovic. 2013. A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness. In Proceedings of the International Symposium on Computer Architecture (ISCA’13). 308--319. Google ScholarDigital Library
Zehan Cui, Licheng Chen, Yungang Bao, and Mingyu Chen. 2014. A swap-based cache set index scheme to leverage both superpage and page coloring optimizations. In Proceedings of the Design Automation Conference. 1--6. Google ScholarDigital Library
Nam Duong, Dali Zhao, Taesu Kim, Rosario Cammarota, Mateo Valero, and Alexander V. Veidenbaum. 2012. Improving cache management policies using dynamic reuse distances. In Proceedings of the International Symposium on Microarchitecture. 389--400. Google ScholarDigital Library
Haakon Dybdahl and Per Stenstrom. 2007. An adaptive shared/private NUCA cache partitioning scheme for chip multiprocessors. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’07). 2--12. Google ScholarDigital Library
Saurabh Gupta and Huiyang Zhou. 2015. Spatial locality-aware cache partitioning for effective cache sharing. In Proceedings of the International Conference on Parallel Processing (ICPP’15). 150--159. Google ScholarDigital Library
Prateek D. Halwe, Shirshendu Das, and Hemangee K. Kapoor. 2013. Towards a better cache utilization using controlled cache partitioning. In Proceedings of 2013 IEEE 11th International Conference on Dependable, Autonomic and Secure Computing (DASC’13). 179--186. Google ScholarDigital Library
William Hasenplaugh, Pritpal S. Ahuja, Aamer Jaleel, Simon Steely Jr., and Joel Emer. 2012. The gradient-based cache partitioning algorithm. ACM Trans. Architect. Code Optim. (TACO) 8, 4 (2012), 44. Google ScholarDigital Library
Andrew Herdrich, Edwin Verplanke, Priya Autee, Ramesh Illikkal, Chris Gianos, Ronak Singhal, and Ravi Iyer. 2016. Cache QoS: From concept to reality in the Intel Xeon processor E5-2600 v3 product family. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’16). 657--668.Google ScholarCross Ref
Enric Herrero, José González, and Ramon Canal. 2010. Elastic cooperative caching: An autonomous dynamically adaptive memory hierarchy for chip multiprocessors. In Proceedings of the International Symposium on Computer Architecture (ISCA’10). 419--428. Google ScholarDigital Library
Lisa R. Hsu, Steven K. Reinhardt, Ravishankar Iyer, and Srihari Makineni. 2006. Communist, utilitarian, and capitalist cache policies on CMPs: Caches as a shared resource. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’06). 13--22. Google ScholarDigital Library
Ravi Iyer. 2004. CQoS: A framework for enabling QoS in shared caches of CMP platforms. In Proceedings of the International Conference on Supercomputing. 257--266. Google ScholarDigital Library
Ravi Iyer, Li Zhao, Fei Guo, Ramesh Illikkal, Srihari Makineni, Don Newell, Yan Solihin, Lisa Hsu, and Steve Reinhardt. 2007. QoS policies and architecture for cache/memory in CMP platforms. ACM SIGMETRICS Perform. Eval. Rev. 35, 1 (2007), 25--36. Google ScholarDigital Library
Aamer Jaleel, William Hasenplaugh, Moinuddin Qureshi, Julien Sebot, Simon Steely Jr., and Joel Emer. 2008. Adaptive insertion policies for managing shared caches. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’08). 208--219. Google ScholarDigital Library
Xinxin Jin, Haogang Chen, Xiaolin Wang, Zhenlin Wang, Xiang Wen, Yingwei Luo, and Xiaoming Li. 2009. A simple cache partitioning approach in a virtualized environment. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA’09). 519--524.Google ScholarCross Ref
Jongpil Jung, Seonpil Kim, and Chong-Min Kyung. 2010. Latency-aware utility-based NUCA cache partitioning in 3D-stacked multi-processor systems. In Proceedings of the VLSI System on Chip Conference (VLSI-SoC’10). 125--130.Google Scholar
Mahmut Kandemir, Ramya Prabhakar, Mustafa Karakoy, and Yuanrui Zhang. 2011a. Multilayer cache partitioning for multiprogram workloads. In Proceedings of the European Conference on Parallel Processing. 130--141. Google ScholarDigital Library
Mahmut Kandemir, Taylan Yemliha, and Emre Kultursay. 2011b. A helper thread based dynamic cache partitioning scheme for multithreaded applications. In Proceedings of the Design Automation Conference. 954--959. Google ScholarDigital Library
Dimitris Kaseridis, Muhammad Faisal Iqbal, and Lizy Kurian John. 2014. Cache friendliness-aware managementof shared last-level caches for high performance multi-core systems. IEEE Trans. Comput. 63, 4 (2014), 874--887. Google ScholarDigital Library
Dimitris Kaseridis, J. Stuecheli, Jian Chen, and Lizy K. John. 2010. A bandwidth-aware memory-subsystem resource management using non-invasive resource profilers for large CMP systems. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’10). 1--11.Google Scholar
Dimitris Kaseridis, Jeffrey Stuecheli, and Lizy K. John. 2009. Bank-aware dynamic cache partitioning for multicore architectures. In Proceedings of the International Conference on Parallel Processing (ICPP’09). 18--25. Google ScholarDigital Library
Harshad Kasture and Daniel Sanchez. 2014. Ubik: Efficient cache sharing with strict qos for latency-critical workloads. In Proceedings of the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14). 729--742. Google ScholarDigital Library
Kamil Kedzierski, Miquel Moreto, Francisco J. Cazorla, and Mateo Valero. 2010. Adapting cache partitioning algorithms to pseudo-LRU replacement policies. In Proceedings of the 2010 IEEE International Symposium on Parallel 8 Distributed Processing (IPDPS’10). 1--12.Google ScholarCross Ref
Samira Khan, Alaa R. Alameldeen, Chris Wilkerson, Onur Mutlu, and Daniel A. Jiménez. 2014. Improving cache performance by exploiting read-write disparity. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’14).Google Scholar
Seongbeom Kim, Dhruba Chandra, and Yan Solihin. 2004. Fair cache sharing and partitioning in a chip multiprocessor architecture. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’04). 111--122. Google ScholarDigital Library
I. Kotera, K. Abe, R. Egawa, H. Takizawa, and H. Kobayashi. 2011. Power-aware dynamic cache partitioning for CMPs. Trans. HiPEAC (2011), 135--153. Google ScholarDigital Library
Vivek Kozhikkottu, Abhisek Pan, Vijay Pai, Sujit Dey, and Anand Raghunathan. 2014. Variation aware cache partitioning for multithreaded programs. In Proceedings of the Design Automation Conference. 1--6. Google ScholarDigital Library
Hyunjin Lee, Sangyeun Cho, and Bruce R. Childers. 2011. CloudCache: Expanding and shrinking private caches. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’11). 219--230. Google ScholarDigital Library
Jaekyu Lee and Hyesoon Kim. 2012. TAP: A TLP-aware cache management policy for a CPU-GPU heterogeneous architecture. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’12). 1--12. Google ScholarDigital Library
J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. 2008. Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’08). 367--378.Google Scholar
J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. 2009. Enabling software multicore cache management with lightweight hardware support. In Proceedings of the Conference on Supercomputing (SC). Google ScholarDigital Library
Xing Lin and Rajeev Balasubramonian. 2011. Refining the utility metric for utility-based cache partitioning. In 9th Annual Workshop on Duplicating, Deconstructing, and Debunking (WDDD), in conjunction with the 38th International Symposium on Computer Architecture (ISCA-38) (2011).Google Scholar
Fang Liu, Xiaowei Jiang, and Yan Solihin. 2010. Understanding how off-chip memory bandwidth partitioning in chip multiprocessors affects system performance. In Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA’10). 1--12.Google Scholar
Lei Liu, Yong Li, Zehan Cui, Yungang Bao, Mingyu Chen, and Chengyong Wu. 2014. Going vertical in memory management: Handling multiplicity by multi-policy. In Proceedings of the International Symposium on Computer Architecture (ISCA’14). 169--180. Google ScholarDigital Library
David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, and Christos Kozyrakis. 2016. Improving resource efficiency at scale with heracles. ACM Trans. Comput. Syst. (TOCS) 34, 2 (2016), 6. Google ScholarDigital Library
Richard L. Mattson, Jan Gecsei, Donald R. Slutz, and Irving L. Traiger. 1970. Evaluation techniques for storage hierarchies. IBM Systems Journal 9, 2 (1970), 78--117. Google ScholarDigital Library
Intel Corporation. 2016. Intel 64 and IA-32 Architectures Developer’s Manual: Vol. 3B, System Programming Guide, Part 2. Retrieved from http://goo.gl/sw24WL.Google Scholar
OSU-CSE News. 2010. Intel Puts OSU-CSE Inside. Retrieved from http://web.cse.ohio-state.edu/news/news118.shtml.Google Scholar
Vineeth Mekkat, Anup Holey, Pen-Chung Yew, and Antonia Zhai. 2013. Managing shared last-level cache in a heterogeneous multicore processor. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’13). 225--234. Google ScholarDigital Library
Sparsh Mittal. 2016a. A survey of architectural techniques for managing process variation. Comput. Surveys 48, 4 (2016), 54:1--54:29. Google ScholarDigital Library
Sparsh Mittal. 2016b. A survey of cache bypassing techniques. J. Low Power Elect. Appl. 6, 2 (2016), 5:1--5:30.Google Scholar
Sparsh Mittal, Yanan Cao, and Zhao Zhang. 2014a. MASTER: A multicore cache energy saving technique using dynamic cache reconfiguration. IEEE Trans. VLSI Syst. 22, 8 (2014), 1653--1665.Google ScholarCross Ref
Sparsh Mittal, Matthew Poremba, Jeffrey Vetter, and Yuan Xie. 2014b. Exploring Design Space of 3D NVM and eDRAM Caches Using DESTINY Tool. Technical Report ORNL/TM-2014/636. Oak Ridge National Laboratory, USA.Google Scholar
Sparsh Mittal and Jeffrey Vetter. 2015. A survey of CPU-GPU heterogeneous computing techniques. Comput. Surveys 47, 4 (2015), 69:1--69:35. Google ScholarDigital Library
Sparsh Mittal and Jeffrey Vetter. 2016. A survey of techniques for architecting DRAM caches. IEEE Trans. Parallel. Distrib. Syst. (TPDS) 27, 6 (2016), 1852--1863.Google ScholarDigital Library
Sparsh Mittal, Jeffrey S. Vetter, and Dong Li. 2015. A survey of architectural approaches for managing embedded DRAM and non-volatile on-chip caches. IEEE Trans. Parallel Distrib. Syst. (TPDS) 26, 6 (2015), 1524--1537.Google ScholarDigital Library
Sparsh Mittal and Zhao Zhang. 2013. MANAGER: A Multicore Shared Cache Energy Saving Technique for QoS Systems. Technical Report. Iowa State University.Google Scholar
Miquel Moreto, Francisco J. Cazorla, Alex Ramirez, Rizos Sakellariou, and Mateo Valero. 2009. FlexDCP: A QoS framework for CMP architectures. ACM SIGOPS Operat. Syst. Rev. 43, 2 (2009), 86--96. Google ScholarDigital Library
Miquel Moreto, Francisco J Cazorla, Alex Ramirez, and Mateo Valero. 2008. MLP-aware dynamic cache partitioning. In High Performance Embedded Architectures and Compilers. 337--352. Google ScholarDigital Library
Sai Prashanth Muralidhara, Mahmut Kandemir, and Padma Raghavan. 2010. Intra-application cache partitioning. In Proceedings of the 2010 IEEE International Symposium on Parallel 8 Distributed Processing (IPDPS’10). 1--12.Google ScholarCross Ref
Konstantinos Nikas, Matthew Horsnell, and Jim Garside. 2008. An adaptive Bloom filter cache partitioning scheme for multicore architectures. In Proceedings of the International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS’08). 25--32.Google ScholarCross Ref
Taecheol Oh, Kiyeon Lee, and Sangyeun Cho. 2011. An analytical performance model for co-management of last-level cache and bandwidth sharing. In Proceedings of the 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS’11). 150--158. Google ScholarDigital Library
Abhisek Pan and Vijay S. Pai. 2013. Imbalanced cache partitioning for balanced data-parallel programs. In Proceedings of the Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’13). 297--309. Google ScholarDigital Library
Pavlos Petoumenos, Georgios Keramidas, Håkan Zeffer, Stefanos Kaxiras, and Erik Hagersten. 2006. StatShare: A statistical model for managing cache sharing via decay. In Proceedings of the Workshop on Modeling, Benchmarking and Simulation (MoBS’06).Google Scholar
Miquel Moreto Planas, Francisco Cazorla, Alex Ramirez, and Mateo Valero. 2007. Explaining dynamic cache partitioning speed ups. IEEE Comput. Arch. Lett. 6, 1 (2007), 1--4. Google ScholarDigital Library
Moinuddin K. Qureshi, Aamer Jaleel, Yale N. Patt, Simon C. Steely, and Joel Emer. 2007. Adaptive insertion policies for high performance caching. In Proceedings of the International Symposium on Computer Architecture (2007), 381--391. Google ScholarDigital Library
Moinuddin K. Qureshi and Yale N. Patt. 2006. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture. 423--432. Google ScholarDigital Library
Nauman Rafique, Won-Taek Lim, and Mithuna Thottethodi. 2006. Architectural support for operating system-driven CMP cache management. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. 2--12. Google ScholarDigital Library
R. Reddy and P. Petrov. 2010. Cache partitioning for energy-efficient and interference-free embedded multitasking. ACM Trans. Embed. Comput. Syst. (TECS) 9, 3 (2010), 16. Google ScholarDigital Library
Daniel Sanchez and Christos Kozyrakis. 2010. The ZCache: Decoupling ways and associativity. In Proceedings of the International Symposium on Microarchitecture. 187--198. Google ScholarDigital Library
D. Sanchez and C. Kozyrakis. 2011. Vantage: Scalable and efficient fine-grain cache partitioning. In Proceedings of the International Symposium on Computer Architecture. 57--68. Google ScholarDigital Library
Alex Settle, Dan Connors, Enric Gibert, and Antonio González. 2006. A dynamically reconfigurable cache for multithreaded processors. J. Embed. Comput. 2, 2 (2006), 221--233. Google ScholarDigital Library
Shekhar Srikantaiah, Reetuparna Das, Asit K. Mishra, Chita R. Das, and Mahmut Kandemir. 2009a. A case for integrated processor-cache partitioning in chip multiprocessors. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC’09). 6. Google ScholarDigital Library
Shekhar Srikantaiah, Mahmut Kandemir, and Qian Wang. 2009b. SHARP control: Controlled shared cache management in chip multiprocessors. In Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42’09). 517--528. Google ScholarDigital Library
H. S. Stone, J. Turek, and J. L. Wolf. 1992. Optimal partitioning of cache memory. IEEE Trans. Comput. 41, 9 (1992), 1054--1068. Google ScholarDigital Library
Lavanya Subramanian, Vivek Seshadri, Arnab Ghosh, Samira Khan, and Onur Mutlu. 2015. The application slowdown model: Quantifying and controlling the impact of inter-application interference at shared caches and main memory. In Proceedings of the Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’15). 62--75. Google ScholarDigital Library
G. Edward Suh, Srinivas Devadas, and Larry Rudolph. 2001. Analytical cache models with applications to cache partitioning. In Proceedings of the International Conference on Supercomputing. 1--12. Google ScholarDigital Library
G. E. Suh, L. Rudolph, and S. Devadas. 2004. Dynamic partitioning of shared cache memory. J. Supercomput. 28, 1 (2004), 7--26. Google ScholarDigital Library
Vivy Suhendra and Tulika Mitra. 2008. Exploring locking 8 partitioning for predictable shared caches on multi-cores. In Proceedings of the Design Automation Conference. 300--303. Google ScholarDigital Library
Karthik T. Sundararajan, Vasileios Porpodas, Timothy M. Jones, Nigel P. Topham, and Bjorn Franke. 2012. Cooperative partitioning: Energy-efficient cache partitioning for high-performance CMPs. In Proceedings of the International Symposium on High-Performance Computer Architecture, (2012), 1--12. Google ScholarDigital Library
David Tam, Reza Azimi, Livio Soares, and Michael Stumm. 2007. Managing shared L2 caches on multicore systems in software. In Proceedings of the Workshop on the Interaction between Operating Systems and Computer Architecture. 26--33.Google Scholar
Keshavan Varadarajan, S. K. Nandy, Vishal Sharda, Amrutur Bharadwaj, Ravi Iyer, Srihari Makineni, and Donald Newell. 2006. Molecular caches: A caching structure for dynamic creation of application-specific heterogeneous cache regions. In Proceedings of the Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’06). 433--442. Google ScholarDigital Library
Ruisheng Wang and Lizhong Chen. 2014. Futility scaling: High-associativity cache partitioning. In Proceedings of the Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’14). 356--367. Google ScholarDigital Library
Xiaorui Wang, Kai Ma, and Yefu Wang. 2012. Cache latency control for application fairness or differentiation in power-constrained chip multiprocessors. IEEE Trans. Comput. 61, 10 (2012), 1371--1385. Google ScholarDigital Library
Xiaodong Wang and José F. Martínez. 2015. XChange: A market-based approach to scalable dynamic multi-resource allocation in multicore architectures. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’15). 113--125.Google Scholar
Y. Xie and G. H. Loh. 2009. PIPP: Promotion/insertion pseudo-partitioning of multi-core shared caches. In ACM SIGARCH Computer Architecture News, Vol. 37. ACM, 174--183. Google ScholarDigital Library
Yuejian Xie and Gabriel H. Loh. 2010. Scalable shared-cache management by containing thrashing workloads. In Proceedings of the International Conference on High-Performance Embedded Architectures and Compilers. 262--276. Google ScholarDigital Library
Ying Ye, Richard West, Zhuoqun Cheng, and Ye Li. 2014. Coloris: A dynamic cache partitioning system using page coloring. In Proceedings of the International Conference on Parallel Architectures and Compilation. 381--392. Google ScholarDigital Library
Thomas Y. Yeh and Glenn Reinman. 2005. Fast and fair: Data-stream quality of service. In Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES’05). 237--248. Google ScholarDigital Library
Chenjie Yu and Peter Petrov. 2010. Off-chip memory bandwidth minimization through cache partitioning for multi-core platforms. In Proceedings of the Design Automation Conference. 132--137. Google ScholarDigital Library
Heechul Yun and Prathap Kumar Valsan. 2015. Evaluating the isolation effect of cache partitioning on COTS multicore platforms. In Proceedings of the 11th Annual Workshop on Operating Systems Platforms for Embedded Real-Time Applications (OSPERT’15). 45.Google Scholar
Dongyuan Zhan, Hong Jiang, and Sharad C. Seth. 2014. CLU: Co-optimizing locality and utility in thread-aware capacity management for shared last level caches. IEEE Trans. Comput. 63, 7 (2014), 1656--1667. Google ScholarDigital Library
Xiao Zhang, Sandhya Dwarkadas, and Kai Shen. 2009. Towards practical page coloring-based multicore cache management. In Proceedings of the European Conference on Computer Systems. 89--102. Google ScholarDigital Library
Miao Zhou, Yu Du, Bruce Childers, Rami Melhem, and Daniel Mossé. 2012. Writeback-aware partitioning and replacement for last-level caches in phase change main memory systems. ACM Trans. Arch. Code Optim. (TACO) 8, 4 (2012), 53. Google ScholarDigital Library
Miao Zhou, Yu Du, Bruce Childers, Daniel Mosse, and Rami Melhem. 2016. Symmetry-agnostic coordinated management of the memory hierarchy in multicore systems. ACM Trans. Arch. Code Optim. (TACO) 12, 4 (2016), 61. Google ScholarDigital Library

Index Terms

A Survey of Techniques for Cache Partitioning in Multicore Processors
1. Computer systems organization
2. General and reference
  1. Document types
    1. Surveys and overviews

Recommendations

Vantage: scalable and efficient fine-grain cache partitioning
ISCA '11: Proceedings of the 38th annual international symposium on Computer architecture

Cache partitioning has a wide range of uses in CMPs, from guaranteeing quality of service and controlled sharing to security-related techniques. However, existing cache partitioning schemes (such as way-partitioning) are limited to coarse-grain ...
Read More
High performance cache replacement using re-reference interval prediction (RRIP)
ISCA '10

Practical cache replacement policies attempt to emulate optimal replacement by predicting the re-reference interval of a cache block. The commonly used LRU replacement policy always predicts a near-immediate re-reference interval on cache hits and ...
Read More
High performance cache replacement using re-reference interval prediction (RRIP)
ISCA '10: Proceedings of the 37th annual international symposium on Computer architecture

Practical cache replacement policies attempt to emulate optimal replacement by predicting the re-reference interval of a cache block. The commonly used LRU replacement policy always predicts a near-immediate re-reference interval on cache hits and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Computing Surveys Volume 50, Issue 2
March 2018
567 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/3071073
Editor:
Sartaj Sahni
Department of Computer and Information Science and Engineering / University of Florida / Gainesville, FL 32611
Issue’s Table of Contents
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 May 2017
- Accepted: 1 March 2017
- Revised: 1 January 2017
- Received: 1 July 2016
Published in csur Volume 50, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
QoS
Review
classification
fairness
multicore processor
partitioning
shared cache
Qualifiers
- survey
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 32
  Total Citations
  View Citations
- 2,928
  Total Downloads
- Downloads (Last 12 months)509
- Downloads (Last 6 weeks)75
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.