research-article

An Associativity Threshold Phenomenon in Set-Associative Caches

Authors:
Michael A. Bender

Stony Brook University, Stony Brook, USA

Stony Brook University, Stony Brook, USA

0000-0001-7639-530X
View Profile

,
Rathish Das

University of Houston, Houston, USA

University of Houston, Houston, USA

0000-0002-2416-6422
View Profile

,
Martín Farach-Colton

Rutgers University, Piscataway, USA

Rutgers University, Piscataway, USA

0000-0003-3616-7788
View Profile

,
Guido Tagliavini

Rutgers University, Piscataway, USA

Rutgers University, Piscataway, USA

0000-0001-8493-1395
View Profile

SPAA '23: Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and ArchitecturesJune 2023Pages 117–127https://doi.org/10.1145/3558481.3591084

Published:17 June 2023Publication History

SPAA '23: Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures

Pages 117–127

ABSTRACT

In an α-way set-associative cache, the cache is partitioned into disjoint sets of size α, and each item can only be cached in one set, typically selected via a hash function. Set-associative caches are widely used and have many benefits, e.g., in terms of latency or concurrency, over fully associative caches, but they often incur more cache misses. As the set size α decreases, the benefits increase, but the paging costs worsen.

In this paper we characterize the performance of an α-way set-associative LRU cache of total size k, as a function of α = α(k). We prove the following, assuming that sets are selected using a fully random hash function: For α = ω(log k), the paging cost of an α-way set-associative LRU cache is within additive O(1) of that a fully-associative LRU cache of size (1-o(1))k, with probability 1 - 1 / poly (k), for all request sequences of length poly (k). For α = o(log k), and for all c = O(1) and r = O(1), the paging cost of an α-way set-associative LRU cache is not within a factor c of that a fully-associative LRU cache of size k/r, for some request sequence of length O(k1.01). For α = ω(log k), if the hash function can be occasionally changed, the paging cost of an α-way set-associative LRU cache is within a factor 1 + o(1) of that a fully-associative LRU cache of size (1-o(1))k, with probability 1 - 1/poly (k), for request sequences of arbitrary (e.g., super-polynomial) length. Some of our results generalize to other paging algorithms besides LRU, such as least-frequently used (LFU).

References

Dolev Adas, Gil Einziger, and Roy Friedman. 2022. Limited Associativity Makes Concurrent Software Caches a Breeze. In Proceedings of the 23rd International Conference on Distributed Computing and Networking (ICDCN). Association for Computing Machinery, New York, NY, USA, 87--96.Google ScholarDigital Library
Anant Agarwal, John Hennessy, and Mark Horowitz. 1988. Cache Performance of Operating System and Multiprogramming Workloads. ACM Trans. Comput. Syst., Vol. 6, 4 (November 1988), 393--431.Google ScholarDigital Library
Alok Aggarwal and S. Vitter, Jeffrey. 1988. The Input/Output Complexity of Sorting and Related Problems. Commun. ACM, Vol. 31, 9 (sep 1988), 1116--1127. https://doi.org/10.1145/48529.48535Google ScholarDigital Library
Kunal Agrawal, Michael A. Bender, Rathish Das, William Kuszmaul, Enoch Peserico, and Michele Scquizzato. 2020. Green Paging and Parallel Paging. In Proc. 32nd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 493--495.Google ScholarDigital Library
Kunal Agrawal, Michael A. Bender, Rathish Das, William Kuszmaul, Enoch Peserico, and Michele Scquizzato. 2021. Tight Bounds for Parallel Paging and Green Paging. In Proc. 32th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 3022--3041.Google ScholarCross Ref
Kunal Agrawal, Michael A Bender, Rathish Das, William Kuszmaul, Enoch Peserico, and Michele Scquizzato. 2022. Online Parallel Paging with Optimal Makespan. In Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures. 205--216.Google ScholarDigital Library
Kunal Agrawal, Michael A. Bender, and Jeremy T. Fineman. 2009. The Worst Page-Replacement Policy. Theory of Computing Systems, Vol. 44 (2 2009), 175--185. Issue 2. https://doi.org/10.1007/s00224-008-9114-1Google ScholarDigital Library
Kathirgamar Aingaran, Sumti Jairath, Georgios Konstadinidis, Serena Leung, Paul Loewenstein, Curtis McAllister, Stephen Phillips, Zoran Radovic, Ram Sivaramakrishnan, David Smentek, and Thomas Wicki. 2015. M7: Oracle's Next-Generation Sparc Processor. IEEE Micro, Vol. 35, 2 (2015), 36--45. https://doi.org/10.1109/MM.2015.35Google ScholarDigital Library
James Bell, David Casasent, and C. Gordon Bell. 1974. An Investigation of Alternative Cache Organizations. IEEE Trans. Comput., Vol. C-23, 4 (1974), 346--351.Google ScholarDigital Library
Michael A. Bender, Abhishek Bhattacharjee, Alex Conway, Martín Farach-Colton, Rob Johnson, Sudarsun Kannan, William Kuszmaul, Nirjhar Mukherjee, Don Porter, Guido Tagliavini, Janet Vorobyeva, and Evan West. 2021. Paging and the Address-Translation Problem. In Proceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). Association for Computing Machinery, New York, NY, USA, 105--117.Google ScholarDigital Library
Michael A. Bender, Rathish Das, Martín Farach-Colton, and Guido Tagliavini. 2023. An Associativity Threshold Phenomenon in Set-Associative Caches. arxiv: 2304.04954 [cs.DS]Google Scholar
Allan Borodin and Ran El-Yaniv. 1998. Online Computation and Competitive Analysis. Cambridge University Press, USA.Google ScholarDigital Library
Bill Bowhill, Blaine Stackhouse, Nevine Nassif, Zibing Yang, Arvind Raghavan, Charles Morganti, Chris Houghton, Dan Krueger, Olivier Franza, Jayen Desai, Jason Crop, Dave Bradley, Chris Bostak, Sal Bhimji, and Matt Becker. 2015. The Xeon® Processor E5--2600 v3: A 22nm 18-Core Product Family. In 2015 IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers. IEEE Computer Society, 1--3. https://doi.org/10.1109/ISSCC.2015.7062934Google ScholarCross Ref
Joan Boyar, Martin R. Ehmsen, Jens S. Kohrt, and Kim S. Larsen. 2010. A theoretical comparison of LRU and LRU-K. Acta Informatica, Vol. 47 (2010), 359--374.Google ScholarDigital Library
Joan Boyar, Lene M. Favrholdt, and Kim S. Larsen. 2007. The Relative Worst-Order Ratio Applied to Paging. J. Comput. Syst. Sci., Vol. 73, 5 (Aug. 2007), 818--843. https://doi.org/10.1016/j.jcss.2007.03.001Google ScholarDigital Library
Mark Brehob, Richard Enbody, Eric Torng, and Stephen Wagner. 2001. On-Line Restricted Caching. In Proceedings of the Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (Washington, D.C., USA). Society for Industrial and Applied Mathematics, USA, 374--383.Google Scholar
Niv Buchbinder, Shahar Chen, and Joseph (Seffi) Naor. 2014. Competitive Algorithms for Restricted Caching and Matroid Caching. In Proceedings of the 22nd Annual European Symposium on Algorithms (ESA),, Andreas S. Schulz and Dorothea Wagner (Eds.). Springer-Verlag, Berlin, Heidelberg, 209--221.Google ScholarCross Ref
Edward G. Coffman and Peter J. Denning. 1973. Operating Systems Theory. Prentice Hall Professional Technical Reference.Google ScholarDigital Library
Rathish Das, Kunal Agrawal, Michael A Bender, Jonathan Berry, Benjamin Moseley, and Cynthia A Phillips. 2020. How to Manage High-Bandwidth Memory Automatically. In Proceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 187--199.Google ScholarDigital Library
Daniel DeLayo, Kenny Zhang, Kunal Agrawal, Michael A Bender, Jonathan Berry, Rathish Das, Benjamin Moseley, and Cynthia A Phillips. 2022. Automatic HBM Management: Models and Algorithms. In Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).Google ScholarDigital Library
Reza Dorrigiv and Alejandro López-Ortiz. 2008. Closing the Gap Between Theory and Practice: New Measures for On-line Algorithm Analysis. In International Workshop on Algorithms and Computation (WALCOM). Springer-Verlag Berlin Heidelberg, 13--24.Google Scholar
Reza Dorrigiv, Alejandro López-Ortiz, and J. Ian Munro. 2009. On the Relative Dominance of Paging Algorithms. Theor. Comput. Sci., Vol. 410, 38--40 (Sept. 2009), 3694--3701. https://doi.org/10.1016/j.tcs.2009.04.023Google ScholarDigital Library
Devdatt Dubhashi and Alessandro Panconesi. 2009. Concentration of Measure for the Analysis of Randomized Algorithms 1st ed.). Cambridge University Press, USA.Google ScholarDigital Library
Amos Fiat, Richard M. Karp, Michael Luby, Lyle A. McGeoch, Daniel D. Sleator, and Neal E. Young. 1991. Competitive Paging Algorithms. Journal of Algorithms, Vol. 12, 4 (December 1991), 685--699. https://doi.org/10.1016/0196-6774(91)90041-VGoogle ScholarDigital Library
James D. Fix. 2003. The Set-Associative Cache Performance of Search Trees. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (Baltimore, Maryland). Society for Industrial and Applied Mathematics, USA, 565--572.Google ScholarDigital Library
M. Frigo, Charles E. Leiserson, Harald Prokop, and Sridhar Ramachandran. 1999. Cache-Oblivious Algorithms. In Proceedings of the 1999 IEEE 40th Annual Symposium on Foundations of Computer Science (FOCS). IEEE Computer Society, Los Alamitos, CA, USA, 285. https://doi.org/10.1109/SFFCS.1999.814600Google Scholar
John S. Harper, Darren J. Kerbyson, and Graham R. Nudd. 1999. Analytical Modeling of Set-Associative Cache Behavior. IEEE Trans. Comput., Vol. 48, 10 (October 1999), 1009--1024. https://doi.org/10.1109/12.805152Google ScholarDigital Library
Mark D. Hill. 1988. A Case for Direct-Mapped Caches. Computer, Vol. 21, 12 (December 1988), 25--40. https://doi.org/10.1109/2.16187Google ScholarDigital Library
Mark D. Hill and Alan J. Smith. 1989. Evaluating Associativity in CPU Caches. IEEE Trans. Comput., Vol. 38, 12 (December 1989), 1612--1630. https://doi.org/10.1109/12.40842Google ScholarDigital Library
Norman P. Jouppi. 1990. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. ACM SIGARCH Computer Architecture News, Vol. 18 (1990). Issue 2SI. https://doi.org/10.1145/325096.325162Google ScholarDigital Library
Anna R. Karlin, Mark S. Manasse, Larry Rudolph, and Daniel D. Sleator. 1986. Competitive Snoopy Caching. In Proceedings of the 27th Annual Symposium on Foundations of Computer Science (FOCS). 244--254. https://doi.org/10.1109/SFCS.1986.14Google ScholarDigital Library
Richard E. Kessler, R. Jooss, Alvin R. Lebeck, and Mark D. Hill. 1989. Inexpensive Implementations Of Set-Associativity. In The 16th Annual International Symposium on Computer Architecture (ISCA). IEEE Computer Society, 131--139. https://doi.org/10.1109/ISCA.1989.714547Google Scholar
William F. King. 1971. Analysis of Demand Paging Algorithms. In IFIP Congress.Google Scholar
Georgios K. Konstadinidis, Hongping Penny Li, Francis Schumacher, Venkat Krishnaswamy, Hoyeol Cho, Sudesna Dash, Robert P. Masleid, Chaoyang Zheng, Yuanjung David Lin, Paul Loewenstein, Heechoul Park, Vijay Srinivasan, Dawei Huang, Changku Hwang, Wenjay Hsu, Curtis McAllister, Jeff Brooks, Ha Pham, Sebastian Turullols, Yifan Yanggong, Robert Golla, Alan P. Smith, and Ali Vahidsafa. 2016. SPARC M7: A 20 nm 32-Core 64 MB L3 Cache Processor. IEEE Journal of Solid-State Circuits, Vol. 51, 1 (2016), 79--91. https://doi.org/10.1109/JSSC.2015.2456902Google ScholarCross Ref
Nasser Kurd, Muntaquim Chowdhury, Edward Burton, Thomas P. Thomas, Christopher Mozak, Brent Boswell, Manoj Lal, Anant Deval, Jonathan Douglas, Mahmoud Elassal, Ankireddy Nalamalpu, Timothy M. Wilson, Matthew Merten, Srinivas Chennupaty, Wilfred Gomes, and Rajesh Kumar. 2014. Haswell: A family of IA 22nm processors. In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC). IEEE Computer Society, 112--113. https://doi.org/10.1109/ISSCC.2014.6757361Google ScholarCross Ref
Mark S. Manasse, Lyle A. McGeoch, and Daniel D. Sleator. 1990. Competitive Algorithms for Server Problems. J. Algorithms, Vol. 11, 2 (may 1990), 208--230. https://doi.org/10.1016/0196-6774(90)90003-WGoogle ScholarDigital Library
M. Mendel and Steven S. Seiden. 2004. Online Companion Caching. Theoretical Computer Science, Vol. 324, 2-3 (September 2004), 183--200. https://doi.org/10.1016/j.tcs.2004.05.015Google ScholarDigital Library
Nima Mousavi. 2012. How tight is Chernoff bound? Notes.Google Scholar
Elizabeth J. O'Neil, Patrick E. O'Neil, and Gerhard Weikum. 1993. The LRU-K Page Replacement Algorithm for Database Disk Buffering. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (Washington, D.C., USA). Association for Computing Machinery, New York, NY, USA, 297--306. https://doi.org/10.1145/170035.170081Google ScholarDigital Library
Elizabeth J. O'Neil, Patrick E. O'Neil, and Gerhard Weikum. 1999. An Optimality Proof of the LRU-K Page Replacement Algorithm. J. ACM, Vol. 46, 1 (January 1999), 92--112. https://doi.org/10.1145/300515.300518Google ScholarDigital Library
Enoch Peserico. 2003. Online Paging with Arbitrary Associativity. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (Baltimore, Maryland). Society for Industrial and Applied Mathematics, USA, 555--564.Google ScholarDigital Library
S. Prybylski, M. Horowitz, and J. Hennessy. 1988. Performance Tradeoffs in Cache Design. In Proceedings of the 15th Annual International Symposium on Computer Architecture (ISCA) (Honolulu, Hawaii, USA). IEEE Computer Society, Washington, DC, USA, 290--298.Google Scholar
Moinuddin K. Qureshi. 2018. CEASER: Mitigating Conflict-Based Cache Attacks via Encrypted-Address and Remapping. In Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (Fukuoka, Japan). IEEE Press, 775--787. https://doi.org/10.1109/MICRO.2018.00068Google ScholarDigital Library
Moinuddin K. Qureshi. 2019. New Attacks and Defense for Encrypted-Address Cache. In Proceedings of the 46th International Symposium on Computer Architecture (ISCA) (Phoenix, Arizona). Association for Computing Machinery, New York, NY, USA, 360--371. https://doi.org/10.1145/3307650.3322246Google ScholarDigital Library
Gururaj S. Rao. 1978. Performance Analysis of Cache Memories. J. ACM, Vol. 25, 3 (July 1978), 378--395. https://doi.org/10.1145/322077.322081Google ScholarDigital Library
RocksDB. 2022. Block Cache. RocksDB wiki. https://github.com/facebook/rocksdb/wiki/Block-Cache Last accessed: 2023-01-09.Google Scholar
Efraim Rotem, Adi Yoaz, Lihu Rappoport, Stephen J. Robinson, Julius Yuli Mandelblat, Arik Gihon, Eliezer Weissmann, Rajshree Chabukswar, Vadim Basin, Russell Fenger, Monica Gupta, and Ahmad Yasin. 2022. Intel Alder Lake CPU Architectures. IEEE Micro, Vol. 42, 3 (2022), 13--19. https://doi.org/10.1109/MM.2022.3164338Google ScholarCross Ref
Peter Sanders. 1999. Accessing Multiple Sequences Through Set Associative Caches. In Proceedings of the 26th International Colloquium on Automata, Languages and Programming (ICALP). Springer-Verlag, Berlin, Heidelberg, 655--664.Google ScholarDigital Library
Rathijit Sen and David A. Wood. 2013. Reuse-Based Online Models for Caches. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS). Association for Computing Machinery, New York, NY, USA, 279--292.Google Scholar
Sandeep Sen, Siddhartha Chatterjee, and Neeraj Dumir. 2002. Towards a Theory of Cache-Efficient Algorithms. J. ACM, Vol. 49, 6 (November 2002), 828--858. https://doi.org/10.1145/602220.602225Google ScholarDigital Library
Daniel D. Sleator and Robert E. Tarjan. 1985. Amortized Efficiency of List Update and Paging Rules. Commun. ACM, Vol. 28, 2 (February 1985), 202--208. https://doi.org/10.1145/2786.2793Google ScholarDigital Library
Alan J. Smith. 1976. On the Effectiveness of Set Associative Page Mapping and Its Application to Main Memory Management. In Proceedings of the 2nd International Conference on Software Engineering (ICSE) (San Francisco, California, USA). IEEE Computer Society Press, Washington, DC, USA, 286--292.Google ScholarDigital Library
Alan J. Smith. 1978. A Comparative Study of Set Associative Memory Mapping Algorithms and Their Use for Cache and Main Memory. IEEE Transactions on Software Engineering, Vol. 4, 2 (March 1978), 121--130. https://doi.org/10.1109/TSE.1978.231482Google ScholarDigital Library
Simon M. Tam, Harry Muljono, Min Huang, Sitaraman Iyer, Kalapi Royneogi, Nagmohan Satti, Rizwan Qureshi, Wei Chen, Tom Wang, Hubert Hsieh, Sujal Vora, and Eddie Wang. 2018. SkyLake-SP: A 14nm 28-Core Xeon® Processor. In 2018 IEEE International Solid-State Circuits Conference (ISSCC). IEEE Computer Society, 34--36. https://doi.org/10.1109/ISSCC.2018.8310170Google ScholarCross Ref
Nigel Topham and Antonio González. 1999. Randomized Cache Placement for Eliminating Conflicts. IEEE Trans. Comput., Vol. 48, 2 (February 1999), 185--192. https://doi.org/10.1109/12.752660Google ScholarDigital Library
David Wajc. 2017. Negative Association - Definition, Properties, and Applications.Google Scholar
Xiaoya Xiang, Chen Ding, Hao Luo, and Bin Bao. 2013. HOTL: A Higher Order Theory of Locality. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Association for Computing Machinery, New York, NY, USA, 343--356.Google ScholarDigital Library
Neal Young. 1992. Competitive Paging and Dual-Guided on-Line Weighted Caching and Watching Algorithms. Ph.D. Dissertation. Princeton University, USA.Google Scholar
Neil Young. 1994. The k-Server Dual and Loose Competitiveness for Paging. Algorithmica, Vol. 11, 6 (1994), 525--541. https://doi.org/10.1007/BF01189992Google ScholarDigital Library
Neil Young. 2002. On-Line File Caching. Algorithmica, Vol. 33, 3 (2002), 371--383. https://doi.org/10.1007/s00453-001-0124-5Google ScholarCross Ref

Index Terms

An Associativity Threshold Phenomenon in Set-Associative Caches
1. Theory of computation
  1. Design and analysis of algorithms
    1. Online algorithms
      1. Caching and paging algorithms

Recommendations

Efficient evaluation of arbitrary set-associative caches on multiprocessors
SPDP '92: Proceedings of the 1992 Fourth IEEE Symposium on Parallel and Distributed Processing

The authors propose a simple solution to the problem of efficient stack evaluation of LRU (least recently used) cache memories with an arbitrary two's power set-associativity on multiprocessors. It is an extension of stack evaluation techniques for all-...
Read More
Reactive-Associative Caches
PACT '01: Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques

Abstract: While set-associative caches typically incur fewer misses than direct-mapped caches, set-associative caches have slower hit times. We propose the reactive-associative cache (r-a cache), which provides flexible associativity by placing most ...
Read More
Optimal Worst Case Formulas Comparing Cache Memory Associativity

In this paper we derive a worst case formula comparing the number of cache hits for two different cache memories. From this various other bounds for cache memory performance may be derived.

Consider an arbitrary program P which is to be executed on a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SPAA '23: Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures
June 2023
504 pages
ISBN:9781450395458
DOI:10.1145/3558481
General Chair:
Kunal Agrawal
Washington University in St. Louis, USA
,
Program Chair:
Julian Shun
MIT, USA
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 June 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
lru
paging
set-associative cache
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate447of1,461submissions,31%
Upcoming Conference
SPAA '24

Sponsor:

sigact

sigact

36th ACM Symposium on Parallelism in Algorithms and Architectures

June 17 - 21, 2024

Nantes , France
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 52
  Total Downloads
- Downloads (Last 12 months)52
- Downloads (Last 6 weeks)10
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

An Associativity Threshold Phenomenon in Set-Associative Caches

SPAA '23: Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures

ABSTRACT

References

Cited By

Index Terms

Recommendations

Efficient evaluation of arbitrary set-associative caches on multiprocessors

Reactive-Associative Caches

Optimal Worst Case Formulas Comparing Cache Memory Associativity