ABSTRACT
To reduce the increasing cost of building and operating cloud data centers, cloud providers are seeking various mechanisms to achieve higher resource effectiveness. For example, cloud operators are leveraging dynamic resource management techniques to consolidate a higher density of application workloads into commodity physical servers to maximize server resource utilization. However, higher workload density is a major source of performance interference problems in multi-tenant clouds. Existing performance isolation techniques such as dedicated CPU cores for specific workloads are not enough as there are still common resource (e.g., last-level cache and memory bandwidth in memory subsystem) on the processor that are shared among all CPUs on the same NUMA node. While prior work has proposed a variety of resource partitioning techniques, it still remains unexplored to characterize the impact of memory subsystem resource partitioning for the consolidated workloads with different priorities and investigate software support to dynamically manage memory subsystem resource sharing in a real-time manner. To bridge the gap, we propose Themis, a feedback-based controller that enables a priority-aware and fairness-aware memory subsystem resource management strategy to guarantee the performance of high-priority workloads while maintaining fairness across all colocated workloads in high-density clouds. Themis is evaluated with multiple typical cloud applications in our data center environment. The results show that Themis improves the performance of various workloads by up to 3.15%, and fairness by more than 70% in memory subsystem resource allocation compared to existing state-of-the-art work.
- A. Beitch, B. Liu, T. Yung, R. Griffith, A. Fox, D. A. Patterson, and A. Beitch. 2010. A Workload Generation Toolkit for Cloud Computing Applications. (2010).Google Scholar
- Ruobing Chen, Jinping Wu, Haosen Shi, Yusen Li, Xiaoguang Liu, and Gang Wang. 2021. DRLPart: A Deep Reinforcement Learning Framework for Optimally Efficient and Robust Resource Partitioning on Commodity Servers. In HPDC ’21: The 30th International Symposium on High-Performance Parallel and Distributed Computing, Virtual Event, Sweden, June 21-25, 2021, Erwin Laure, Stefano Markidis, Ana Lucia Verbanescu, and Jay F. Lofstead (Eds.). ACM, 175–188.Google ScholarDigital Library
- Shuang Chen, Christina Delimitrou, and José F. Martínez. 2019. PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2019, Providence, RI, USA, April 13-17, 2019, Iris Bahar, Maurice Herlihy, Emmett Witchel, and Alvin R. Lebeck (Eds.). ACM, 107–120.Google ScholarDigital Library
- Jongsok Choi, Ruolong Lian, Zhi Li, Andrew Canis, and Jason Helge Anderson. 2018. Accelerating Memcached on AWS Cloud FPGAs. In Proceedings of the 9th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2018, Toronto, ON, Canada, June 20-22, 2018. ACM, 2:1–2:8.Google ScholarDigital Library
- Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing, SoCC 2010, Indianapolis, Indiana, USA, June 10-11, 2010, Joseph M. Hellerstein, Surajit Chaudhuri, and Mendel Rosenblum (Eds.). ACM, 143–154.Google ScholarDigital Library
- Eli Cortez, Anand Bonde, Alexandre Muzio, Mark Russinovich, Marcus Fontoura, and Ricardo Bianchini. 2017. Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms. In Proceedings of the 26th Symposium on Operating Systems Principles (Shanghai, China) (SOSP ’17). Association for Computing Machinery, New York, NY, USA, 153–167. https://doi.org/10.1145/3132747.3132772Google ScholarDigital Library
- Eli Cortez, Anand Bonde, Alexandre Muzio, Mark Russinovich, Marcus Fontoura, and Ricardo Bianchini. 2017. Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms. In Proceedings of the 26th Symposium on Operating Systems Principles, Shanghai, China, October 28-31, 2017. ACM, 153–167.Google ScholarDigital Library
- Stijn Eyerman and Lieven Eeckhout. 2008. System-Level Performance Metrics for Multiprogram Workloads. IEEE Micro 28, 3 (2008), 42–53.Google ScholarDigital Library
- Michael Ferdman, Almutaz Adileh, Yusuf Onur Koçberber, Stavros Volos, Mohammad Alisafaee, Djordje Jevdjic, Cansu Kaynak, Adrian Daniel Popescu, Anastasia Ailamaki, and Babak Falsafi. 2012. Clearing the clouds: a study of emerging scale-out workloads on modern hardware. In Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2012, London, UK, March 3-7, 2012, Tim Harris and Michael L. Scott (Eds.). ACM, 37–48.Google ScholarDigital Library
- Senbo Fu, Rui Prior, and Hyong Kim. 2019. DMFD: Non-Intrusive Dependency Inference and Flow Ratio Model for Performance Anomaly Detection in Multi-Tier Cloud Applications. In 12th IEEE International Conference on Cloud Computing, CLOUD 2019, Milan, Italy, July 8-13, 2019, Elisa Bertino, Carl K. Chang, Peter Chen, Ernesto Damiani, Michael Goul, and Katsunori Oyama (Eds.). IEEE, 164–173.Google Scholar
- Samuel Ginzburg and Michael J. Freedman. 2020. Serverless Isn’t Server-Less: Measuring and Exploiting Resource Variability on Cloud FaaS Platforms. In Proceedings of the 2020 Sixth International Workshop on Serverless Computing (Delft, Netherlands) (WoSC’20). Association for Computing Machinery, New York, NY, USA, 43–48.Google ScholarDigital Library
- Jing Guo, Zihao Chang, Sa Wang, Haiyang Ding, Yihui Feng, Liang Mao, and Yungang Bao. 2019. Who limits the resource efficiency of my datacenter: an analysis of Alibaba datacenter traces. In Proceedings of the International Symposium on Quality of Service, IWQoS 2019, Phoenix, AZ, USA, June 24-25, 2019. ACM, 39:1–39:10.Google ScholarDigital Library
- Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy H. Katz, Scott Shenker, and Ion Stoica. [n.d.]. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. In Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2011, Boston, MA, USA, March 30 - April 1, 2011, David G. Andersen and Sylvia Ratnasamy (Eds.).Google Scholar
- Shengsheng Huang, Jie Huang, Jinquan Dai, Tao Xie, and Bo Huang. 2010. The HiBench benchmark suite: Characterization of the MapReduce-based data analysis. In Workshops Proceedings of the 26th International Conference on Data Engineering, ICDE 2010, March 1-6, 2010, Long Beach, California, USA. IEEE Computer Society, 41–51.Google ScholarCross Ref
- Seyyed Ahmad Javadi, Amoghavarsha Suresh, Muhammad Wajahat, and Anshul Gandhi. 2019. Scavenger: A Black-Box Batch Workload Resource Manager for Improving Utilization in Cloud Environments. In Proceedings of the ACM Symposium on Cloud Computing (Santa Cruz, CA, USA) (SoCC ’19). Association for Computing Machinery, New York, NY, USA, 272–285. https://doi.org/10.1145/3357223.3362734Google ScholarDigital Library
- Kostis Kaffes, Dragos Sbirlea, Yiyan Lin, David Lo, and Christos Kozyrakis. 2020. Leveraging application classes to save power in highly-utilized data centers. In SoCC ’20: ACM Symposium on Cloud Computing, Virtual Event, USA, October 19-21, 2020, Rodrigo Fonseca, Christina Delimitrou, and Beng Chin Ooi (Eds.). ACM, 134–149. https://doi.org/10.1145/3419111.3421274Google ScholarDigital Library
- Yoongu Kim, Dongsu Han, Onur Mutlu, and Mor Harchol-Balter. 2010. ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers. In 16th International Conference on High-Performance Computer Architecture (HPCA-16 2010), 9-14 January 2010, Bangalore, India. IEEE Computer Society, 1–12.Google Scholar
- Alok Gautam Kumbhare, Reza Azimi, Ioannis Manousakis, Anand Bonde, Felipe Vieira Frujeri, Nithish Mahalingam, Pulkit A. Misra, Seyyed Ahmad Javadi, Bianca Schroeder, Marcus Fontoura, and Ricardo Bianchini. 2021. Prediction-Based Power Oversubscription in Cloud Platforms. In 2021 USENIX Annual Technical Conference, USENIX ATC 2021, July 14-16, 2021, Irina Calciu and Geoff Kuenning (Eds.). USENIX Association, 473–487. https://www.usenix.org/conference/atc21/presentation/kumbhareGoogle Scholar
- Redis Lab. 2014. Memtier Benchmark. https://github.com/RedisLabs/memtier_benchmark. Accessed Dec 18, 2021.Google Scholar
- Shaohong Li, Xi Wang, Xiao Zhang, Vasileios Kontorinis, Sreekumar Kodakara, David Lo, and Parthasarathy Ranganathan. 2020. Thunderbolt: Throughput-Optimized, Quality-of-Service-Aware Power Capping at Scale. In 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020, Virtual Event, November 4-6, 2020. 1241–1255.Google Scholar
- David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, and Christos Kozyrakis. 2015. Heracles: improving resource efficiency at scale. In Proceedings of the 42nd Annual International Symposium on Computer Architecture, Portland, OR, USA, June 13-17, 2015, Deborah T. Marr and David H. Albonesi (Eds.). ACM, 450–462. https://doi.org/10.1145/2749469.2749475Google ScholarDigital Library
- Shanka Subhra Mondal, Nikhil Sheoran, and Subrata Mitra. 2021. Scheduling of Time-Varying Workloads Using Reinforcement Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, 35, 10 (May 2021). 9000–9008.Google ScholarCross Ref
- Onur Mutlu and Thomas Moscibroda. 2008. Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems. In 35th International Symposium on Computer Architecture (ISCA 2008), June 21-25, 2008, Beijing, China. IEEE Computer Society, 63–74.Google ScholarDigital Library
- Jinsu Park, Seongbeom Park, and Woongki Baek. 2019. CoPart: Coordinated Partitioning of Last-Level Cache and Memory Bandwidth for Fairness-Aware Workload Consolidation on Commodity Servers. In Proceedings of the Fourteenth EuroSys Conference 2019, Dresden, Germany, March 25-28, 2019, George Candea, Robbert van Renesse, and Christof Fetzer (Eds.). ACM, 10:1–10:16.Google ScholarDigital Library
- Tirthak Patel and Devesh Tiwari. 2020. CLITE: Efficient and QoS-Aware Co-Location of Multiple Latency-Critical Jobs for Warehouse Scale Computers. In IEEE International Symposium on High Performance Computer Architecture, HPCA 2020, San Diego, CA, USA, February 22-26, 2020. IEEE, 193–206.Google Scholar
- L.G.B. Ruiz, M.C. Pegalajar, R. Arcucci, and M. Molina-Solana. 2020. A time-series clustering methodology for knowledge extraction in energy consumption data. Expert Systems with Applications 160 (2020), 113731.Google ScholarCross Ref
- Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek, and John Wilkes. 2013. Omega: flexible, scalable schedulers for large compute clusters. In Eighth Eurosys Conference 2013, EuroSys ’13, Prague, Czech Republic, April 14-17, 2013, Zdenek Hanzálek, Hermann Härtig, Miguel Castro, and M. Frans Kaashoek (Eds.). ACM, 351–364.Google ScholarDigital Library
- Jiaqi Tan, Soila Kavulya, Rajeev Gandhi, and Priya Narasimhan. 2012. Light-Weight Black-Box Failure Detection for Distributed Systems. In Proceedings of the 2012 Workshop on Management of Big Data Systems (San Jose, California, USA) (MBDS ’12). Association for Computing Machinery, New York, NY, USA, 13–18. https://doi.org/10.1145/2378356.2378360Google ScholarDigital Library
- Gil Tene. 2014. Wrk2: A constant throughput, correct latency recording variant of wrk.http://github.com/giltene/wrk2. Accessed Dec 18, 2021.Google Scholar
- Huangshi Tian, Yunchuan Zheng, and Wei Wang. 2019. Characterizing and Synthesizing Task Dependencies of Data-Parallel Jobs in Alibaba Cloud. In Proceedings of the ACM Symposium on Cloud Computing, SoCC 2019, Santa Cruz, CA, USA, November 20-23, 2019. ACM, 139–151.Google ScholarDigital Library
- M. Tirmazi, A. Barker, N. Deng, M. E. Haque, and J. Wilkes. 2020. Borg: the next generation. In EuroSys ’20: Fifteenth EuroSys Conference 2020.Google Scholar
- Yawen Wang, Kapil Arya, Marios Kogias, Manohar Vanga, Aditya Bhandari, Neeraja J. Yadwadkar, Siddhartha Sen, Sameh Elnikety, Christos Kozyrakis, and Ricardo Bianchini. 2021. SmartHarvest: harvesting idle CPUs safely and efficiently in the cloud. In EuroSys ’21: Sixteenth European Conference on Computer Systems, Online Event, United Kingdom, April 26-28, 2021, Antonio Barbalace, Pramod Bhatotia, Lorenzo Alvisi, and Cristian Cadar (Eds.). ACM, 1–16.Google ScholarDigital Library
- Yaocheng Xiang, Chencheng Ye, Xiaolin Wang, Yingwei Luo, and Zhenlin Wang. 2019. EMBA: Efficient Memory Bandwidth Allocation to Improve Performance on Intel Commodity Processor. In Proceedings of the 48th International Conference on Parallel Processing, ICPP 2019, Kyoto, Japan, August 05-08, 2019. ACM, 16:1–16:12.Google ScholarDigital Library
- Mingli Xie, Dong Tong, Kan Huang, and Xu Cheng. 2014. Improving system throughput and fairness simultaneously in shared memory CMP systems via Dynamic Bank Partitioning. In 20th IEEE International Symposium on High Performance Computer Architecture, HPCA 2014, Orlando, FL, USA, February 15-19, 2014. IEEE Computer Society, 344–355.Google ScholarCross Ref
- Cong Xu, Karthick Rajamani, Alexandre Ferreira, Wesley Felter, Juan Rubio, and Yang Li. 2018. DCat: Dynamic Cache Management for Efficient, Performance-Sensitive Infrastructure-as-a-Service. In Proceedings of the Thirteenth EuroSys Conference(Porto, Portugal) (EuroSys ’18). Association for Computing Machinery, New York, NY, USA, Article 14, 13 pages. https://doi.org/10.1145/3190508.3190555Google ScholarDigital Library
- Xiao Zhang, Eric Tune, Robert Hagmann, Rohit Jnagal, Vrigo Gokhale, and John Wilkes. 2013. CPI2: CPU performance isolation for shared compute clusters. In Eighth Eurosys Conference 2013, EuroSys ’13, Prague, Czech Republic, April 14-17, 2013, Zdenek Hanzálek, Hermann Härtig, Miguel Castro, and M. Frans Kaashoek (Eds.). ACM, 379–391.Google ScholarDigital Library
- Ying Zhang, Jian Chen, Xiaowei Jiang, Qiang Liu, Ian M. Steiner, Andrew J. Herdrich, Kevin Shu, Ripan Das, Long Cui, and Litrin Jiang. 2021. LIBRA: Clearing the Cloud Through Dynamic Memory Bandwidth Management. In IEEE International Symposium on High-Performance Computer Architecture, HPCA 2021, Seoul, South Korea, February 27 - March 3, 2021. IEEE, 815–826.Google Scholar
Index Terms
- Themis: Fair Memory Subsystem Resource Sharing with Differentiated QoS in Public Clouds
Recommendations
PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services
ASPLOS '19: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating SystemsMulti-tenancy in modern datacenters is currently limited to a single latency-critical, interactive service, running alongside one or more low-priority, best-effort jobs. This limits the efficiency gains from multi-tenancy, especially as an increasing ...
Resource provisioning and scheduling in clouds: QoS perspective
Resource provisioning of appropriate resources to cloud workloads depends on the quality of service (QoS) requirements of cloud applications and is a challenging task. In cloud environment, heterogeneity, uncertainty and dispersion of resources ...
Resource virtualization methodology for on-demand allocation in cloud computing systems
The resources' heterogeneity and unbalanced capability, together with the diversity of resource requirements in cloud computing systems, have produced great contradictions between resources' tight coupling characteristics and user's multi-granularities ...
Comments