research-article

Themis: Fair Memory Subsystem Resource Sharing with Differentiated QoS in Public Clouds

Authors:
Wenda Tang

School of Automation Science and Engineering, Xi'an Jiaotong University, China and Huawei Cloud Computing Technologies Co., Ltd., China

School of Automation Science and Engineering, Xi'an Jiaotong University, China and Huawei Cloud Computing Technologies Co., Ltd., China
View Profile

,
Senbo Fu

Huawei Cloud Computing Technologies Co., Ltd., China

Huawei Cloud Computing Technologies Co., Ltd., China
View Profile

,
Yutao Ke

Huawei Cloud Computing Technologies Co., Ltd., China

Huawei Cloud Computing Technologies Co., Ltd., China
View Profile

,
Qian Peng

Huawei Cloud Computing Technologies Co., Ltd., China

Huawei Cloud Computing Technologies Co., Ltd., China
View Profile

,
Feng Gao

School of Automation Science and Engineering, Xi'an Jiaotong University, China

School of Automation Science and Engineering, Xi'an Jiaotong University, China
View Profile

ICPP '22: Proceedings of the 51st International Conference on Parallel ProcessingAugust 2022Article No.: 49Pages 1–12https://doi.org/10.1145/3545008.3545064

Published:13 January 2023Publication History

ICPP '22: Proceedings of the 51st International Conference on Parallel Processing

Pages 1–12

ABSTRACT

To reduce the increasing cost of building and operating cloud data centers, cloud providers are seeking various mechanisms to achieve higher resource effectiveness. For example, cloud operators are leveraging dynamic resource management techniques to consolidate a higher density of application workloads into commodity physical servers to maximize server resource utilization. However, higher workload density is a major source of performance interference problems in multi-tenant clouds. Existing performance isolation techniques such as dedicated CPU cores for specific workloads are not enough as there are still common resource (e.g., last-level cache and memory bandwidth in memory subsystem) on the processor that are shared among all CPUs on the same NUMA node. While prior work has proposed a variety of resource partitioning techniques, it still remains unexplored to characterize the impact of memory subsystem resource partitioning for the consolidated workloads with different priorities and investigate software support to dynamically manage memory subsystem resource sharing in a real-time manner. To bridge the gap, we propose Themis, a feedback-based controller that enables a priority-aware and fairness-aware memory subsystem resource management strategy to guarantee the performance of high-priority workloads while maintaining fairness across all colocated workloads in high-density clouds. Themis is evaluated with multiple typical cloud applications in our data center environment. The results show that Themis improves the performance of various workloads by up to 3.15%, and fairness by more than 70% in memory subsystem resource allocation compared to existing state-of-the-art work.

References

A. Beitch, B. Liu, T. Yung, R. Griffith, A. Fox, D. A. Patterson, and A. Beitch. 2010. A Workload Generation Toolkit for Cloud Computing Applications. (2010).Google Scholar
Ruobing Chen, Jinping Wu, Haosen Shi, Yusen Li, Xiaoguang Liu, and Gang Wang. 2021. DRLPart: A Deep Reinforcement Learning Framework for Optimally Efficient and Robust Resource Partitioning on Commodity Servers. In HPDC ’21: The 30th International Symposium on High-Performance Parallel and Distributed Computing, Virtual Event, Sweden, June 21-25, 2021, Erwin Laure, Stefano Markidis, Ana Lucia Verbanescu, and Jay F. Lofstead (Eds.). ACM, 175–188.Google ScholarDigital Library
Shuang Chen, Christina Delimitrou, and José F. Martínez. 2019. PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2019, Providence, RI, USA, April 13-17, 2019, Iris Bahar, Maurice Herlihy, Emmett Witchel, and Alvin R. Lebeck (Eds.). ACM, 107–120.Google ScholarDigital Library
Jongsok Choi, Ruolong Lian, Zhi Li, Andrew Canis, and Jason Helge Anderson. 2018. Accelerating Memcached on AWS Cloud FPGAs. In Proceedings of the 9th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART 2018, Toronto, ON, Canada, June 20-22, 2018. ACM, 2:1–2:8.Google ScholarDigital Library
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing, SoCC 2010, Indianapolis, Indiana, USA, June 10-11, 2010, Joseph M. Hellerstein, Surajit Chaudhuri, and Mendel Rosenblum (Eds.). ACM, 143–154.Google ScholarDigital Library
Eli Cortez, Anand Bonde, Alexandre Muzio, Mark Russinovich, Marcus Fontoura, and Ricardo Bianchini. 2017. Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms. In Proceedings of the 26th Symposium on Operating Systems Principles (Shanghai, China) (SOSP ’17). Association for Computing Machinery, New York, NY, USA, 153–167. https://doi.org/10.1145/3132747.3132772Google ScholarDigital Library
Eli Cortez, Anand Bonde, Alexandre Muzio, Mark Russinovich, Marcus Fontoura, and Ricardo Bianchini. 2017. Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms. In Proceedings of the 26th Symposium on Operating Systems Principles, Shanghai, China, October 28-31, 2017. ACM, 153–167.Google ScholarDigital Library
Stijn Eyerman and Lieven Eeckhout. 2008. System-Level Performance Metrics for Multiprogram Workloads. IEEE Micro 28, 3 (2008), 42–53.Google ScholarDigital Library
Michael Ferdman, Almutaz Adileh, Yusuf Onur Koçberber, Stavros Volos, Mohammad Alisafaee, Djordje Jevdjic, Cansu Kaynak, Adrian Daniel Popescu, Anastasia Ailamaki, and Babak Falsafi. 2012. Clearing the clouds: a study of emerging scale-out workloads on modern hardware. In Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2012, London, UK, March 3-7, 2012, Tim Harris and Michael L. Scott (Eds.). ACM, 37–48.Google ScholarDigital Library
Senbo Fu, Rui Prior, and Hyong Kim. 2019. DMFD: Non-Intrusive Dependency Inference and Flow Ratio Model for Performance Anomaly Detection in Multi-Tier Cloud Applications. In 12th IEEE International Conference on Cloud Computing, CLOUD 2019, Milan, Italy, July 8-13, 2019, Elisa Bertino, Carl K. Chang, Peter Chen, Ernesto Damiani, Michael Goul, and Katsunori Oyama (Eds.). IEEE, 164–173.Google Scholar
Samuel Ginzburg and Michael J. Freedman. 2020. Serverless Isn’t Server-Less: Measuring and Exploiting Resource Variability on Cloud FaaS Platforms. In Proceedings of the 2020 Sixth International Workshop on Serverless Computing (Delft, Netherlands) (WoSC’20). Association for Computing Machinery, New York, NY, USA, 43–48.Google ScholarDigital Library
Jing Guo, Zihao Chang, Sa Wang, Haiyang Ding, Yihui Feng, Liang Mao, and Yungang Bao. 2019. Who limits the resource efficiency of my datacenter: an analysis of Alibaba datacenter traces. In Proceedings of the International Symposium on Quality of Service, IWQoS 2019, Phoenix, AZ, USA, June 24-25, 2019. ACM, 39:1–39:10.Google ScholarDigital Library
Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy H. Katz, Scott Shenker, and Ion Stoica. [n.d.]. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. In Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2011, Boston, MA, USA, March 30 - April 1, 2011, David G. Andersen and Sylvia Ratnasamy (Eds.).Google Scholar
Shengsheng Huang, Jie Huang, Jinquan Dai, Tao Xie, and Bo Huang. 2010. The HiBench benchmark suite: Characterization of the MapReduce-based data analysis. In Workshops Proceedings of the 26th International Conference on Data Engineering, ICDE 2010, March 1-6, 2010, Long Beach, California, USA. IEEE Computer Society, 41–51.Google ScholarCross Ref
Seyyed Ahmad Javadi, Amoghavarsha Suresh, Muhammad Wajahat, and Anshul Gandhi. 2019. Scavenger: A Black-Box Batch Workload Resource Manager for Improving Utilization in Cloud Environments. In Proceedings of the ACM Symposium on Cloud Computing (Santa Cruz, CA, USA) (SoCC ’19). Association for Computing Machinery, New York, NY, USA, 272–285. https://doi.org/10.1145/3357223.3362734Google ScholarDigital Library
Kostis Kaffes, Dragos Sbirlea, Yiyan Lin, David Lo, and Christos Kozyrakis. 2020. Leveraging application classes to save power in highly-utilized data centers. In SoCC ’20: ACM Symposium on Cloud Computing, Virtual Event, USA, October 19-21, 2020, Rodrigo Fonseca, Christina Delimitrou, and Beng Chin Ooi (Eds.). ACM, 134–149. https://doi.org/10.1145/3419111.3421274Google ScholarDigital Library
Yoongu Kim, Dongsu Han, Onur Mutlu, and Mor Harchol-Balter. 2010. ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers. In 16th International Conference on High-Performance Computer Architecture (HPCA-16 2010), 9-14 January 2010, Bangalore, India. IEEE Computer Society, 1–12.Google Scholar
Alok Gautam Kumbhare, Reza Azimi, Ioannis Manousakis, Anand Bonde, Felipe Vieira Frujeri, Nithish Mahalingam, Pulkit A. Misra, Seyyed Ahmad Javadi, Bianca Schroeder, Marcus Fontoura, and Ricardo Bianchini. 2021. Prediction-Based Power Oversubscription in Cloud Platforms. In 2021 USENIX Annual Technical Conference, USENIX ATC 2021, July 14-16, 2021, Irina Calciu and Geoff Kuenning (Eds.). USENIX Association, 473–487. https://www.usenix.org/conference/atc21/presentation/kumbhareGoogle Scholar
Redis Lab. 2014. Memtier Benchmark. https://github.com/RedisLabs/memtier_benchmark. Accessed Dec 18, 2021.Google Scholar
Shaohong Li, Xi Wang, Xiao Zhang, Vasileios Kontorinis, Sreekumar Kodakara, David Lo, and Parthasarathy Ranganathan. 2020. Thunderbolt: Throughput-Optimized, Quality-of-Service-Aware Power Capping at Scale. In 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020, Virtual Event, November 4-6, 2020. 1241–1255.Google Scholar
David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, and Christos Kozyrakis. 2015. Heracles: improving resource efficiency at scale. In Proceedings of the 42nd Annual International Symposium on Computer Architecture, Portland, OR, USA, June 13-17, 2015, Deborah T. Marr and David H. Albonesi (Eds.). ACM, 450–462. https://doi.org/10.1145/2749469.2749475Google ScholarDigital Library
Shanka Subhra Mondal, Nikhil Sheoran, and Subrata Mitra. 2021. Scheduling of Time-Varying Workloads Using Reinforcement Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, 35, 10 (May 2021). 9000–9008.Google ScholarCross Ref
Onur Mutlu and Thomas Moscibroda. 2008. Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems. In 35th International Symposium on Computer Architecture (ISCA 2008), June 21-25, 2008, Beijing, China. IEEE Computer Society, 63–74.Google ScholarDigital Library
Jinsu Park, Seongbeom Park, and Woongki Baek. 2019. CoPart: Coordinated Partitioning of Last-Level Cache and Memory Bandwidth for Fairness-Aware Workload Consolidation on Commodity Servers. In Proceedings of the Fourteenth EuroSys Conference 2019, Dresden, Germany, March 25-28, 2019, George Candea, Robbert van Renesse, and Christof Fetzer (Eds.). ACM, 10:1–10:16.Google ScholarDigital Library
Tirthak Patel and Devesh Tiwari. 2020. CLITE: Efficient and QoS-Aware Co-Location of Multiple Latency-Critical Jobs for Warehouse Scale Computers. In IEEE International Symposium on High Performance Computer Architecture, HPCA 2020, San Diego, CA, USA, February 22-26, 2020. IEEE, 193–206.Google Scholar
L.G.B. Ruiz, M.C. Pegalajar, R. Arcucci, and M. Molina-Solana. 2020. A time-series clustering methodology for knowledge extraction in energy consumption data. Expert Systems with Applications 160 (2020), 113731.Google ScholarCross Ref
Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek, and John Wilkes. 2013. Omega: flexible, scalable schedulers for large compute clusters. In Eighth Eurosys Conference 2013, EuroSys ’13, Prague, Czech Republic, April 14-17, 2013, Zdenek Hanzálek, Hermann Härtig, Miguel Castro, and M. Frans Kaashoek (Eds.). ACM, 351–364.Google ScholarDigital Library
Jiaqi Tan, Soila Kavulya, Rajeev Gandhi, and Priya Narasimhan. 2012. Light-Weight Black-Box Failure Detection for Distributed Systems. In Proceedings of the 2012 Workshop on Management of Big Data Systems (San Jose, California, USA) (MBDS ’12). Association for Computing Machinery, New York, NY, USA, 13–18. https://doi.org/10.1145/2378356.2378360Google ScholarDigital Library
Gil Tene. 2014. Wrk2: A constant throughput, correct latency recording variant of wrk.http://github.com/giltene/wrk2. Accessed Dec 18, 2021.Google Scholar
Huangshi Tian, Yunchuan Zheng, and Wei Wang. 2019. Characterizing and Synthesizing Task Dependencies of Data-Parallel Jobs in Alibaba Cloud. In Proceedings of the ACM Symposium on Cloud Computing, SoCC 2019, Santa Cruz, CA, USA, November 20-23, 2019. ACM, 139–151.Google ScholarDigital Library
M. Tirmazi, A. Barker, N. Deng, M. E. Haque, and J. Wilkes. 2020. Borg: the next generation. In EuroSys ’20: Fifteenth EuroSys Conference 2020.Google Scholar
Yawen Wang, Kapil Arya, Marios Kogias, Manohar Vanga, Aditya Bhandari, Neeraja J. Yadwadkar, Siddhartha Sen, Sameh Elnikety, Christos Kozyrakis, and Ricardo Bianchini. 2021. SmartHarvest: harvesting idle CPUs safely and efficiently in the cloud. In EuroSys ’21: Sixteenth European Conference on Computer Systems, Online Event, United Kingdom, April 26-28, 2021, Antonio Barbalace, Pramod Bhatotia, Lorenzo Alvisi, and Cristian Cadar (Eds.). ACM, 1–16.Google ScholarDigital Library
Yaocheng Xiang, Chencheng Ye, Xiaolin Wang, Yingwei Luo, and Zhenlin Wang. 2019. EMBA: Efficient Memory Bandwidth Allocation to Improve Performance on Intel Commodity Processor. In Proceedings of the 48th International Conference on Parallel Processing, ICPP 2019, Kyoto, Japan, August 05-08, 2019. ACM, 16:1–16:12.Google ScholarDigital Library
Mingli Xie, Dong Tong, Kan Huang, and Xu Cheng. 2014. Improving system throughput and fairness simultaneously in shared memory CMP systems via Dynamic Bank Partitioning. In 20th IEEE International Symposium on High Performance Computer Architecture, HPCA 2014, Orlando, FL, USA, February 15-19, 2014. IEEE Computer Society, 344–355.Google ScholarCross Ref
Cong Xu, Karthick Rajamani, Alexandre Ferreira, Wesley Felter, Juan Rubio, and Yang Li. 2018. DCat: Dynamic Cache Management for Efficient, Performance-Sensitive Infrastructure-as-a-Service. In Proceedings of the Thirteenth EuroSys Conference(Porto, Portugal) (EuroSys ’18). Association for Computing Machinery, New York, NY, USA, Article 14, 13 pages. https://doi.org/10.1145/3190508.3190555Google ScholarDigital Library
Xiao Zhang, Eric Tune, Robert Hagmann, Rohit Jnagal, Vrigo Gokhale, and John Wilkes. 2013. CPI2: CPU performance isolation for shared compute clusters. In Eighth Eurosys Conference 2013, EuroSys ’13, Prague, Czech Republic, April 14-17, 2013, Zdenek Hanzálek, Hermann Härtig, Miguel Castro, and M. Frans Kaashoek (Eds.). ACM, 379–391.Google ScholarDigital Library
Ying Zhang, Jian Chen, Xiaowei Jiang, Qiang Liu, Ian M. Steiner, Andrew J. Herdrich, Kevin Shu, Ripan Das, Long Cui, and Litrin Jiang. 2021. LIBRA: Clearing the Cloud Through Dynamic Memory Bandwidth Management. In IEEE International Symposium on High-Performance Computer Architecture, HPCA 2021, Seoul, South Korea, February 27 - March 3, 2021. IEEE, 815–826.Google Scholar

Index Terms

Themis: Fair Memory Subsystem Resource Sharing with Differentiated QoS in Public Clouds
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
      1. Cloud computing
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Memory management
        Process management
        Scheduling
    2. Software system structures
      1. Real-time systems software

Recommendations

PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services
ASPLOS '19: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems

Multi-tenancy in modern datacenters is currently limited to a single latency-critical, interactive service, running alongside one or more low-priority, best-effort jobs. This limits the efficiency gains from multi-tenancy, especially as an increasing ...
Read More
Resource provisioning and scheduling in clouds: QoS perspective

Resource provisioning of appropriate resources to cloud workloads depends on the quality of service (QoS) requirements of cloud applications and is a challenging task. In cloud environment, heterogeneity, uncertainty and dispersion of resources ...
Read More
Resource virtualization methodology for on-demand allocation in cloud computing systems

The resources' heterogeneity and unbalanced capability, together with the diversity of resource requirements in cloud computing systems, have produced great contradictions between resources' tight coupling characteristics and user's multi-granularities ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICPP '22: Proceedings of the 51st International Conference on Parallel Processing
August 2022
976 pages
ISBN:9781450397339
DOI:10.1145/3545008

Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 January 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Cloud computing
cache ways
fairness-aware
interference
memory bandwidth
priority-aware
quality of service
resource management
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate91of313submissions,29%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 122
  Total Downloads
- Downloads (Last 12 months)71
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Themis: Fair Memory Subsystem Resource Sharing with Differentiated QoS in Public Clouds

ICPP '22: Proceedings of the 51st International Conference on Parallel Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services

Resource provisioning and scheduling in clouds: QoS perspective

Resource virtualization methodology for on-demand allocation in cloud computing systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Themis: Fair Memory Subsystem Resource Sharing with Differentiated QoS in Public Clouds

ICPP '22: Proceedings of the 51st International Conference on Parallel Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services

Resource provisioning and scheduling in clouds: QoS perspective

Resource virtualization methodology for on-demand allocation in cloud computing systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media