research-article

Less Provisioning: A Fine-grained Resource Scaling Engine for Long-running Services with Tail Latency Guarantees

Authors:
Binlei Cai

Tianjin Key Laboratory of Advanced Networking, School of Computer Science and Technology, Tianjin University, Tianjin, China

Tianjin Key Laboratory of Advanced Networking, School of Computer Science and Technology, Tianjin University, Tianjin, China
View Profile

,
Rongqi Zhang

Tianjin Key Laboratory of Advanced Networking, School of Computer Science and Technology, Tianjin University, Tianjin, China

Tianjin Key Laboratory of Advanced Networking, School of Computer Science and Technology, Tianjin University, Tianjin, China
View Profile

,
Laiping Zhao

Tianjin Key Laboratory of Advanced Networking, School of Computer Software Tianjin University Tianjin, China

Tianjin Key Laboratory of Advanced Networking, School of Computer Software Tianjin University Tianjin, China
View Profile

,
Keqiu Li

Tianjin Key Laboratory of Advanced Networking, School of Computer Science and Technology, Tianjin University, Tianjin, China

Tianjin Key Laboratory of Advanced Networking, School of Computer Science and Technology, Tianjin University, Tianjin, China
View Profile

ICPP '18: Proceedings of the 47th International Conference on Parallel ProcessingAugust 2018Article No.: 30Pages 1–11https://doi.org/10.1145/3225058.3225113

Published:13 August 2018Publication History

ICPP '18: Proceedings of the 47th International Conference on Parallel Processing

Pages 1–11

ABSTRACT

Modern resource management frameworks guarantee low tail latency for long-running services using the resource over-provisioning method, resulting in serious waste of resource and increasing the service costs greatly. To reduce the over-provisioning cost, we present EFRA, an elastic and fine-grained resource allocator that enables much more efficient resource provisioning while guaranteeing the tail latency Service Level Objective (SLO). EFRA achieves this through the cooperation of three key components running on a containerized platform: The period detector identifies the period features of the workload through a convolution-based time series analysis. The resource reservation component estimates the just-right amount of resources based on the period analysis through a top-K based collaborative filtering approach. The online reprovisioning component dynamically adjusts the resources for further enforcing the tail latency SLO. Testbed experiments show that EFRA is able to increase the average resource utilization to 43%, and save up to 66% resources while guaranteeing the same tail latency objective.

References

2017. Alibaba trace. (2017). https://github.com/alibaba/clusterdata.Google Scholar
2017. Docker platform. (2017). https://www.docker.com/docker-engine.Google Scholar
2017. The Internet Traffic Archive. (2017). http://ita.ee.lbl.gov/html/traces.html.Google Scholar
Omer Adam, Young Choon Lee, et al. 2017. Stochastic Resource Provisioning for Containerized Multi-Tier Web Services in Clouds. IEEE Transactions on Parallel and Distributed Systems 28, 7 (2017), 2060--2073.Google ScholarDigital Library
Omer Y. Adam, Young Choon Lee, et al. 2016. Constructing Performance-Predictable Clusters with Performance-Varying Resources of Clouds. IEEE Trans. Comput. 65, 9 (Sept. 2016), 2709--2724. Google ScholarDigital Library
M. J. Atallah, F. Chyzak, et al. 2001. A Randomized Algorithm for Approximate String Matching. Algorithmica 29, 3 (2001), 468--486. Google ScholarDigital Library
Berk Atikoglu, Yuehai Xu, et al. 2012. Workload Analysis of a Large-scale Key-value Store. In SIGMETRICS '12. ACM, New York, NY, USA, 53--64. Google ScholarDigital Library
Christian Bliek1ú, Pierre Bonami, et al. 2014. Solving mixed-integer quadratic programming problems with IBM-CPLEX: a progress report. In Proceedings of the Twenty-Sixth RAMP Symposium. 171--180.Google Scholar
John S. Breese, David Heckerman, et al. 1998. Empirical Analysis of Predictive Algorithms for Collaborative Filtering. In UAI'98. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 43--52. Google ScholarDigital Library
Ludmila Cherkasova. 2011. Performance Modeling in Mapreduce Environments: Challenges and Opportunities. In ICPE '11. ACM, New York, NY, USA, 5--6. Google ScholarDigital Library
Eli Cortez, Anand Bonde, et al. 2017. Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms. In SOSP '17. ACM, New York, NY, USA, 153--167. Google ScholarDigital Library
Carlo Curino, Djellel E. Difallah, et al. 2014. Reservation-based Scheduling: If You'Re Late Don'T Blame Us!. In SOCC '14. ACM, New York, NY, USA, Article 2, 14 pages. Google ScholarDigital Library
Christina Delimitrou and Christos Kozyrakis. 2014. Quasar: Resource-efficient and QoS-aware Cluster Management. In ASPLOS '14. ACM, New York, NY, USA, 127--144. Google ScholarDigital Library
Christina Delimitrou and Christos Kozyrakis. 2016. HCloud: Resource-Efficient Provisioning in Shared Cloud Systems. In ASPLOS '16. ACM, New York, NY, USA, 473--488. Google ScholarDigital Library
Mohamed G. Elfeky, Walid G. Aref, et al. 2005. Periodicity Detection in Time Series Databases. IEEE Trans. on Knowl. and Data Eng. 17, 7 (July 2005), 875--887. Google ScholarDigital Library
Anshul Gandhi, Yuan Chen, et al. 2011. Minimizing data center SLA violations and power consumption via hybrid resource provisioning. In Green Computing Conference and Workshops. 1--8. Google ScholarDigital Library
D Gmach, J Rolia, L Cherkasova, et al. 2009. An integrated approach to resource pool management: Policies, efficiency and quality metrics. In IEEE International Conference on Dependable Systems and Networks with Ftcs and DCC. 326--335.Google Scholar
Zhenhuan Gong, Xiaohui Gu, et al. 2010. PRESS: PRedictive Elastic ReSource Scaling for cloud systems. In International Conference on Network and Service Management. 9--16.Google Scholar
Benjamin Hindman, Andy Konwinski, et al. 2011. Mesos: A Platform for Finegrained Resource Sharing in the Data Center. In NSDI'11. USENIX Association, Berkeley, CA, USA, 295--308. Google ScholarDigital Library
Sangeetha Abdu Jyothi, Carlo Curino, et al. 2016. Morpheus: Towards Automated SLOs for Enterprise Clusters. In OSDI'16. USENIX Association, Berkeley, CA, USA, 117--134. Google ScholarDigital Library
Palden Lama and Xiaobo Zhou. 2012. AROMA: Automated Resource Allocation and Configuration of Mapreduce Environment in the Cloud. In ICAC '12. ACM, New York, NY, USA, 63--72. Google ScholarDigital Library
Richard J Larsen and Morris L Marx. 1981. An introduction to mathematical statistics and its applications. Prentice-Hall. 2061--2071 pages.Google Scholar
Jessica Lin, Eamonn Keogh, et al. 2003. A Symbolic Representation of Time Series, with Implications for Streaming Algorithms. In DMKD '03. ACM, New York, NY, USA, 2--11. Google ScholarDigital Library
Haikun Liu and Bingsheng He. 2014. Reciprocal Resource Fairness: Towards Cooperative Multiple-resource Fair Sharing in IaaS Clouds. In SC '14. IEEE Press, Piscataway, NJ, USA, 970--981. Google ScholarDigital Library
David Lo, Liqun Cheng, et al. 2016. Improving Resource Efficiency at Scale with Heracles. ACM Trans. Comput. Syst. 34, 2, Article 6 (May 2016), 33 pages. Google ScholarDigital Library
Richard Mccreadie, Ian Soboroff, et al. 2012. On building a reusable Twitter corpus. In International ACM SIGIR conference on research and development in Information Retrieval. 1113--1114. Google ScholarDigital Library
Malte Schwarzkopf, Andy Konwinski, et al. 2013. Omega: Flexible, Scalable Schedulers for Large Compute Clusters. In EuroSys '13. ACM, New York, NY, USA, 351--364. Google ScholarDigital Library
Zhiming Shen, Sethuraman Subbiah, et al. 2011. CloudScale: Elastic Resource Scaling for Multi-tenant Cloud Systems. In SOCC '11. ACM, New York, NY, USA, Article 5, 14 pages. Google ScholarDigital Library
Sethuraman Subbiah, John Wilkes, et al. 2013. AGILE: Elastic Distributed Resource Scaling for Infrastructure-as-a-Service. In International Conference on Autonomic Computing.Google Scholar
Alexey Tumanov, Timothy Zhu, et al. 2016. TetriSched: Global Rescheduling with Adaptive Plan-ahead in Dynamic Heterogeneous Clusters. In EuroSys '16. ACM, New York, NY, USA, Article 35, 16 pages. Google ScholarDigital Library
Guido Urdaneta, Guillaume Pierre, et al. 2009. Wikipedia Workload Analysis for Decentralized Hosting. Comput. Netw. 53, 11 (July 2009), 1830--1845. Google ScholarDigital Library
Vinod Kumar Vavilapalli, Arun C. Murthy, et al. 2013. Apache Hadoop YARN: Yet Another Resource Negotiator. In SOCC '13. ACM, New York, NY, USA, Article 5, 16 pages. Google ScholarDigital Library
Abhishek Verma, Ludmila Cherkasova, et al. 2011. ARIA: Automatic Resource Inference and Allocation for Mapreduce Environments. In ICAC '11. ACM, New York, NY, USA, 235--244. Google ScholarDigital Library
Abhishek Verma, Luis Pedrosa, et al. 2015. Large-scale Cluster Management at Google with Borg. In EuroSys '15. ACM, New York, NY, USA, Article 18, 17 pages. Google ScholarDigital Library
Yunqi Zhang, George Prekas, et al. 2016. History-based Harvesting of Spare Cycles and Storage in Large-scale Datacenters. In OSDI '16. USENIX Association, Berkeley, CA, USA, 755--770. Google ScholarDigital Library
Timothy Zhu, Daniel S. Berger, et al. 2016. SNC-Meister: Admitting More Tenants with Tail Latency SLOs. In SoCC '16. ACM, New York, NY, USA, 374--387. Google ScholarDigital Library

Recommendations

Cloud resource provisioning: survey, status and future research directions

Cloud resource provisioning is a challenging job that may be compromised due to unavailability of the expected resources. Quality of Service (QoS) requirements of workloads derives the provisioning of appropriate resources to cloud workloads. Discovery ...
Read More
Resource provisioning and scheduling in clouds: QoS perspective

Resource provisioning of appropriate resources to cloud workloads depends on the quality of service (QoS) requirements of cloud applications and is a challenging task. In cloud environment, heterogeneity, uncertainty and dispersion of resources ...
Read More
User's priority focused resource provisioning over cloud computing infrastructure

Resource provisioning is the process of activating a bundle of allocated quantity of resources to bear the user requests. The scheduling algorithm plays a vital role in effective utilisation of resources, though resource allocation fails to achieve user ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICPP '18: Proceedings of the 47th International Conference on Parallel Processing
August 2018
945 pages
ISBN:9781450365109
DOI:10.1145/3225058

Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 August 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
ICPP '18 Paper Acceptance Rate91of313submissions,29%Overall Acceptance Rate91of313submissions,29%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 9
  Total Citations
  View Citations
- 241
  Total Downloads
- Downloads (Last 12 months)25
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Less Provisioning: A Fine-grained Resource Scaling Engine for Long-running Services with Tail Latency Guarantees

ICPP '18: Proceedings of the 47th International Conference on Parallel Processing

ABSTRACT

References

Cited By

Recommendations

Cloud resource provisioning: survey, status and future research directions

Resource provisioning and scheduling in clouds: QoS perspective

User's priority focused resource provisioning over cloud computing infrastructure

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Less Provisioning: A Fine-grained Resource Scaling Engine for Long-running Services with Tail Latency Guarantees

ICPP '18: Proceedings of the 47th International Conference on Parallel Processing

ABSTRACT

References

Cited By

Recommendations

Cloud resource provisioning: survey, status and future research directions

Resource provisioning and scheduling in clouds: QoS perspective

User's priority focused resource provisioning over cloud computing infrastructure

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media