A Reinforcement Learning Based Approach to Identify Resource Bottlenecks for Multiple Services Interactions in Cloud Computing Environments

Xu, Lingxiao; Xu, Minxian; Semmes, Richard; Li, Hui; Mu, Hong; Gui, Shuangquan; Tian, Wenhong; Wu, Kui; Buyya, Rajkumar

doi:10.1007/978-3-030-67540-0_4

Lingxiao Xu²¹,
Minxian Xu²²,
Richard Semmes²³,
Hui Li²³,
Hong Mu²³,
Shuangquan Gui²³,
Wenhong Tian²¹,
Kui Wu²⁴ &
…
Rajkumar Buyya^21,25

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 350))

Included in the following conference series:

International Conference on Collaborative Computing: Networking, Applications and Worksharing

751 Accesses
1 Citations

Abstract

Cloud service providers are provisioning resources including a variety of virtual machine instances to support customers that migrate their services to the cloud. From the customers’ perspective, selecting the appropriate amount of resources is tightly coupled with performance and cost. By identifying the potential resource bottlenecks in the early stage of the service deployment process, resource planning can be significantly optimized. However, due to the unpredictable workloads and heterogeneous resources, it is difficult to identify resource bottlenecks that can degrade system performance. To support system non-functional requirements (NFR) in a better manner, we propose a reinforcement learning based approach to support the NFR management of system concerning the multiple services interactions scenario by identifying the potential resource bottleneck and optimizing the demanded resources. The proposed approach can predict the resource bottleneck for multiple services interactions, e.g. bottleneck in CPU or overloads in specific service, and provide guidance for resource planning. We modeled and simulated the proposed approach using an extended version of the CloudSim toolkit. Comprehensive evaluations with realistic use case from Siemens Digital Industries Software’s MindSphere Solution on AliCloud show that our proposed approach can achieve high accuracy in terms of performance metrics, such as response time, queries per second (QPS), and resource usage.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://siemens.mindsphere.io/en.

References

Ben Alla, H., Ben Alla, S., Touhafi, A., Ezzati, A.: A novel task scheduling approach based on dynamic queues and hybrid meta-heuristic algorithms for cloud computing environment. Clust. Comput. 21(4), 1797–1820 (2018). https://doi.org/10.1007/s10586-018-2811-x
Article Google Scholar
Ben Alla, H., Ben Alla, S., Ezzati, A., Mouhsen, A.: A novel architecture with dynamic queues based on fuzzy logic and particle swarm optimization algorithm for task scheduling in cloud computing. In: El-Azouzi, R., Menasché, D.S., Sabir, E., Pellegrini, F.D., Benjillali, M. (eds.) Advances in Ubiquitous Networking 2. LNEE, vol. 397, pp. 205–217. Springer, Singapore (2017). https://doi.org/10.1007/978-981-10-1627-1_16
Chapter Google Scholar
Buyya, R., Yeo, C.S., Venugopal, S., Broberg, J., Brandic, I.: Cloud computing and emerging it platforms: vision, hype, and reality for delivering computing as the 5th utility. Futur. Gener. Comput. Syst. 25(6), 599–616 (2009)
Article Google Scholar
Calheiros, R.N., Ranjan, R., Beloglazov, A., De Rose, C.A., Buyya, R.: CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw.: Pract. Exp. 41(1), 23–50 (2011)
Google Scholar
Cheng, M., Li, J., Nazarian, S.: DRL-cloud: deep reinforcement learning-based resource provisioning and task scheduling for cloud service providers. In: Proceedings of the 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), pp. 129–134. IEEE (2018)
Google Scholar
Fan, J.: Local linear regression smoothers and their minimax efficiencies. Ann. Stat. 21(1), 196–216 (1993)
Article MathSciNet Google Scholar
Funika, W., Koperek, P.: Evaluating the use of policy gradient optimization approach for automatic cloud resource provisioning. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K. (eds.) PPAM 2019. LNCS, vol. 12043, pp. 467–478. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43229-4_40
Chapter Google Scholar
Gao, H., Huang, W., Zou, Q., Yang, X.: A dynamic planning framework for QoS-based mobile service composition under cloud-edge hybrid environments. In: Wang, X., Gao, H., Iqbal, M., Min, G. (eds.) CollaborateCom 2019. LNICST, vol. 292, pp. 58–70. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30146-0_5
Chapter Google Scholar
Jung, J., Kim, H.: MR-CloudSim: designing and implementing MapReduce computing model on CloudSim. In: Proceedings of the 2012 International Conference on ICT Convergence (ICTC), pp. 504–509. IEEE (2012)
Google Scholar
Abdulhamid, S.M., Abd Latiff, M.S., Madni, S.H.H., Abdullahi, M.: Fault tolerance aware scheduling technique for cloud computing environment using dynamic clustering algorithm. Neural Comput. Appl. 29(1), 279–293 (2016). https://doi.org/10.1007/s00521-016-2448-8
Article Google Scholar
Mousavi, S.S., Schukat, M., Howley, E.: Traffic light control using deep policy-gradient and value-function-based reinforcement learning. IET Intell. Transp. Syst. 11(7), 417–423 (2017)
Article Google Scholar
Nayak, S.C., Tripathy, C.: Deadline sensitive lease scheduling in cloud computing environment using AHP. J. King Saud Univ.-Comput. Inf. Sci. 30(2), 152–163 (2018)
Google Scholar
Petrik, D., Herzwurm, G.: iIoT ecosystem development through boundary resources: a Siemens MindSphere case study. In: Proceedings of the 2nd ACM SIGSOFT International Workshop on Software-Intensive Business: Start-Ups, Platforms, and Ecosystems, pp. 1–6 (2019)
Google Scholar
Priya, V., Kumar, C.S., Kannan, R.: Resource scheduling algorithm with load balancing for cloud service provisioning. Appl. Soft Comput. 76, 416–424 (2019)
Article Google Scholar
Sekaran, K., Khan, M.S., Patan, R., Gandomi, A.H., Krishna, P.V., Kallam, S.: Improving the response time of m-learning and cloud computing environments using a dominant firefly approach. IEEE Access 7, 30203–30212 (2019)
Article Google Scholar
Wang, Z., Wen, Y., Zhang, Y., Chen, J., Cao, B.: A resource usage prediction-based energy-aware scheduling algorithm for instance-intensive cloud workflows. In: Gao, H., Wang, X., Yin, Y., Iqbal, M. (eds.) CollaborateCom 2018. LNICST, vol. 268, pp. 626–642. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-12981-1_44
Wickremasinghe, B., Calheiros, R.N., Buyya, R.: CloudAnalyst: a CloudSim-based visual modeller for analysing cloud computing environments and applications. In: Proceedings of the 2010 24th IEEE International Conference on Advanced Information Networking and Applications, pp. 446–452. IEEE (2010)
Google Scholar
Wu, D., Jiang, N., Du, W., Tang, K., Cao, X.: Particle swarm optimization with moving particles on scale-free networks. IEEE Trans. Netw. Sci. Eng. 7(1), 497–506 (2020)
Article Google Scholar
Xu, M., Buyya, R.: Brownout approach for adaptive management of resources and applications in cloud computing systems: a taxonomy and future directions. ACM Comput. Surv. (CSUR) 52(1), 1–27 (2019)
Article Google Scholar
Xu, M., Tian, W., Buyya, R.: A survey on load balancing algorithms for virtual machines placement in cloud computing. Concurr. Comput.: Pract. Exp. 29(12), e4123 (2017)
Article Google Scholar
Xu, M., Toosi, A.N., Bahrani, B., Razzaghi, R., Singh, M.: Optimized renewable energy use in green cloud data centers. In: Yangui, S., Bouassida Rodriguez, I., Drira, K., Tari, Z. (eds.) ICSOC 2019. LNCS, vol. 11895, pp. 314–330. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33702-5_24

Download references

Acknowledgments

This research is partially supported by Key-Area Research and Development Program of Guangdong Province (NO. 2020B010164003), the National Natural Science Foundation of China, with Grant ID 61672136 and 61828202, SIAT Innovation Program for Excellent Young Researchers. We thank teams in Siemens Industry Software Co., Ltd., China, for their discussion and comments on this work.

Author information

Authors and Affiliations

School of Software and Information Engineering, University of Electronic Science and Technology of China, Chengdu, China
Lingxiao Xu, Wenhong Tian & Rajkumar Buyya
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
Minxian Xu
Siemens Industry Software (Chengdu) Co., Ltd., Chengdu, China
Richard Semmes, Hui Li, Hong Mu & Shuangquan Gui
Department of Computer Science, University of Victoria, Victoria, Canada
Kui Wu
CLOUDS Lab, School of Computing and Information Systems, University of Melbourne, Melbourne, Australia
Rajkumar Buyya

Authors

Lingxiao Xu
View author publications
You can also search for this author in PubMed Google Scholar
Minxian Xu
View author publications
You can also search for this author in PubMed Google Scholar
Richard Semmes
View author publications
You can also search for this author in PubMed Google Scholar
Hui Li
View author publications
You can also search for this author in PubMed Google Scholar
Hong Mu
View author publications
You can also search for this author in PubMed Google Scholar
Shuangquan Gui
View author publications
You can also search for this author in PubMed Google Scholar
Wenhong Tian
View author publications
You can also search for this author in PubMed Google Scholar
Kui Wu
View author publications
You can also search for this author in PubMed Google Scholar
Rajkumar Buyya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Minxian Xu or Wenhong Tian .

Editor information

Editors and Affiliations

Shanghai University, Shanghai, China
Honghao Gao
Xi’an Jiaotong-Liverpool University, Suzhou, China
Xinheng Wang
London South Bank University, London, UK
Muddesar Iqbal
Hangzhou Dianzi University, Hangzhou, China
Yuyu Yin
Zhejiang University, Hangzhou, China
Jianwei Yin
Fudan University, Shanghai, China
Ning Gu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, L. et al. (2021). A Reinforcement Learning Based Approach to Identify Resource Bottlenecks for Multiple Services Interactions in Cloud Computing Environments. In: Gao, H., Wang, X., Iqbal, M., Yin, Y., Yin, J., Gu, N. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 350. Springer, Cham. https://doi.org/10.1007/978-3-030-67540-0_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-67540-0_4
Published: 22 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67539-4
Online ISBN: 978-3-030-67540-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics