Skip to main content
Log in

Improving utilization of heterogeneous clusters

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Datacenters often agglutinate sets of nodes with different capabilities, leading to a sub-optimal resource utilization. One of the best ways of improving utilization is to balance the load by taking into account the heterogeneity of these clusters. This article presents a novel way of expressing computational capacity, more adequate for heterogeneous clusters, and also advocates for task migration in order to further improve the utilization. The experimental evaluation shows that both proposals are advantageous and allow improving the utilization of heterogeneous clusters and reducing the makespan to 16.7% and 17.1%, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. The code is available at https://github.com/dmtcp.

References

  1. Beltrán M, Guzmán A, Bosque JL (2006) Dealing with heterogeneity in load balancing algorithms. In: 5th International Symposium on Parallel and Distributed Computing (ISPDC 2006), 6–9 July 2006, Timisoara, Romania, pp 123–132

  2. Deng W, Liu F, Jin H, Liao X, Liu H, Chen L (2012) Lifetime or energy: consolidating servers with reliability control in virtualized cloud datacenters. In: 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings, pp 18–25

  3. Guenter B, Jain N, Williams C (2011) Managing cost, performance, and reliability tradeoffs for energy-aware server provisioning. In: 2011 Proceedings IEEE INFOCOM, pp 1332–1340

  4. Alam T, Raza Z (2016) An adaptive threshold based hybrid load balancing scheme with sender and receiver initiated approach using random information exchange. Concurr Comput: Pract Exp 28(9):2729–2746

    Article  Google Scholar 

  5. Bosque JL, Robles OD, Pastor L, Rodríguez A (2006) Parallel CBIR implementations with load balancing algorithms. J Parallel Distrib Comput 66(8):1062–1075

    Article  Google Scholar 

  6. Martínez J, Almeida F, Garzón E, Acosta A, Blanco V (2011) Adaptive load balancing of iterative computation on heterogeneous nondedicated systems. J Supercomput 58(3):385–393

    Article  Google Scholar 

  7. Belgaum MR, Soomro S, Alansari Z, Musa S, Alam M, Su’ud MM (2019) Load balancing with preemptive and non-preemptive task scheduling in cloud computing. In: CoRR, arXiv:abs/1905.03094

  8. Ungureanu V, Melamed B, Katehakis M (2008) Effective load balancing for cluster-based servers employing job preemption. Perform Eval 65(8):606–622

    Article  Google Scholar 

  9. Gerofi B, Ishikawa Y (2011) Workload adaptive checkpoint scheduling of virtual machine replication. In: 2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing, pp 204–213

  10. Bartuschat Dominik, Rüde Ulrich (2014) Parallel multiphysics simulations of charged particles in microfluidic flows. J Comput Sci 8:1–19

    Article  Google Scholar 

  11. Bosque JL, Toharia P, Robles OD, Pastor L (2013) A load index and load balancing algorithm for heterogeneous clusters. J Supercomput 65(3):1104–1113

    Article  Google Scholar 

  12. Harchol-Balter M, Downey AB (1997) Exploiting process lifetime distributions for dynamic load balancing. ACM Trans Comput Syst 15(3):253–285

    Article  Google Scholar 

  13. Ansel J, Arya K, Cooperman G (2009) DMTCP: transparent checkpointing for cluster computations and the desktop. In: IEEE International Symposium on Parallel and Distributed Processing, Rome, pp 1–12

  14. Jiang Y (2016) A survey of task allocation and load balancing in distributed systems. IEEE Trans Parallel Distrib Syst 27(2):585–599

    Article  Google Scholar 

  15. Cabrera Pérez A, Acosta A, Almeida F, Blanco Pérez V (2019) A heuristic technique to improve energy efficiency with dynamic load balancing. J Supercomput 75(3):1610–1624

    Article  Google Scholar 

  16. Laredo JLJ, Guinand F, Olivier D, Bouvry P (2017) Load balancing at the edge of chaos: how self-organized criticality can lead to energy-efficient computing. IEEE Trans Parallel Distrib Syst 28(2):517–529

    Article  Google Scholar 

  17. Sheetlani J, Khanna MS (2016) Classification of task partitioning and load balancing strategies in distributed parallel computing systems. Int J Comput Syst 3(5):371–375

    Google Scholar 

  18. Mishra P, Singh S, Mishra M, Agarwal S (2013) Comparative analysis of various evolutionary techniques of load balancing: a review. Int J Comput Appl 63(15):8–13

    Google Scholar 

  19. Bosque JL, Toharia P, Robles OD, Pastor L (2013) A load balancing algorithm for heterogeneous clusters. J Supercomput 65(3):1104–1113

    Article  Google Scholar 

  20. Sant’Ana L, Cordeiro D, de Camargo RY (2019) PLB-HAC: dynamic load-balancing for heterogeneous accelerator clusters. In: Euro-Par 2019: 25th International Conference on Parallel and Distributed Computing, Proceedings, pp 197–209

  21. Cocaña Fernández A, Ranilla J, Sánchez L (2015) Energy-efficient allocation of computing node slots in HPC clusters through parameter learning and hybrid genetic fuzzy system modeling. J Supercomput 71(3):1163–1174

    Article  Google Scholar 

  22. Cocaña-Fernández A, Sánchez L, Ranilla J (2016) Leveraging a predictive model of the workload for intelligent slot allocation schemes in energy-efficient HPC clusters. Eng Appl Artif Intell 48:95–105

    Article  Google Scholar 

  23. Cocaña-Fernández A, San José Guiote E, Sánchez L, Ranilla J (2019) Eco-efficient resource management in hpc clusters through computer intelligence techniques. Energies 12:2129

    Article  Google Scholar 

  24. Kohl N, Hötzer J, Schornbaum F, Bauer M, Godenschwager C, Köstler H, Nestler B, Rüde U (2019) A scalable and extensible checkpointing scheme for massively parallel simulations. Int J High Perform Comput Appl 33(4):571–589

    Article  Google Scholar 

  25. De Falco I, Laskowski E, Olejnik R, Scafuri U, Tarantino E, Tudruj M (2018) Effective processor load balancing using multi-objective parallel extremal optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO ’18, pp 1292–1299

  26. El-Sayed N, Schroeder B (2018) Understanding practical tradeoffs in HPC checkpoint-scheduling policies. IEEE Trans Dependable Secure Comput 15(2):336–350

    Article  Google Scholar 

Download references

Acknowledgements

This work has been supported by the Spanish Science and Technology Commission under contracts TIN2016-76635-C2-2-R and TIN2016-81840-REDT (CAPAP-H6 network) and the European HiPEAC Network of Excellence.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Esteban Stafford.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Stafford, E., Bosque, J.L. Improving utilization of heterogeneous clusters. J Supercomput 76, 8787–8800 (2020). https://doi.org/10.1007/s11227-020-03175-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-020-03175-4

Keywords

Navigation