ABSTRACT
Load balance is critical to achieving scalability for large network emulation studies, which are of compelling interest for emerging Grid, Peer to Peer, and other distributed applications and middleware. Achieving load balance in emulation is difficult because of irregular network structure and unpredictable network traffic. We formulate load balance as a graph partitioning problem and apply classical graph partitioning algorithms to it. The primary challenge in this approach is how to extract useful information from the network emulation and present it to the graph partitioning algorithms in a way that reflects the load balance requirement in the original emulation problem. Using a large-scale network emulation system called MaSSF, we explore three approaches for partitioning, based on purely static topology information (TOP), combining topology and application placement information (PLACE), and combining topology and application profile data (PROFILE). These studies show that exploiting static topology and application placement information can achieve reasonable load balance, but a profile-based approach further improves load balance for even large scale network emulation. In our experiments, PROFILE improves load balance by 50% to 66% and emulation time is reduced up to 50% compared to purely static topology-based approaches.
- 1. Fugui Wang, P.M., Sarit Mukherjee, Dennis Bushmitch, A Random Early Demotion and Promotion Marker for Assured Services. IEEE Jour. on Selected Areas in Communications, 1999. Google ScholarDigital Library
- 2. Tao Ye, S.K., David Harrison, Biplab Sikdar, Bin Mo, Hema Tahilramani Kaur, Ken Vastold, Boleslaw Szymanski, Network Management and Control Using Collaborative On-line Simulation. Proc. IEEE International Conference on Communications, June 2001.Google ScholarCross Ref
- 3. D. Katabi, M.H., and C. Rohrs. Internet congestion control for future high bandwidth-delay product environments. in Proc. ACM SIGCOMM. 2002. Pittsburgh, PA.Google ScholarDigital Library
- 4. Christina Parsa, J.J.G.-L.-A. Improving TCP Congestion Control over Internets with Heterogeneous Transmission Media. in Proceedings of the 7th IEEE International Conference on Network Protocols (ICNP). 1999. Google ScholarDigital Library
- 5. Oram, A., Peer-to-Peer: Harnessing the Power of Disruptive Technologies. March 2001: O'Reilly. Google ScholarDigital Library
- 6. Ian Foster, C.K.e., The Grid: Blueprint for a New Computing Infrastructure. 1999: Morgan Kaufmann. Google ScholarDigital Library
- 7. Pei Zheng, L.N. EMPOWER: A Network Emulator for Wireless and Wireline Networks. in Infocom 2003. 2003. San Francisco.Google ScholarCross Ref
- 8. Rob Simmonds, R.B., and Brian Unger. Applying parallel discrete event simulation to network emulation. in 14th Workshop on Parallel and Distributed Simulation (PADS 2000). May 28-31, 2000. Bologna, Italy. Google ScholarDigital Library
- 9. Brian White, J.L., Leigh Stoller, Robert Ricci, Shashi Guruprasad, Mac Newbold, Mike Hibler, Chad Barb, and Abhijeet Joglekar. An Integrated Experimental Environment for Distributed Systems and Networks. in Proceedings of 5th Symposium on Operating Systems Design and Implementation (OSDI). December 2002. Google ScholarDigital Library
- 10. Amin Vahdat, K.Y., Kevin Walsh, Priya Mahadevan, Dejan Kostic, Jeff Chase, and David Becker. Scalability and Accuracy in a Large-Scale Network Emulator. in Proceedings of 5th Symposium on Operating Systems Design and Implementation (OSDI). December 2002. Google ScholarDigital Library
- 11. H. Song, X.L., D. Jakobsen, R. Bhagwan, X. Zhang, K. Taura, and A. Chien. The MicroGrid: a Scientific Tool for Modeling Computational Grids. in IEEE Supercomputing (SC 2000). 2000. Dallas, USA. Google ScholarDigital Library
- 12. Jason Liu, a.D.M.N. Learning Not to Share. in Proceedings of the 15th Workshop on Parallel and Distributed Simulation (PADS 2001). 2001. Lake Arrowhead, CA. Google ScholarDigital Library
- 13. C. Walshaw, M.C., S. Johnson, and M. Everett. JOSTLE: Partitioning of Unstructured Meshes for Massively Parallel Machines. in Parallel CFD'94. 1994. Tyoto, Japan.Google Scholar
- 14. F. Pellegrini, J.R. SCOTCH: a software package for static mapping by dual recursive bipartitioning of process and architecture graphs. in High-performance Computing and Networking, Proc. HPCN'96. 1996. Springer, Berlin. Google ScholarDigital Library
- 15. Preis R., D.R., PARTY - A Software Library for Graph Partitioning. Advances in Computational Mechanics with Parallel and Distributed Processing, 1997: p. 63-71.Google Scholar
- 16. Hendrickson, B., Graph Partitioning Models for Parallel Computing. Parallel Computing Journal, 2000. 26(12): p. 1519-1534. Google ScholarDigital Library
- 17. B. Hendrickson, R.L., The Chaco User's Guide: Version 2.0. 1994, Sandia Tech.Google Scholar
- 18. Kirk Schloegel, G.K., and Vipin Kumar. A New Algorithm for Multi-Objective Graph Partitioning. in Euro-Par'99 Parallel Processing. 1999. Springer Verlag, Heidelberg. Google ScholarDigital Library
- 19. Cisco Systems, NetFlow. 2001.Google Scholar
- 20. Alberto Medina, A.L., Ibrahim Matta, and John Byers. BRITE: An Approach to Universal Topology Generation. in In Proceedings of the International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunications Systems-MASCOTS'01 . 2001. Cincinnati, Ohio. Google ScholarDigital Library
- 21. Paul Barford, M.C. Generating Representative Web Workloads for Network and Server Performance Evaluation. in Measurement and Modeling of Computer Systems 1998. 1998. Google ScholarDigital Library
- 22. David P. Olshefski, J.N., and Dakshi Agrawal. Inferring Client Response Time at the Web Server. in Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 2002). 2002. Marina del Rey, CA. Google ScholarDigital Library
- 23. Rong Pan, B.P., Konstantinos Psounis, and Damon Wischik. SHRINK: A Method for Scalable Performance Prediction and Efficient Network Simulation. in IEEE INFOCOM. 2003.Google ScholarCross Ref
- 24. Jaeyeon Jung, B.K., and Michael Rabinovich. Flash crowds and denial of service attacks: Characterization and implications for CDNs and web sites. in Proceeding of 11th World Wide Web conference. 2002. Honolulu, Hawaii. Google ScholarDigital Library
- 25. A. Petitet, S.B., J. Dongarra, B. Ellis, G. Fagg, K. Roche, and S. Vadhiyar. Numerical Libraries and the Grid: The GrADS Experiment with ScaLAPACK. in International Journal of High Performance Computing Applications. 2001. Google ScholarDigital Library
- 26. Frumkin, R.F.V.D.W.a.M., NAS Grid Benchmarks Version 1.0. 2002, NASA Ames Research Center.Google Scholar
- 27. James Cowie, H.L., Jason Liu, David Nicol and Andy Ogielski. Towards Realistic Million-Node Internet Simulations. in Proceedings of the 1999 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'99). June 28 - July 1, 1999. Las Vegas, Nevada.Google Scholar
- 28. Robert Ricci, C.A., Jay Lepreau, A Solver for the Network Testbed Mapping Problem. 2002, University of Utah Flux Group.Google Scholar
- 29. K. Devine, B.H., E. Boman, M. St. John, and C. Vaughan. Design of Dynamic Load-Balancing Tools for Parallel Applications. in Proceedings of the International Conference on Supercomputing. 2000. Santa Fe. Google ScholarDigital Library
Recommendations
Quantifying the effectiveness of load balance algorithms
ICS '12: Proceedings of the 26th ACM international conference on SupercomputingLoad balance is critical for performance in large parallel applications. An imbalance on today's fastest supercomputers can force hundreds of thousands of cores to idle, and on future exascale machines this cost will increase by over a factor of a ...
Genetic Algorithm for Optimizing Network Load Balance in MPLS Network
CICN '12: Proceedings of the 2012 Fourth International Conference on Computational Intelligence and Communication NetworksThis paper presents a flexible genetic algorithm (FGA) for optimizing network load balance in MPLS network. Along with FGA, minimum cost path constraint is also considered. Multiconstraints optimal Network load balancing is an NP-hard problem and it is ...
Gossip-Based Load Balance Strategy in Big Data Systems with Hierarchical Processors
In big data systems, data are assigned to different processors by the system manager, which has a large amount of work to perform, such as achieving load balances and allocating data to the system processors in a centralized way. To alleviate its load, ...
Comments