ABSTRACT
Work stealing can be implemented in either a cooperative or a coordinated way. We compared the two approaches for lifeline-based global load balancing, which is the algorithm used by X10's Global Load Balancing framework GLB. We conducted our study with the APGAS library for Java, to which we ported GLB in a first step. Our cooperative variant resembles the original GLB framework, except that strict sequentialization is replaced by Java synchronization constructs such as critical sections. Our coordinated variant enables concurrent access to local task pools by using a split queue data structure. In experiments with modified versions of the UTS and BC benchmarks, the cooperative and coordinated APGAS variants had similar executions times, without a clear winner. Both variants outperformed the original GLB when compiled with Managed X10. Experiments were run on up to 128 nodes, to which we assigned up to 512 places.
- U. A. Acar, A. Charguraud, and M. Rainey. Scheduling parallel programs by work stealing with private deques. ACM SIGPLAN Notices (PPoPP), 48(8):219–228, 2013. Google ScholarDigital Library
- R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. Journal of the ACM, 46(5):720–748, 1999. Google ScholarDigital Library
- D. Chase and Y. Lev. Dynamic circular work-stealing deque. In Proc. ACM Symp. on Parallelism in Algorithms and Architectures (SPAA), pages 21–28, 2005. Google ScholarDigital Library
- J. Dinan, S. Krishnamoorthy, D. B. Larkins, et al. Scioto: A framework for global-view task parallelism. In Int. Conf. on Parallel Processing, pages 586–593. IEEE, 2008. Google ScholarDigital Library
- J. Dinan, D. B. Larkins, P. Sadayappan, et al. Scalable work stealing. In Proc. SC Conf. on High Performance Computing, Networking, Storage and Analysis (SC), 2009. Google ScholarDigital Library
- C. Fohry, M. Bungart, and J. Posner. Towards an efficient fault-tolerance scheme for GLB. In Proc. ACM SIGPLAN X10 Workshop, pages 27–32, 2015. Google ScholarDigital Library
- L. C. Freeman. A set of measures of centrality based on betweenness. Sociometry, 40(1):35–41, 1977.Google ScholarCross Ref
- D. Grove. Make x10.glb safe for multi-threaded places. https: //xtenlang.atlassian.net/browse/XTENLANG-3391, 2015.Google Scholar
- Hazelcast, Inc. The leading open source in-memory data grid. http: //hazelcast.org, 2016.Google Scholar
- R. Hoffmann and T. Rauber. Adaptive task pools: Efficiently balancing large number of tasks on shared-address spaces. Int. Journal of Parallel Programming, 39(5):553–581, 2011.Google ScholarCross Ref
- IBM. Core implementation of X10 programming language including compiler, runtime, class libraries, sample programs and test suite. https://github.com/x10-lang/x10, 2016.Google Scholar
- M. Korch and T. Rauber. A comparison of task pools for dynamic load balancing of irregular algorithms. Concurrency and Computation: Practice and Experience, 16(1):1–47, 2004. Google ScholarDigital Library
- S. Olivier, J. Huan, J. Liu, et al. UTS: An Unbalanced Tree Search benchmark. In Proc. Workshop on Languages and Compilers for High-Performance Computing, pages 235–250. Springer LNCS 4382, 2006. Google ScholarDigital Library
- S. Perarnau and M. Sato. Victim selection and distributed work stealing performance: A case study. In Proc. IEEE Int. Parallel and Distributed Processing Symp., pages 659–668, 2014. Google ScholarDigital Library
- J. Posner. Global load balancing and intra-node synchronization with the Java framework APGAS. Mastersthesis, Universität Kassel, Germany, 2016.Google Scholar
- K. Ravicandran, S. Lee, and S. Pande. Work stealing for multi-core HPC clusters. In Proc. Euro-Par, pages 205–217. Springer LNCS 6852, 2011. Google ScholarDigital Library
- V. Saraswat, G. Almasi, G. Bikshandi, et al. The asynchronous partitioned global address space model. Technical report, IBM, Toronto, Canada, 2010.Google Scholar
- V. Saraswat, P. Kambadur, S. Kodali, et al. Lifeline-based global load balancing. In Proc. ACM Symp. on Principles and Practice of Parallel Programming, pages 201–212, 2011. Google ScholarDigital Library
- O. Tardieu. The APGAS library: Resilient parallel and distributed programming in Java 8. In Proc. ACM SIGPLAN X10 Workshop, pages 25–26, 2015. Google ScholarDigital Library
- K. Yamashita and T. Kamada. Introducing a multithread and multistage mechanism for the global load balancing library of X10. Journal of Information Processing, 24(2):416–424, 2016.Google ScholarCross Ref
- W. Zhang, O. Tardieu, D. Grove, et al. GLB: Lifeline-based global load balancing library in X10. In Proc. ACM Workshop on Parallel Programming for Analytics Applications (PPAA), pages 31–40, 2014. Google ScholarDigital Library
Index Terms
- Cooperation vs. coordination for lifeline-based global load balancing in APGAS
Recommendations
The APGAS library: resilient parallel and distributed programming in Java 8
X10 2015: Proceedings of the ACM SIGPLAN Workshop on X10We propose the APGAS library for Java 8. Inspired by the core constructs and semantics of the Resilient X10 programming language, APGAS brings many benefits of the X10 programming model to the Java programmer as a pure, idiomatic Java library. APGAS ...
Lifeline-based global load balancing
PPoPP '11On shared-memory systems, Cilk-style work-stealing has been used to effectively parallelize irregular task-graph based applications such as Unbalanced Tree Search (UTS). There are two main difficulties in extending this approach to distributed memory. ...
Hybrid work stealing of locality-flexible and cancelable tasks for the APGAS library
Since large parallel machines are typically clusters of multicore nodes, parallel programs should be able to deal with both shared memory and distributed memory. This paper proposes a hybrid work stealing scheme, which combines the lifeline-based ...
Comments