skip to main content
10.1145/2931028.2931029acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
short-paper

Cooperation vs. coordination for lifeline-based global load balancing in APGAS

Published:02 June 2016Publication History

ABSTRACT

Work stealing can be implemented in either a cooperative or a coordinated way. We compared the two approaches for lifeline-based global load balancing, which is the algorithm used by X10's Global Load Balancing framework GLB. We conducted our study with the APGAS library for Java, to which we ported GLB in a first step. Our cooperative variant resembles the original GLB framework, except that strict sequentialization is replaced by Java synchronization constructs such as critical sections. Our coordinated variant enables concurrent access to local task pools by using a split queue data structure. In experiments with modified versions of the UTS and BC benchmarks, the cooperative and coordinated APGAS variants had similar executions times, without a clear winner. Both variants outperformed the original GLB when compiled with Managed X10. Experiments were run on up to 128 nodes, to which we assigned up to 512 places.

References

  1. U. A. Acar, A. Charguraud, and M. Rainey. Scheduling parallel programs by work stealing with private deques. ACM SIGPLAN Notices (PPoPP), 48(8):219–228, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. Journal of the ACM, 46(5):720–748, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Chase and Y. Lev. Dynamic circular work-stealing deque. In Proc. ACM Symp. on Parallelism in Algorithms and Architectures (SPAA), pages 21–28, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Dinan, S. Krishnamoorthy, D. B. Larkins, et al. Scioto: A framework for global-view task parallelism. In Int. Conf. on Parallel Processing, pages 586–593. IEEE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Dinan, D. B. Larkins, P. Sadayappan, et al. Scalable work stealing. In Proc. SC Conf. on High Performance Computing, Networking, Storage and Analysis (SC), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Fohry, M. Bungart, and J. Posner. Towards an efficient fault-tolerance scheme for GLB. In Proc. ACM SIGPLAN X10 Workshop, pages 27–32, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. L. C. Freeman. A set of measures of centrality based on betweenness. Sociometry, 40(1):35–41, 1977.Google ScholarGoogle ScholarCross RefCross Ref
  8. D. Grove. Make x10.glb safe for multi-threaded places. https: //xtenlang.atlassian.net/browse/XTENLANG-3391, 2015.Google ScholarGoogle Scholar
  9. Hazelcast, Inc. The leading open source in-memory data grid. http: //hazelcast.org, 2016.Google ScholarGoogle Scholar
  10. R. Hoffmann and T. Rauber. Adaptive task pools: Efficiently balancing large number of tasks on shared-address spaces. Int. Journal of Parallel Programming, 39(5):553–581, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  11. IBM. Core implementation of X10 programming language including compiler, runtime, class libraries, sample programs and test suite. https://github.com/x10-lang/x10, 2016.Google ScholarGoogle Scholar
  12. M. Korch and T. Rauber. A comparison of task pools for dynamic load balancing of irregular algorithms. Concurrency and Computation: Practice and Experience, 16(1):1–47, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Olivier, J. Huan, J. Liu, et al. UTS: An Unbalanced Tree Search benchmark. In Proc. Workshop on Languages and Compilers for High-Performance Computing, pages 235–250. Springer LNCS 4382, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. Perarnau and M. Sato. Victim selection and distributed work stealing performance: A case study. In Proc. IEEE Int. Parallel and Distributed Processing Symp., pages 659–668, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Posner. Global load balancing and intra-node synchronization with the Java framework APGAS. Mastersthesis, Universität Kassel, Germany, 2016.Google ScholarGoogle Scholar
  16. K. Ravicandran, S. Lee, and S. Pande. Work stealing for multi-core HPC clusters. In Proc. Euro-Par, pages 205–217. Springer LNCS 6852, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. V. Saraswat, G. Almasi, G. Bikshandi, et al. The asynchronous partitioned global address space model. Technical report, IBM, Toronto, Canada, 2010.Google ScholarGoogle Scholar
  18. V. Saraswat, P. Kambadur, S. Kodali, et al. Lifeline-based global load balancing. In Proc. ACM Symp. on Principles and Practice of Parallel Programming, pages 201–212, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. O. Tardieu. The APGAS library: Resilient parallel and distributed programming in Java 8. In Proc. ACM SIGPLAN X10 Workshop, pages 25–26, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. K. Yamashita and T. Kamada. Introducing a multithread and multistage mechanism for the global load balancing library of X10. Journal of Information Processing, 24(2):416–424, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  21. W. Zhang, O. Tardieu, D. Grove, et al. GLB: Lifeline-based global load balancing library in X10. In Proc. ACM Workshop on Parallel Programming for Analytics Applications (PPAA), pages 31–40, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Cooperation vs. coordination for lifeline-based global load balancing in APGAS

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          X10 2016: Proceedings of the 6th ACM SIGPLAN Workshop on X10
          June 2016
          33 pages
          ISBN:9781450343862
          DOI:10.1145/2931028

          Copyright © 2016 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 2 June 2016

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • short-paper

          Acceptance Rates

          Overall Acceptance Rate5of5submissions,100%

          Upcoming Conference

          PLDI '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader