ABSTRACT
Efficiently utilizing the rapidly increasing concurrency of multi-petaflop computing systems is a significant programming challenge. One approach is to structure applications with an upper layer of many loosely-coupled coarse-grained tasks, each comprising a tightly-coupled parallel function or program. "Many-task" programming models such as functional parallel dataflow may be used at the upper layer to generate massive numbers of tasks, each of which generates significant tighly-coupled parallelism at the lower level via multithreading, message passing, and/or partitioned global address spaces. At large scales, however, the management of task distribution, data dependencies, and inter-task data movement is a significant performance challenge. In this work, we describe Turbine, a new highly scalable and distributed many-task dataflow engine. Turbine executes a generalized many-task intermediate representation with automated self-distribution, and is scalable to multi-petaflop infrastructures. We present here the architecture of Turbine and its performance on highly concurrent systems.
- S. Ahuja, N. Carriero, and D. Gelernter. Linda and friends. IEEE Computer, 19(8):26--34, 1986. Google ScholarDigital Library
- T. G. Armstrong, J. M. Wozniak, M. Wilde, K. Maheshwari, D. S. Katz, M. Ripeanu, E. L. Lusk, and I. T. Foster. ExM: High level dataflow programming for extreme-scale systems. Under review for HotPar 2012. ANL Preprint ANL/MCS-P2045-0212, available at http://www.mcs.anl.gov/publications.Google Scholar
- ASCAC Subcommittee on Exascale Computing. The opportunities and challenges of exascale computing, 2010. U.S. Dept. of Energy report.Google Scholar
- R. D. Blumofe and P. A. Lisiecki. Adaptive and reliable parallel computing on networks of workstations. In Proc. of Annual Conf. on USENIX, page 10, Berkeley, CA, USA, 1997. USENIX Association. Google ScholarDigital Library
- G. Bosilca, A. Bouteiller, A. Danalis, T. Herault, P. Lemarinier, and J. Dongarra. DAGuE: A generic distributed DAG engine for high performance computing. In Proc. Intl. Parallel and Distributed Processing Symp., 2011. Google ScholarDigital Library
- J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. Commun. ACM, 51:107--113, January 2008. Google ScholarDigital Library
- G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's highly available key-value store. SIGOPS Oper. Syst. Rev., 41:205--220, Oct. 2007. Google ScholarDigital Library
- E. Deelman, T. Kosar, C. Kesselman, and M. Livny. What makes workflows work in an opportunistic environment? Concurrency and Computation: Practice and Experience, 18:1187--1199, 2006. Google ScholarDigital Library
- J. Dinan, S. Krishnamoorthy, D. B. Larkins, J. Nieplocha, and P. Sadayappan. Scioto: A framework for global-view task parallelism. Intl. Conf. on Parallel Processing, 0:586--593, 2008. Google ScholarDigital Library
- J. Ekanayake, H. Li, B. Zhang, T. Gunarathne, S.-H. Bae, J. Qiu, and G. Fox. Twister: A runtime for iterative MapReduce. In Proc. of 19th ACM Intl. Symp. on High Performance Distributed Computing, HPDC '10, pages 810--818, New York, 2010. ACM. Google ScholarDigital Library
- J. Evans and A. Rzhetsky. Machine science. Science, 329(5990):399--400, 2010.Google ScholarCross Ref
- B. Fitzpatrick. Distributed caching with memcached. Linux Journal, 2004:5--, August 2004. Google ScholarDigital Library
- M. Hategan, J. Wozniak, and K. Maheshwari. Coasters: uniform resource provisioning and access for scientific computing on clouds and grids. In Proc. Utility and Cloud Computing, 2011. Google ScholarDigital Library
- M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: Distributed data-parallel programs from sequential building blocks. SIGOPS Oper. Syst. Rev., 41:59--72, March 2007. Google ScholarDigital Library
- J. W. Jones, G. Hoogenboom, P. Wilkens, C. Porter, and G. Tsuji, editors. Decision Support System for Agrotechnology Transfer Version 4.0: Crop Model Documentation. University of Hawaii, 2003.Google Scholar
- A. Lakshman and P. Malik. Cassandra: a decentralized structured storage system. SIGOPS Oper. Syst. Rev., 44:35--40, April 2010. Google ScholarDigital Library
- Z. Li and M. Parashar. Comet: A scalable coordination space for decentralized distributed environments. In 2nd Intl. Work. on Hot Topics in Peer-to-Peer Systems, HOT-P2P 2005, pages 104--111, 2005. Google ScholarDigital Library
- E. L. Lusk, S. C. Pieper, and R. M. Butler. More scalability, less pain: A simple programming model and its implementation for extreme computing. SciDAC Review, 17:30--37, January 2010.Google Scholar
- M. D. McCool. Structured parallel programming with deterministic patterns. In Proc. HotPar, 2010. Google ScholarDigital Library
- D. G. Murray and S. Hand. Scripting the cloud with Skywriting. In HotCloud '10: Proc. of 2nd USENIX Work. on Hot Topics in Cloud Computing, Boston, MA, USA, June 2010. USENIX. Google ScholarDigital Library
- D. G. Murray, M. Schwarzkopf, C. Smowton, S. Smith, A. Madhavapeddy, and S. Hand. CIEL: a universal execution engine for distributed data-flow computing. In Proc. NSDI, 2011. Google ScholarDigital Library
- C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig Latin: A not-so-foreign language for data processing. In Proc. of 2008 ACM SIGMOD Intl. Conf. on Management of Data, SIGMOD '08, pages 1099--1110, New York, 2008. ACM. Google ScholarDigital Library
- R. Pike, S. Dorward, R. Griesemer, and S. Quinlan. Interpreting the data: Parallel analysis with Sawzall. Scientific Programming, 13(4):277--298, 2005. Google ScholarDigital Library
- I. Raicu, Z. Zhang, M. Wilde, I. Foster, P. Beckman, K. Iskra, and B. Clifford. Toward loosely coupled programming on petascale systems. In Proc. of 2008 ACM/IEEE Conf. on Supercomputing, SC '08, pages 22:1--22:12, Piscataway, NJ, 2008. IEEE Press. Google ScholarDigital Library
- Redis. http://redis.io/.Google Scholar
- J. Shalf, J. Morrison, and S. Dosanj. Exascale computing technology challenges. VECPAR'2010, 2010. Google ScholarDigital Library
- H. Simon, T. Zacharia, and R. Stevens. Modeling and simulation at the exascale for energy and the environment, 2007. Report on the Advanced Scientific Computing Research Town Hall Meetings on Simulation and Modeling at the Exascale for Energy, Ecological Sustainability and Global Security (E3).Google Scholar
- R. Stevens and A. White. Architectures and technology for extreme scale computing, 2009. U.S. Dept. of Energy report.Google Scholar
- A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu, P. Wyckoff, and R. Murthy. Hive: a warehousing solution over a map-reduce framework. Proc. VLDB Endow., 2:1626--1629, August 2009. Google ScholarDigital Library
- G. von Laszewski, I. Foster, J. Gawor, and P. Lane. A Java Commodity Grid Kit. Concurrency and Computation: Practice and Experience, 13(8-9), 2001.Google Scholar
- E. Walker, W. Xu, and V. Chandar. Composing and executing parallel data-flow graphs with shell pipes. In Work. on Workflows in Support of Large-Scale Science at SC'09, 2009. Google ScholarDigital Library
- B. B. Welch, K. Jones, and J. Hobbs. Practical programming in Tcl and Tk. Prentice Hall, 4th edition, 2003. Google ScholarDigital Library
- M. Wilde, I. Foster, K. Iskra, P. Beckman, Z. Zhang, A. Espinosa, M. Hategan, B. Clifford, and I. Raicu. Parallel scripting for applications at the petascale and beyond. Computer, 42(11):50--60, 2009. Google ScholarDigital Library
- M. Wilde, M. Hategan, J. M. Wozniak, B. Clifford, D. S. Katz, and I. Foster. Swift: A language for distributed parallel scripting. Parallel Computing, 37:633--652, 2011. Google ScholarDigital Library
- Y. Yu, M. Isard, D. Fetterly, M. Budiu, U. Erlingsson, P. K. Gunda, and J. Currey. DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. In Proc. of Symp. on Operating System Design and Implementation (OSDI), December 2008. Google ScholarDigital Library
- M. Zaharia, N. M. M. Chowdhury, M. Franklin, S. Shenker, and I. Stoica. Spark: Cluster computing with working sets. Technical Report UCB/EECS-2010-53, EECS Department, University of California, Berkeley, May 2010.Google Scholar
Index Terms
- Turbine: a distributed-memory dataflow engine for extreme-scale many-task applications
Recommendations
Turbine: A Distributed-memory Dataflow Engine for High Performance Many-task Applications
Scalable Workflow Enactment Engines and TechnologyEfficiently utilizing the rapidly increasing concurrency of multi-petaflop computing systems is a significant programming challenge. One approach is to structure applications with an upper layer of many loosely coupled coarse-grained tasks, each ...
Swift/T: scalable data flow programming for many-task applications
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingSwift/T, a novel programming language implementation for highly scalable data flow programs, is presented.
Dataflow coordination of data-parallel tasks via MPI 3.0
EuroMPI '13: Proceedings of the 20th European MPI Users' Group MeetingScientific applications are often complex collections of many large-scale tasks. Mature tools exist for describing task-parallel workflows consisting of serial tasks, and a variety of tools exist for programming a single data-parallel operation. However,...
Comments