Dag-calculus: a calculus for parallel computation

Authors:
Umut A. Acar

Carnegie Mellon University, USA / Inria, France

Carnegie Mellon University, USA / Inria, France
View Profile

,
Arthur Charguéraud

Inria, France

Inria, France
View Profile

,
Mike Rainey

Inria, France

Inria, France
View Profile

,
Filip Sieczkowski

Inria, France

Inria, France
View Profile

ICFP 2016: Proceedings of the 21st ACM SIGPLAN International Conference on Functional ProgrammingSeptember 2016Pages 18–32https://doi.org/10.1145/2951913.2951946

Published:04 September 2016Publication History

ICFP 2016: Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming

Pages 18–32

ABSTRACT

Increasing availability of multicore systems has led to greater focus on the design and implementation of languages for writing parallel programs. Such languages support various abstractions for parallelism, such as fork-join, async-finish, futures. While they may seem similar, these abstractions lead to different semantics, language design and implementation decisions, and can significantly impact the performance of end-user applications.

In this paper, we consider the question of whether it would be possible to unify various paradigms of parallel computing. To this end, we propose a calculus, called dag calculus, that can encode fork-join, async-finish, and futures, and possibly others. We describe dag calculus and its semantics, establish translations from the aforementioned paradigms into dag calculus. These translations establish that dag calculus is sufficiently powerful for encoding programs written in prevailing paradigms of parallelism. We present concurrent algorithms and data structures for realizing dag calculus on multicore hardware and prove that the proposed techniques are consistent with the semantics. Finally, we present an implementation of the calculus and evaluate it empirically by comparing its performance to highly optimized code from prior work. The results show that the calculus is expressive and that it competes well with, and sometimes outperforms, the state of the art.

References

Folly: Facebook open-source library, 2015. https://github.com/ facebook/folly.Google Scholar
Umut A. Acar, Guy E. Blelloch, and Robert D. Blumofe. The data locality of work stealing. Theory of Computing Systems (TOCS), 35(3): 321–347, 2002.Google Scholar
Umut A. Acar, Arthur Charguéraud, and Mike Rainey. Oracle scheduling: Controlling granularity in implicitly parallel languages. In ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), 2011. Google ScholarDigital Library
Umut A. Acar, Arthur Charguéraud, and Mike Rainey. Scheduling parallel programs by work stealing with private deques. In PPoPP ’13, 2013. Google ScholarDigital Library
Umut A. Acar, Arthur Chargueraud, and Mike Rainey. A work-e fficient algorithm for parallel unordered depth-first search. In ACM /IEEE Conference on High Performance Computing (SC), page 1, 2015. Google ScholarDigital Library
Nimar S. Arora, Robert D. Blumofe, and C. Greg Plaxton. Thread scheduling for multiprogrammed multiprocessors. Theory of Computing Systems, 34(2):115–144, 2001.Google ScholarCross Ref
J. C. M. Baeten. A brief history of process algebra. Theory of Computing Science, 335(2-3):131–146, May 2005. ISSN 0304-3975. Google ScholarDigital Library
Dave Berry, Robin Milner, and David N. Turner. A semantics for ML concurrency primitives. In Conference Record of the Nineteenth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Albuquerque, New Mexico, USA, January 19-22, 1992, pages 119–129, 1992. Google ScholarDigital Library
Guy Blelloch and Margaret Reid-Miller. Pipelining with futures. Theory of Computing Systems, 32(3):213–239, 1999. ISSN 1433-0490.Google ScholarCross Ref
Guy E. Blelloch and John Greiner. A provable time and space e fficient implementation of NESL. In Proceedings of the 1st ACM SIGPLAN International Conference on Functional Programming, pages 213–225. ACM, 1996. Google ScholarDigital Library
Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, and Julian Shun. Internally deterministic parallel algorithms can be fast. In PPoPP ’12, pages 181–192, 2012. ISBN 978-1-4503-1160-1. Google ScholarDigital Library
Robert D. Blumofe and Charles E. Leiserson. Scheduling multithreaded computations by work stealing. J. ACM, 46:720–748, September 1999. ISSN 0004-5411. Google ScholarDigital Library
Sebastian Burckhardt, Alexandro Baldassin, and Daan Leijen. Concurrent programming with revisions and isolation types. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA ’10, pages 691–707, 2010. ISBN 978-1-4503-0203-6. Google ScholarDigital Library
Manuel M. T. Chakravarty, Roman Leshchinskiy, Simon Peyton Jones, Gabriele Keller, and Simon Marlow. Data parallel Haskell: a status report. In Workshop on Declarative Aspects of Multicore Programming, DAMP ’07, pages 10–18, 2007. ISBN 978-1-59593-690-5. Google ScholarDigital Library
Philippe Charles, Christian Grotho ff, Vijay Saraswat, Christopher Donawa, Allan Kielstra, Kemal Ebcioglu, Christoph von Praun, and Vivek Sarkar. X10: an object-oriented approach to non-uniform cluster computing. In Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, OOPSLA ’05, pages 519–538. ACM, 2005. ISBN 1- 59593-031-0. Google ScholarDigital Library
Guojing Cong, Sreedhar B. Kodali, Sriram Krishnamoorthy, Doug Lea, Vijay A. Saraswat, and Tong Wen. Solving large, irregular graph problems using adaptive work-stealing. In ICPP, pages 536–545, 2008. Google ScholarDigital Library
Faith Ellen, Yossi Lev, Victor Luchangco, and Mark Moir. Snzi: Scalable nonzero indicators. In Proceedings of the Twenty-sixth Annual ACM Symposium on Principles of Distributed Computing, PODC ’07, pages 13–22, 2007. ISBN 978-1-59593-616-5. Google ScholarDigital Library
Matthew Fluet, Mike Rainey, John Reppy, and Adam Shaw. Implicitly threaded parallelism in Manticore. Journal of Functional Programming, 20(5-6):1–40, 2011. Google ScholarDigital Library
Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. The implementation of the Cilk-5 multithreaded language. In PLDI, pages 212–223, 1998. Google ScholarDigital Library
John Greiner and Guy E. Blelloch. A provably time-e fficient parallel implementation of full speculation. ACM Transactions on Programming Languages and Systems, 21(2):240–285, March 1999. ISSN 0164-0925. Google ScholarDigital Library
Robert H. Halstead, Jr. Implementation of multilisp: Lisp on a multiprocessor. In Proceedings of the 1984 ACM Symposium on LISP and functional programming, LFP ’84, pages 9–17. ACM, 1984. ISBN 0-89791-142-3. Google ScholarDigital Library
Robert Harper. Practical Foundations for Programming Languages. Cambridge University Press, New York, NY, USA, 2012. Google ScholarDigital Library
ISBN 1107029570, 9781107029576.Google Scholar
Maurice Herlihy and Zhiyu Liu. Well-structured futures and cache locality. ACM Transactions on Parallel Computing, 2(4):22:1–22:20, February 2016. ISSN 2329-4949. Google ScholarDigital Library
Carl Hewitt, Peter Bishop, and Richard Steiger. A universal modular actor formalism for artificial intelligence. In Proceedings of the 3rd International Joint Conference on Artificial Intelligence, IJCAI’73, pages 235–245, 1973. Google ScholarDigital Library
C. A. R. Hoare. Communicating sequential processes. Communications of the ACM, 21(8):666–677, August 1978. ISSN 0001-0782. Google ScholarDigital Library
Shams Mahmood Imam and Vivek Sarkay. Habanero-java library: a java 8 framework for multicore programming. In 2014 International Conference on Principles and Practices of Programming on the Java Platform Virtual Machines, Languages and Tools, PPPJ ’14, Cracow, Poland, September 23-26, 2014, pages 75–86, 2014. Google ScholarCross Ref
Joseph Jaja. An introduction to parallel algorithms. Addison Wesley Longman Publishing Company, 1992. Google ScholarDigital Library
Gabriele Keller, Manuel M.T. Chakravarty, Roman Leshchinskiy, Simon Peyton Jones, and Ben Lippmeier. Regular, shape-polymorphic, parallel arrays in haskell. In Proceedings of the 15th ACM SIGPLAN international conference on Functional programming, ICFP ’10, pages 261–272, 2010. ISBN 978-1-60558-794-3. Google ScholarDigital Library
Lindsey Kuper and Ryan R Newton. Lvars: lattice-based data structures for deterministic parallelism. In Proceedings of the 2nd ACM SIGPLAN workshop on Functional high-performance computing, pages 71–84. ACM, 2013. Google ScholarDigital Library
Lindsey Kuper, Aaron Turon, Neelakantan R. Krishnaswami, and Ryan R. Newton. Freeze after writing: Quasi-deterministic parallel programming with lvars. In Proceedings of the 41st ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL ’14, pages 257–270, New York, NY, USA, 2014. ACM. ISBN 978-1- 4503-2544-8.. Google ScholarDigital Library
Doug Lea. A java fork /join framework. In Proceedings of the ACM 2000 conference on Java Grande, JAVA ’00, pages 36–43, 2000. ISBN 1-58113-288-3. Google ScholarDigital Library
Daan Leijen, Wolfram Schulte, and Sebastian Burckhardt. The design of a task parallel library. In Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications, OOPSLA ’09, pages 227–242, 2009. ISBN 978-1-60558- 766-0. Google ScholarDigital Library
Simon Marlow. Parallel and concurrent programming in haskell. In Central European Functional Programming School - 4th Summer School, CEFP 2011, Budapest, Hungary, June 14-24, 2011, Revised Selected Papers, pages 339–401, 2011. Google ScholarDigital Library
Robin Milner, Joachim Parrow, and David Walker. A calculus of mobile processes, i. Inf. Comput., 100(1):1–40, September 1992. ISSN 0890-5401. Google ScholarDigital Library
Robin Milner, Joachim Parrow, and David Walker. A calculus of mobile processes, ii. Inf. Comput., 100(1):41–77, September 1992. ISSN 0890-5401. Google ScholarDigital Library
Scott Owens, Susmit Sarkar, and Peter Sewell. A better x86 memory model: x86-tso. In Stefan Berghofer, Tobias Nipkow, Christian Urban, and Makarius Wenzel, editors, Theorem Proving in Higher Order Logics, volume 5674 of Lecture Notes in Computer Science, pages 391–407. Springer Berlin / Heidelberg, 2009. ISBN 978-3-642-03358- 2. Google ScholarDigital Library
Jens Palsberg. Featherweight X10: a core calculus for async-finish parallelism. In Proceedings of the 14th Workshop on Formal Techniques for Java-like Programs, FTfJP 2012, Beijing, China, June 12, 2012, page 1, 2012. Google ScholarDigital Library
Prakash Panangaden and John H. Reppy. ML with Concurrency, chapter The Essence of Concurrent ML, pages 5–29. CRC Press, 2005.Google Scholar
Benjamin C. Pierce and David N. Turner. Proof, language, and interaction. chapter Pict: A Programming Language Based on the Pi-Calculus, pages 455–494. 2000. ISBN 0-262-16188-5.Google Scholar
Antoniu Pop and Albert Cohen. Openstream: Expressiveness and dataflow compilation of openmp streaming programs. TACO’13, 9(4): 53:1–53:25, January 2013. ISSN 1544-3566. Google ScholarDigital Library
Ram Raghunathan, Stefan K. Muller, Umut A. Acar, and Guy Blelloch. Hierarchical memory management for parallel programs. In ACM International Conference on Functional Programming, 2016. Google ScholarDigital Library
John H. Reppy. Concurrent Programming in ML. Cambridge University Press, New York, NY, USA, 1999. ISBN 0-521-48089-2. Google ScholarDigital Library
Mads Rosendahl. Automatic complexity analysis. In FPCA ’89: Functional Programming Languages and Computer Architecture, pages 144–156. ACM, 1989. Google ScholarDigital Library
David Sands. Calculi for Time Analysis of Functional Programs. PhD thesis, University of London, Imperial College, September 1990.Google Scholar
Patrick M. Sansom and Simon L. Peyton Jones. Time and space profiling for non-strict, higher-order functional languages. In Principles of Programming Languages, pages 355–366, 1995. Google ScholarDigital Library
K. C. Sivaramakrishnan, Lukasz Ziarek, and Suresh Jagannathan. Multimlton: A multicore-aware runtime for standard ml. Journal of Functional Programming, FirstView:1–62, 6 2014. ISSN 1469-7653.Google Scholar
Daniel Spoonhower, Guy E. Blelloch, Robert Harper, and Phillip B. Gibbons. Space profiling for parallel functional programs. Journal of Functional Programming, 20:417–461, 2010. ISSN 1469-7653. Google ScholarDigital Library
Alexandros Tzannes, George C. Caragea, Rajeev Barua, and Uzi Vishkin. Lazy binary-splitting: a run-time adaptive work-stealing scheduler. In PPoPP ’10, pages 179–190, 2010. Google ScholarDigital Library
Alexandros Tzannes, George C. Caragea, Uzi Vishkin, and Rajeev Barua. Lazy scheduling: A runtime adaptive scheduler for declarative parallelism. TOPLAS, 36(3):10:1–10:51, September 2014. ISSN 0164- 0925.. Google ScholarDigital Library
Introduction Background Dag Calculus Parallelism in the Dag Calculus Fork-Join Async-Finish Futures Representing the Computation Dag Implementation and Experiments Related Work ConclusionGoogle Scholar

Index Terms

Dag-calculus: a calculus for parallel computation

Recommendations

Dag-calculus: a calculus for parallel computation
ICFP '16

Increasing availability of multicore systems has led to greater focus on the design and implementation of languages for writing parallel programs. Such languages support various abstractions for parallelism, such as fork-join, async-finish, futures. ...
Read More
Computational Soundness of a Call by Name Calculus of Recursively-scoped Records

The paper presents a calculus of recursively-scoped records: a two-level calculus with a traditional call-by-name @l-calculus at a lower level and unordered collections of labeled @l-calculus terms at a higher level. Terms in records may reference each ...
Read More
The Bang Calculus: an untyped lambda-calculus generalizing call-by-name and call-by-value
PPDP '16: Proceedings of the 18th International Symposium on Principles and Practice of Declarative Programming

We introduce and study the Bang Calculus, an untyped functional calculus in which the promotion operation of Linear Logic is made explicit and where application is a bilinear operation. This calculus, which can be understood as an untyped version of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICFP 2016: Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming
September 2016
501 pages
ISBN:9781450342193
DOI:10.1145/2951913
General Chairs:
Jacques Garrigue
Nagoya University, Japan
,
Gabriele Keller
University of New South Wales, Australia
,
Program Chair:
Eijiro Sumii
Tohoku University, Japan
ACM SIGPLAN Notices Volume 51, Issue 9
ICFP '16
September 2016
501 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/3022670
Editor:
Matthew Fluet
Issue’s Table of Contents
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 September 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
calculus
concurrent data structures
experiments
operational semantics
parallelism
proofs
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate333of1,064submissions,31%
Upcoming Conference
ICFP '24

Sponsor:

sigplan

ACM SIGPLAN International Conference on Functional Programming

September 9 - 13, 2024

Milan , Italy
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 15
  Total Citations
  View Citations
- 2,399
  Total Downloads
- Downloads (Last 12 months)1,526
- Downloads (Last 6 weeks)62
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Dag-calculus: a calculus for parallel computation

ICFP 2016: Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming

ABSTRACT

References

Cited By

Index Terms

Recommendations

Dag-calculus: a calculus for parallel computation

Computational Soundness of a Call by Name Calculus of Recursively-scoped Records

The Bang Calculus: an untyped lambda-calculus generalizing call-by-name and call-by-value