research-article

Commutative set: a language extension for implicit parallel programming

Authors:
Prakash Prabhu

Princeton University, Princeton, NJ, USA

Princeton University, Princeton, NJ, USA
View Profile

,
Soumyadeep Ghosh

Princeton University, Princeton, NJ, USA

Princeton University, Princeton, NJ, USA
View Profile

,
Yun Zhang

Princeton University, Princeton, NJ, USA

Princeton University, Princeton, NJ, USA
View Profile

,
Nick P. Johnson

Princeton University, Princeton, NJ, USA

Princeton University, Princeton, NJ, USA
View Profile

,
David I. August

Princeton University, Princeton, NJ, USA

Princeton University, Princeton, NJ, USA
View Profile

Authors Info & Claims

ACM SIGPLAN Notices Volume 46 Issue 6June 2011pp 1–11https://doi.org/10.1145/1993316.1993500

Published:04 June 2011Publication History

ACM SIGPLAN Notices

Abstract

Sequential programming models express a total program order, of which a partial order must be respected. This inhibits parallelizing tools from extracting scalable performance. Programmer written semantic commutativity assertions provide a natural way of relaxing this partial order, thereby exposing parallelism implicitly in a program. Existing implicit parallel programming models based on semantic commutativity either require additional programming extensions, or have limited expressiveness. This paper presents a generalized semantic commutativity based programming extension, called Commutative Set (COMMSET), and associated compiler technology that enables multiple forms of parallelism. COMMSET expressions are syntactically succinct and enable the programmer to specify commutativity relations between groups of arbitrary structured code blocks. Using only this construct, serializing constraints that inhibit parallelization can be relaxed, independent of any particular parallelization strategy or concurrency control mechanism. COMMSET enables well performing parallelizations in cases where they were inapplicable or non-performing before. By extending eight sequential programs with only 8 annotations per program on average, COMMSET and the associated compiler technology produced a geomean speedup of 5.7x on eight cores compared to 1.5x for the best non-COMMSET parallelization.

References

F. Aleen and N. Clark. Commutativity analysis for software parallelization: Letting program transformations see the big picture. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2009. Google ScholarDigital Library
Apple Open Source. md5sum: Message Digest 5 computation. http://www.opensource.apple.com/darwinsource/.Google Scholar
E. Ayguadé, N. Copty, A. Duran, J. Hoeflinger, Y. Lin, F. Massaioli, X. Teruel, P. Unnikrishnan, and G. Zhang. The design of OpenMP tasks. IEEE Transactions on Parallel and Distributed Systems, 2009. Google ScholarDigital Library
G. E. Blelloch and J. Greiner. A provable time and space efficient implementation of NESL. In Proceedings of the First ACM SIGPLAN International Conference on Functional Programming (ICFP), 1996. Google ScholarDigital Library
R. L. Bocchino, Jr., V. S. Adve, D. Dig, S. V. Adve, S. Heumann, R. Komuravelli, J. Overbey, P. Simmons, H. Sung, and M. Vakilian. A type and effect system for Deterministic Parallel Java. In Proceedings of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems, Languages, and Applications (OOPSLA), 2009. Google ScholarDigital Library
M. Bridges, N. Vachharajani, Y. Zhang, T. Jablin, and D. August. Revisiting the sequential programming model for multi-core. In Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2007. Google ScholarDigital Library
M. J. Bridges. The VELOCITY compiler: Extracting efficient multicore execution from legacy sequential codes. PhD thesis, 2008. Google ScholarDigital Library
D. R. Butenhof. Programming with POSIX threads. Addison-Wesley Longman Publishing Co., Inc., 1997. Google ScholarDigital Library
M. C. Carlisle. Olden: Parallelizing programs with dynamic data structures on distributed-memory machines. PhD thesis, 1996. Google ScholarDigital Library
B. D. Carlstrom, A. McDonald, M. Carbin, C. Kozyrakis, and K. Olukotun. Transactional collection classes. In Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), 2007. Google ScholarDigital Library
R. Eigenmann, J. Hoeflinger, Z. Li, and D. A. Padua. Experience in the automatic parallelization of four Perfect-benchmark programs. In Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing (LCPC), 1992. Google ScholarDigital Library
J. Ferrante, K. J. Ottenstein, and J. D. Warren. The program dependence graph and its use in optimization. ACM Trans. Program. Lang. Syst., 9(3), 1987. Google ScholarDigital Library
T. Harris and S. Singh. Feedback directed implicit parallelism. In Proceedings of the 12th ACM SIGPLAN International Conference on Functional Programming (ICFP), 2007. Google ScholarDigital Library
J. L. Henning. SPEC CPU2006 benchmark descriptions. SIGARCH Comput. Archit. News, 2006. Google ScholarDigital Library
W.-m. Hwu, S. Ryoo, S.-Z. Ueng, J. Kelm, I. Gelado, S. Stone, R. Kidd, S. Baghsorkhi, A. Mahesri, S. Tsao, N. Navarro, S. Lumetta, M. Frank, and S. Patel. Implicitly parallel programming models for thousand-core microprocessors. In Proceedings of the 44th annual Design Automation Conference (DAC), 2007. Google ScholarDigital Library
K. Kennedy and J. R. Allen. Optimizing Compilers for Modern Architectures: a Dependence-based Approach. Morgan Kaufmann Publishers Inc., 2002. Google ScholarDigital Library
E. Koskinen, M. Parkinson, and M. Herlihy. Coarse-grained transactions. In Proceedings of the 37th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), 2010. Google ScholarDigital Library
M. Kulkarni, K. Pingali, B. Walter, G. Ramanarayanan, K. Bala, and L. P. Chew. Optimistic parallelism requires abstractions. In Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Google ScholarDigital Library
C. Lattner and V. Adve. LLVM: A compilation framework for lifelong program analysis and transformation. In Proceedings of 2nd International Symposium on Code Generation and Optimization (CGO), 2004. Google ScholarDigital Library
R. Leino, P. Müller, and J. Smans. Deadlock-free channels and locks. In Proceedings of the 19th European Symposium on Programming (ESOP), 2010. Google ScholarDigital Library
G. Memik, W. H. Mangione-Smith, and W. Hu. NetBench: a benchmarking suite for network processors. In Proceedings of the 2001 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2001. Google ScholarDigital Library
C. C. Minh, J. Chung, C. Kozyrakis, and K. Olukotun. STAMP: Stanford Transactional Applications for Multi-Processing. In IEEE International Symposium on Workload Characterization (IISWC), 2008.Google Scholar
R. Narayanan, B. Ozisikyilmaz, J. Zambreno, G. Memik, and A. Choudhary. MineBench: A benchmark suite for data mining workloads. In IEEE International Symposium on Workload Characterization (IIWSC), 2006.Google ScholarCross Ref
G. Ottoni. Global Instruction Scheduling for Multi-Threaded Architectures. PhD thesis, 2008. Google ScholarDigital Library
G. Ottoni, R. Rangan, A. Stoler, and D. I. August. Automatic thread extraction with decoupled software pipelining. Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2005. Google ScholarDigital Library
E. Raman, G. Ottoni, A. Raman, M. J. Bridges, and D. I. August. Parallel-stage decoupled software pipelining. In Proceedings of the 6th annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2008. Google ScholarDigital Library
M. C. Rinard. The design, implementation and evaluation of Jade, a portable, implicitly parallel programming language. PhD thesis, 1994. Google ScholarDigital Library
M. C. Rinard and P. Diniz. Commutativity analysis: A new analysis framework for parallelizing compilers. In Proceedings of the ACM SIGPLAN 1996 Conference on Programming Language Design and Implementation (PLDI). Google ScholarDigital Library
P. Selinger. potrace: Transforming bitmaps into vector graphics. http://potrace.sourceforge.net.Google Scholar
H. Vandierendonck, S. Rul, and K. De Bosschere. The Paralax infrastructure: Automatic parallelization with a helping hand. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT), 2010. Google ScholarDigital Library
C. von Praun, L. Ceze, and C. Caşcaval. Implicit parallelism with ordered transactions. In Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), 2007. Google ScholarDigital Library
P. Wu and D. A. Padua. Beyond arrays - a container-centric approach for parallelization of real-world symbolic applications. In Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing (LCPC), 1999. Google ScholarDigital Library
R. M. Yoo, Y. Ni, A. Welc, B. Saha, A.-R. Adl-Tabatabai, and H.-H. S. Lee. Kicking the tires of software transactional memory: Why the going gets tough. In Proceedings of the Twentieth Annual Symposium on Parallelism in Algorithms and Architectures (SPAA), 2008. Google ScholarDigital Library
H. Zhong, M. Mehrara, S. Lieberman, and S. Mahlke. Uncovering hidden loop level parallelism in sequential applications. In Proceedings of 14th International Conference on High-Performance Computer Architecture (HPCA), 2008.Google Scholar

Index Terms

Commutative set: a language extension for implicit parallel programming
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel programming languages
2. Software and its engineering
  1. Software notations and tools
    1. Compilers
    2. General programming languages
      1. Language types
        Parallel programming languages

Recommendations

Commutative set: a language extension for implicit parallel programming
PLDI '11: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation

Sequential programming models express a total program order, of which a partial order must be respected. This inhibits parallelizing tools from extracting scalable performance. Programmer written semantic commutativity assertions provide a natural way ...
Read More
Commutative pseudo-equality algebras

Pseudo-equality algebras were initially introduced by Jenei and Kóródi as a possible algebraic semantic for fuzzy-type theory, and they have been revised by Dvureăźenskij and Zahiri under the name of JK-algebras. In this paper, we define and study the ...
Read More
Comparing Parallel Functional Languages: Programming and Performance

This paper presents a practical evaluation and comparison of three state-of-the-art parallel functional languages. The evaluation is based on implementations of three typical symbolic computation programs, with performance measured on a Beowulf-class ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGPLAN Notices Volume 46, Issue 6
PLDI '11
June 2011
652 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1993316
Issue’s Table of Contents
PLDI '11: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation
June 2011
668 pages
ISBN:9781450306638
DOI:10.1145/1993498
General Chair:
Mary Hall
University of Utah
,
Program Chair:
David Padua
University of Illinois at Urbana-Champaign
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 June 2011
Check for updates
Author Tags
automatic parallelization
implicit parallelism
programming model
semantic commutativity
static analysis
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 29
  Total Citations
  View Citations
- 703
  Total Downloads
- Downloads (Last 12 months)13
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Commutative set: a language extension for implicit parallel programming

ACM SIGPLAN Notices

Abstract

References

Cited By

Index Terms

Recommendations

Commutative set: a language extension for implicit parallel programming

Commutative pseudo-equality algebras

Comparing Parallel Functional Languages: Programming and Performance