Exploiting task and data parallelism on a multicomputer

Authors Info & Claims

ACM SIGPLAN Notices Volume 28 Issue 7July 1993pp 13–22https://doi.org/10.1145/173284.155334

Published:01 July 1993Publication History

ACM SIGPLAN Notices

Abstract

For many applications, achieving good performance on a private memory parallel computer requires exploiting data parallelism as well as task parallelism. Depending on the size of the input data set and the number of nodes (i.e., processors), different tradeoffs between task and data parallelism are appropriate for a parallel system. Most existing compilers focus on only one of data parallelism and task parallelism. Therefore, to achieve the desired results, the programmer must separately program the data and task parallelism. We have taken a unified approach to exploiting both kinds of parallelism in a single framework with an existing language. This approach eases the task of programming and exposes the tradeoffs between data and task parallelism to compiler. We have implemented a parallelizing Fortran compiler for the iWarp system based on this approach. We discuss the design of our compiler, and present performance results to validate our approach.

References

1 ALBERT, E., KNOBE, K., LUKAS, J., AND STEELE, G. Compiling Fortran 8x array features for the Connection Machine computer system. In Proceedings of the A CM SIGPLAN Symposium on Parallel Programming: Experience with Applications, Languages and Systems (New Haven, CT, July 1988), pp. 42-56. Google ScholarDigital Library
2 BORKAR, S., COHN, R., COX, G., GLEASON, S., GROSS, T., KUNG, H. T., LAM, M., MOORE, B., PETERSON, C., PIEPER, J., RANKIN, L., TSENG, P. S., SUTTON, J., UR- BANSKI, J., AND WEBB, J. iWarp: An integrated solution to high-speed parallel computing. In Supercomputing '88 (Nov. 1988), pp. 330-339. Google ScholarDigital Library
3 BORKAR, S., COHN, R., Cox, G., GROSS, T., KUNG, H. T., LAM, M., MOORE, M. L. B., MOORE, W., PETER- SON, C., SUSMAN, J., SUTTON, J., URBANSKI, J., AND WEBB, J. Supporting systolic and memory communication in iWarp. In Proceedings of the 17th Annual InternationaI Symposium on ComputerArchitecture (Seattle, WA, May 1990), pp. 70-81. Google ScholarDigital Library
4 CARRIERO, N., AND GELERNTI#R, D. Application# experience with Linda. In Proceedings of the ACM SIG- PLAN Symposium on Parallel Programming: Experience with Applications, Languages and Systems (New Haven, CT, July 1988), pp. 173-187. Google ScholarDigital Library
5 CHAPMAN, B., MEHROTRA, P., AND ZIMA, H. Programming in Vienna Fortran. Scientific Programming 1, 1 (Aug. 1992), 31-50.Google ScholarDigital Library
6 DONGARRA, J., AND SORENSEN, D. A portable environment for developing parallel Fortran programs. Parallel Computing 5 (1987), 175-186.Google ScholarCross Ref
7 FOX, G. The architecture of problems and portable parallel software systems. Tech. Rep. CRPC-TR91172, Northeast Parallel Architectures Center, 1991.Google Scholar
8 HIGH PERFORMANCE FORTRAN FORUM. High Performance Fortran Language Specification, Jan. 1993. Version 1.0 DRAFT.Google Scholar
9 HILLIS, D. W., AND STEELE, JR., G. L. Data parallel algorithms. Communications of the ACM 29, 12 (Dec. 1986), 1170-1183. Google ScholarDigital Library
10 HIRANANDANI, S., KENNEDY, K., AND TSENG, C. Compiler optimizations for Fortran D on MIMD distributedmemory machines. In Proceedings of Supercomputing '91 (Albuquerque, NM, November 1991), pp. 86-100. Google ScholarDigital Library
11 KOELBEL, C., MEHROTRA, P., AND ROSENDALE, J. V. Semi-automatic domain decomposition in BLAZE. In Proceedings of the 1987 International Conference on Parallel Processing (August 1987), S. K. Sahni, Ed., pp. 521-524.Google Scholar
12 LAM, M., AND RINARD, M. Coarse-grain parallel programming in Jade. In Proceedings of the Third A CM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Williamsburg, VA, April 1991), pp. 94-105. Google ScholarDigital Library
13 LEE, E. A., AND MESSERSCHMITT, D. G. Static scheduling of synchronous data flow programs for digital signal processing. IEEE Transactions on Computers C-36, 1 (Jan. 1987), 24-35. Google ScholarDigital Library
14 O'HALLARON, D. The Assign parallel program generator. In Proceedings of the 6th Distributed Memory Computing Conference (Portland, OR, Apr. 1991), pp. 178-185.Google ScholarCross Ref
15 PRINTZ, H. Automatic Mapping of Large Signal Processing Systems to a Parallel Machine. PhD thesis, Department of Computer Science, Carnegie-Mellon University, 1991. Also available as report CMU-CS-91- 101. Google ScholarDigital Library
16 PRINTZ, H., KUNG, H. T., MUMMERT, T., AND SCHERER, P. Automatic mapping of large signal processing systems to a parallel machine. In Proceedings of SPIE Symposium, Real-Time Signal Processing XI (San Diego, CA, Aug. 1989), Society of Photo-Optical Instrumentation Engineers, pp. 2-16.Google ScholarCross Ref
17 STICHNOTH, J. Efficient compilation of array statements for private memory multicomputers. Tech. Rep. CMU- CS-93-109, School of Computer Science, Carnegie Mellon University, Feb. 1993. Google ScholarDigital Library
18 SUNDERAM, V. S. PVM : A framework for parallel distributed computing. Concurrency: Practice and Experience 2, 4 (December 1990), 315-339. Google ScholarDigital Library
19 TSENG, P. S. A Parallelizing Compiler For Distributed Memory Parallel Computers. PhD thesis, Department of Computer Science, Carnegie-Mellon University, 1989. Also available as CMU Tech Report CMU- CS-89-148. Google ScholarDigital Library
20 ZIMA, H., BAST, H.-J., AND GERNDT, M. SUPERB: A tool for semi-automatic MIMD/SIMD parallelization. Parallel Computing 6 (1988), 1-18.Google ScholarCross Ref

Index Terms

Exploiting task and data parallelism on a multicomputer

Recommendations

Exploiting task and data parallelism on a multicomputer
PPOPP '93: Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming

For many applications, achieving good performance on a private memory parallel computer requires exploiting data parallelism as well as task parallelism. Depending on the size of the input data set and the number of nodes (i.e., processors), different ...
Read More
A Framework for Exploiting Task and Data Parallelism on Distributed Memory Multicomputers

Distributed Memory Multicomputers (DMMs), such as the IBM SP-2, the Intel Paragon, and the Thinking Machines CM-5, offer significant advantages over shared memory multiprocessors in terms of cost and scalability. Unfortunately, the utilization of all ...
Read More
Braid: Integrating Task and Data Parallelism
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGPLAN Notices Volume 28, Issue 7
July 1993
259 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/173284
Editor:
Richard Wexelblat
IDA/CSED, Alexandria, VA
Issue’s Table of Contents
PPOPP '93: Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
August 1993
259 pages
ISBN:0897915895
DOI:10.1145/155332
Chairmen:
Marina Chen
Yale Univ., New Haven, CT
,
Robert Halstead
DEC Cambridge Research Lab.
Copyright © 1993 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 July 1993
Check for updates
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 83
  Total Citations
  View Citations
- 1,161
  Total Downloads
- Downloads (Last 12 months)216
- Downloads (Last 6 weeks)29
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Exploiting task and data parallelism on a multicomputer

ACM SIGPLAN Notices

Abstract

References

Cited By

Index Terms

Recommendations

Exploiting task and data parallelism on a multicomputer

A Framework for Exploiting Task and Data Parallelism on Distributed Memory Multicomputers

Braid: Integrating Task and Data Parallelism