Article

Free Access

Index array flattening through program transformation

Authors:
Raja Das

Georgia Institute of Technology

Georgia Institute of Technology
View Profile

,
Paul Havlak

University of Maryland, College Park

University of Maryland, College Park
View Profile

,
Joel Saltz

University of Maryland, College Park

University of Maryland, College Park
View Profile

,
Ken Kennedy

Rice University

Rice University
View Profile

Supercomputing '95: Proceedings of the 1995 ACM/IEEE conference on SupercomputingDecember 1995Pages 70–eshttps://doi.org/10.1145/224170.224420

Published:08 December 1995Publication History

Supercomputing '95: Proceedings of the 1995 ACM/IEEE conference on Supercomputing

Pages 70–es

ABSTRACT

This paper presents techniques for compiling loops with complex, indirect array accesses into loops whose array references have at most one level of indirection. The transformation allows prefetching of array indices for more efficient structuring of communication on distributed-memory machines. It can also improve performance on other architectures by enabling prefetching of data between levels of the memory hierarchy or exploitation of hardware support for vectorized gather/scatter. Our techniques are implemented in a compiler for Fortran D and execution speed improvements are given for multiprocessor and vector machines.

References

1.B. Alpern, M. N. Wegman, and F. K. Zadeck. Detecting equality of variables in programs. In Proceedings of the Fifteenth Annual ACM Symposium on the Principles of Programming Languages, pages 1-11, San Diego, CA, January 1988.]] Google ScholarDigital Library
2.B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M. Karplus. Charmm: A program for macromolecular energy, minimization, and dynamics calculations. Journal of Computational Chemistry, 4:187, 1983.]]Google ScholarCross Ref
3.B. Chapman, P. Mehrotra, and H. Zima. Programming in Vienna Fortran. Scientific Programming, 1(1):31-50, Fall 1992.]]Google ScholarDigital Library
4.Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems, 13(4):451-490, October 1991.]] Google ScholarDigital Library
5.R. Das, Y.-S. Hwang, M. Uysal, J. Saltz, and A. Sussman. Applying the CHAOS/PARTI library to irregular problems in computational chemistry and computational aerodynamics. In Proceedings of the 1993 Scalable Parallel Libraries Conference, pages 45-56. IEEE Computer Society Press, October 1993.]]Google Scholar
6.R. Das, D. J. Mavriplis, J. Saltz, S. Gupta, and R. Ponnusamy. The design and implementation of a parallel unstructured Euler solver using software primitives. AIAA Journal, 32(3):489-496, March 1994.]]Google ScholarCross Ref
7.R. Das and J. Saltz. Parallelizing molecular dynamics codes using the Parti software primitives. In Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, pages 187-192. SIAM, March 1993.]]Google Scholar
8.D. Loveman (Ed.). Draft High Performance Fortran language specification, version 1.0. Technical Report CRPC-TR92225, Center for Research on Parallel Computation, Rice University, January 1993.]]Google Scholar
9.Geoffrey Fox, Seema Hiranandani, Ken Kennedy, Charles Koelbel, Uli Kremer, Chau-Wen Tseng, and Min-You Wu. Fortran D language specification. Technical Report CRPC-TR90079, Center for Research on Parallel Computation, Rice University, December 1990.]]Google Scholar
10.R. v. Hanxleden, K. Kennedy, C. Koelbel, R. Das, and J. Saltz. Compiler analysis for irregular problems in Fortran D. In Proceedings of the 5th Workshop on Languages and Compilers for Parallel Computing, New Haven, CT, August 1992.]] Google ScholarDigital Library
11.Reinhard v. Hanxleden. Handling irregular problems with Fortran D - a preliminary report. In Proceedings of the Fourth Workshop on Compilers for Parallel Computers, Delft, The Netherlands, December 1993. Also available as CRPC Technical Report CRPC-TR93339- S.]]Google ScholarCross Ref
12.Reinhard von Hanxleden, Ken Kennedy, Charles Koelbel, Raja Das, and Joel Saltz. Compiler analysis for irregular problems in Fortran D. Technical Report 92-22, ICASE, NASA Langley Research Center, June 1992.]]Google Scholar
13.Paul Havlak. Interprocedural Symbolic Analysis. PhD thesis, Rice University, Houston, TX, May 1994.]] Google ScholarDigital Library
14.S. Hiranandani, K. Kennedy, and C. Tseng. Evaluation of compiler optimizations for Fortran D on MIMD distributed-memory machines. In Proceedings of the 1992 International Conference on Supercomputing. ACM Press, July 1992.]] Google ScholarDigital Library
15.S. Hiranandani, K. Kennedy, and C. W. Tseng. Compiler optimizations for fortran d on mimd distributed memory machines. In Supercomputing '91, Albuquerque, NM, November 1991.]] Google ScholarDigital Library
16.Seema Hiranandani, Ken Kennedy, and Chau-Wen Tseng. Compiler optimizations for Fortran D on MIMD distributed-memory machines. In Proceedings Supercomputing '91, pages 86-100. IEEE Computer Society Press, November 1991.]] Google ScholarDigital Library
17.Seema Hiranandani, Ken Kennedy, and Chau-Wen Tseng. Compiling Fortran D for MIMD distributed-memory machines. Communications of the ACM, 35(8):66-80, August 1992.]] Google ScholarDigital Library
18.C. Koelbel and P. Mehrotra. Compiling global name-space parallel loops for distributed execution. IEEE Transactions on Parallel and Distributed Systems, 2(4):440-451, October 1991.]] Google ScholarDigital Library
19.T. Lengauer and R. E. Tarjan. A fast algorithm for finding dominators in a flowgraph. ACM Transactions on Programming Languages and Systems, 1:121-141, 1979.]] Google ScholarDigital Library
20.D. J. Mavriplis. Three dimensional multigrid for the Euler equations. AIAA paper 91-1549CP, pages 824-831, June 1991.]]Google Scholar
21.R. Mirchandaney, J. H. Saltz, R. M. Smith, D. M. Nicol, and Kay Crowley. Principles of runtime support for parallel processors. In Proceedings of the 1988 ACM International Conference on Supercomputing, pages 140-152, July 1988.]] Google ScholarDigital Library
22.A. Rogers and K. Pingali. Compiling for distributed memory architectures. IEEE Transactions on Parallel and Distributed Systems, 5(3):281-298, March 1994.]] Google ScholarDigital Library
23.J. Saltz, R. Das, R. Ponnusamy, D. Mavriplis, H Berryman, and J. Wu. Parti procedures for realistic loops. In Proceedings of the 6th Distributed Memory Computing Conference, Portland, Oregon, April- May 1991.]]Google ScholarCross Ref
24.Joel Saltz, Harry Berryman, and Janet Wu. Multiprocessors and runtime compilation. Technical Report 90-59, ICASE, NASA Langley Research Center, September 1990.]]Google Scholar
25.G. A. Venkatesh. The semantic approach to program slicing. In Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation, pages 107-119, June 1991.]] Google ScholarDigital Library
26.M. Weiser. Program slicing. IEEE Trans. on Software Eng., SE- 10(4):352-357, July 1984.]]Google ScholarDigital Library
27.H. Zima, P. Brezany, B. Chapman, P. Mehrotra, and A. Schwald. Vienna Fortran - a language specification, version 1.1. Interim Report 21, ICASE, NASA Langley Research Center, March 1992.]]Google Scholar

Index Terms

Recommendations

Flattening and parallelizing irregular, recurrent loop nests
PPOPP '95: Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming

Irregular loop nests in which the loop bounds are determined dynamically by indexed arrays are difficult to compile into expressive parallel constructs, such as segmented scans and reductions. In this paper, we describe a suite of transformations to ...
Read More
Flattening-based mapping of imperfect loop nests for CGRAs
CODES '14: Proceedings of the 2014 International Conference on Hardware/Software Codesign and System Synthesis

For loop accelerators such as coarse-grained reconfigurable architectures (CGRAs) and GP-GPUs, nested loops represent an important source of parallelism. Existing solutions to mapping nested loops on CGRAs, however, are either designed for perfectly ...
Read More
Data-only flattening for nested data parallelism
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming

Data parallelism has proven to be an effective technique for high-level programming of a certain class of parallel applications, but it is not well suited to irregular parallel computations. Blelloch and others proposed nested data parallelism (NDP) as ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
Supercomputing '95: Proceedings of the 1995 ACM/IEEE conference on Supercomputing
December 1995
875 pages
ISBN:0897918169
DOI:10.1145/224170
Chairman:
Sid Karin
San Diego Supercomputer Center, San Diego, CA
Copyright © 1995 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 December 1995
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Supercomputing '95 Paper Acceptance Rate69of241submissions,29%Overall Acceptance Rate1,516of6,373submissions,24%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 28
  Total Citations
  View Citations
- 303
  Total Downloads
- Downloads (Last 12 months)32
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Index array flattening through program transformation

Supercomputing '95: Proceedings of the 1995 ACM/IEEE conference on Supercomputing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Flattening and parallelizing irregular, recurrent loop nests

Flattening-based mapping of imperfect loop nests for CGRAs

Data-only flattening for nested data parallelism

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Index array flattening through program transformation

Supercomputing '95: Proceedings of the 1995 ACM/IEEE conference on Supercomputing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Flattening and parallelizing irregular, recurrent loop nests

Flattening-based mapping of imperfect loop nests for CGRAs

Data-only flattening for nested data parallelism

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media