An Extension of the StarSs Programming Model for Platforms with Multiple GPUs

Ayguadé, Eduard; Badia, Rosa M.; Igual, Francisco D.; Labarta, Jesús; Mayo, Rafael; Quintana-Ortí, Enrique S.

doi:10.1007/978-3-642-03869-3_79

Eduard Ayguadé¹⁷,
Rosa M. Badia^17,18,
Francisco D. Igual¹⁹,
Jesús Labarta¹⁷,
Rafael Mayo¹⁹ &
…
Enrique S. Quintana-Ortí¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5704))

Included in the following conference series:

European Conference on Parallel Processing

1381 Accesses
58 Citations
1 Altmetric

Abstract

While general-purpose homogeneous multi-core architectures are becoming ubiquitous, there are clear indications that, for a number of important applications, a better performance/power ratio can be attained using specialized hardware accelerators. These accelerators require specific SDK or programming languages which are not always easy to program. Thus, the impact of the new programming paradigms on the programmer’s productivity will determine their success in the high-performance computing arena. In this paper we present GPU Superscalar (GPUSs), an extension of the Star Superscalar programming model that targets the parallelization of applications on platforms consisting of a general-purpose processor connected with multiple graphics processors. GPUSs deals with architecture heterogeneity and separate memory address spaces, while preserving simplicity and portability. Preliminary experimental results for a well-known operation in numerical linear algebra illustrate the correct adaptation of the runtime to a multi-GPU system, attaining notable performance results.

Download to read the full chapter text

Chapter PDF

SkelCL: a high-level extension of OpenCL for multi-GPU systems

Article 28 May 2014

Exploiting GPUs with the Super Instruction Architecture

Article 20 August 2014

Fusion: Abstractions for Multicore/Manycore Heterogenous Parallel Programming Using GPUs

Keywords

References

Anderson, E., Bai, Z., Demmel, J., Dongarra, J.E., DuCroz, J., Greenbaum, A., Hammarling, S., McKenney, A.E., Ostrouchov, S., Sorensen, D.: LAPACK Users’ Guide. SIAM, Philadelphia (1992)
MATH Google Scholar
Ayguade, E., Badia, R.M., Cabrera, D., Duran, A., Gonzalez, M., Igual, F.D., Jimenez, D., Labarta, J., Martorell, X., Mayo, R., Perez, J.M., Quintana-Ortí, E.S.: A proposal to extend the OpenMP tasking model for heterogeneous architectures. In: Evolving OpenMP in an Age of Extreme Parallelism. 5th International Workshop on OpenMP, IWOMP 2009, Dresden, Germany. LNCS. Springer, Heidelberg (2009)
Google Scholar
Barrachina, S., Castillo, M., Igual, F.D., Mayo, R., Quintana-Ortí, E.S.: Solving dense linear systems on graphics processors. In: Luque, E., Margalef, T., Benítez, D. (eds.) Euro-Par 2008. LNCS, vol. 5168, pp. 739–748. Springer, Heidelberg (2008)
Chapter Google Scholar
Bellens, P., Pérez, J.M., Badia, R.M., Labarta, J.: CellSs: a programming model for the Cell BE architecture. In: SC 2006: Proceedings of the 2006 ACM/IEEE conference on Supercomputing, p. 86. ACM Press, New York (2006)
Google Scholar
Chatterjee, S., Lebeck, A.R., Patnala, P.K., Thottethodi, M.: Recursive array layouts and fast matrix multiplication. IEEE Trans. on Parallel and Distributed Systems 13(11), 1105–1123 (2002)
Article Google Scholar
Dongarra, J., Croz, J.D., Hammarling, S., Duff, I.: A set of level 3 basic linear algebra subprograms. ACM Trans. Math. Soft. 16(1), 1–17 (1990)
Article MATH Google Scholar
Lee, S., Min, S.-J., Eigenmann, R.: Openmp to gpgpu: a compiler framework for automatic translation and optimization. In: PPoPP 2009: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, pp. 101–110. ACM Press, New York (2009)
Google Scholar
NVIDIA. NVIDIA CUDA Programming Guide 2.2 (2008)
Google Scholar
Park, N., Hong, B., Prasanna, V.K.: Tiling, block data layout, and memory hierarchy performance. IEEE Trans. on Parallel and Distributed Systems 14(7), 640–654 (2003)
Article Google Scholar
Perez, J.M., Bellens, P., Badia, R.M., Labarta, J.: CellSs: Making it easier to program the cell broadband engine processor. IBM Journal of Research and Development 51(5) (August 2007)
Google Scholar
Perez, J.M., Badia, R.M., Labarta, J.: Scalar-aware grid superscalar. DAC TR UPC-DAC-RR-CAP-2006-12. Technical report, Universitat Politécnica de Catalunya, Computer Architecture Department (2006)
Google Scholar
Pérez, J.M., Badia, R.M., Labarta, J.: A flexible and portable programming model for SMP and multi-cores. Technical Report 03/2007, Barcelona Supercomputing Center - CNS, Barcelona, Spain (2007)
Google Scholar
Quintana-Ortí, G., Igual, F.D., Quintana-Ortí, E.S., van de Geijn, R.A.: Solving dense linear systems on platforms with multiple hardware accelerators. In: PPoPP 2009: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, pp. 121–130. ACM, New York (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Barcelona Supercomputing Center – Centro Nacional de Supercomputación, (BSC–CNS) and Universitat Politècnica de Catalunya, Nexus II Building, C. Jordi Girona 29, 08034, Barcelona, Spain
Eduard Ayguadé, Rosa M. Badia & Jesús Labarta
Consejo Superior de Investigaciones Científicas (CSIC), Spain
Rosa M. Badia
Depto. de Ingeniería y Ciencia de Computadores, Universidad Jaume I (UJI), 12.071, Castellón, Spain
Francisco D. Igual, Rafael Mayo & Enrique S. Quintana-Ortí

Authors

Eduard Ayguadé
View author publications
You can also search for this author in PubMed Google Scholar
Rosa M. Badia
View author publications
You can also search for this author in PubMed Google Scholar
Francisco D. Igual
View author publications
You can also search for this author in PubMed Google Scholar
Jesús Labarta
View author publications
You can also search for this author in PubMed Google Scholar
Rafael Mayo
View author publications
You can also search for this author in PubMed Google Scholar
Enrique S. Quintana-Ortí
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Software Technology, Delft University of Technology, Mekelweg 4, 2628, Delft, CD, The Netherlands
Henk Sips , Dick Epema & Hai-Xiang Lin , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ayguadé, E., Badia, R.M., Igual, F.D., Labarta, J., Mayo, R., Quintana-Ortí, E.S. (2009). An Extension of the StarSs Programming Model for Platforms with Multiple GPUs. In: Sips, H., Epema, D., Lin, HX. (eds) Euro-Par 2009 Parallel Processing. Euro-Par 2009. Lecture Notes in Computer Science, vol 5704. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03869-3_79

Download citation

DOI: https://doi.org/10.1007/978-3-642-03869-3_79
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03868-6
Online ISBN: 978-3-642-03869-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Extension of the StarSs Programming Model for Platforms with Multiple GPUs

Abstract

Chapter PDF

Similar content being viewed by others

SkelCL: a high-level extension of OpenCL for multi-GPU systems

Exploiting GPUs with the Super Instruction Architecture

Fusion: Abstractions for Multicore/Manycore Heterogenous Parallel Programming Using GPUs

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

An Extension of the StarSs Programming Model for Platforms with Multiple GPUs

Abstract

Chapter PDF

Similar content being viewed by others

SkelCL: a high-level extension of OpenCL for multi-GPU systems

Exploiting GPUs with the Super Instruction Architecture

Fusion: Abstractions for Multicore/Manycore Heterogenous Parallel Programming Using GPUs

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation