article

Algorithm 842: A set of GMRES routines for real and complex arithmetics on high performance computers

Authors:
Valérie Frayssé

Kvasar Technology LLC, Boston, MA

Kvasar Technology LLC, Boston, MA
View Profile

,
Luc Giraud

CERFACS, Toulouse Cedex, France

CERFACS, Toulouse Cedex, France
View Profile

,
Serge Gratton

CERFACS, Toulouse Cedex, France

CERFACS, Toulouse Cedex, France
View Profile

,
Julien Langou

University of Tennessee, Knoxville, Knoxville, TN

University of Tennessee, Knoxville, Knoxville, TN
View Profile

Authors Info & Claims

ACM Transactions on Mathematical Software Volume 31 Issue 2pp 228–238https://doi.org/10.1145/1067967.1067970

Published:01 June 2005Publication History

ACM Transactions on Mathematical Software

Abstract

In this article we describe our implementations of the GMRES algorithm for both real and complex, single and double precision arithmetics suitable for serial, shared memory and distributed memory computers. For the sake of portability, simplicity, flexibility and efficiency the GMRES solvers have been implemented in Fortran 77 using the reverse communication mechanism for the matrix-vector product, the preconditioning and the dot product computations. For distributed memory computation, several orthogonalization procedures have been implemented to reduce the cost of the dot product calculation, which is a well-known bottleneck of efficiency for the Krylov methods. Either implicit or explicit calculation of the residual at restart are possible depending on the actual cost of the matrix-vector product. Finally the implemented stopping criterion is based on a normwise backward error.

Supplemental Material

Available for Download

zip

842.zip (90.5 KB)

Software for "A set of GMRES routines for real and complex arithmetics on high performance computers"

References

Balay, S., Buschelman, K., Eijkhout, V., Gropp, W. D., Kaushik, D., Knepley, M. G., McInnes, L. C., Smith, B. F., and Zhang, H. 2004. PETSc users manual. Tech. Rep. ANL-95/11 - Revision 2.1.5, Argonne National Laboratory.Google Scholar
Barrett, R., Berry, M., Chan, T. F., Demmel, J., Donato, J., Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., and der Vorst, H. V. 1994. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, Second ed. SIAM, Philadelphia, PA.Google Scholar
Bindel, D., Demmel, J., Kahan, W., and Marques, O. 2002. On computing givens rotations reliably and efficiently. ACM Transactions on Mathematical Software (TOMS) 28, 2 (June), 206--238. Google Scholar
Björck, Å. 1994. Numerics of Gram-Schmidt orthogonalization. Linear Algebra Appl. 197--198, 297--316.Google Scholar
Björck, Å. 1996. Numerical Methods for Least Squares Problems. SIAM, Philadelphia.Google Scholar
Blackford, L. S., Demmel, J., Dongarra, J., Duff, I., Hammarling, S., Henry, G., Heroux, M., Kaufman, L., Lumsdaine, A., Petitet, A., Pozo, R., Remington, K., and Whaley, R. C. 2002. An updated set of Basic Linear Algebra Subprograms (BLAS). ACM Trans. Math. Softw. (TOMS) 28, 2 (June), 135--151. Google Scholar
Buchau, A. and Rucker, W. M. 2002. Preconditioned fast adaptive multipole boundary element method. IEEE Trans. Magn. 38, 2, 461--464.Google Scholar
Chaitin-Chatelin, F. and Frayss&ecute;, V. 1996. Lectures on Finite Precision Computations. SIAM, Philadelphia.Google Scholar
Cross, J. T., Masters, I., and Lewis, R. W. 1999. Why you should consider object-oriented programming techniques for finite element methods. Int. J. Numer. Meth. Heat Fluid Flow 9, 333--347.Google Scholar
Daniel, W., Gragg, W. B., Kaufman, L., and Stewart, G. W. 1976. Reorthogonalization and stable algorithms for updating the Gram-Schmidt QR factorization. Math. Comp. 30, 772--795.Google Scholar
Frank, J. and Vuik, C. 1999. Parallel implementation of a multiblock method with approximate subdomain solution. Appl. Num. Math. 30, 403--423. Google Scholar
Frayss&ecute;, V., Giraud, L., Gratton, S., and Langou, J. 2003. A set of GMRES routines for real and complex arithmetics on high performance computers. Tech. Rep. TR/PA/03/03, CERFACS. Available on http://www.cerfacs.fr/algor.Google Scholar
Frayss&ecute;, V., Giraud, L., and Kharraz-Aroussi, H. 1998. On the influence of the orthogonalization scheme on the parallel performance of GMRES. Tech. Rep. TR/PA/98/07, CERFACS, Toulouse, France. Preliminary version of the paper published in the proceedings of EuroPar'98. Lecture Notes in Computer Science, Springer-Verlag, vol. 1470, pp. 751--762. Google Scholar
Higham, N. J. 2002. Accuracy and Stability of Numerical Algorithms, Second ed. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA. Google Scholar
Hoffmann, W. 1989. Iterative algorithms for Gram-Schmidt orthogonalization. Computing 41, 335--348. Google Scholar
Hustedt, B., Operto, S., and Virieux, J. 2003. A multi-level direct-iterative solver for seismic wave propagation modelling: space and wavelet approaches. Geophys. Int. J. 155, 953--980.Google Scholar
Lehoucq, R. B. and Salinger, A. G. 2001. Large-scale eigenvalue calculations for stability analysis of steady flows on massively parallel computers. Int. J. Numer. Meth. Fluids 36, 309--327.Google Scholar
Lewis, R. W., Masters, I., and Cross, J. T. 1997. Automatic timestep selection for the super-time-stepping acceleration on unstructured grids using object-oriented programming. Comm. Numer. Meth. Eng. 13, 4, 249--260.Google Scholar
Li, Z., Saad, Y., and Sosonkina, M. 2003. pARMS: a parallel version of the recursive multilevel solver. Numer. Linear Alg. Appl. 10, 485--509.Google Scholar
Masters, I., Usmani, A. S., Cross, J. T., and Lewis, R. W. 1997. Finite element analysis of solidification using object-oriented and parallel techniques. Int. J. Numer. Meth. Eng. 40, 15, 2891--2909.Google Scholar
Monga-Made, M. M. 2001. Incomplete factorization based preconditionings for solving the Helmholtz equation. Int. J. Numer. Meth. Eng. 50, 5, 1077--1101.Google Scholar
Monga-Made, M. M. and Beauwens, R. 2000. Imaginary diagonal relaxations for highly indefinite linear systems. In Proceedings of the 16th IMACS World Congress 2000. IMACS, Department of Computer Science, Rutgers University.Google Scholar
Monga-Made, M. M., Beauwens, R., and Warz´e, G. 2000. Preconditioning of discrete Helmholtz operators perturbed by a diagonal complex matrix. Comm. Numer. Meth. Eng. 16, 2, 801--817.Google Scholar
Rutishauser, H. 1967. Description of Algol 60. Handbook for Automatic Computation, vol. 1. Springer-Verlag, Berlin.Google Scholar
Saad, Y. and Schultz, M. 1986. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7, 856--869. Google Scholar
Shadid, J. N. and Tuminaro, R. S. 1994. A comparison of preconditioned nonsymmetric Krylov methods on a large-scale MIMD machine. SIAM J. Sci. Comp. 14, 2, 440--459. Google Scholar
Sylvand, G. 2002. R´solution it´rative de formulation int´grale pour Helmholtz 3D: Applications de la m´thode multipôle à des problèmes de grande taille. Ph.D. thesis, Ecole Nationale des Ponts et Chauss´es.Google Scholar
Tuminaro, R. S., Heroux, M., Hutchinson, S. A., and Shadid, J. 1999. Official Aztec User's Guide - Version 2.1. Tech. Rep. 99-8801J, Sandia National Laboratories.Google Scholar
Warsa, J. S., Benzi, M., Wareing, T. A., and Morel, J. E. 2004. Preconditioning a mixed discontinuous finite element method for radiation diffusion. Numerical Linear Algebra with Applications 11, 795--811.Google Scholar
West, J. C. and Sturm, J. M. 1999. On iterative approaches for electromagnetic rough-surface scattering problems. IEEE Trans. Antennas and Propagation 47, 8, 1281--1288.Google Scholar
Wilkinson, J. H. 1963. Rounding errors in algebraic processes. Vol. 32. Her Majesty's stationery office, London. Google Scholar
Yoshida, K., Nishimura, N., and Kobayashi, S. 2000. Analysis of three dimensional scattering of elastic waves by cracks with fast multipole boundary integral equation method. J. Appl. Mech. JSCE 3, 145--150.Google Scholar
Yoshida, K., Nishimura, N., and Kobayashi, S. 2001. Analysis of three dimensional scattering of scalar waves by cracks with fast multipole boundary integral equation method. Trans JSME A, 16--22.Google Scholar

Index Terms

Algorithm 842: A set of GMRES routines for real and complex arithmetics on high performance computers
1. Mathematics of computing
  1. Mathematical analysis
    1. Numerical analysis
      1. Numerical differentiation
    2. Quadrature

Recommendations

Algorithm 881: A Set of Flexible GMRES Routines for Real and Complex Arithmetics on High-Performance Computers

In this article we describe our implementations of the FGMRES algorithm for both real and complex, single and double precision arithmetics suitable for serial, shared-memory, and distributed-memory computers. For the sake of portability, simplicity, ...
Read More
A parallel implementation of the CMRH method for dense linear systems

This paper presents an implementation of the CMRH (Changing Minimal Residual method based on the Hessenberg process) iterative method suitable for parallel architectures. CMRH is an alternative to GMRES and QMR, the well-known Krylov methods for solving ...
Read More
Restarted GMRES for Shifted Linear Systems
^* Special Issue on Iterative Methods for Solving Systems of Algebraic Equations

Shifted matrices, which differ by a multiple of the identity only, generate the same Krylov subspaces with respect to any fixed vector. This fact has been exploited in Lanczos-based methods like CG, QMR, and BiCG to simultaneously solve several shifted ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Mathematical Software Volume 31, Issue 2
June 2005
93 pages
ISSN:0098-3500
EISSN:1557-7295
DOI:10.1145/1067967
Issue’s Table of Contents

Copyright © 2005 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 June 2005
Published in toms Volume 31, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Artifacts Evaluated & Reusable
- Artifacts Available
Author Tags
GMRES
Krylov methods
Linear systems
distributed memory
reverse communication
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 47
  Total Citations
  View Citations
- 1,373
  Total Downloads
- Downloads (Last 12 months)18
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Algorithm 842: A set of GMRES routines for real and complex arithmetics on high performance computers

ACM Transactions on Mathematical Software

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Algorithm 881: A Set of Flexible GMRES Routines for Real and Complex Arithmetics on High-Performance Computers

A parallel implementation of the CMRH method for dense linear systems

Restarted GMRES for Shifted Linear Systems