Abstract
In this article we describe our implementations of the GMRES algorithm for both real and complex, single and double precision arithmetics suitable for serial, shared memory and distributed memory computers. For the sake of portability, simplicity, flexibility and efficiency the GMRES solvers have been implemented in Fortran 77 using the reverse communication mechanism for the matrix-vector product, the preconditioning and the dot product computations. For distributed memory computation, several orthogonalization procedures have been implemented to reduce the cost of the dot product calculation, which is a well-known bottleneck of efficiency for the Krylov methods. Either implicit or explicit calculation of the residual at restart are possible depending on the actual cost of the matrix-vector product. Finally the implemented stopping criterion is based on a normwise backward error.
Supplemental Material
Available for Download
Software for "A set of GMRES routines for real and complex arithmetics on high performance computers"
- Balay, S., Buschelman, K., Eijkhout, V., Gropp, W. D., Kaushik, D., Knepley, M. G., McInnes, L. C., Smith, B. F., and Zhang, H. 2004. PETSc users manual. Tech. Rep. ANL-95/11 - Revision 2.1.5, Argonne National Laboratory.Google Scholar
- Barrett, R., Berry, M., Chan, T. F., Demmel, J., Donato, J., Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., and der Vorst, H. V. 1994. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, Second ed. SIAM, Philadelphia, PA.Google Scholar
- Bindel, D., Demmel, J., Kahan, W., and Marques, O. 2002. On computing givens rotations reliably and efficiently. ACM Transactions on Mathematical Software (TOMS) 28, 2 (June), 206--238. Google Scholar
- Björck, Å. 1994. Numerics of Gram-Schmidt orthogonalization. Linear Algebra Appl. 197--198, 297--316.Google Scholar
- Björck, Å. 1996. Numerical Methods for Least Squares Problems. SIAM, Philadelphia.Google Scholar
- Blackford, L. S., Demmel, J., Dongarra, J., Duff, I., Hammarling, S., Henry, G., Heroux, M., Kaufman, L., Lumsdaine, A., Petitet, A., Pozo, R., Remington, K., and Whaley, R. C. 2002. An updated set of Basic Linear Algebra Subprograms (BLAS). ACM Trans. Math. Softw. (TOMS) 28, 2 (June), 135--151. Google Scholar
- Buchau, A. and Rucker, W. M. 2002. Preconditioned fast adaptive multipole boundary element method. IEEE Trans. Magn. 38, 2, 461--464.Google Scholar
- Chaitin-Chatelin, F. and Frayss&ecute;, V. 1996. Lectures on Finite Precision Computations. SIAM, Philadelphia.Google Scholar
- Cross, J. T., Masters, I., and Lewis, R. W. 1999. Why you should consider object-oriented programming techniques for finite element methods. Int. J. Numer. Meth. Heat Fluid Flow 9, 333--347.Google Scholar
- Daniel, W., Gragg, W. B., Kaufman, L., and Stewart, G. W. 1976. Reorthogonalization and stable algorithms for updating the Gram-Schmidt QR factorization. Math. Comp. 30, 772--795.Google Scholar
- Frank, J. and Vuik, C. 1999. Parallel implementation of a multiblock method with approximate subdomain solution. Appl. Num. Math. 30, 403--423. Google Scholar
- Frayss&ecute;, V., Giraud, L., Gratton, S., and Langou, J. 2003. A set of GMRES routines for real and complex arithmetics on high performance computers. Tech. Rep. TR/PA/03/03, CERFACS. Available on http://www.cerfacs.fr/algor.Google Scholar
- Frayss&ecute;, V., Giraud, L., and Kharraz-Aroussi, H. 1998. On the influence of the orthogonalization scheme on the parallel performance of GMRES. Tech. Rep. TR/PA/98/07, CERFACS, Toulouse, France. Preliminary version of the paper published in the proceedings of EuroPar'98. Lecture Notes in Computer Science, Springer-Verlag, vol. 1470, pp. 751--762. Google Scholar
- Higham, N. J. 2002. Accuracy and Stability of Numerical Algorithms, Second ed. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA. Google Scholar
- Hoffmann, W. 1989. Iterative algorithms for Gram-Schmidt orthogonalization. Computing 41, 335--348. Google Scholar
- Hustedt, B., Operto, S., and Virieux, J. 2003. A multi-level direct-iterative solver for seismic wave propagation modelling: space and wavelet approaches. Geophys. Int. J. 155, 953--980.Google Scholar
- Lehoucq, R. B. and Salinger, A. G. 2001. Large-scale eigenvalue calculations for stability analysis of steady flows on massively parallel computers. Int. J. Numer. Meth. Fluids 36, 309--327.Google Scholar
- Lewis, R. W., Masters, I., and Cross, J. T. 1997. Automatic timestep selection for the super-time-stepping acceleration on unstructured grids using object-oriented programming. Comm. Numer. Meth. Eng. 13, 4, 249--260.Google Scholar
- Li, Z., Saad, Y., and Sosonkina, M. 2003. pARMS: a parallel version of the recursive multilevel solver. Numer. Linear Alg. Appl. 10, 485--509.Google Scholar
- Masters, I., Usmani, A. S., Cross, J. T., and Lewis, R. W. 1997. Finite element analysis of solidification using object-oriented and parallel techniques. Int. J. Numer. Meth. Eng. 40, 15, 2891--2909.Google Scholar
- Monga-Made, M. M. 2001. Incomplete factorization based preconditionings for solving the Helmholtz equation. Int. J. Numer. Meth. Eng. 50, 5, 1077--1101.Google Scholar
- Monga-Made, M. M. and Beauwens, R. 2000. Imaginary diagonal relaxations for highly indefinite linear systems. In Proceedings of the 16th IMACS World Congress 2000. IMACS, Department of Computer Science, Rutgers University.Google Scholar
- Monga-Made, M. M., Beauwens, R., and Warz´e, G. 2000. Preconditioning of discrete Helmholtz operators perturbed by a diagonal complex matrix. Comm. Numer. Meth. Eng. 16, 2, 801--817.Google Scholar
- Rutishauser, H. 1967. Description of Algol 60. Handbook for Automatic Computation, vol. 1. Springer-Verlag, Berlin.Google Scholar
- Saad, Y. and Schultz, M. 1986. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7, 856--869. Google Scholar
- Shadid, J. N. and Tuminaro, R. S. 1994. A comparison of preconditioned nonsymmetric Krylov methods on a large-scale MIMD machine. SIAM J. Sci. Comp. 14, 2, 440--459. Google Scholar
- Sylvand, G. 2002. R´solution it´rative de formulation int´grale pour Helmholtz 3D: Applications de la m´thode multipôle à des problèmes de grande taille. Ph.D. thesis, Ecole Nationale des Ponts et Chauss´es.Google Scholar
- Tuminaro, R. S., Heroux, M., Hutchinson, S. A., and Shadid, J. 1999. Official Aztec User's Guide - Version 2.1. Tech. Rep. 99-8801J, Sandia National Laboratories.Google Scholar
- Warsa, J. S., Benzi, M., Wareing, T. A., and Morel, J. E. 2004. Preconditioning a mixed discontinuous finite element method for radiation diffusion. Numerical Linear Algebra with Applications 11, 795--811.Google Scholar
- West, J. C. and Sturm, J. M. 1999. On iterative approaches for electromagnetic rough-surface scattering problems. IEEE Trans. Antennas and Propagation 47, 8, 1281--1288.Google Scholar
- Wilkinson, J. H. 1963. Rounding errors in algebraic processes. Vol. 32. Her Majesty's stationery office, London. Google Scholar
- Yoshida, K., Nishimura, N., and Kobayashi, S. 2000. Analysis of three dimensional scattering of elastic waves by cracks with fast multipole boundary integral equation method. J. Appl. Mech. JSCE 3, 145--150.Google Scholar
- Yoshida, K., Nishimura, N., and Kobayashi, S. 2001. Analysis of three dimensional scattering of scalar waves by cracks with fast multipole boundary integral equation method. Trans JSME A, 16--22.Google Scholar
Index Terms
- Algorithm 842: A set of GMRES routines for real and complex arithmetics on high performance computers
Recommendations
Algorithm 881: A Set of Flexible GMRES Routines for Real and Complex Arithmetics on High-Performance Computers
In this article we describe our implementations of the FGMRES algorithm for both real and complex, single and double precision arithmetics suitable for serial, shared-memory, and distributed-memory computers. For the sake of portability, simplicity, ...
A parallel implementation of the CMRH method for dense linear systems
This paper presents an implementation of the CMRH (Changing Minimal Residual method based on the Hessenberg process) iterative method suitable for parallel architectures. CMRH is an alternative to GMRES and QMR, the well-known Krylov methods for solving ...
Restarted GMRES for Shifted Linear Systems
* Special Issue on Iterative Methods for Solving Systems of Algebraic EquationsShifted matrices, which differ by a multiple of the identity only, generate the same Krylov subspaces with respect to any fixed vector. This fact has been exploited in Lanczos-based methods like CG, QMR, and BiCG to simultaneously solve several shifted ...
Comments