skip to main content

Algorithm 842: A set of GMRES routines for real and complex arithmetics on high performance computers

Published:01 June 2005Publication History
Skip Abstract Section

Abstract

In this article we describe our implementations of the GMRES algorithm for both real and complex, single and double precision arithmetics suitable for serial, shared memory and distributed memory computers. For the sake of portability, simplicity, flexibility and efficiency the GMRES solvers have been implemented in Fortran 77 using the reverse communication mechanism for the matrix-vector product, the preconditioning and the dot product computations. For distributed memory computation, several orthogonalization procedures have been implemented to reduce the cost of the dot product calculation, which is a well-known bottleneck of efficiency for the Krylov methods. Either implicit or explicit calculation of the residual at restart are possible depending on the actual cost of the matrix-vector product. Finally the implemented stopping criterion is based on a normwise backward error.

Skip Supplemental Material Section

Supplemental Material

References

  1. Balay, S., Buschelman, K., Eijkhout, V., Gropp, W. D., Kaushik, D., Knepley, M. G., McInnes, L. C., Smith, B. F., and Zhang, H. 2004. PETSc users manual. Tech. Rep. ANL-95/11 - Revision 2.1.5, Argonne National Laboratory.Google ScholarGoogle Scholar
  2. Barrett, R., Berry, M., Chan, T. F., Demmel, J., Donato, J., Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., and der Vorst, H. V. 1994. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, Second ed. SIAM, Philadelphia, PA.Google ScholarGoogle Scholar
  3. Bindel, D., Demmel, J., Kahan, W., and Marques, O. 2002. On computing givens rotations reliably and efficiently. ACM Transactions on Mathematical Software (TOMS) 28, 2 (June), 206--238. Google ScholarGoogle Scholar
  4. Björck, Å. 1994. Numerics of Gram-Schmidt orthogonalization. Linear Algebra Appl. 197--198, 297--316.Google ScholarGoogle Scholar
  5. Björck, Å. 1996. Numerical Methods for Least Squares Problems. SIAM, Philadelphia.Google ScholarGoogle Scholar
  6. Blackford, L. S., Demmel, J., Dongarra, J., Duff, I., Hammarling, S., Henry, G., Heroux, M., Kaufman, L., Lumsdaine, A., Petitet, A., Pozo, R., Remington, K., and Whaley, R. C. 2002. An updated set of Basic Linear Algebra Subprograms (BLAS). ACM Trans. Math. Softw. (TOMS) 28, 2 (June), 135--151. Google ScholarGoogle Scholar
  7. Buchau, A. and Rucker, W. M. 2002. Preconditioned fast adaptive multipole boundary element method. IEEE Trans. Magn. 38, 2, 461--464.Google ScholarGoogle Scholar
  8. Chaitin-Chatelin, F. and Frayss&ecute;, V. 1996. Lectures on Finite Precision Computations. SIAM, Philadelphia.Google ScholarGoogle Scholar
  9. Cross, J. T., Masters, I., and Lewis, R. W. 1999. Why you should consider object-oriented programming techniques for finite element methods. Int. J. Numer. Meth. Heat Fluid Flow 9, 333--347.Google ScholarGoogle Scholar
  10. Daniel, W., Gragg, W. B., Kaufman, L., and Stewart, G. W. 1976. Reorthogonalization and stable algorithms for updating the Gram-Schmidt QR factorization. Math. Comp. 30, 772--795.Google ScholarGoogle Scholar
  11. Frank, J. and Vuik, C. 1999. Parallel implementation of a multiblock method with approximate subdomain solution. Appl. Num. Math. 30, 403--423. Google ScholarGoogle Scholar
  12. Frayss&ecute;, V., Giraud, L., Gratton, S., and Langou, J. 2003. A set of GMRES routines for real and complex arithmetics on high performance computers. Tech. Rep. TR/PA/03/03, CERFACS. Available on http://www.cerfacs.fr/algor.Google ScholarGoogle Scholar
  13. Frayss&ecute;, V., Giraud, L., and Kharraz-Aroussi, H. 1998. On the influence of the orthogonalization scheme on the parallel performance of GMRES. Tech. Rep. TR/PA/98/07, CERFACS, Toulouse, France. Preliminary version of the paper published in the proceedings of EuroPar'98. Lecture Notes in Computer Science, Springer-Verlag, vol. 1470, pp. 751--762. Google ScholarGoogle Scholar
  14. Higham, N. J. 2002. Accuracy and Stability of Numerical Algorithms, Second ed. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA. Google ScholarGoogle Scholar
  15. Hoffmann, W. 1989. Iterative algorithms for Gram-Schmidt orthogonalization. Computing 41, 335--348. Google ScholarGoogle Scholar
  16. Hustedt, B., Operto, S., and Virieux, J. 2003. A multi-level direct-iterative solver for seismic wave propagation modelling: space and wavelet approaches. Geophys. Int. J. 155, 953--980.Google ScholarGoogle Scholar
  17. Lehoucq, R. B. and Salinger, A. G. 2001. Large-scale eigenvalue calculations for stability analysis of steady flows on massively parallel computers. Int. J. Numer. Meth. Fluids 36, 309--327.Google ScholarGoogle Scholar
  18. Lewis, R. W., Masters, I., and Cross, J. T. 1997. Automatic timestep selection for the super-time-stepping acceleration on unstructured grids using object-oriented programming. Comm. Numer. Meth. Eng. 13, 4, 249--260.Google ScholarGoogle Scholar
  19. Li, Z., Saad, Y., and Sosonkina, M. 2003. pARMS: a parallel version of the recursive multilevel solver. Numer. Linear Alg. Appl. 10, 485--509.Google ScholarGoogle Scholar
  20. Masters, I., Usmani, A. S., Cross, J. T., and Lewis, R. W. 1997. Finite element analysis of solidification using object-oriented and parallel techniques. Int. J. Numer. Meth. Eng. 40, 15, 2891--2909.Google ScholarGoogle Scholar
  21. Monga-Made, M. M. 2001. Incomplete factorization based preconditionings for solving the Helmholtz equation. Int. J. Numer. Meth. Eng. 50, 5, 1077--1101.Google ScholarGoogle Scholar
  22. Monga-Made, M. M. and Beauwens, R. 2000. Imaginary diagonal relaxations for highly indefinite linear systems. In Proceedings of the 16th IMACS World Congress 2000. IMACS, Department of Computer Science, Rutgers University.Google ScholarGoogle Scholar
  23. Monga-Made, M. M., Beauwens, R., and Warz´e, G. 2000. Preconditioning of discrete Helmholtz operators perturbed by a diagonal complex matrix. Comm. Numer. Meth. Eng. 16, 2, 801--817.Google ScholarGoogle Scholar
  24. Rutishauser, H. 1967. Description of Algol 60. Handbook for Automatic Computation, vol. 1. Springer-Verlag, Berlin.Google ScholarGoogle Scholar
  25. Saad, Y. and Schultz, M. 1986. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7, 856--869. Google ScholarGoogle Scholar
  26. Shadid, J. N. and Tuminaro, R. S. 1994. A comparison of preconditioned nonsymmetric Krylov methods on a large-scale MIMD machine. SIAM J. Sci. Comp. 14, 2, 440--459. Google ScholarGoogle Scholar
  27. Sylvand, G. 2002. R´solution it´rative de formulation int´grale pour Helmholtz 3D: Applications de la m´thode multipôle à des problèmes de grande taille. Ph.D. thesis, Ecole Nationale des Ponts et Chauss´es.Google ScholarGoogle Scholar
  28. Tuminaro, R. S., Heroux, M., Hutchinson, S. A., and Shadid, J. 1999. Official Aztec User's Guide - Version 2.1. Tech. Rep. 99-8801J, Sandia National Laboratories.Google ScholarGoogle Scholar
  29. Warsa, J. S., Benzi, M., Wareing, T. A., and Morel, J. E. 2004. Preconditioning a mixed discontinuous finite element method for radiation diffusion. Numerical Linear Algebra with Applications 11, 795--811.Google ScholarGoogle Scholar
  30. West, J. C. and Sturm, J. M. 1999. On iterative approaches for electromagnetic rough-surface scattering problems. IEEE Trans. Antennas and Propagation 47, 8, 1281--1288.Google ScholarGoogle Scholar
  31. Wilkinson, J. H. 1963. Rounding errors in algebraic processes. Vol. 32. Her Majesty's stationery office, London. Google ScholarGoogle Scholar
  32. Yoshida, K., Nishimura, N., and Kobayashi, S. 2000. Analysis of three dimensional scattering of elastic waves by cracks with fast multipole boundary integral equation method. J. Appl. Mech. JSCE 3, 145--150.Google ScholarGoogle Scholar
  33. Yoshida, K., Nishimura, N., and Kobayashi, S. 2001. Analysis of three dimensional scattering of scalar waves by cracks with fast multipole boundary integral equation method. Trans JSME A, 16--22.Google ScholarGoogle Scholar

Index Terms

  1. Algorithm 842: A set of GMRES routines for real and complex arithmetics on high performance computers

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader