Skip to main content

An adaptive blocking strategy for matrix factorizations

  • Algorithms For Matrix Factorization
  • Conference paper
  • First Online:
CONPAR 90 — VAPP IV (VAPP 1990, CONPAR 1990)

Abstract

On most high-performance architectures, data movement is slow compared to floating-point (in particular, vector) performance. On these architectures block algorithms have been successful for matrix computations. By considering a matrix as a collection of submatrices (the so-called blocks) one naturally arrives at algorithms that require little data movement. The optimal blocking strategy, however, depends on the computing environment and on the problem parameters. Current approaches use fixed-width blocking strategies that are not optimal. This paper presents an “adaptive blocking” methodology for determining in a systematic manner an optimal blocking strategy for a uniprocessor machine. We demonstrate this technique on a block QR factorization routine on a uniprocessor. After generating timing models for the high-level kernels of the algorithm we can formulate the optimal blocking strategy in a recurrence relation that we can solve inexpensively with a dynamic programming technique. Experiments on one processor of a CRAY-2 show that in fact the resulting blocking strategy is as good as any fixed-width blocking strategy. So while we do not know the optimum fixed-width blocking strategy unless we re-run the same problem several times, adaptive blocking provides optimum performance in the very first run.

This work was supported by the Applied Mathematical Sciences subprogram of the Office of Energy Research, U.S. Department of Energy, under Contract W-31-109-Eng-38

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alfred Aho, John Hopcroft, and Jeffrey Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, Reading, Mass., 1974.

    Google Scholar 

  2. Michael Berry, Kyle Gallivan, William Harrod, William Jalby, Sy-Shin Lo, Ulrike Meier, Bernard Philippe, and Ahmed Sameh. Parallel algorithms on the Cedar system. In W. Händler, editor, Proceedings of CONPAR 86, pages 25–39. Springer Verlag, New York, 1986.

    Google Scholar 

  3. Christian Bischof, James Demmel, Jack Dongarra, Jeremy Du Croz, Anne Greenbaum, Sven Hammarling, and Danny Sorensen. LAPACK Working Note #5: Provisional contents. Technical Report ANL-88-38, Argonne National Laboratory, Mathematics and Computer Sciences Division, September 1988.

    Google Scholar 

  4. Christian H. Bischof. Adaptive blocking in the QR factorization. The Journal of Supercomputing, 3(3):193–208, 1989.

    Article  Google Scholar 

  5. Christian H. Bischof. Computing the singular value decomposition on a distributed system of vector processors. Parallel Computing, 11:171–186, 1989.

    Article  Google Scholar 

  6. Christian H. Bischof and Jack J. Dongarra. A project for developing a linear algebra library for high-performance computers. In Graham Carey, editor, Parallel and Vector Supercomputing: Methods and Algorithms, pages 45–56. John Wiley & Sons, Somerset, NJ, 1989.

    Google Scholar 

  7. Christian H. Bischof and Charles F. Van Loan. The WY representation for products of Householder matrices. SIAM Journal on Scientific and Statistical Computing, 8:s2–s13, 1987.

    Article  Google Scholar 

  8. William S. Cleveland, Susan J. Devlin, and Eric Grosse. Regression by local fitting: Methods, properties and computational algorithms. Journal of Econometrics, 37:87–114, 1988.

    Article  Google Scholar 

  9. Jim Demmel, Jack Dongarra, Jeremy Du Croz, Anne Greenbaum, Sven Hammarling, and Danny Sorensen. Prospectus for the development of a linear algebra library for high-performance computers. Technical Report ANL-MCS-TM97, Argonne National Laboratory, Mathematics and Computer Sciences Division, September 1987.

    Google Scholar 

  10. Jack Dongarra and Eric Grosse. Distribution of mathematical software by electronic mail. Communications of the ACM, 30(5):403–407, 1987.

    Article  Google Scholar 

  11. Jack Dongarra, Ahmed Sameh, and Danny Sorensen. Implementation of some concurrent algorithms for matrix factorization. Parallel Computing, 3(1):25–34, 1986.

    Article  Google Scholar 

  12. Jack J. Dongarra, Jeremy Du Croz, Sven Hammarling, and Richard J. Hanson. An extended set of Fortran basic linear algebra subprograms. ACM Transactions on Mathematical Software, 14(1):1–17, 1988.

    Article  Google Scholar 

  13. Jack J. Dongarra, Sven J. Hammarling, and Danny C. Sorensen. Block reduction of matrices to condensed form for eigenvalue computations. Technical Report MCS-TM-99, Argonne National Laboratory, Mathematics and Computer Sciences Division, September 1987.

    Google Scholar 

  14. Kyle Gallivan, William Jalby, Ulrike Meier, and Ahmed Sameh. The impact of hierarchical memory systems on linear algebra algorithm design. SIAM Journal on Scientific and Statistical Computing, 8(6):1079–1084, November 1987.

    Article  Google Scholar 

  15. Gene H. Golub and Charles F. Van Loan. Matrix Computations. The Johns Hopkins University Press, 1983.

    Google Scholar 

  16. William Harrod. Solving linear least squares problems on an Alliant FX/8. Technical report, University of Illinois at Urbana-Champaign, Center for Supercomputing Research and Development, 1986.

    Google Scholar 

  17. Kai Hwang and Fayé A. Briggs. Computer Architecture and Parallel Processing. McGraw-Hill, New York, 1984.

    Google Scholar 

  18. Peter Lancaster and Kestutis Šalkauskas. Curve and Surface Fitting: An Introduction. Academic Press, San Diego, 1986.

    Google Scholar 

  19. C. L. Lawson, R. J. Hanson, R. J. Kincaid, and F. T. Krogh. Basic linear algebra subprograms for Fortran usage. ACM Transactions on Mathematical Software, 5(3):308–323, September 1979.

    Article  Google Scholar 

  20. Peter Mayes and Guiseppe Radicati di Brozolo. Block factorization algorithms on the IBM 3090/VF. In Proceedings of the International Meeting on Supercomputing, 1989.

    Google Scholar 

  21. Robert Schreiber. Block Algorithms for Parallel Machines, pages 197–207. Number 13 in IMA Volumes in Mathematics and its Applications. Springer Verlag, Berlin, 1988.

    Google Scholar 

  22. Robert Schreiber and Charles Van Loan. A storage efficient WY representation for products of Householder transformations. SIAM Journal on Scientific and Statistical Computing, 10(1):53–57, 1989.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Helmar Burkhart

Rights and permissions

Reprints and permissions

Copyright information

© 1990 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bischof, C.H., Lacroute, P.G. (1990). An adaptive blocking strategy for matrix factorizations. In: Burkhart, H. (eds) CONPAR 90 — VAPP IV. VAPP CONPAR 1990 1990. Lecture Notes in Computer Science, vol 457. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-53065-7_101

Download citation

  • DOI: https://doi.org/10.1007/3-540-53065-7_101

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-53065-7

  • Online ISBN: 978-3-540-46597-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics