skip to main content
article
Free Access

Adaptive backoff synchronization techniques

Authors Info & Claims
Published:01 April 1989Publication History
Skip Abstract Section

Abstract

Shared-memory multiprocessors commonly use shared variables for synchronization. Our simulations of real parallel applications show that large-scale cache-coherent multiprocessors suffer significant amounts of invalidation traffic due to synchronization. Large multiprocessors that do not cache synchronization variables are often more severely impacted. If this synchronization traffic is not reduced or managed adequately, synchronization references can cause severe congestion in the network. We propose a class of adaptive back-off methods that do not use any extra hardware and can significantly reduce the memory traffic to synchronization variables. These methods use synchronization state to reduce polling of synchronization variables. Our simulations show that when the number of processors participating in a barrier synchronization is small compared to the time of arrival of the processors, reductions of 20 percent to over 95 percent in synchronization traffic can be achieved at no extra cost. In other situations adaptive backoff techniques result in a tradeoff between reduced network accesses and increased processor idle time.

References

  1. 1 Norman Abramson. The ALOHA System - Another alternative for computer communications. In Proc. Fall Joint Computer Conf., pages 261-285, 1977.Google ScholarGoogle Scholar
  2. 2 Anant Agarwal and Mathews Cherian. Adaptive Backofi Synchronization Techniques. MIT VLSI Memo, April 1989.Google ScholarGoogle Scholar
  3. 3 Anant Agarwal, Richard Simoni, John Hennessy, and Mark Horowitz. An Evaluation of Directory Schemes for Cache Coherence. In Proc. 15th Intl. Symp. on Computer Architecture, IEEE, New York, June 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4 Lucien M. Censier and Paul Feautrier. A New Solution to Coherence Problems in Multicache Systems. IEEE Trans. on Computers, C-27(12):1112-1118, December 1978.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5 J. W. Cooley and J. W. "Tukey. An Algorithm for the Machine Calculation of Complex Fourier Series. Math. Comput., 19:297-301, April 1965.Google ScholarGoogle ScholarCross RefCross Ref
  6. 6 W. P. Crowley and C. P. Hendrickson and T. E. Rudy. The Simple Code. Lawrence Livermore Laboratory TR, February 1978.Google ScholarGoogle Scholar
  7. 7 F. Darema-Rogers, D. A. George, V. A. Norton, and G. F. Pfister. Single-Program-Multiple-Data Computational Model for EPEX/FORTRAN. TR RC 11552 (55212), IBM T. J. Watson Research Center, Yorktown Heights, November 1986.Google ScholarGoogle Scholar
  8. 8 Daniel Gajski, David Kuck, Duncan Lawrie, and Ahmed Saleh. Cedar - A Large Scale Multiprocessor. In Proc. ICPP, pages 524-529, August 1983.Google ScholarGoogle Scholar
  9. 9 A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph, and M. Snir. The NYU Ultracomputer - Designing a MIMD Shared-Memory Parallel Machine. IEEE Trans. on Computers, C-32(2):175- 189, February 1983.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10 Tsutomu Hoshino. PAX Computer. High-Speed ParaL lel Processing and Scientific Computing. Addison Wesley, Reading Mass., 1989. Harold S. Stone, Editor. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11 Eugenia Kalnay-Rivas and David Hoitsma. Documentation of the Fourth Order Band Model. Technical Report, NASA Modeling and Simulation Facility Laboratory for Atmospheric Science, NASA/Goddard Space Flight Center, Greenbelt, MD, 1979.Google ScholarGoogle Scholar
  12. 12 L. Kleinrock and Y. Yemini. An Optimal Adaptive Scheme for Multiple Acess Broadcast Communication. Proc. ICC, pages 7.2.1-7.2.5, June 1978.Google ScholarGoogle Scholar
  13. 13 S. S. Lam. A' Carrier Sense Multiple Access Protocol for Local Networks. Computer Networks, 4(1):21-32, Jan. 1980.Google ScholarGoogle Scholar
  14. 14 S. S. Lam and L. Kleinrock. Packet Switching in a Multiaccess Broadcast Channel: Dynamic Control Procedures. IEEE Trans. on Computers, C-23, Sept. 1975.Google ScholarGoogle Scholar
  15. 15 E. L. Lusk and R. A. Overbeek. Implementation of Monitors with Macros: A Programming Aid for the HEP and other Parallel Processors. TR ANL-83-97, Argonne National Laboratory, Argonne, Illinois, December 1983.Google ScholarGoogle Scholar
  16. 16 R. Metcalfe and D. Boggs. Ethernet: Distributed Packet Switching for Local Computer Networks. Communications of the ACM, 19(7), July 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17 Janak H. Patel. Analysis of Multiprocessors with Private Cache Memories. IEEE Trans. on Computers, C- 31(4):296-304, April 1982.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18 G. F. Pfister, W. C. Brantley, D. A. George, S. L. Harvey, W. J. Kleinfelder, K. P. McAuliffe, E. A. Melton, A. Norton, and J. Weiss. The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture. In Proc. ICPP, pages 764-771, August 1985.Google ScholarGoogle Scholar
  19. 19 G. F. Pfister and V. A. Norton. 'Hotspot' Contention and Combining in Multistage Interconnection Networks. IEEE Trans. on Computers, C-34(10), October 1985.Google ScholarGoogle Scholar
  20. 20 Steven Scott and Gurindar Sohi. Using Feedback to Control Tree Saturation In Multistage Interconnection Networks. In Proc. 16th Annual Int. Symp. on Computer Architecture, June 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21 K. So, F. Darema-Rogers, D. A. George, V. A. Norton, and G. F. Pfister. PSIMUL - A System for Parallel Simulation of Parallel Systems. Technical Report RC 11674 (58502), IBM T. J. Watson Research Center, Yorktown Heights, November 1987.Google ScholarGoogle Scholar
  22. 22 Peiyi Tang and Pen-Chung Yew. Processor Selfscheduling for Multiple-Nested Parallel Loops. In Proc. ICPP, pages 528-535, August 1986.Google ScholarGoogle Scholar
  23. 23 Wolf-Dietrich Weber and Anoop Gupta. Analysis of Cache Invalidation Patterns in Multiprocessors. In Proc. ASPLOS III, April 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. 24 P.-C. Yew, N.-F. Tzeng, and D. H. Lawrie. Distributed Hot-Spot Addressing in Large-Scale Multiprocessors. IEEE Tmns. on Computers, C-36(14):388-395, April 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Adaptive backoff synchronization techniques

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM SIGARCH Computer Architecture News
              ACM SIGARCH Computer Architecture News  Volume 17, Issue 3
              Special Issue: Proceedings of the 16th annual international symposium on Computer Architecture
              June 1989
              400 pages
              ISSN:0163-5964
              DOI:10.1145/74926
              Issue’s Table of Contents
              • cover image ACM Conferences
                ISCA '89: Proceedings of the 16th annual international symposium on Computer architecture
                April 1989
                426 pages
                ISBN:0897913191
                DOI:10.1145/74925

              Copyright © 1989 Authors

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 April 1989

              Check for updates

              Qualifiers

              • article

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader