skip to main content
10.1145/1989493.1989549acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
research-article

A highly-efficient wait-free universal construction

Published:04 June 2011Publication History

ABSTRACT

We present a new simple wait-free universal construction, called Sim, that uses just a Fetch&Add and an LL/SC object and performs a constant number of shared memory accesses. We have implemented SIM in a real shared-memory machine. In theory terms, our practical version of SIM, called P-SIM, has worse complexity than its theoretical analog; in practice though, we experimentally show that P-SIM outperforms several state-of-the-art lock-based and lock-free techniques, and this given that it is wait-free, i.e., that it satisfies a stronger progress condition than all the algorithms it outperforms.

We have used P-SIM to get highly-efficient wait-free implementations of stacks and queues. Our experiments show that our implementations outperform the currently state-of-the-art shared stack and queue implementations which ensure only weaker progress properties than wait-freedom.

References

  1. Yehuda Afek, Dalia Dauber, and Dan Touitou. Wait-free made fast. In Proceedings of the 27th ACM Symposium on Theory of Computing, pages 538--547, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Yehuda Afek, Gideon Stupp, and Dan Touitou. Long-lived adaptive collect with applications. In Proceedings of the 40th Symposium on Foundations of Computer Science, pages 262--272, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. James H. Anderson and Mark Moir. Universal constructions for multi-object operations. In Proceedings of the 14th ACM Symposium on Principles of Distributed Computing, pages 184--193, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. James H. Anderson and Mark Moir. Universal constructions for large objects. IEEE Transactions on Parallel and Distributed Systems, 10(12):1317--1332, dec 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Hagit Attiya, Rachid Guerraoui, and Eric Ruppert. Partial snapshot objects. In Proceedings of the 20th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 336--343, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Emery D. Berger, Kathryn S. McKinley, Robert D. Blumofe, and Paul R. Wilson. Hoard: A scalable memory allocator for multithreaded applications. In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 117--128, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Phong Chuong, Faith Ellen, and Vijaya Ramachandran. A universal construction for wait-free transaction friendly data structures. In Proceedings of the 22nd Annual ACM Symposium on Parallel Algorithms and Architectures, pages 335--344, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Pat Conway, Nathan Kalyanasundharam, Gregg Donley, Kevin Lepak, and Bill Hughes. Blade computing with the amd opteron processor (magny-cours). Hot chips 21, August 2009.Google ScholarGoogle Scholar
  9. T. S. Craig. Building fifo and priority-queueing spin locks from atomic swap. Technical Report TR 93-02-02, Department of Computer Science, University of Washington, February 1993.Google ScholarGoogle Scholar
  10. Panagiota Fatourou and Nikolaos D. Kallimanis. The RedBlue adaptive universal constractions. In Proceedings of the 23rd International Symposium on Distributed Computing, pages 127--141, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Panagiota Fatourou and Nikolaos D. Kallimanis. Fast implementations of shared objects using fetch&add. Technical Report TR 02-2010, Department of Computer Science, University of Ioannina, February 2010.Google ScholarGoogle Scholar
  12. D. George S. Harvey W. Kleinfelder K. McAuliffe E. Melton V. Norton G. Pfister, W. Brantley and J. Weiss. The ibm research parallel processor prototype (rp3): Introduction and architecture. pages 764--771, 1985.Google ScholarGoogle Scholar
  13. P. Heidelberger, A. Norton, and John T. Robinson. Parallel quicksort using fetch-and-add. IEEE Transactions on Computers., 39(1):133--138, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Danny Hendler, Itai Incze, Nir Shavit, and Moran Tzafrir. The code for flat combining. http://github.com/mit-carbon/flat-combining.Google ScholarGoogle Scholar
  15. Danny Hendler, Itai Incze, Nir Shavit, and Moran Tzafrir. Flat combining and the synchronization-parallelism tradeoff. In Proceedings of the 22nd Annual ACM Symposium on Parallel Algorithms and Architectures, pages 355--364, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Danny Hendler, Nir Shavit, and Lena Yerushalmi. A scalable lock-free stack algorithm. In Proceedings of the 16th ACM Symposium on Parallel Algorithms and Architectures, pages 206--215, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Maurice Herlihy. Wait-free synchronization. ACM Transactions on Programming Languages and Systems (TOPLAS), 13:124--149, jan 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Maurice Herlihy. A methodology for implementing highly concurrent data objects. ACM Transactions on Programming Languages and Systems (TOPLAS), 15(5):745--770, nov 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Maurice P. Herlihy and Jeannette M. Wing. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems (TOPLAS), 12:463--492, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Damien Imbs and Michel Raynal. Help when needed, but no more: Efficient read/write partial snapshot. In Proceedings of the 23rd International Symposium on Distributed Computing, pages 142--156. Springer, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Prasad Jayanti. A time complexity lower bound for randomized implementations of some shared objects. In Proceedings of the 17th ACM Symposium on Principles of Distributed Computing, pages 201--210, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Peter S. Magnusson, Anders Landin, and Erik Hagersten. Queue locks on cache coherent multiprocessors. In Proceedings of the 8th International Parallel Processing Symposium, pages 165--171, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. John M. Mellor-Crummey and Michael L. Scott. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Transactions on Computer Systems, 9(1):21--65, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Maged M. Michael and Michael L. Scott. Simple, fast, and practical non-blocking and blocking concurrent queue algorithms. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, pages 267--275, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Dimitrios S. Nikolopoulos and Theodore S. Papatheodorou. A quantitative architectural evaluation of synchronization algorithms and disciplines on ccnuma systems: the case of the sgi origin2000. In Proceedings of the 13th international conference on Supercomputing (ICS '99), pages 319--328, New York, NY, USA, 1999. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ori Shalev and Nir Shavit. Predictive log-synchronization. In EuroSys, pages 305--315, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Nir Shavit and Asaph Zemach. Combining funnels: A dynamic approach to software combining. Journal of Parallel and Distributed Computing, 60(11):1355--1387, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Gadi Taubenfeld. Synchronization Algorithms and Concurrent Programming. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R. K. Treiber. Systems programming: Coping with parallelism. Technical Report RJ 5118, IBM Almaden Research Center, April 1986.Google ScholarGoogle Scholar
  30. Pen-Chung Yew, Nian-Feng Tzeng, and D.H. Lawrie. Distributing hot-spot addressing in large-scale multiprocessors. IEEE Transactions on Computers, C-36(4):388 --395, April 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A highly-efficient wait-free universal construction

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SPAA '11: Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
        June 2011
        404 pages
        ISBN:9781450307437
        DOI:10.1145/1989493

        Copyright © 2011 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 June 2011

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate447of1,461submissions,31%

        Upcoming Conference

        SPAA '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader