skip to main content
10.1145/2755573.2755597acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
research-article

A Top-Down Parallel Semisort

Authors Info & Claims
Published:13 June 2015Publication History

ABSTRACT

Semisorting is the problem of reordering an input array of keys such that equal keys are contiguous but different keys are not necessarily in sorted order. Semisorting is important for collecting equal values and is widely used in practice. For example, it is the core of the MapReduce paradigm, is a key component of the database join operation, and has many other applications. We describe a (randomized) parallel algorithm for the problem that is theoretically efficient (linear work and logarithmic depth), but is designed to be more practically efficient than previous algorithms. We use ideas from the parallel integer sorting algorithm of Rajasekaran and Reif, but instead of processing bits of a integers in a reduced range in a bottom-up fashion, we process the hashed values of keys directly top-down. We implement the algorithm and experimentally show on a variety of input distributions that it outperforms a similarly-optimized radix sort on a modern 40-core machine with hyper-threading by about a factor of 1.7--1.9, and achieves a parallel speedup of up to 38x. We discuss the various optimizations used in our implementation and present an extensive experimental analysis of its performance.

References

  1. C. Balkesen, G. Alonso, J. Teubner, and M. T. Ozsu. Main-memory hash joins on modern processor architectures. In IEEE Transactions on Knowledge and Data Engineering (TKDE), 2014.Google ScholarGoogle Scholar
  2. H. Bast and T. Hagerup. Fast parallel space allocation, estimation and integer sorting. Information and Computation, 123(1):72--110, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. E. Blelloch, P. B. Gibbons, and H. V. Simhadri. Low depth cache-oblivious algorithms. In Proc. ACM Symp. on Parallelism in Algorithms and Architectures (SPAA), pages 189--199, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. J. ACM (JACM), 46(5), Sept. 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. P. Brent. The parallel evaluation of general arithmetic expressions. J. ACM (JACM), 21(2):201--206, 1974. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. F. Codd. A relational model of data for large shared data banks. Commun. ACM (CACM), 13(6):377--387, June 1970. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Cole. Parallel merge sort. SIAM J. Comput., 17(4):770--785, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms (3. ed.). MIT Press, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. Commun. ACM (CACM), 51(1):107--113, Jan. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. B. Gibbons, Y. Matias, and V. Ramachandran. Efficient low-contention parallel algorithms. Journal of Computer and System Sciences, 53(3):417--442, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Gil, Y. Matias, and U. Vishkin. Towards a theory of nearly constant time parallel algorithms. In Foundations of Computer Science (FOCS), pages 698--710, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. W. Hasenplaugh, T. Kaler, T. B. Schardl, and C. E. Leiserson. Ordering heuristics for parallel graph coloring. In Proc. ACM Symp. on Parallelism in Algorithms and Architectures (SPAA), pages 166--177, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Jaja. Introduction to Parallel Algorithms. Addison-Wesley Professional, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. E. Leiserson. The Cilk++ concurrency platform. The Journal of Supercomputing, 51(3):244--257, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. O. Polychroniou and K. A. Ross. A comprehensive study of main-memory partitioning and its application to large-scale comparison- and radix-sort. In Proc. ACM SIGMOD International Conference on Management of Data, pages 755--766, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Rajasekaran and J. H. Reif. Optimal and sublogarithmic time randomized parallel sorting algorithms. SIAM J. Comput., 18(3):594--607, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. H. Reif and S. Sen. Parallel computational geometry: An approach using randomization. In J. Sack and J. Urrutia, editors, Handbook of Computational Geometry, chapter 18, pages 765--828. Elsevier Science, 1999.Google ScholarGoogle Scholar
  18. J. Shun and G. E. Blelloch. Phase-concurrent hash tables for determinism. In Proc. ACM Symp. on Parallelism in Algorithms and Architectures (SPAA), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Shun, G. E. Blelloch, J. T. Fineman, P. B. Gibbons, A. Kyrola, H. V. Simhadri, and K. Tangwongsan. Brief announcement: the Problem Based Benchmark Suite. In Proc. ACM Symp. on Parallelism in Algorithms and Architectures (SPAA), pages 68--70, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Singler, P. Sanders, and F. Putze. Mcstl: The multi-core standard template library. In Euro-Par 2007 Parallel Processing, pages 682--694. Springer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. L. G. Valiant. Handbook of theoretical computer science (vol. a). chapter General Purpose Parallel Architectures, pages 943--973. MIT Press, Cambridge, MA, USA, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Top-Down Parallel Semisort

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SPAA '15: Proceedings of the 27th ACM symposium on Parallelism in Algorithms and Architectures
      June 2015
      362 pages
      ISBN:9781450335881
      DOI:10.1145/2755573
      • General Chair:
      • Guy Blelloch,
      • Program Chair:
      • Kunal Agrawal

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 June 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SPAA '15 Paper Acceptance Rate31of131submissions,24%Overall Acceptance Rate447of1,461submissions,31%

      Upcoming Conference

      SPAA '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader