skip to main content
research-article

Engineering a cache-oblivious sorting algorithm

Published:12 June 2008Publication History
Skip Abstract Section

Abstract

This paper is an algorithmic engineering study of cache-oblivious sorting. We investigate by empirical methods a number of implementation issues and parameter choices for the cache-oblivious sorting algorithm Lazy Funnelsort and compare the final algorithm with Quicksort, the established standard for comparison-based sorting, as well as with recent cache-aware proposals. The main result is a carefully implemented cache-oblivious sorting algorithm, which, our experiments show, can be faster than the best Quicksort implementation we are able to find for input sizes well within the limits of RAM. It is also at least as fast as the recent cache-aware implementations included in the test. On disk, the difference is even more pronounced regarding Quicksort and the cache-aware algorithms, whereas the algorithm is slower than a careful implementation of multiway Mergesort, such as TPIE.

References

  1. Agarwal, P. K., Arge, L., Danner, A., and Holland-Minkley, B. 2003. Cache-oblivious data structures for orthogonal range searching. In Proc. 19th ACM Symposium on Computational Geometry. ACM, New York. 237--245. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Aggarwal, A. and Vitter, J. S. 1988. The input/output complexity of sorting and related problems. Communications of the ACM 31, 9, 1116--1127.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Arge, L. 2001. External memory data structures. In Proc. 9th Annual European Symposium on Algorithms. LNCS, vol. 2161. Springer, New York. 1--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Arge, L., Bender, M. A., Demaine, E. D., Holland-Minkley, B., and Munro, J. I. 2002a. Cache-oblivious priority queue and graph algorithm applications. In Proc. 34th Annual ACM Symposium on Theory of Computing. ACM, New York. 268--276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Arge, L., Chase, J., Vitter, J. S., and Wickremesinghe, R. 2002b. Efficient sorting using registers and caches. ACM Journal of Experimental Algorithmics 7, 9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Arge, L., Brodal, G. S., and Fagerberg, R. 2005a. Cache-oblivious data structures. In Handbook of Data Structures and Applications, D. Mehta and S. Sahni, Eds. CRC Press, Boca Ratom, FL. Chapter 34.Google ScholarGoogle Scholar
  7. Arge, L., Brodal, G. S., Fagerberg, R., and Laustsen, M. 2005b. Cache-oblivious planar orthogonal range searching and counting. In Proc. 21st Annual ACM Symposium on Computational Geometry. ACM, New York. 160--169. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Arge, L., de Berg, M., and Haverkort, H. J. 2005c. Cache-oblivious R-trees. In Proc. 21st Annual ACM Symposium on Computational Geometry. ACM, New York. 170--179. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bayer, R. and McCreight, E. 1972. Organization and maintenance of large ordered indexes. Acta Informatica 1, 173--189.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Bender, M., Cole, R., Demaine, E., and Farach-Colton, M. 2002a. Scanning and traversing: Maintaining data for traversals in a memory hierarchy. In Proc. 10th Annual European Symposium on Algorithms. LNCS, vol. 2461. Springer, New York. 139--151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Bender, M., Cole, R., and Raman, R. 2002b. Exponential structures for cache-oblivious algorithms. In Proc. 29th International Colloquium on Automata, Languages, and Programming. LNCS, vol. 2380. Springer, New York. 195--207. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Bender, M., Demaine, E., and Farach-Colton, M. 2002c. Efficient tree layout in a multilevel memory hierarchy. In Proc. 10th Annual European Symposium on Algorithms. LNCS, vol. 2461. Springer, New York. 165--173. Full version at http://arxiv.org/abs/cs/0211010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Bender, M. A., Demaine, E., and Farach-Colton, M. 2000. Cache-oblivious B-trees. In Proc. 41st Annual Symposium on Foundations of Computer Science. IEEE Computer Society Press, Washington D.C. 399--409. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Bender, M. A., Duan, Z., Iacono, J., and Wu, J. 2002d. A locality-preserving cache-oblivious dynamic dictionary. In Proc. 13th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM, New York. 29--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Bender, M. A., Brodal, G. S., Fagerberg, R., Ge, D., He, S., Hu, H., Iacono, J., and López-Ortiz, A. 2003. The cost of cache-oblivious searching. In Proc. 44th Annual IEEE Symposium on Foundations of Computer Science. IEEE Computer Society Press, Washington D.C. 271--282. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Bender, M. A., Fineman, J. T., Gilbert, S., and Kuszmaul, B. C. 2005. Concurrent cache-oblivious B-trees. In Proc. 17th Annual ACM Symposium on Parallel Algorithms. ACM, New York. 228--237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Bender, M. A., Farach-Colton, M., and Kuszmaul, B. C. 2006. Cache-oblivious string B-trees. In Proc. 25th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM, New York. 233--242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Bentley, J. L. and McIlroy, M. D. 1993. Engineering a sort function. Software--Practice and Experience 23, 1, 1249--1265.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Brodal, G. S. 2004. Cache-oblivious algorithms and data structures. In Proc. 9th Scandinavian Workshop on Algorithm Theory. LNCS, vol. 3111. Springer, New York. 3--13.Google ScholarGoogle ScholarCross RefCross Ref
  20. Brodal, G. S. and Fagerberg, R. 2002a. Cache oblivious distribution sweeping. In Proc. 29th International Colloquium on Automata, Languages, and Programming. LNCS, vol. 2380. Springer, New York. 426--438. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Brodal, G. S. and Fagerberg, R. 2002b. Funnel heap—a cache-oblivious priority queue. In Proc. 13th Annual International Symposium on Algorithms and Computation. LNCS, vol. 2518. Springer, New York. 219--228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Brodal, G. S. and Fagerberg, R. 2003. On the limits of cache-obliviousness. In Proc. 35th Annual ACM Symposium on Theory of Computing. ACM, New York. 307--315. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Brodal, G. S. and Fagerberg, R. 2006. Cache-oblivious string dictionaries. In Proc. 17th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM, New York. 581--590. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Brodal, G. S., Fagerberg, R., and Jacob, R. 2002c. Cache oblivious search trees via binary trees of small height. In Proc. 13th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM, New York. 39--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Brodal, G. S., Fagerberg, R., Meyer, U., and Zeh, N. 2004. Cache-oblivious data structures and algorithms for undirected breadth-first search and shortest paths. In Proc. 9th Scandinavian Workshop on Algorithm Theory. LNCS, vol. 3111. Springer, New York. 480--492.Google ScholarGoogle Scholar
  26. Brodal, G. S., Fagerberg, R., and Moruz, G. 2005. Cache-aware and cache-oblivious adaptive sorting. In Proc. 32nd International Colloquium on Automata, Languages, and Programming. LNCS, vol. 3580. Springer, New York. 576--588. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Chowdhury, R. A. and Ramachandran, V. 2004. Cache-oblivious shortest paths in graphs using buffer heap. In Proc. 16th Annual ACM Symposium on Parallelism in Algorithms and Architectures. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Chowdhury, R. A. and Ramachandran, V. 2006. Cache-oblivious dynamic programming. In Proc. 17th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM, New York. 591--600. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Department of Computer Science, Duke University. 2002. TPIE: a transparent parallel I/O environment. WWW page, http://www.cs.duke.edu/TPIE/.Google ScholarGoogle Scholar
  30. Fagerberg, R., Pagh, A., and Pagh, R. 2006. External string sorting: Faster and cache-oblivious. In Proc. 23rd Annual Symposium on Theoretical Aspects of Computer Science. LNCS, vol. 3884. Springer, New York. 68--79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Farzan, A., Ferragina, P., Franceschini, G., and Munro, J. I. 2005. Cache-oblivious comparison-based algorithms on multisets. In Proc. 13th Annual European Symposium on Algorithms. LNCS, vol. 3669. Springer, New York. 305--316. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Franceschini, G. 2004. Proximity mergesort: Optimal in-place sorting in the cache-oblivious model. In Proc. 15th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM, New York. 291--299. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Franceschini, G. and Grossi, R. 2003a. Optimal cache-oblivious implicit dictionaries. In Proc. 30th International Colloquium on Automata, Languages, and Programming. LNCS, vol. 2719. Springer, New York. 316--331. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Franceschini, G. and Grossi, R. 2003b. Optimal worst-case operations for implicit cache-oblivious search trees. In Proc. 8th International Workshop on Algorithms and Data Structures. LNCS, vol. 2748. Springer, New York. 114--126.Google ScholarGoogle Scholar
  35. Frigo, M., Leiserson, C. E., Prokop, H., and Ramachandran, S. 1999. Cache-oblivious algorithms. In Proc. 40th Annual Symposium on Foundations of Computer Science. IEEE Computer Society Press, Washington D.C. 285--297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Gray, J. 2003. Sort benchmark home page. WWW page, http://research.microsoft.com/barc/SortBenchmark/.Google ScholarGoogle Scholar
  37. Hwang, F. K. and Lin, S. 1972. A simple algorithm for merging two disjoint linearly ordered sets. SIAM Journal on Computing 1, 1, 31--39.Google ScholarGoogle ScholarCross RefCross Ref
  38. Jampala, H. and Zeh, N. 2005. Cache-oblivious planar shortest paths. In Proc. 32nd International Colloquium on Automata, Languages, and Programming. LNCS, vol. 3580. Springer, New York. 563--575. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Knuth, D. E. 1998. The Art of Computer Programming, Vol 3, Sorting and Searching, 2nd ed. Addison-Wesley, Reading, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Ladner, R. E., Fortna, R., and Nguyen, B.-H. 2002. A comparison of cache aware and cache oblivious static search trees using program instrumentation. In Experimental Algorithmics. LNCS, vol. 2547. Springer, New York. 78--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. LaMarca, A. and Ladner, R. E. 1999. The influence of caches on the performance of sorting. Journal of Algorithms 31, 66--104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Prokop, H. 1999. Cache-oblivious algorithms. M.S. thesis, Massachusetts Institute of Technology.Google ScholarGoogle Scholar
  43. Rahman, N., Cole, R., and Raman, R. 2001. Optimised predecessor data structures for internal memory. In Proc. 5th International Workshop on Algorithm Engineering. LNCS 2141, 67--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Sanders, P. 2000. Fast priority queues for cached memory. ACM Journal of Experimental Algorithmics 5, 7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Sedgewick, R. 1998. Algorithms in C++: Parts 1--4: Fundamentals, Data Structures, Sorting, Searching, third ed. Addison-Wesley, Reading, MA. Code available at http://www.cs.princeton.edu/~rs/Algs3.cxx1-4/code.txt. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Vinther, K. 2003. Engineering cache-oblivious sorting algorithms. M.S. thesis, Department of Computer Science, University of Aarhus, Denmark. Available online at http://kristoffer.vinther.name/academia/thesis/.Google ScholarGoogle Scholar
  47. Vitter, J. S. 2001. External memory algorithms and data structures: Dealing with massive data. ACM Computing Surveys 33, 2, 209--271. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Williams, J. W. J. 1964. Algorithm 232: Heapsort. Communications of the ACM 7, 347--348.Google ScholarGoogle Scholar
  49. Xiao, L., Zhang, X., and Kubricht, S. A. 2000. Improving memory performance of sorting algorithms. ACM Journal of Experimental Algorithmics 5, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Engineering a cache-oblivious sorting algorithm

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Journal of Experimental Algorithmics
        ACM Journal of Experimental Algorithmics  Volume 12, Issue
        2008
        507 pages
        ISSN:1084-6654
        EISSN:1084-6654
        DOI:10.1145/1227161
        Issue’s Table of Contents

        Copyright © 2008 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 June 2008
        • Accepted: 1 December 2006
        • Revised: 1 September 2006
        • Received: 1 May 2004
        Published in jea Volume 12, Issue

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader