ABSTRACT
This paper studies parallel algorithms for the longest increasing subsequence (LIS) problem. Let n be the input size and k be the LIS length of the input. Sequentially, LIS is a simple problem that can be solved using dynamic programming (DP) in O(n log n) work. However, parallelizing LIS is a long-standing challenge. We are unaware of any parallel LIS algorithm that has optimal O(n log n) work and non-trivial parallelism (i.e., Õ(k) or o(n) span).
This paper proposes a parallel LIS algorithm that costs O(n log k) work, Õ(k) span, and O(n) space, and is much simpler than the previous parallel LIS algorithms. We also generalize the algorithm to a weighted version of LIS, which maximizes the weighted sum for all objects in an increasing subsequence. To achieve a better work bound for the weighted LIS algorithm, we designed parallel algorithms for the van Emde Boas (vEB tree, which has the same structure as the sequential vEB tree, and supports work-efficient parallel batch insertion, deletion, and range queries.
We also implemented our parallel LIS algorithms. Our implementation is light-weighted, efficient, and scalable. On input size 109, our LIS algorithm outperforms a highly-optimized sequential algorithm (with O(n log k)cost) on inputs with k ≤ 3 x 105. Our algorithm is also much faster than the best existing parallel implementation by Shen et al. (2022) on all input instances.
- Peyman Afshani and Zhewei Wei. 2017. Independent range sampling, revisited. In esa. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.Google Scholar
- Reza Akbarinia, Esther Pacitti, and Patrick Valduriez. 2011. Best position algorithms for efficient top-k query processing. Information Systems, Vol. 36, 6 (2011), 973--989.Google ScholarDigital Library
- Yaroslav Akhremtsev and Peter Sanders. 2016. Fast Parallel Operations on Search Trees. In IEEE International Conference on High Performance Computing (HiPC).Google Scholar
- Muhammad Rashed Alam and M Sohel Rahman. 2013. A divide and conquer approach and a work-optimal parallel algorithm for the LIS problem. Inform. Process. Lett., Vol. 113, 13 (2013), 470--476.Google ScholarDigital Library
- Stephen F Altschul, Warren Gish, Webb Miller, Eugene W Myers, and David J Lipman. 1990. Basic local alignment search tool. Journal of molecular biology, Vol. 215, 3 (1990), 403--410.Google ScholarCross Ref
- Nimar S Arora, Robert D Blumofe, and C Greg Plaxton. 2001. Thread scheduling for multiprogrammed multiprocessors. Theory of Computing Systems (TOCS), Vol. 34, 2 (2001), 115--144.Google ScholarCross Ref
- Michael A Bender, Erik D Demaine, and Martin Farach-Colton. 2000. Cache-oblivious B-trees. In focs. IEEE, 399--409.Google Scholar
- Jon Louis Bentley and Jerome H Friedman. 1979. Data structures for range searching. Comput. Surveys, Vol. 11, 4 (1979), 397--409.Google ScholarDigital Library
- Sergei Bespamyatnikh and Michael Segal. 2000. Enumerating longest increasing subsequences and patience sorting. Inform. Process. Lett., Vol. 76, 1--2 (2000), 7--11.Google ScholarDigital Library
- Guy Blelloch, Daniel Ferizovic, and Yihan Sun. 2022. Joinable Parallel Balanced Binary Trees. ACM Transactions on Parallel Computing (TOPC), Vol. 9, 2 (2022), 1--41.Google ScholarDigital Library
- Guy E. Blelloch, Daniel Anderson, and Laxman Dhulipala. 2020a. ParlayLib -- a toolkit for parallel algorithms on shared-memory multicore machines. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 507--509.Google ScholarDigital Library
- Guy E. Blelloch, Daniel Ferizovic, and Yihan Sun. 2016. Just Join for Parallel Ordered Sets. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).Google Scholar
- Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, and Julian Shun. 2012b. Internally deterministic parallel algorithms can be fast. In ACM Symposium on Principles and Practice of Parallel Programming (PPOPP). 181--192.Google ScholarDigital Library
- Guy E. Blelloch, Jeremy T. Fineman, Yan Gu, and Yihan Sun. 2020b. Optimal parallel algorithms in the binary-forking model. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 89--102.Google ScholarDigital Library
- Guy E. Blelloch, Jeremy T. Fineman, and Julian Shun. 2012a. Greedy sequential maximal independent set and matching are parallel on average. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).Google ScholarDigital Library
- Guy E. Blelloch and Yan Gu. 2020. Improved Parallel Cache-Oblivious Algorithms for Dynamic Programming. In SIAM Symposium on Algorithmic Principles of Computer Systems (APOCS).Google Scholar
- Guy E. Blelloch, Yan Gu, Julian Shun, and Yihan Sun. 2018. Parallel Write-Efficient Algorithms and Data Structures for Computational Geometry. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).Google Scholar
- Guy E. Blelloch, Yan Gu, Julian Shun, and Yihan Sun. 2020c. Parallelism in Randomized Incremental Algorithms. J. ACM, Vol. 67, 5 (2020), 1--27.Google ScholarDigital Library
- Guy E. Blelloch, Yan Gu, Julian Shun, and Yihan Sun. 2020d. Randomized Incremental Convex Hull is Highly Parallel. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).Google ScholarDigital Library
- Guy E. Blelloch and Margaret Reid-Miller. 1998. Fast Set Operations Using Treaps. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 16--26.Google Scholar
- Robert D. Blumofe and Charles E. Leiserson. 1998. Space-Efficient Scheduling of Multithreaded Computations. SIAM J. on Computing, Vol. 27, 1 (1998).Google Scholar
- Nairen Cao, Shang-En Huang, and Hsin-Hao Su. 2023. Nearly optimal parallel algorithms for longest increasing subsequence. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).Google ScholarDigital Library
- Wun-Tat Chan, Yong Zhang, Stanley PY Fung, Deshi Ye, and Hong Zhu. 2007. Efficient algorithms for finding a longest common increasing subsequence. Journal of Combinatorial Optimization, Vol. 13, 3 (2007), 277--288.Google ScholarCross Ref
- Y-J Chiang and Roberto Tamassia. 1992. Dynamic algorithms in computational geometry. Proc. IEEE, Vol. 80, 9 (1992), 1412--1434.Google ScholarCross Ref
- Rezaul A. Chowdhury and Vijaya Ramachandran. 2008. Cache-efficient dynamic programming algorithms for multicores. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). ACM.Google Scholar
- Francisco Claude, J Ian Munro, and Patrick K Nicholson. 2010. Range queries over untangled chains. In International Symposium on String Processing and Information Retrieval. Springer, 82--93.Google ScholarCross Ref
- Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algorithms (3rd edition). MIT Press.Google ScholarDigital Library
- Maxime Crochemore and Ely Porat. 2010. Fast computation of a longest increasing subsequence and application. Information and Computation, Vol. 208, 9 (2010), 1054--1059.Google ScholarDigital Library
- Sanjoy Dasgupta, Christos H Papadimitriou, and Umesh Virkumar Vazirani. 2008. Algorithms. McGraw-Hill Higher Education New York.Google Scholar
- Percy Deift. 2000. Integrable systems and combinatorial theory. Notices AMS, Vol. 47 (2000), 631--640.Google Scholar
- Arthur L Delcher, Simon Kasif, Robert D Fleischmann, Jeremy Peterson, Owen White, and Steven L Salzberg. 1999. Alignment of whole genomes. Nucleic acids research, Vol. 27, 11 (1999), 2369--2376.Google Scholar
- Xiaojun Dong, Yan Gu, Yihan Sun, and Yunming Zhang. 2021. Efficient Stepping Algorithms and Implementations for Parallel Shortest Paths. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 184--197.Google Scholar
- David Eppstein, Zvi Galil, and Raffaele Giancarlo. 1988. Speeding up dynamic programming. In IEEE Symposium on Foundations of Computer Science (FOCS). 488--496.Google ScholarDigital Library
- Manuela Fischer and Andreas Noever. 2018. Tight analysis of parallel randomized greedy MIS. In ACM-SIAM Symposium on Discrete Algorithms (SODA). 2152--2160.Google ScholarCross Ref
- Michael L Fredman. 1975. On computing the length of longest increasing subsequences. Discrete Mathematics, Vol. 11, 1 (1975), 29--35.Google ScholarDigital Library
- Zvi Galil and Kunsoo Park. 1992. Dynamic programming with convexity, concavity and sparsity. Theoretical Computer Science (TCS), Vol. 92, 1 (1992), 49--76.Google ScholarDigital Library
- Zvi Galil and Kunsoo Park. 1994. Parallel algorithms for dynamic programming recurrences with more than O(1) dependency. J. Parallel Distrib. Comput., Vol. 21, 2 (1994), 213--222.Google ScholarDigital Library
- Paweŀ Gawrychowski, Shunsuke Inenaga, Dominik Köppl, Florin Manea, et al. 2015. Efficiently Finding All Maximal \ α-gapped Repeats. arXiv preprint arXiv:1509.09237 (2015).Google Scholar
- Michael T Goodrich and Roberto Tamassia. 2015. Algorithm design and applications. Wiley Hoboken.Google Scholar
- Yan Gu, Zachary Napier, and Yihan Sun. 2022. Analysis of Work-Stealing and Parallel Cache Complexity. In SIAM Symposium on Algorithmic Principles of Computer Systems (APOCS). SIAM, 46?60.Google Scholar
- Yan Gu, Zachary Napier, Yihan Sun, and Letong Wang. 2022b. Parallel Cover Trees and their Applications. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 259--272.Google ScholarDigital Library
- Yan Gu, Zheqi Shen, Yihan Sun, and Zijin Wan. 2022c. Work-Efcient Parallel Implementations on Longest Increasing Subsequence. https://github.com/ucrparlay/Parallel-LIS.Google Scholar
- Dan Gusfield. 1997. Algorithms on stings, trees, and sequences: Computer science and computational biology. Acm Sigact News, Vol. 28, 4 (1997), 41--60.Google ScholarDigital Library
- Hoai Phuong Ha, Ngoc Nha Vi Tran, Ibrahim Umar, Philippas Tsigas, Anders Gidenstam, Paul Renaud-Goud, Ivan Walulya, and Aras Atalar. 2014. Models for energy consumption of data structures and algorithms. (2014).Google Scholar
- William Hasenplaugh, Tim Kaler, Tao B. Schardl, and Charles E. Leiserson. 2014. Ordering heuristics for parallel graph coloring. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 166--177.Google Scholar
- James W Hunt and Thomas G Szymanski. 1977. A fast algorithm for computing longest common subsequences. Commun. ACM, Vol. 20, 5 (1977), 350--353.Google ScholarDigital Library
- Takafumi Inoue, Shunsuke Inenaga, Heikki Hyyrö, Hideo Bannai, and Masayuki Takeda. 2018. Computing longest common square subsequences. In cpm. Schloss Dagstuhl-Leibniz-Zentrum für Informatik, Dagstuhl Publishing.Google Scholar
- Kurt Johansson. 1998. The longest increasing subsequence in a random permutation and a unitary random matrix model. Mathematical Research Letters, Vol. 5, 1 (1998), 68--82.Google ScholarCross Ref
- Mark T. Jones and Paul E. Plassmann. 1993. A parallel graph coloring heuristic. , Vol. 14, 3 (1993), 654--669.Google Scholar
- Donald E. Knuth. 1973. The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley.Google Scholar
- Arie MCA Koster, Hans L Bodlaender, and Stan PM Van Hoesel. 2001. Treewidth: computational experiments. Electronic Notes in Discrete Mathematics, Vol. 8 (2001), 54--57.Google ScholarCross Ref
- Peter Krusche and Alexander Tiskin. 2009. Parallel longest increasing subsequences in scalable time and memory. In International Conference on Parallel Processing and Applied Mathematics. Springer, 176--185.Google Scholar
- Peter Krusche and Alexander Tiskin. 2010. New algorithms for efficient parallel string comparison. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 209--216.Google ScholarDigital Library
- Wei Quan Lim. 2019. Optimal Multithreaded Batch-Parallel 2--3 Trees. arXiv preprint arXiv:1905.05254 (2019).Google Scholar
- Witold Lipski and Franco P Preparata. 1981. Efficient algorithms for finding maximum matchings in convex bipartite graphs and related problems. Acta Informatica, Vol. 15, 4 (1981), 329--346.Google ScholarDigital Library
- Witold Lipski Jr. 1983. Finding a Manhattan path and related problems. Networks, Vol. 13, 3 (1983), 399--409.Google ScholarCross Ref
- Takaaki Nakashima and Akihiro Fujiwara. 2002. Parallel algorithms for patience sorting and longest increasing subsequence. In International Conference in Networks, Parallel and Distributed Processing and Applications. 7--12.Google Scholar
- Takaaki Nakashima and Akihiro Fujiwara. 2006. A cost optimal parallel algorithm for patience sorting. Parallel processing letters, Vol. 16, 01 (2006), 39--51.Google Scholar
- Shintaro Narisada, Kazuyuki Narisawa, Shunsuke Inenaga, Ayumi Shinohara, et al. 2017. Computing longest single-arm-gapped palindromes in a string. In International Conference on Current Trends in Theory and Practice of Informatics. Springer, 375--386.Google ScholarCross Ref
- Ryan O'Donnell and John Wright. [n.,d.]. A primer on the statistics of longest increasing subsequences and quantum states. SIGACT News ( [n.,d.]).Google Scholar
- Xinghao Pan, Dimitris Papailiopoulos, Samet Oymak, Benjamin Recht, Kannan Ramchandran, and Michael I. Jordan. 2015. Parallel correlation clustering on big graphs. In Advances in Neural Information Processing Systems (NIPS). 82--90.Google Scholar
- Craige Schensted. 1961. Longest increasing and decreasing subsequences. Canadian Journal of Mathematics, Vol. 13 (1961), 179--191.Google ScholarCross Ref
- David Semé. 2006. A CGM algorithm solving the longest increasing subsequence problem. In International Conference on Computational Science and Its Applications. Springer, 10--21.Google ScholarDigital Library
- Zheqi Shen, Zijin Wan, Yan Gu, and Yihan Sun. 2022. Many Sequential Iterative Algorithms Can Be Parallel and (Nearly) Work-efficient. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).Google Scholar
- Julian Shun, Yan Gu, Guy E. Blelloch, Jeremy T. Fineman, and Phillip B Gibbons. 2015. Sequential random permutation, list contraction and tree contraction are highly parallel. In ACM-SIAM Symposium on Discrete Algorithms (SODA). 431--448.Google ScholarCross Ref
- Jack Snoeyink. 1992. Two-and Three-Dimensional Point Location in Rectangular Subdivisions. In swat, Vol. 621. Springer Verlag, 352.Google Scholar
- Yihan Sun and Guy E Blelloch. 2019. Parallel Range, Segment and Rectangle Queries with Augmented Maps. In SIAM Symposium on Algorithm Engineering and Experiments (ALENEX). 159--173.Google Scholar
- Yihan Sun, Daniel Ferizovic, and Guy E Blelloch. 2018. PAM: Parallel Augmented Maps. In ACM Symposium on Principles and Practice of Parallel Programming (PPOPP).Google ScholarDigital Library
- Yuan Tang, Ronghui You, Haibin Kan, Jesmin Jahan Tithi, Pramod Ganapathi, and Rezaul A Chowdhury. 2015. Cache-oblivious wavefront: improving parallelism of recursive dynamic programming algorithms without losing cache-efficiency. In ACM Symposium on Principles and Practice of Parallel Programming (PPOPP). 205--214.Google ScholarDigital Library
- Garcia Thierry, Myoupo Jean-Frédéric, and Semé David. 2001. A work-optimal CGM algorithm for the LIS problem. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 330--331.Google ScholarDigital Library
- Alexander Tiskin. 2015. Fast distance multiplication of unit-Monge matrices. Algorithmica, Vol. 71, 4 (2015), 859--888.Google ScholarDigital Library
- Daniel Tomkins, Timmie Smith, Nancy M Amato, and Lawrence Rauchwerger. 2014. SCCMulti: an improved parallel strongly connected components algorithm. ACM Symposium on Principles and Practice of Parallel Programming (PPOPP), Vol. 49, 8 (2014), 393--394.Google ScholarDigital Library
- Ibrahim Umar, Otto Anshus, and Phuong Ha. 2013. Deltatree: A practical locality-aware concurrent search tree. arXiv preprint arXiv:1312.2628 (2013).Google Scholar
- Peter van Emde Boas. 1977. Preserving order in a forest in less than logarithmic time and linear space. Inform. Process. Lett., Vol. 6, 3 (1977), 80--82.Google ScholarCross Ref
- Peter van Emde Boas, Robert Kaas, and Erik Zijlstra. 1976. Design and implementation of an efficient priority queue. Mathematical systems theory, Vol. 10, 1 (1976), 99--127.Google Scholar
- Yiqiu Wang, Shangdi Yu, Yan Gu, and Julian Shun. 2021. A Parallel Batch-Dynamic Data Structure for the Closest Pair Problem. In ACM Symposium on Computational Geometry (SoCG).Google Scholar
- Marvin Williams, Peter Sanders, and Roman Dementiev. 2021. Engineering MultiQueues: Fast Relaxed Concurrent Priority Queues. In European Symposium on Algorithms (ESA).Google Scholar
- I-Hsuan Yang, Chien-Pin Huang, and Kun-Mao Chao. 2005. A fast algorithm for computing a longest common increasing subsequence. Inform. Process. Lett., Vol. 93, 5 (2005), 249--253.Google ScholarDigital Library
- Hongyu Zhang. 2003. Alignment of BLAST high-scoring segment pairs based on the longest increasing subsequence algorithm. Bioinformatics, Vol. 19, 11 (2003), 1391--1396.Google ScholarCross Ref
- Tingzhe Zhou, Maged Michael, and Michael Spear. 2019. A Practical, Scalable, Relaxed Priority Queue. In International Conference on Parallel Processing (ICPP). 1--10.Google Scholar
Index Terms
- Parallel Longest Increasing Subsequence and van Emde Boas Trees
Recommendations
Nearly Optimal Parallel Algorithms for Longest Increasing Subsequence
SPAA '23: Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and ArchitecturesThe paper presents parallel algorithms for multiplying implicit simple unit-Monge matrices (Krusche and Tiskin, PPAM 2009) of size n x n in the EREW PRAM model. We show implicit simple unit-Monge matrices multiplication of size n x n can be achieved by a ...
Parallel Algorithms for the Longest Common Subsequence Problem
A subsequence of a given string is any string obtained by deleting none or some symbolsfrom the given string. A longest common subsequence (LCS) of two strings is a commonsubsequence of both that is as long as any other common subsequences. The problem ...
Fast computation of a longest increasing subsequence and application
We consider the complexity of computing a longest increasing subsequence (LIS) parameterised by the length of the output. Namely, we show that the maximal length k of an increasing subsequence of a permutation of the set of integers {1,2,...,n} can be ...
Comments