ABSTRACT
Computing the Euclidean minimum spanning tree (Emst) is a computationally demanding step of many algorithms. While work-efficient serial and multithreaded algorithms for computing Emst are known, designing an efficient GPU algorithm is challenging due to a complex branching structure, data dependencies, and load imbalances. In this paper, we propose a single-tree Borůvka-based algorithm for computing Emst on GPUs. We use an efficient nearest neighbor algorithm and reduce the number of the required distance calculations by avoiding traversing subtrees with leaf nodes in the same component. The developed algorithms are implemented in a performance portable way using ArborX, an open-source geometric search library based on the Kokkos framework. We evaluate the proposed algorithm on various 2D and 3D datasets, show and compare it with the current state-of-the-art open-source CPU implementations. We demonstrate 4-24 × speedup over the fastest multi-threaded implementation. We prove the portability of our implementation by providing results on a variety of hardware: AMD EPYC 7763, Nvidia A100 and AMD MI250X. We show scalability of the implementation, computing Emst for 37 million 3D cosmological dataset in under a 0.5 second on a single A100 Nvidia GPU.
- 2018. Next Generation Simulation (NGSIM) Vehicle Trajectories and Supporting Data. Available online: https://catalog.data.gov/dataset/next-generation-simulation-ngsim-vehicle-trajectories-and-supporting-data. Accessed: 2021-03-06.Google Scholar
- Pankaj K Agarwal, Herbert Edelsbrunner, Otfried Schwarzkopf, and Emo Welzl. 1991. Euclidean minimum spanning trees and bichromatic closest pairs. Discrete & Computational Geometry 6, 3 (1991), 407–422.Google ScholarDigital Library
- Ciprian Apetrei. 2014. Fast and Simple Agglomerative LBVH Construction. In Computer Graphics and Visual Computing (CGVC), Rita Borgo and Wen Tang (Eds.). The Eurographics Association. https://doi.org/10.2312/cgvc.20141206 ZSCC: NoCitationData[s0].Google Scholar
- Bentley and Friedman. 1978. Fast Algorithms for Constructing Minimal Spanning Trees in Coordinate Spaces. IEEE Trans. Comput. C-27, 2 (Feb. 1978), 97–105. https://doi.org/10.1109/TC.1978.1675043 Conference Name: IEEE Transactions on Computers.Google ScholarDigital Library
- Otakar Borůvka. 1926. O jistém problému minimálním. Práce Mor. Prırodved. Spol. v Brne (Acta Societ. Scienc. Natur. Moravicae) 3, 3 (1926), 37–58.Google Scholar
- Paul B Callahan and S Rao Kosaraju. 1995. A decomposition of multidimensional point sets with applications to k-nearest-neighbors and n-body potential fields. Journal of the ACM (JACM) 42, 1 (1995), 67–90.Google ScholarDigital Library
- Ricardo J. G. B. Campello, Davoud Moulavi, Arthur Zimek, and Jörg Sander. 2015. Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection. ACM Transactions on Knowledge Discovery from Data 10, 1 (July 2015), 5:1–5:51. https://doi.org/10.1145/2733381Google ScholarDigital Library
- Ryan R. Curtin, Marcus Edel, Mikhail Lozhnikov, Yannis Mentekidis, Sumedh Ghaisas, and Shangtong Zhang. 2018. mlpack 3: a fast, flexible machine learning library. Journal of Open Source Software 3, 26 (June 2018), 726. https://doi.org/10.21105/joss.00726Google ScholarCross Ref
- H. Carter Edwards, Christian R. Trott, and Thomas Sunderland. 2014. Kokkos: Enabling manycore performance portability through polymorphic memory access patterns. J. Parallel and Distrib. Comput. 74, 12 (Dec. 2014), 3202–3216. https://doi.org/10.1016/j.jpdc.2014.07.003 Publisher: Academic Press.Google ScholarDigital Library
- Junhao Gan and Yufei Tao. 2017. On the Hardness and Approximation of Euclidean DBSCAN. ACM Transactions on Database Systems 42, 3 (July 2017), 14:1–14:45. https://doi.org/10.1145/3083897Google ScholarDigital Library
- Salman Habib, Adrian Pope, Hal Finkel, Nicholas Frontiere, Katrin Heitmann, David Daniel, Patricia Fasel, Vitali Morozov, George Zagaris, Tom Peterka, 2016. HACC: Simulating sky surveys on state-of-the-art supercomputing architectures. New Astronomy 42(2016), 49–65.Google ScholarCross Ref
- Michael Held and Richard M Karp. 1970. The traveling-salesman problem and minimum spanning trees. Operations Research 18, 6 (1970), 1138–1162.Google ScholarDigital Library
- T. Karras. 2012. Maximizing Parallelism in the Construction of BVHs, Octrees, and K-d Trees. In Proceedings of the Fourth ACM SIGGRAPH / Eurographics Conference on High-Performance Graphics(EGGH-HPG’12). Eurographics Association, Goslar Germany, Germany, 33–37. https://doi.org/10.2312/EGGH/HPG12/033-037Google ScholarCross Ref
- Manohar Kaul, Bin Yang, and Christian S Jensen. 2013. Building accurate 3d spatial networks to enable next generation intelligent transportation systems. In 2013 IEEE 14th International Conference on Mobile Data Management, Vol. 1. IEEE, 137–146.Google ScholarDigital Library
- Joseph B. Kruskal. 1956. On the Shortest Spanning Subtree of a Graph and the Traveling Salesman Problem. Proc. Amer. Math. Soc. 7, 1 (1956), 48–50. https://doi.org/10.2307/2033241 Publisher: American Mathematical Society.Google ScholarCross Ref
- D. Lebrun-Grandié, A. Prokopenko, B. Turcksin, and S. R. Slattery. 2020. ArborX: A Performance Portable Geometric Search Library. ACM Trans. Math. Softw. 47, 1, Article 2 (Dec. 2020), 15 pages. https://doi.org/10.1145/3412558Google ScholarDigital Library
- Xiang-Yang Li and Peng-Jun Wan. 2001. Constructing minimum energy mobile wireless networks. ACM SIGMOBILE Mobile Computing and Communications Review 5, 4(2001), 55–67.Google ScholarDigital Library
- William B March, Parikshit Ram, and Alexander G Gray. 2010. Fast euclidean minimum spanning tree: algorithm, analysis, and applications. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. 603–612.Google ScholarDigital Library
- Leland McInnes and John Healy. 2017. Accelerated Hierarchical Density Based Clustering. In 2017 IEEE International Conference on Data Mining Workshops (ICDMW). 33–42. https://doi.org/10.1109/ICDMW.2017.12 ISSN: 2375-9259.Google Scholar
- Leland McInnes, John Healy, and Steve Astels. 2017. hdbscan: Hierarchical density based clustering.J. Open Source Softw. 2, 11 (2017), 205.Google ScholarCross Ref
- Luis Moreira-Matias, Joao Gama, Michel Ferreira, Joao Mendes-Moreira, and Luis Damas. 2013. Predicting taxi–passenger demand using streaming data. IEEE Transactions on Intelligent Transportation Systems 14, 3(2013), 1393–1402.Google ScholarDigital Library
- Krishna Naidoo, Lorne Whiteway, Elena Massara, Davide Gualdi, Ofer Lahav, Matteo Viel, Héctor Gil-Marín, and Andreu Font-Ribera. 2020. Beyond two-point statistics: using the minimum spanning tree as a tool for cosmology. Monthly Notices of the Royal Astronomical Society 491, 2 (2020), 1709–1726.Google ScholarCross Ref
- Giri Narasimhan, Jianlin Zhu, and Martin Zachariasen. 2000. Experiments with computing geometric minimum spanning trees. In Proceedings of ALENEX’00. Citeseer, 183–196.Google Scholar
- R. C. Prim. 1957. Shortest connection networks and some generalizations. The Bell System Technical Journal 36, 6 (Nov. 1957), 1389–1401. https://doi.org/10.1002/j.1538-7305.1957.tb01515.x Conference Name: The Bell System Technical Journal.Google ScholarCross Ref
- S Subramaniam and SB Pope. 1998. A mixing model for turbulent reactive flows based on Euclidean minimum spanning trees. Combustion and Flame 115, 4 (1998), 487–514.Google ScholarCross Ref
- Christian R. Trott, Damien Lebrun-Grandié, Daniel Arndt, Jan Ciesko, Vinh Dang, Nathan Ellingwood, Rahulkumar Gayatri, Evan Harvey, Daisy S. Hollman, Dan Ibanez, Nevin Liber, Jonathan Madsen, Jeff Miles, David Poliakoff, Amy Powell, Sivasankaran Rajamanickam, Mikael Simberg, Dan Sunderland, Bruno Turcksin, and Jeremiah Wilke. 2022. Kokkos 3: Programming Model Extensions for the Exascale Era. IEEE Transactions on Parallel and Distributed Systems 33, 4 (April 2022), 805–817. https://doi.org/10.1109/TPDS.2021.3097283 Conference Name: IEEE Transactions on Parallel and Distributed Systems.Google ScholarCross Ref
- Yiqiu Wang, Shangdi Yu, Yan Gu, and Julian Shun. 2021. Fast parallel algorithms for euclidean minimum spanning tree and hierarchical spatial clustering. In Proceedings of the 2021 International Conference on Management of Data. 1982–1995.Google ScholarDigital Library
- Yu Zheng, Like Liu, Longhao Wang, and Xing Xie. 2008. Learning transportation mode from raw gps data for geographic applications on the web. In Proceedings of the 17th international conference on World Wide Web. 247–256.Google ScholarDigital Library
Index Terms
- A single-tree algorithm to compute the Euclidean minimum spanning tree on GPUs
Recommendations
A Fast Two-Level Approximate Euclidean Minimum Spanning Tree Algorithm for High-Dimensional Data
Machine Learning and Data Mining in Pattern RecognitionAbstractEuclidean minimum spanning tree algorithms run typically with quadratic computational complexity, which is not practical for large scale high dimensional datasets. In this paper, we propose a new two-level approximate Euclidean minimum spanning ...
Drawing a tree as a minimum spanning tree approximation
We introduce and study (1+@e)-EMST drawings, i.e., planar straight-line drawings of trees such that, for any fixed @e>0, the distance between any two vertices is at least 11+@e the length of the longest edge in the path connecting them. (1+@e)-EMST ...
On the area requirements of Euclidean minimum spanning trees
In their seminal paper on Euclidean minimum spanning trees, Monma and Suri (1992) proved that any tree of maximum degree 5 admits a planar embedding as a Euclidean minimum spanning tree. Their algorithm constructs embeddings with exponential area; ...
Comments