Faster output-sensitive parallel algorithms for 3D convex hulls and vector maxima

https://doi.org/10.1016/S0743-7315(03)00035-2Get rights and content

Abstract

In this paper we focus on the problem of designing very fast parallel algorithms for the convex hull and the vector maxima problems in three dimensions that are output-size sensitive. Our algorithms achieve O(loglog2nlogh) parallel time and optimal O(nlogh) work with high probability in the CRCW PRAM where n and h are the input and output size, respectively. These bounds are independent of the input distribution and are faster than the previously known algorithms. We also present an optimal speed-up (with respect to the input size only) sublogarithmic time algorithm that uses superlinear number of processors for vector maxima in three dimensions.

Introduction

The study of algorithms is mainly concerned with developing algorithms for well-defined problems that are close to the best-possible (for the problem at hand). The most common measure of the performance of an algorithm is related to the running time of the algorithm. For any non-trivial problem, the running time of an algorithm increases monotonically with the size of the input. Hence the efficiency of an algorithm is usually measured as a function of the input-size. However, for many geometric problems, additional parameters like the size of the output capture the complexity of the problem more accurately enabling us to design superior algorithms. Algorithms whose running times are sensitive to the output-size have been actively pursued for problems like convex hulls [Cha95], [CM92], [CS89], [Mat92], [Sei86].

The primary objective of designing parallel algorithms is to obtain very fast running time while keeping the total work (the processor-time product) close to the best sequential algorithms. Problems for which output-size sensitive algorithms are known, it becomes especially important to have similar parallel algorithms. Unless we design parallel algorithms that are output-sensitive, the corresponding (output-sensitive) sequential may be faster for many instances. (We will often use the shorter term output-sensitive instead of output-size sensitive.)

The task of designing output-size sensitive algorithms on parallel machines is even more challenging than their sequential counterparts. Not only is the output-size an unknown parameter, we also have to rapidly eliminate input points that do not contribute to the final output without incurring a high cost. The divide-and-conquer method is not directly effective as it cannot divide the output evenly—in fact herein lies the crux of the difficulty in designing ‘fast’ output-sensitive algorithms. By ‘fast’ we imply O(logh) or something very close, where h is the output size. The sequential output-size sensitive algorithms often use techniques like gift wrapping or incremental construction which are inherently sequential.

The parallel random access machine (PRAM) has been the most popular model for designing algorithms because of its close relationship with the sequential models. For example if S(n) is the best-known sequential time complexity for the input size n then we aim for a parallel algorithm with P(n) processors and T(n) running time so as to minimize T(n) subject to keeping the product P(nT(n) close to O(S(n)). A parallel algorithm that actually does total work O(S(n)) is called a work optimal algorithm. Simultaneously, if one can also match the time lower bound then the algorithm is the best possible (theoretically).

The fastest possible time bound clearly depends on the parallel machine model. For example, in the case of concurrent read exclusive write (CREW) PRAM model, the convex hull cannot be constructed faster than Ω(logn) time, because of an Ω(logn) bound for computing the maximum (minimum). Note that this bound is independent of the output-size. However, this bound is not applicable to the CRCW model. For parallel algebraic decision tree model, Sen [Sen97] has obtained a trade-off between the number of processors and possible speed-up for a wide range of problems in computational geometry. For convex hulls, the following is known.

Lemma 1.1 Sen [Sen97]

Any randomized algorithm in the parallel bounded degree decision tree model for constructing convex hull of n points and output-size h, has a parallel time-bound of Ω(logh/logk) using kn processors, k>1, in the worst case.

In other words, for super-linear number of processors, a proportional speed-up is not achievable and hence these parallel algorithms cannot be considered efficient. The best or the ultimate that one can hope for under the circumstances is an algorithm that achieves O(logh) time using n processors.

Our algorithms are designed for the CRCW PRAM model. Although we always work in a Euclidean space Ed we are mainly interested in the combinatorial complexity of the algorithm. We pretend that a single memory location can hold a real number and a processor can perform an arithmetic operation involving real numbers in a single step. In reality there may be numerical errors due to finite word lengths that we ignore. This is a non-trivial issue and deserves a separate treatment. This model is consistent with the model that is used for sequential computational geometry, called ‘Real’-RAM. For our randomized algorithms, we further assume that processors have access to O(logn) random bits in constant time.

Convex polygons are very important objects in computational geometry, and in a large number of cases give rise to very efficient algorithms because of their nice property, namely convexity.

Definition 1.1

Given a set S={p1,p2,…,pn} of n points in Ed (the Euclidean d-dimensional space), the convex hull of S is the smallest convex polytope containing all the points of S. The convex hull problem is to determine the ordered list CH(S) of the points of S defining the boundary of the convex hull of S.

In two dimensions, the ‘ordering’ in the output is simply the clockwise (or counter-clockwise) ordering of the vertices of the hull. In three dimensions it can be the cyclic ordering of the hull edges around each hull vertex.

The problem of constructing convex hulls has attracted a great deal of attention from the inception of computational geometry. In three dimensions, Preparata and Hong [PH77], described the first O(nlogn) time algorithm. Clarkson and Shor [CS89] presented the first (randomized) optimal output-sensitive algorithm which ran in O(nlogh) expected time, where h is the number of hull vertices. Their algorithm was subsequently derandomized optimally by Chazelle and Matoušek [CM92]. Chan [Cha95] presented a very elegant approach for output-sensitive construction of convex hulls using ray shooting that achieve optimal Θ(nlogh) running times for dimensions two and three. In higher dimensions, the quest is still on to design optimal output-sensitive algorithms. Recently, Chan [Cha95] designed an output-sensitive algorithm for convex hulls in d dimensions which runs in O(nlogO(1)h+h⌊d/2⌋) time. Seidel [Sei86] computes F faces of the convex hull of n points in a fixed dimension d in O(n2+Flogn) time. This can be slightly improved to O(n2−(2/(⌊d/2⌋+1))+ε+Flogn), for any fixed ε>0, using a technique of Matoušek [Mat92]. In a recent paper, Amato and Ramos [AR96] have shown that the convex hull of n points in Rd+1,2⩽d⩽4, can be computed in O((n+F)logd−1F) time.

In the context of parallel algorithms, for convex hulls in three dimensions (3-D hulls) Chow [Cho80] described an O(log3n) time algorithm using n CREW processors. An O(log2nlogn) time algorithm using n processors was obtained by Dadoun and Kirkpatrick [DK89] and an O(log2n) time algorithm was designed by Amato and Preparata [AP92]. Reif and Sen [RS92] presented an O(logn) time and O(nlogn) work randomized algorithm which was the first-known worst-case optimal algorithm for three-dimensional hulls in the CREW model. Their algorithm deals with the dual equivalent of the convex hull problem namely the intersection of half-spaces in three dimensions. The algorithm works on a divide-and-conquer approach. It works in O(logn) phases pruning away the redundant half-spaces and keeping the work to linear in each phase. This algorithm was derandomized by Goodrich [Goo93] who obtained an O(log2n) time, O(nlogn) work method for the EREW PRAM. In the context of output-size sensitive algorithm, one faces the problem of dividing the output points evenly so that one can finish in O(logh) phases, keeping the work linear in each phase, especially when the output-size is not known. In [GS97], the authors have shown that the convex hulls in two dimensions can be constructed in O(loglognlogh) time with optimal work with high probability. In three dimensions, Goodrich and Ghouse [GG91] described an O(log2n) expected time, O(min{nlog2h,nlogn}) work method, which is output-sensitive but not work-optimal. More recently, Amato et al. [AGR94] gave a deterministic O(log3n) time, O(nlogh) work algorithm for convex hulls in R3 on the EREW PRAM.

In higher dimensions, Amato et al. [AGR94] have shown that the convex hull of n points in Rd can be constructed in O(logn) time with O(nlogn+n⌊d/2⌋) work with high probability and in O(logn) time with O(n⌊d/2⌋logc(⌈d/2⌉−⌊d/2⌋)n) work deterministically, where c>0 is a constant.

In this paper, we present a randomized algorithm for the convex hulls in three dimensions that is faster for all output-sizes while maintaining the optimal output-sensitive work bound. The algorithm exploits an observation of Clarkson and Shor [CS89], namely, that of iterative pruning of non-extreme points. We also make use of a number of sophisticated techniques like bootstrapping and super-linear processors parallel algorithms for convex hulls [Sen97] combined with a very fine-tuned analysis.

Let T be a set of vectors in Rd. The partial order ⩽M is defined for two vectors x=(x1,x2,…,xd) and y=(y1,y2,…,yd) as follows: xMy (x is dominated by y) if and only if xiyi for all 1⩽id.

Definition 1.2

A vector vT is said to be maximal if it is not dominated by any other vector wT. The problem of maximal vectors is to determine all maximal vectors in a given set of input vectors.

Relatively little work has been done in the context of parallel algorithms for the vector maxima problem. Efficient sequential algorithms are known for two and three dimensions (the O(nlogn)—time algorithm is worst case optimal). However, for h∈o(logn) (h is the number of vertices on the hull) a better (O(nh) time) algorithm can be easily obtained in two dimensions by finding the point with maximum y coordinate, deleting all the points dominated by it and repeating on the reduced problem. This algorithm takes linear time for constant h. Kirkpatrick and Seidel presented an O(nlogh) time algorithm and showed that the bound is tight with respect to the input and the output sizes [KS86]. In the context of parallel algorithms for two dimensions, an O(logn) time optimal algorithm can be obtained easily by using any of the (n,logn) algorithm for sorting followed by a straight forward divide-and-conquer. The best-known result for three-dimensional maxima is O(logn) time and O(nlogn) operations due to Atallah et al. [ACG89] in which they have used parallel merging techniques. Using the techniques of [GS97] an O(loglognlogh) time, optimal work algorithm can be obtained for vector maxima in two dimensions also. We present an algorithm for vector maxima in three dimensions.

Our algorithms are randomized and always provide correct output and the bounds (running time) hold with high probability independent of the input distribution. Such algorithms are often referred to as Las Vegas algorithms. The term high probability implies probability exceeding 1−1/nc for any predetermined constant c where n is the input-size. We will use the notation Õ instead of O to denote that the bound holds with high probability.

Our algorithms achieve O(loglog2nlogh) time with optimal O(nlogh) work with high probability. This is one of the first non-trivial applications of super-linear processor algorithms in computational geometry to obtain speed-up for a situation where initially there is no processor advantage. Our work establishes a close connection between fast output-sensitive parallel algorithms and super-linear processor algorithms. Consequently, our algorithms become increasingly faster than the previous algorithms as the output size decreases. We are not aware of any previous work where the parallel algorithms speed-up optimally with the output size in the sublogarithmic time domain.

We also present an optimal speed-up (with respect to the input size only) sublogarithmic time algorithm that uses super-linear number of processors for vector maxima in three dimensions.

Section snippets

Some known useful results

Definition 2.1

For all n,m∈N and λ⩾1, the m-color semi-sorting problem of size n and with slack λ is defined as follows: Given n integers (colors) x1,…,xn in the range 0,…m, compute n non-negative integers y1,…,yn (the placement of xi) such that:

  • (1)

    All the xi of the same color are placed in contiguous locations (not necessarily consecutive).

  • (2)

    max{yj:1⩽jn}=O(λn).

Definition 2.2

For all n∈N interval allocation problem is defined as follows: Given n non-negative integers x1,…,xn, compute n non-negative integers y1,…,yn (the

Brief overview of the algorithm for convex hulls

The problem of constructing the convex hull of points in three dimensions is well known to be equivalent to the problem of finding the intersection of half-spaces. Here we give an algorithm for the latter which implies a solution for the former.

Let us denote the input set of half-spaces by S and their intersection by P(S). We construct the intersection P(R) of a random sample R of r half-spaces and filter out the redundant half-spaces, i.e. the half-spaces which do not contribute to P(S).

Random sampling lemma

Let H(R) denote the set of cones induced by a sample R and let H(R) denote the set of critical cones. We will denote the set of half-spaces intersecting a cone △∈H(R) by L(△) and its cardinality |L(△)| by l(△). L(△) will also be referred to as the conflict list of △ and l(△), its conflict size. We will use the following results related to bounding the size of the reduced problem.

Lemma 4.1

Clarkson and Shor [CS89], Rajasekaran and Sen [RS93]

For some suitable constant k and large n,Pr△∈H(R)l(△)⩾kn⩽1/4,where the probability is taken over all possible

Algorithm

We give below a general algorithm for both the convex hull and the vector maxima problems. In the later sections we describe how each step is carried out in the context of a specific problem and then give a combined analysis.

Let S be the input set of n objects. Objects are half-spaces for the intersection of half-spaces or the convex hull and they are vectors or points for the vector maxima problem. The algorithm is iterative. Let ni (respectively ri) denote the size of the problem

Convex hulls

In order to get a fast algorithm we must be able to determine the intersections of the half-spaces with the cones and determine the critical cones quickly.

Vector maxima

Let S be the set of input vectors. We pick up a random sample R of the input vectors and compute its maxima and define regions. We say that a vector conflicts with a region if it can potentially dominate the vectors in that region. Determine the critical regions, i.e. the regions containing an output vector, delete the vectors not conflicting with any critical region and iterate on the reduced problem. As in the case of convex hulls, in order to get a fast algorithm we must be able to detect

Analysis

Assume that h=O(nδ) for some δ between 0 and 1, for otherwise the problem can be solved in O(∑logri)=O(logn)=O(logh) time with O(nlogh) work.

Since ri=O(nε),ε<1 therefore Step 2 can be done in constant time by a brute-force method followed by sorting with ri2 processors. Step 3(a)i (finding conflicts) can be done in O(logri/logpni) time using p processors by Lemma 6.2 for convex hulls and by Lemma 7.1 for vector maxima. An application of semi-sorting then gives us the set of input objects

Sublogarithmic algorithm for 3d maxima

In this section we present a sublogarithmic time algorithm that achieves optimal speed-up (with respect to the input-size only) and uses super-linear number of processors for the vector maxima problem in three dimensions.

References (25)

  • B. Chazelle, J. Matoušek, Derandomizing an output-sensitive convex-hull algorithm in three dimensions, Tech. Report,...
  • A. Chow, Parallel algorithms for geometric problems, Ph.D. Thesis, University of Illinois, Urbana-Champaign,...
  • Cited by (12)

    • GPU accelerated convex hull computation

      2012, Computers and Graphics (Pergamon)
      Citation Excerpt :

      Although their implementations [3] can be quite efficient and robust, they are unable to exploit the computation capability of multi-core/many-core processors. Some prior GPU-based algorithms [9,23,14,15] are either limited to lower dimensions (2D or 3D) or may not offer high speedups. Recently, Tzeng and Owens [24,25] presented a parallel convex hull algorithm based on a generalized framework for recursive divide-and-conquer on GPUs.

    • CudaHull: Fast parallel 3D convex hull on the GPU

      2012, Computers and Graphics (Pergamon)
    • Randomized Incremental Convex Hull is Highly Parallel

      2020, Annual ACM Symposium on Parallelism in Algorithms and Architectures
    • Diameter and Convex Hull of Points Using Space Subdivision in E<sup>2</sup> and E<sup>3</sup>

      2020, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    View all citing articles on Scopus

    Some of the results of this paper appeared in a preliminary version [GS96] in the Twelth Annual Symposium on Computational Geometry, 1996, Philadelphia, USA.

    View full text