ABSTRACT
The skyline of a set of multi-dimensional points (tuples) consists of those points for which no clearly better point exists in the given set, using component-wise comparison on domains of interest. Skyline queries, i.e., queries that involve computation of a skyline, can be computationally expensive, so it is natural to consider parallelized approaches which make good use of multiple processors. We approach this problem by using hyperplane projections to obtain useful partitions of the data set for parallel processing. These partitions not only ensure small local skyline sets, but enable efficient merging of results as well. Our experiments show that our method consistently outperforms similar approaches for parallel skyline computation, regardless of data distribution, and provides insights on the impacts of different optimization strategies.
- W.-T. Balke, U. Güntzer, and J. X. Zheng. Efficient distributed skylining for web information systems. In EDBT, pages 256--273, 2004.Google ScholarCross Ref
- J. L. Bentley, H. T. Kung, M. Schkolnick, and C. D. Thompson. On the average number of maxima in a set of vectors and applications. J. ACM, 25(4):536--543, 1978. Google ScholarDigital Library
- M. Blum, R. W. Floyd, V. R. Pratt, R. L. Rivest, and R. E. Tarjan. Time bounds for selection. J. Comput. Syst. Sci., 7(4):448--461, 1973. Google ScholarDigital Library
- S. Börzsönyi, D. Kossmann, and K. Stocker. The skyline operator. In ICDE, pages 421--430, 2001. Google ScholarDigital Library
- M. J. Carey and D. Kossmann. On saying 'Enough Already!' in SQL. In SIGMOD, pages 219--230, 1997. Google ScholarDigital Library
- J. Chomicki, P. Godfrey, J. Gryz, and D. Liang. Skyline with presorting. In ICDE, pages 717--719, 2003.Google ScholarCross Ref
- A. Cosgaya-Lozano, A. Rau-Chaplin, and N. Zeh. Parallel computation of skyline queries. In 21st International Symposium on High Performance Computing Systems and Applications (HPCS), page 12, 2007. Google ScholarDigital Library
- B. Cui, H. Lu, Q. Xu, L. Chen, Y. Dai, and Y. Zhou. Parallel distributed processing of constrained skyline queries by filtering. In ICDE, pages 546--555, 2008. Google ScholarDigital Library
- J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. In 6th Symposium on Operating System Design and Implementation (OSDI), pages 137--150, 2004. Google ScholarDigital Library
- F. Dehne, A. Fabri, and A. Rau-Chaplin. Scalable parallel geometric algorithms for coarse grained multicomputers. In 9th Annual Symposium on Computational Geometry (SCG), pages 298--307, 1993. Google ScholarDigital Library
- Y. Gao, G. Chen, L. Chen, and C. Chen. Parallelizing progressive computation for skyline queries in multi-disk environment. In DEXA, pages 697--706, 2006. Google ScholarDigital Library
- P. Godfrey, R. Shipley, and J. Gryz. Maximal vector computation in large data sets. In VLDB, pages 229--240, 2005. Google ScholarDigital Library
- Z. Huang, C. S. Jensen, H. Lu, and B. C. Ooi. Skyline queries against mobile lightweight devices in manets. In ICDE, page 66, 2006. Google ScholarDigital Library
- D. Kossmann, F. Ramsak, and S. Rost. Shooting stars in the sky: an online algorithm for skyline queries. In VLDB, pages 275--286, 2002. Google ScholarDigital Library
- H. T. Kung, F. Luccio, and F. P. Preparata. On finding the maxima of a set of vectors. J. ACM, 22(4):469--476, 1975. Google ScholarDigital Library
- J. Matousek. Computing dominances in en (short communication). Information Processing Letters, 38(5):277--278, 1991. Google ScholarDigital Library
- D. Papadias, Y. Tao, G. Fu, and B. Seeger. An optimal and progressive algorithm for skyline queries. In SIGMOD, pages 467--478, 2003. Google ScholarDigital Library
- D. Papadias, Y. Tao, G. Fu, and B. Seeger. Progressive skyline computation in database systems. TODS, 30(1):41--82, 2005. Google ScholarDigital Library
- S. Park, T. Kim, J. Park, J. Kim, and H. Im. Parallel skyline computation on multicore architectures. In ICDE, pages 760--771, 2009. Google ScholarDigital Library
- C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis. Evaluating mapreduce for multi-core and multiprocessor systems. In IEEE 13th International Symposium on High Performance Computer Architecture (HPCA), pages 13--24, 2007. Google ScholarDigital Library
- S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. A scalable content-addressable network. In SIGCOMM, pages 161--172, 2001. Google ScholarDigital Library
- K.-L. Tan, P.-K. Eng, and B. C. Ooi. Efficient progressive skyline computation. In VLDB, pages 301--310, 2001. Google ScholarDigital Library
- A. Vlachou, C. Doulkeridis, and Y. Kotidis. Angle-based space partitioning for efficient parallel skyline computation. In SIGMOD, pages 227--238, 2008. Google ScholarDigital Library
- S. Wang, B. C. Ooi, and A. K. H. Tung. Efficient skyline query processing on peer-to-peer networks. In ICDE, pages 1126--1135, 2007.Google ScholarCross Ref
- P. Wu, C. Zhang, Y. Feng, B. Y. Zhao, D. Agrawal, and A. E. Abbadi. Parallelizing skyline queries for scalable distribution. In EDBT, pages 112--130, 2006. Google ScholarDigital Library
- H. Yang, A. Dasdan, R.-L. Hsiao, and D. S. Parker. Map-reduce-merge: simplified relational data processing on large clusters. In SIGMOD, pages 1029--1040, 2007. Google ScholarDigital Library
Index Terms
- Efficient parallel skyline processing using hyperplane projections
Recommendations
Skyline query processing over joins
SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of dataThis paper addresses the problem of efficiently computing the skyline set of a relational join. Existing techniques either require to access all tuples of the input relations or demand specialized multi-dimensional access methods to generate the skyline ...
U-Skyline: A New Skyline Query for Uncertain Databases
The skyline query, aiming at identifying a set of skyline tuples that are not dominated by any other tuple, is particularly useful for multicriteria data analysis and decision making. For uncertain databases, a probabilistic skyline query, called P-...
Towards multidimensional subspace skyline analysis
The skyline operator is important for multicriteria decision-making applications. Although many recent studies developed efficient methods to compute skyline objects in a given space, none of them considers skylines in multiple subspaces simultaneously. ...
Comments