Article

Supporting ad-hoc ranking aggregates

Authors:
Chengkai Li

University of Illinois at Urbana-Champaign, Urbana, IL

University of Illinois at Urbana-Champaign, Urbana, IL
View Profile

,
Kevin Chen-Chuan Chang

University of Illinois at Urbana-Champaign, Urbana, IL

University of Illinois at Urbana-Champaign, Urbana, IL
View Profile

,
Ihab F. Ilyas

University of Waterloo, Waterloo, Ontario, Canada

University of Waterloo, Waterloo, Ontario, Canada
View Profile

SIGMOD '06: Proceedings of the 2006 ACM SIGMOD international conference on Management of dataJune 2006Pages 61–72https://doi.org/10.1145/1142473.1142481

Published:27 June 2006Publication History

SIGMOD '06: Proceedings of the 2006 ACM SIGMOD international conference on Management of data

Pages 61–72

ABSTRACT

This paper presents a principled framework for efficient processing of ad-hoc top-k (ranking) aggregate queries, which provide the k groups with the highest aggregates as results. Essential support of such queries is lacking in current systems, which process the queries in a naïve materialize-group-sort scheme that can be prohibitively inefficient. Our framework is based on three fundamental principles. The Upper-Bound Principle dictates the requirements of early pruning, and the Group-Ranking and Tuple-Ranking Principles dictate group-ordering and tuple-ordering requirements. They together guide the query processor toward a provably optimal tuple schedule for aggregate query processing. We propose a new execution framework to apply the principles and requirements. We address the challenges in realizing the framework and implementing new query operators, enabling efficient group-aware and rank-aware query plans. The experimental study validates our framework by demonstrating orders of magnitude performance improvement in the new query plans, compared with the traditional plans.

References

F. N. Afrati and R. Chirkova. Selecting and using views to compute aggregate queries (extended abstract). In ICDT, pages 383--397, 2005. Google ScholarDigital Library
S. Agarwal, R. Agrawal, P. Deshpande, A. Gupta, J. F. Naughton, R. Ramakrishnan, and S. Sarawagi. On the computation of multidimensional aggregates. In VLDB, pages 506--521, 1996. Google ScholarDigital Library
K. Beyer and R. Ramakrishnan. Bottom-up computation of sparse and iceberg cube. SIGMOD, pages 359--370, 1999. Google ScholarDigital Library
N. Bruno, L. Gravano, and A. Marian. Evaluating top-k queries over web-accessible databases. In ICDE, 2002.Google ScholarCross Ref
M. J. Carey and D. Kossmann. On saying "enough already!" in SQL. In SIGMOD, pages 219--230, 1997. Google ScholarDigital Library
K. C.-C. Chang and S. Hwang. Minimal probing: Supporting expensive predicates for top-k queries. In SIGMOD, 2002. Google ScholarDigital Library
S. Chaudhuri and L. Gravano. Evaluating top-k selection queries. In VLDB, pages 397--410, 1999. Google ScholarDigital Library
S. Chaudhuri, R. Ramakrishnan, and G. Weikum. Integrating DB and IR technologies: What is the sound of one hand clapping? In CIDR, pages 1--12, 2005.Google Scholar
S. Chaudhuri and K. Shim. Including group-by in query optimization. In VLDB, pages 354--366, 1994. Google ScholarDigital Library
J. Claussen, A. Kemper, D. Kossmann, and C. Wiesner. Exploiting early sorting and early partitioning for decision support query processing. VLDB J., 9(3), 2000. Google ScholarDigital Library
S. Cohen, W. Nutt, and A. Serebrenik. Rewriting aggregate queries using views. In PODS, pages 155--166, 1999. Google ScholarDigital Library
D. Donjerkovic and R. Ramakrishnan. Probabilistic optimization of top n queries. In VLDB, 1999. Google ScholarDigital Library
R. Fagin. Combining fuzzy information from multiple systems. In PODS, pages 216--226, 1996. Google ScholarDigital Library
R. Fagin, A. Lote, and M. Naor. Optimal aggregation algorithms for middleware. In PODS, 2001. Google ScholarDigital Library
M. Fang, N. Shivakumar, H. Garcia-Molina, R. Motwani, and J. D. Ullman. Computing iceberg queries efficiently. In VLDB, pages 299--310, San Francisco, CA, USA, 1998. Google ScholarDigital Library
J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H. Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. J. Data Mining and Knowledge Discovery, 1(1):29--53, 1997. Google ScholarDigital Library
A. Gupta, V. Harinarayan, and D. Quass. Aggregate query processing in data warehousing environments. In VLDB, pages 358--369, 1995. Google ScholarDigital Library
H. Gupta, V. Harinarayan, A. Rajaraman, and J. D. Ullman. Index selection for OLAP. In ICDE, pages 208--219, 1997. Google ScholarDigital Library
P. J. Haas and J. M. Hellerstein. Ripple joins for online aggregation. In SIGMOD, pages 287--298, 1999. Google ScholarDigital Library
J. Han, J. Pei, G. Dong, and K. Wang. Efficient computation of iceberg cubes with complex measures. In SIGMOD, 2001. Google ScholarDigital Library
V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In SIGMOD Conference, pages 205--216, 1996. Google ScholarDigital Library
J. M. Hellerstein, P. J. Haas, and H. J. Wang. Online aggregation. In SIGMOD, pages 171--182, 1997. Google ScholarDigital Library
I. F. Ilyas, W. G. Aref, and A. K. Elmagarmid. Supporting top-k join queries in relational databases. In VLDB, pages 754--765, 2003. Google ScholarDigital Library
I. F. Ilyas, R. Shah, W. G. Aref, J. S. Vitter, and A. K. Elmagarmid. Rank-aware query optimization. In SIGMOD, pages 203--214, 2004. Google ScholarDigital Library
C. Li, K. C.-C. Chang, and I. F. Ilyas. Efficient processing of ad-hoc top-k aggregate queries in OLAP. Technical Report UIUCDCS-R-2005-2596, Department of Computer Science, UIUC, June 2005. http://aim.cs.uiuc.edu.Google Scholar
C. Li, K. C.-C. Chang, I. F. Ilyas, and S. Song. RankSQL: Query algebra and optimization for relational top-k queries. In SIGMOD, pages 131--142, 2005. Google ScholarDigital Library
H.-G. Li, H. Yu, D. Agrawal, , and A. E. Abbadi. Ranking aggregates. Technical report, UCSB, July 2004.Google Scholar
V. Lin, V. Vassalos, and P. Malakasiotis. MiniCount: Efficient rewriting of COUNT-queries using views. In ICDE, 2006. Google ScholarDigital Library
T. Neumann and G. Moerkotte. A combined framework for grouping and order optimization. In VLDB, 2004.Google ScholarDigital Library
K. A. Ross and D. Srivastava. Fast computation of sparse datacubes. In VLDB, pages 116--125, 1997. Google ScholarDigital Library
D. E. Simmen, E. J. Shekita, and T. Malkemus. Fundamental techniques for order optimization. In SIGMOD, 1996. Google ScholarDigital Library
D. Srivastava, S. Dar, H. V. Jagadish, and A. Y. Levy. Answering queries with aggregation using views. In VLDB, pages 318--329, 1996. Google ScholarDigital Library
A. Tsois and T. K. Sellis. The generalized pre-grouping transformation: Aggregate-query optimization in the presence of dependencies. In VLDB, pages 644--655, 2003. Google ScholarDigital Library
W. P. Yan and P.-Å. Larson. Performing group-by before join. In ICDE, pages 89--100, 1994. Google ScholarDigital Library
W. P. Yan and P.-Å. Larson. Eager aggregation and lazy aggregation. In VLDB'95, pages 345--357, 1995. Google ScholarDigital Library
Y. Zhao, P. Deshpande, and J. F. Naughton. An array-based algorithm for simultaneous multidimensional aggregates. In SIGMOD, pages 159--170, 1997. Google ScholarDigital Library

Index Terms

Supporting ad-hoc ranking aggregates
1. Information systems
  1. Data management systems
    1. Database management system engines
      1. Database query processing
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Database theory
      1. Database query processing and optimization (theory)

Recommendations

Probabilistic top-k and ranking-aggregate queries

Ranking and aggregation queries are widely used in data exploration, data analysis, and decision-making scenarios. While most of the currently proposed ranking and aggregation techniques focus on deterministic data, several emerging applications involve ...
Read More
On contextual ranking queries in databases

In this paper, we identify a novel and interesting type of queries, contextual ranking queries, which return the ranks of query tuples among some context tuples given in the queries. Contextual ranking queries are useful for olap and decision support ...
Read More
Efficient Top-k Query Answering through its Top-N Rewritings Using Views
PIKM '15: Proceedings of the 8th Workshop on Ph.D. Workshop in Information and Knowledge Management

Recently, various algorithms were proposed to speed up top-k query answering by using multiple materialized query results. Nevertheless, for most of the proposed algorithms, a potentially costly view selection operation is required. In fact, the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '06: Proceedings of the 2006 ACM SIGMOD international conference on Management of data
June 2006
830 pages
ISBN:1595934340
DOI:10.1145/1142473
General Chairs:
Clement Yu
University of Illinois at Chicago
,
Peter Scheuermann
Northwestern University
,
Program Chair:
Surajit Chaudhuri
Microsoft Research
Copyright © 2006 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 June 2006
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
OLAP
aggregate query
decision support
ranking
top-k query processing
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate785of4,003submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 42
  Total Citations
  View Citations
- 814
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Supporting ad-hoc ranking aggregates

SIGMOD '06: Proceedings of the 2006 ACM SIGMOD international conference on Management of data

ABSTRACT

References

Cited By

Index Terms

Recommendations

Probabilistic top-k and ranking-aggregate queries

On contextual ranking queries in databases

Efficient Top-k Query Answering through its Top-N Rewritings Using Views

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Supporting ad-hoc ranking aggregates

SIGMOD '06: Proceedings of the 2006 ACM SIGMOD international conference on Management of data

ABSTRACT

References

Cited By

Index Terms

Recommendations

Probabilistic top-k and ranking-aggregate queries

On contextual ranking queries in databases

Efficient Top-k Query Answering through its Top-N Rewritings Using Views

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media