skip to main content
10.1145/1142473.1142481acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Supporting ad-hoc ranking aggregates

Published:27 June 2006Publication History

ABSTRACT

This paper presents a principled framework for efficient processing of ad-hoc top-k (ranking) aggregate queries, which provide the k groups with the highest aggregates as results. Essential support of such queries is lacking in current systems, which process the queries in a naïve materialize-group-sort scheme that can be prohibitively inefficient. Our framework is based on three fundamental principles. The Upper-Bound Principle dictates the requirements of early pruning, and the Group-Ranking and Tuple-Ranking Principles dictate group-ordering and tuple-ordering requirements. They together guide the query processor toward a provably optimal tuple schedule for aggregate query processing. We propose a new execution framework to apply the principles and requirements. We address the challenges in realizing the framework and implementing new query operators, enabling efficient group-aware and rank-aware query plans. The experimental study validates our framework by demonstrating orders of magnitude performance improvement in the new query plans, compared with the traditional plans.

References

  1. F. N. Afrati and R. Chirkova. Selecting and using views to compute aggregate queries (extended abstract). In ICDT, pages 383--397, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Agarwal, R. Agrawal, P. Deshpande, A. Gupta, J. F. Naughton, R. Ramakrishnan, and S. Sarawagi. On the computation of multidimensional aggregates. In VLDB, pages 506--521, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. K. Beyer and R. Ramakrishnan. Bottom-up computation of sparse and iceberg cube. SIGMOD, pages 359--370, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. N. Bruno, L. Gravano, and A. Marian. Evaluating top-k queries over web-accessible databases. In ICDE, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  5. M. J. Carey and D. Kossmann. On saying "enough already!" in SQL. In SIGMOD, pages 219--230, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. K. C.-C. Chang and S. Hwang. Minimal probing: Supporting expensive predicates for top-k queries. In SIGMOD, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Chaudhuri and L. Gravano. Evaluating top-k selection queries. In VLDB, pages 397--410, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Chaudhuri, R. Ramakrishnan, and G. Weikum. Integrating DB and IR technologies: What is the sound of one hand clapping? In CIDR, pages 1--12, 2005.Google ScholarGoogle Scholar
  9. S. Chaudhuri and K. Shim. Including group-by in query optimization. In VLDB, pages 354--366, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Claussen, A. Kemper, D. Kossmann, and C. Wiesner. Exploiting early sorting and early partitioning for decision support query processing. VLDB J., 9(3), 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Cohen, W. Nutt, and A. Serebrenik. Rewriting aggregate queries using views. In PODS, pages 155--166, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Donjerkovic and R. Ramakrishnan. Probabilistic optimization of top n queries. In VLDB, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Fagin. Combining fuzzy information from multiple systems. In PODS, pages 216--226, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Fagin, A. Lote, and M. Naor. Optimal aggregation algorithms for middleware. In PODS, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Fang, N. Shivakumar, H. Garcia-Molina, R. Motwani, and J. D. Ullman. Computing iceberg queries efficiently. In VLDB, pages 299--310, San Francisco, CA, USA, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H. Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. J. Data Mining and Knowledge Discovery, 1(1):29--53, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Gupta, V. Harinarayan, and D. Quass. Aggregate query processing in data warehousing environments. In VLDB, pages 358--369, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. H. Gupta, V. Harinarayan, A. Rajaraman, and J. D. Ullman. Index selection for OLAP. In ICDE, pages 208--219, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. P. J. Haas and J. M. Hellerstein. Ripple joins for online aggregation. In SIGMOD, pages 287--298, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Han, J. Pei, G. Dong, and K. Wang. Efficient computation of iceberg cubes with complex measures. In SIGMOD, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In SIGMOD Conference, pages 205--216, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. M. Hellerstein, P. J. Haas, and H. J. Wang. Online aggregation. In SIGMOD, pages 171--182, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. I. F. Ilyas, W. G. Aref, and A. K. Elmagarmid. Supporting top-k join queries in relational databases. In VLDB, pages 754--765, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. I. F. Ilyas, R. Shah, W. G. Aref, J. S. Vitter, and A. K. Elmagarmid. Rank-aware query optimization. In SIGMOD, pages 203--214, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. C. Li, K. C.-C. Chang, and I. F. Ilyas. Efficient processing of ad-hoc top-k aggregate queries in OLAP. Technical Report UIUCDCS-R-2005-2596, Department of Computer Science, UIUC, June 2005. http://aim.cs.uiuc.edu.Google ScholarGoogle Scholar
  26. C. Li, K. C.-C. Chang, I. F. Ilyas, and S. Song. RankSQL: Query algebra and optimization for relational top-k queries. In SIGMOD, pages 131--142, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. H.-G. Li, H. Yu, D. Agrawal, , and A. E. Abbadi. Ranking aggregates. Technical report, UCSB, July 2004.Google ScholarGoogle Scholar
  28. V. Lin, V. Vassalos, and P. Malakasiotis. MiniCount: Efficient rewriting of COUNT-queries using views. In ICDE, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. T. Neumann and G. Moerkotte. A combined framework for grouping and order optimization. In VLDB, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. K. A. Ross and D. Srivastava. Fast computation of sparse datacubes. In VLDB, pages 116--125, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. D. E. Simmen, E. J. Shekita, and T. Malkemus. Fundamental techniques for order optimization. In SIGMOD, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. D. Srivastava, S. Dar, H. V. Jagadish, and A. Y. Levy. Answering queries with aggregation using views. In VLDB, pages 318--329, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. Tsois and T. K. Sellis. The generalized pre-grouping transformation: Aggregate-query optimization in the presence of dependencies. In VLDB, pages 644--655, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. W. P. Yan and P.-Å. Larson. Performing group-by before join. In ICDE, pages 89--100, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. W. P. Yan and P.-Å. Larson. Eager aggregation and lazy aggregation. In VLDB'95, pages 345--357, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Y. Zhao, P. Deshpande, and J. F. Naughton. An array-based algorithm for simultaneous multidimensional aggregates. In SIGMOD, pages 159--170, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Supporting ad-hoc ranking aggregates

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMOD '06: Proceedings of the 2006 ACM SIGMOD international conference on Management of data
        June 2006
        830 pages
        ISBN:1595934340
        DOI:10.1145/1142473

        Copyright © 2006 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 June 2006

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate785of4,003submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader