Abstract
Most research work on optimization of nested queries focuses on aggregate subqueries. In this article, we show that existing approaches are not adequate for nonaggregate subqueries, especially for those having multiple subqueries and certain comparison operators. We then propose a new efficient approach, the nested relational approach, based on the nested relational algebra. The nested relational approach treats all subqueries in a uniform manner, being able to deal with nested queries of any type and any level. We report on experimental work that confirms that existing approaches have difficulties dealing with nonaggregate subqueries, and that the nested relational approach offers better performance. We also discuss algebraic optimization rules for further optimizing the nested relational approach and the issue of integrating it into relational database systems.
Supplemental Material
Available for Download
Online appendix to designing mediation for context-aware applications. The appendix supports the information on article 18.
- Abiteboul, S. and Bidoit, N. 1984. Non first normal form relations to represent hierarchically organized data. In Proceedings of the PODS Conference. 191--200. Google ScholarDigital Library
- Akinde, M. O. and Bohlen, M. H. 2001. Generalized MD-joins: Evaluation and reduction to SQL. In Proceedings of the VLDB International Workshop on Databases in Telecommunications. 52--67. Google ScholarDigital Library
- Akinde, M. and Bohlen, M. 2003. Efficient computation of subqueries in complex OLAP. In Proceedings of the IEEE International Conference on Data Engineering. IEEE Computer Society Press, Los Alamiton CA, 163--174.Google Scholar
- Badia, A. 2003a. Automatic generation of XML from relations: The nested relation approach. In Proceedings of the ER Workshop. 330--341.Google ScholarCross Ref
- Badia, A. 2003b. Computing SQL subqueries with Boolean aggregates. In Proceedings of the DAWAK Conference. 391--400.Google Scholar
- Baekgaard, L. and Mark, L. 1995. Incremental computation of nested relational query expressions. ACM Trans. Datab. Syst. 20, 2, 111--148. Google ScholarDigital Library
- Bultingsloewen, G. V. 1987. Translating and optimizing SQL queries having aggregates. In Proceedings of the Conference on Very Large Data Bases. 235--243. Google ScholarDigital Library
- Ceri, S. and Gottlob, G. 1985. Translating SQL into relational algebra: Optimization, semantics, and equivalence of SQL queries. IEEE Trans. Softw. Eng. 11, 4, 324--345. Google ScholarDigital Library
- Chatziantoniou, D., Akinde, M. O., Johnson, T., and Kim, S. 2001. MD-join: an operator for complex OLAP. In Proceedings of the IEEE International Conference on Data Engineering. IEEE Computer Society Press, Los Alamiton, CA, 524--533. Google ScholarDigital Library
- Codd, E. F. 1970. A relational model of data for large shared data banks. Commun. ACM 13, 6, 377--387. Google ScholarDigital Library
- Colby, L. S. 1989. A recursive algebra and query optimization for nested relations. In Proceedings of the SIGMOD Conference. ACM, New York, 273--283. Google ScholarDigital Library
- Dayal, U. 1987. Of nests and trees: A unified approach to processing queries that contain nested subqueries, aggregates, and quantifiers. In Proceedings of the Conference on Very Large Data Bases. 197--208. Google ScholarDigital Library
- Galindo-Legaria, C. A. and Joshi, M. M. 2001. Orthogonal optimization of subqueries and aggregation. In Proceedings of the ACM SIGMOD Conference. ACM, New York, 571--581. Google ScholarDigital Library
- Galindo-Legaria, C. and Rosenthal, A. 1997. Outerjoin simplification and reordering for query optimization. ACM Trans. Datab. Syst. 22, 1, 43--74. Google ScholarDigital Library
- Ganski, R. A. and Wong, H. K. T. 1987. Optimization of nested SQL queries revisited. In Proceedings of the ACM SIGMOD Conference. ACM, New York, 23--33. Google ScholarDigital Library
- Garani, G. and Johnson, R. 2000. Joining nested relations and subrelations. Inf. Syst. 25, 4, 287--307. Google ScholarDigital Library
- Gupta, A., Harinarayan, V., and Quass, D. 1995. Aggregate-query processing in data warehousing environments. In Proceedings of the Conference on Very Large Data Bases. 358--369. Google ScholarDigital Library
- Gyssens, M. and Van Gucht, D. 1988. The powerset algebra as a result of adding programming constructs to the nested relational algebra. In Proceedings of the SIGMOD Conference. ACM, New York, 225--232. Google ScholarDigital Library
- Gyssens, M. and Van Gucht, D. 1989. A uniform approach toward handling atomic and structured information in the nested relational database model. J. ACM 36, 4, 790--825. Google ScholarDigital Library
- Helmer, S. and Moerkotte, G. 1997. Evaluation of main memory join algorithms for joins with set comparison join predicates. In Proceedings of the Conference on Very Large Data Bases. 386--395. Google ScholarDigital Library
- Jaeschke, G. and Schek, H. J. 1982. Remarks on the algebra of non first normal form relationsl. In Proceedings of the PODS Conference. ACM, New York, 124--138. Google ScholarDigital Library
- Jan, Y. 1990. Algebraic optimization for nested relations. In Proceedings of the ICSS Conference. 278--287.Google ScholarCross Ref
- Kim, W. 1982. On optimizing an SQL-like nested query. ACM Trans. Datab. Syst. 7, 3, 443--469. Google ScholarDigital Library
- Lee, D., Mani, M., Chiu, F., and Chu, W. W. 2001. Nesting-based relational-to-XML schema translation. In Proceedings of the WebDB Workshop. 61--66.Google Scholar
- Levene, M. and Loizou, G. 1993. Semantics for null extended nested relations. ACM Trans. Datab. Syst. 18, 3, 414--459. Google ScholarDigital Library
- Levene, M. and Loizou, G. 1994. The nested universal relation data model. J. Comput. Syst. Sci. 49, 3, 683--717. Google ScholarDigital Library
- Liu, H.-C. and Ramamohanarao, K. 1994. Algebraic equivalences among nested relational expressions. In Proceedings of the CIKM Conference. 234--243. Google ScholarDigital Library
- Liu, H.-C. and Yu, J. X. 2005. Algebraic equivalences of nested relational operators. Inf. Syst. 30, 167--204. Google ScholarDigital Library
- Makinouchi, A. 1977. A consideration on normal form of not-necessarily-normalized relation in the relational data model. In Proceedings of the Conference on Very Large Data Bases. 447--453. Google ScholarDigital Library
- Mamoulis, N. 2003. Efficient processing of joins on set-valued attributes. In Proceedings of the SIGMOD Conference. ACM, New York, 157--168. Google ScholarDigital Library
- Melnik, S. and Garcia-Molina, H. 2002. Divide-and-conquer algorithm for computing set containment joins. In Proceedings of the EDBT Conference. 427--444. Google ScholarDigital Library
- Melnik, S. and Garcia-Molina, H. 2003. Adaptive algorithms for set containment joins. ACM Trans. Datab. Sys. 28, 1, 56--99. Google ScholarDigital Library
- Mumick, I. S., Finkekstein, S. J., Pirahesh, H., and Ramakrishnan, R. 1990. Magic is relevant. In Proceedings of the ACM SIGMOD Conference. ACM, New York, 247--258. Google ScholarDigital Library
- Mumick, I. S. and Pirahesh, H. 1994. Implementation of magic-sets in a relational database system. In Proceedings of the ACM SIGMOD Conference. ACM, New York, 103--114. Google ScholarDigital Library
- Muralikrishna, M. 1989. Optimization and dataflow algorithms for nested tree queries. In Proceedings of the Conference on Very Large Data Bases. 77--85. Google ScholarDigital Library
- Muralikrishna, M. 1992. Improved unnesting algorithms for join aggregate SQL queries. In Proceedings of the Conference on Very Large Data Bases. 91--102. Google ScholarDigital Library
- Ozsoyoglu, G., Ozsoyoglu, Z. M., and Matos, V. 1987. Extending relational algebra and relational calculus with set-valued attributes and aggregate functions. ACM Trans. Datab. Syst. 12, 4, 566--592. Google ScholarDigital Library
- Paredaens, J., De Bra, P., Gyssens, M., and Van Gucht, D. 1989. The Structure of the Relational Model. Springer-Verlag, New York. Google ScholarDigital Library
- Ramasamy, K., Patel, J. M., Naughton, J. F., and Kaushik, R. 2000. Set containment joins: The good, the bad and the ugly. In Proceedings of the SIGMOD Conference. ACM, New York, 351--362. Google ScholarDigital Library
- Rao, J., Lindsay, B., Lohman, G., Pirahesh, H., and Simmen, D. 2001. Using EELs, a practical approach to outerjoin and antijoin reordering. In Proceedings of the IEEE International Conference on Data Engineering. IEEE Computer Society Press, Los Alamiton, CA, 585--594. Google ScholarDigital Library
- Rao, J. and Ross, K. A. 1998. Reusing invariants: a new strategy for correlated queries. In Proceedings of the ACM SIGMOD Conference. ACM, New York, 37--48. Google ScholarDigital Library
- Roth, M. A., Korth, H. F., and Silberschatz, A. 1988. Extended relational algebra and calculus for nested relational databases. ACM Trans. Datab. Syst. 13, 4, 389--417. Google ScholarDigital Library
- Roth, M. A., Korth, H. F., and Silberschatz, A. 1989. Null values in nested relational databases. Acta Inf. 26, 7, 615--642. Google ScholarDigital Library
- Roy, P., Seshadri, S., Sudarshan, S., and Bhobe, S. 2000. Efficient and extensible algorithms for multi query optimization. In Proceedings of the SIGMOD Conference. ACM, New York, 249--260. Google ScholarDigital Library
- Schek, H. J. and Scholl, M. H. 1986. The relational model with relation-valued attributes. Inf. Syst. 11, 2, 137--147. Google ScholarDigital Library
- Scholl, M. H. 1986. Theoretical foundation of algebraic optimization utilizing unnormalized relations. In Proceedings of the ICDT Conference. 380--396. Google ScholarDigital Library
- Selinger, P. G., Astrahan, M. M., Chamberlin, D. D., Lorie, R. A., and Price, T. G. 1979. Access path selection in a relational database management system. In Proceedings of the ACM SIGMOD Conference. ACM, New York, 23--34. Google ScholarDigital Library
- Seshadri, P., Hellerstein, J. M., Pirahesh, H., Leung, T. Y. C., Ramakrishnan, R., Srivastava, R., Stuckey, P. J., and Sudarshan, S. 1996a. Cost-based optimization for magic: Algebra and implementation. In Proceedings of the ACM SIGMOD Conference. ACM, New York, 435--446. Google ScholarDigital Library
- Seshadri, P., Pirahesh, H., and Leung, T. Y. C. 1996b. Complex query decorrelation. In Proceedings of the ICDE Conference. 450--458. Google ScholarDigital Library
- Thomas, S. J. and Fischer, P. C. 1986. Nested relational structures. Adv. Comput. Res. 3, 269--307.Google Scholar
- Transaction Processing Performance Council. The TPC-H benchmark. http://www.tpc.org/tpch.Google Scholar
- Van Gucht, D. 1987. On the expressive power of the extended relational algebra for the unnormalized relational model. In Proceedings of the PODS Conference. 302--312. Google ScholarDigital Library
- Van Gucht, D., and Fischer, P. C. 1986. Some classes of multilevel relational structures. In Proceedings of the PODS Conference. 60--69. Google ScholarDigital Library
- Vossen, G. 1991. Data Models, Database Language and Database Management Systems. Addison-Wesley, Reading, MA. Google ScholarDigital Library
- Yan, W. P., and Larson, P. 1994. Performing group-by before join. In Proceedings of the IEEE International Conference on Data Engineering. IEEE Computer Society Press, Los Alamiton, CA, 89--100. Google ScholarDigital Library
- Zhang, C., Naughton, J., DeWitt, D., Luo, Q., and Lohman, G. 2001. On supporting containment queries in relational database management systems. In Proceedings of the Conference on Very Large Data Bases. 425--436. Google ScholarDigital Library
- Zuzarte, C., Pirahesh, H., Ma, W., Cheng, Q., Liu, L., and Wong, K. 2003. WinMagic: Subquery elimination using window aggregation. In Proceedings of the ACM SIGMOD Conference. ACM, New York, 652--656. Google ScholarDigital Library
Index Terms
- SQL query optimization through nested relational algebra
Recommendations
On optimizing an SQL-like nested query
SQL is a high-level nonprocedural data language which has received wide recognition in relational databases. One of the most interesting features of SQL is the nesting of query blocks to an arbitrary depth. An SQL-like query nested to an arbitrary depth ...
Optimization of Nested Queries using the NF2 Algebra
SIGMOD '16: Proceedings of the 2016 International Conference on Management of DataA key promise of SQL is that the optimizer will find the most efficient execution plan, regardless of how the query is formulated. In general, query optimizers of modern database systems are able to keep this promise, with the notable exception of ...
An Extended Algebra for Constraint Databases
Constraint relational databases use constraints to both model and query data. A constraint relation contains a finite set of generalized tuples. Each generalized tuple is represented by a conjunction of constraints on a given logical theory and, ...
Comments