skip to main content
article

SQL query optimization through nested relational algebra

Published:01 August 2007Publication History
Skip Abstract Section

Abstract

Most research work on optimization of nested queries focuses on aggregate subqueries. In this article, we show that existing approaches are not adequate for nonaggregate subqueries, especially for those having multiple subqueries and certain comparison operators. We then propose a new efficient approach, the nested relational approach, based on the nested relational algebra. The nested relational approach treats all subqueries in a uniform manner, being able to deal with nested queries of any type and any level. We report on experimental work that confirms that existing approaches have difficulties dealing with nonaggregate subqueries, and that the nested relational approach offers better performance. We also discuss algebraic optimization rules for further optimizing the nested relational approach and the issue of integrating it into relational database systems.

Skip Supplemental Material Section

Supplemental Material

References

  1. Abiteboul, S. and Bidoit, N. 1984. Non first normal form relations to represent hierarchically organized data. In Proceedings of the PODS Conference. 191--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Akinde, M. O. and Bohlen, M. H. 2001. Generalized MD-joins: Evaluation and reduction to SQL. In Proceedings of the VLDB International Workshop on Databases in Telecommunications. 52--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Akinde, M. and Bohlen, M. 2003. Efficient computation of subqueries in complex OLAP. In Proceedings of the IEEE International Conference on Data Engineering. IEEE Computer Society Press, Los Alamiton CA, 163--174.Google ScholarGoogle Scholar
  4. Badia, A. 2003a. Automatic generation of XML from relations: The nested relation approach. In Proceedings of the ER Workshop. 330--341.Google ScholarGoogle ScholarCross RefCross Ref
  5. Badia, A. 2003b. Computing SQL subqueries with Boolean aggregates. In Proceedings of the DAWAK Conference. 391--400.Google ScholarGoogle Scholar
  6. Baekgaard, L. and Mark, L. 1995. Incremental computation of nested relational query expressions. ACM Trans. Datab. Syst. 20, 2, 111--148. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bultingsloewen, G. V. 1987. Translating and optimizing SQL queries having aggregates. In Proceedings of the Conference on Very Large Data Bases. 235--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ceri, S. and Gottlob, G. 1985. Translating SQL into relational algebra: Optimization, semantics, and equivalence of SQL queries. IEEE Trans. Softw. Eng. 11, 4, 324--345. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chatziantoniou, D., Akinde, M. O., Johnson, T., and Kim, S. 2001. MD-join: an operator for complex OLAP. In Proceedings of the IEEE International Conference on Data Engineering. IEEE Computer Society Press, Los Alamiton, CA, 524--533. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Codd, E. F. 1970. A relational model of data for large shared data banks. Commun. ACM 13, 6, 377--387. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Colby, L. S. 1989. A recursive algebra and query optimization for nested relations. In Proceedings of the SIGMOD Conference. ACM, New York, 273--283. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Dayal, U. 1987. Of nests and trees: A unified approach to processing queries that contain nested subqueries, aggregates, and quantifiers. In Proceedings of the Conference on Very Large Data Bases. 197--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Galindo-Legaria, C. A. and Joshi, M. M. 2001. Orthogonal optimization of subqueries and aggregation. In Proceedings of the ACM SIGMOD Conference. ACM, New York, 571--581. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Galindo-Legaria, C. and Rosenthal, A. 1997. Outerjoin simplification and reordering for query optimization. ACM Trans. Datab. Syst. 22, 1, 43--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ganski, R. A. and Wong, H. K. T. 1987. Optimization of nested SQL queries revisited. In Proceedings of the ACM SIGMOD Conference. ACM, New York, 23--33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Garani, G. and Johnson, R. 2000. Joining nested relations and subrelations. Inf. Syst. 25, 4, 287--307. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Gupta, A., Harinarayan, V., and Quass, D. 1995. Aggregate-query processing in data warehousing environments. In Proceedings of the Conference on Very Large Data Bases. 358--369. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Gyssens, M. and Van Gucht, D. 1988. The powerset algebra as a result of adding programming constructs to the nested relational algebra. In Proceedings of the SIGMOD Conference. ACM, New York, 225--232. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Gyssens, M. and Van Gucht, D. 1989. A uniform approach toward handling atomic and structured information in the nested relational database model. J. ACM 36, 4, 790--825. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Helmer, S. and Moerkotte, G. 1997. Evaluation of main memory join algorithms for joins with set comparison join predicates. In Proceedings of the Conference on Very Large Data Bases. 386--395. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jaeschke, G. and Schek, H. J. 1982. Remarks on the algebra of non first normal form relationsl. In Proceedings of the PODS Conference. ACM, New York, 124--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jan, Y. 1990. Algebraic optimization for nested relations. In Proceedings of the ICSS Conference. 278--287.Google ScholarGoogle ScholarCross RefCross Ref
  23. Kim, W. 1982. On optimizing an SQL-like nested query. ACM Trans. Datab. Syst. 7, 3, 443--469. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Lee, D., Mani, M., Chiu, F., and Chu, W. W. 2001. Nesting-based relational-to-XML schema translation. In Proceedings of the WebDB Workshop. 61--66.Google ScholarGoogle Scholar
  25. Levene, M. and Loizou, G. 1993. Semantics for null extended nested relations. ACM Trans. Datab. Syst. 18, 3, 414--459. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Levene, M. and Loizou, G. 1994. The nested universal relation data model. J. Comput. Syst. Sci. 49, 3, 683--717. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Liu, H.-C. and Ramamohanarao, K. 1994. Algebraic equivalences among nested relational expressions. In Proceedings of the CIKM Conference. 234--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Liu, H.-C. and Yu, J. X. 2005. Algebraic equivalences of nested relational operators. Inf. Syst. 30, 167--204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Makinouchi, A. 1977. A consideration on normal form of not-necessarily-normalized relation in the relational data model. In Proceedings of the Conference on Very Large Data Bases. 447--453. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Mamoulis, N. 2003. Efficient processing of joins on set-valued attributes. In Proceedings of the SIGMOD Conference. ACM, New York, 157--168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Melnik, S. and Garcia-Molina, H. 2002. Divide-and-conquer algorithm for computing set containment joins. In Proceedings of the EDBT Conference. 427--444. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Melnik, S. and Garcia-Molina, H. 2003. Adaptive algorithms for set containment joins. ACM Trans. Datab. Sys. 28, 1, 56--99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Mumick, I. S., Finkekstein, S. J., Pirahesh, H., and Ramakrishnan, R. 1990. Magic is relevant. In Proceedings of the ACM SIGMOD Conference. ACM, New York, 247--258. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Mumick, I. S. and Pirahesh, H. 1994. Implementation of magic-sets in a relational database system. In Proceedings of the ACM SIGMOD Conference. ACM, New York, 103--114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Muralikrishna, M. 1989. Optimization and dataflow algorithms for nested tree queries. In Proceedings of the Conference on Very Large Data Bases. 77--85. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Muralikrishna, M. 1992. Improved unnesting algorithms for join aggregate SQL queries. In Proceedings of the Conference on Very Large Data Bases. 91--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Ozsoyoglu, G., Ozsoyoglu, Z. M., and Matos, V. 1987. Extending relational algebra and relational calculus with set-valued attributes and aggregate functions. ACM Trans. Datab. Syst. 12, 4, 566--592. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Paredaens, J., De Bra, P., Gyssens, M., and Van Gucht, D. 1989. The Structure of the Relational Model. Springer-Verlag, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Ramasamy, K., Patel, J. M., Naughton, J. F., and Kaushik, R. 2000. Set containment joins: The good, the bad and the ugly. In Proceedings of the SIGMOD Conference. ACM, New York, 351--362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Rao, J., Lindsay, B., Lohman, G., Pirahesh, H., and Simmen, D. 2001. Using EELs, a practical approach to outerjoin and antijoin reordering. In Proceedings of the IEEE International Conference on Data Engineering. IEEE Computer Society Press, Los Alamiton, CA, 585--594. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Rao, J. and Ross, K. A. 1998. Reusing invariants: a new strategy for correlated queries. In Proceedings of the ACM SIGMOD Conference. ACM, New York, 37--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Roth, M. A., Korth, H. F., and Silberschatz, A. 1988. Extended relational algebra and calculus for nested relational databases. ACM Trans. Datab. Syst. 13, 4, 389--417. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Roth, M. A., Korth, H. F., and Silberschatz, A. 1989. Null values in nested relational databases. Acta Inf. 26, 7, 615--642. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Roy, P., Seshadri, S., Sudarshan, S., and Bhobe, S. 2000. Efficient and extensible algorithms for multi query optimization. In Proceedings of the SIGMOD Conference. ACM, New York, 249--260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Schek, H. J. and Scholl, M. H. 1986. The relational model with relation-valued attributes. Inf. Syst. 11, 2, 137--147. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Scholl, M. H. 1986. Theoretical foundation of algebraic optimization utilizing unnormalized relations. In Proceedings of the ICDT Conference. 380--396. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Selinger, P. G., Astrahan, M. M., Chamberlin, D. D., Lorie, R. A., and Price, T. G. 1979. Access path selection in a relational database management system. In Proceedings of the ACM SIGMOD Conference. ACM, New York, 23--34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Seshadri, P., Hellerstein, J. M., Pirahesh, H., Leung, T. Y. C., Ramakrishnan, R., Srivastava, R., Stuckey, P. J., and Sudarshan, S. 1996a. Cost-based optimization for magic: Algebra and implementation. In Proceedings of the ACM SIGMOD Conference. ACM, New York, 435--446. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Seshadri, P., Pirahesh, H., and Leung, T. Y. C. 1996b. Complex query decorrelation. In Proceedings of the ICDE Conference. 450--458. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Thomas, S. J. and Fischer, P. C. 1986. Nested relational structures. Adv. Comput. Res. 3, 269--307.Google ScholarGoogle Scholar
  51. Transaction Processing Performance Council. The TPC-H benchmark. http://www.tpc.org/tpch.Google ScholarGoogle Scholar
  52. Van Gucht, D. 1987. On the expressive power of the extended relational algebra for the unnormalized relational model. In Proceedings of the PODS Conference. 302--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Van Gucht, D., and Fischer, P. C. 1986. Some classes of multilevel relational structures. In Proceedings of the PODS Conference. 60--69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Vossen, G. 1991. Data Models, Database Language and Database Management Systems. Addison-Wesley, Reading, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Yan, W. P., and Larson, P. 1994. Performing group-by before join. In Proceedings of the IEEE International Conference on Data Engineering. IEEE Computer Society Press, Los Alamiton, CA, 89--100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Zhang, C., Naughton, J., DeWitt, D., Luo, Q., and Lohman, G. 2001. On supporting containment queries in relational database management systems. In Proceedings of the Conference on Very Large Data Bases. 425--436. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Zuzarte, C., Pirahesh, H., Ma, W., Cheng, Q., Liu, L., and Wong, K. 2003. WinMagic: Subquery elimination using window aggregation. In Proceedings of the ACM SIGMOD Conference. ACM, New York, 652--656. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. SQL query optimization through nested relational algebra

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader