ABSTRACT
While checking containment of Datalog programs is undecidable, checking whether a Datalog program is contained in a union of conjunctive queries (UCQ), in the context of relational databases, or a union of conjunctive 2-way regular path queries (UC2RPQ), in the context of graph databases, is decidable. The complexity of these problems is, however, prohibitive: 2exptime-complete. We investigate to which extent restrictions on UCQs and UC2RPQs, which have been known to reduce the complexity of query containment for these classes, yield a more "manageable" single-exponential time bound, which is the norm for several static analysis and verification tasks.
Checking containment of a UCQ Theta' in a UCQ Theta is NP-hard, in general, but better bounds can be obtained if Theta is restricted to belong to a "tractable" class of UCQs, e.g., a class of bounded treewidth or hypertreewidth. Also, each Datalog program Pi is equivalent to an infinite union of CQs. This motivated us to study the question of whether restricting Theta to belong to a tractable class also helps alleviate the complexity of checking whether Pi is contained in Theta.
We study such question in detail and show that the situation is much more delicate than expected: First, tractability of UCQs does not help in general, but further restricting Theta to be acyclic and have a bounded number of shared variables between atoms yields better complexity bounds. As corollaries, we obtain that checking containment of Pi in Theta is in exptime if Theta is of treewidth one, or it is acyclic and the arity of the schema is fixed. In the case of UC2RPQs we show an exptime bound when queries are acyclic and have a bounded number of edges connecting pairs of variables. As a corollary, we obtain that checking whether Pi is contained in UC2RPQ Gamma is in exptime if Gamma is a strongly acyclic UC2RPQ. Our positive results for UCQs and UC2RPQs are optimal, in a sense, since slightly extending the conditions turns the problem 2exptime-complete.
- . Barceló. Querying graph databases. In PODS2013, pages 175--188. Google ScholarDigital Library
- P. Barceló, R. Pichler (Eds.). Datalog in Academia and Industry.LNCS 7494, Springer 2012. Google ScholarDigital Library
- . Barceló, M. Romero, M. Y. Vardi. Semanticacyclicity on graph databases. In PODS 2013, pages 237--248. Google ScholarDigital Library
- . Beeri, R. Fagin, D. Maier, A. O. Mendelzon,J. D. Ullman, M. Yannakakis. Properties of acyclic databaseschemes. In STOC 1981, pages 355--362. Google ScholarDigital Library
- . Benedikt, P. Bourhis, P. Senellart. Monadicdatalog containment. In ICALP 2012, pages 79--91. Google ScholarDigital Library
- . Buneman, S. B. Davidson, G. G. Hillebrand,D. Suciu. A query language and optimization techniques forunstructured data. In SIGMOD 1996, pages 505--516. Google ScholarDigital Library
- . Calvanese, G. de Giacomo, M. Lenzerini, M. Y. Vardi. Containment of conjunctive regular path queries with inverse. In KR'00, pages 176--185.Google Scholar
- . Calvanese, G. de Giacomo, M. Y. Vardi. Decidablecontainment of recursive queries. Theor. Comput. Sci. 336(1),pages 33--56, 2005. Google ScholarDigital Library
- . Calvanese, G. de Giacomo, M. Lenzerini, M. Y. Vardi. Rewriting of regular expressions and regular path queries. JCSS, 64(3):443--465, 2002.Google ScholarDigital Library
- . Chandra, Ph. Merlin. Optimal implementation ofconjunctive queries in relational data bases. In STOC 1977, pp. 77--90. Google ScholarDigital Library
- . Chaudhuri, R. Krishnamurthy, S. Potamianos,K. Shim. Optimizing queries with materialized views. In ICDE 1995, pages190--200. Google ScholarDigital Library
- . Chaudhuri, M. Y. Vardi. On the equivalenceof recursive and nonrecursive Datalogprograms. J. Comput. Syst. Sci. 54(1), pages 61--78, 1997.Google ScholarDigital Library
- . Chaudhuri, M. Y. Vardi. On the complexity ofequivalence between recursive and nonrecursive Datalog programs. In PODS 1994, pages 107--116. Google ScholarDigital Library
- . Chekuri, A. Rajaraman. Conjunctive query containmentrevisited. Theor. Comput. Sci. 239(2), pages 211--229, 2000. Google ScholarDigital Library
- . Chen, V. Dalmau. Beyond hypertree width: Decomposition methods withoutdecompositions. In CP 2005, pages 167--181.Google Scholar
- . S. Cosmadakis, P. C. Kanellakis. Parallel Evaluation of Recursive Rule Queries. In PODS 1986, pages 280--293. Google ScholarDigital Library
- . S. Cosmadakis, H. Gaifman, P. C. Kanellakis,M. Y. Vardi. Decidable optimization problems for database logicprograms (Preliminary report). In STOC 1988, pages 477--490. Google ScholarDigital Library
- . Dalmau, P. Kolaitis, M. Vardi. Constraintsatisfaction, bounded treewidth, and finite-variable logics. In CP 2002, pp. 310--326. Google ScholarDigital Library
- O. de Moor, G. Gottlob, T. Furche, A. J. Sellers (Eds.). Datalog Reloaded. LNCS 6702, Springer 2011. Google ScholarDigital Library
- . F. Fernández, D. Florescu, A. Y. Levy, D. Suciu. Verifying integrity constraints on web sites. In IJCAI 1999, pages 614--619. Google ScholarDigital Library
- J. Flum and M. Grohe. Parameterized Complexity Theory. Springer, 2006. Google ScholarDigital Library
- . Friedman, A. Y. Levy, T. D. Millstein. Navigationalplans For data integration. In AAAI/IAAI 1999, pages 67--73. Google ScholarDigital Library
- . Fagin, M. Y. Vardi. The theory of data dependencies - An overview. In ICALP 1984,pages 1--22. Google ScholarDigital Library
- . Gottlob, N. Leone, F. Scarcello. Hypertree decompositions and tractable queries. J. Comput. Syst. Sci. 64(3), pages 579--627, 2002.Google ScholarDigital Library
- . Grohe, D. Marx. Constraint solving via fractional edge covers. In SODA 2006, pages 289--298. Google ScholarDigital Library
- . Hell, J. Ne\vset\vril. The core of a graph. Discr. Math. 109, 1995. Google ScholarDigital Library
- . Imielinski, W. Lipski Jr. Incomplete information in relational databases. J. of the ACM 31(4), pages 761--791, 1984. Google ScholarDigital Library
- . Malik and L. Zhang. Booleansatisfiability: from theoretical hardness to practical success. CACM 52(8), 76--82, 2009. Google ScholarDigital Library
- . Naughton. Data independent recursion in deductivedatabases. JCSS 38, pages 259--289, 1989.Google Scholar
- A. Robinson, A. Voronkov, eds. Handbook of Automated Reasoning. The MIT Press, 2001. Google ScholarDigital Library
- . Sagiv and M. Yannakakis. Equivalences among relational expressions with the union and difference operator. J. of the ACM 27(4),1980, pages 633--655. Google ScholarDigital Library
- . Seidl. Deciding equivalence of finite tree automata. SIAM J. Comput.19(3), pages 424--437, 1990. Google ScholarDigital Library
- . Shmueli. Equivalence of DATALOG queries is undecidable. J. Log. Program.15(3), pages 231--241, 1993. Google ScholarDigital Library
- . Slutzki. Alternating tree automata. TCS 41, pp. 305--318, 1985. Google ScholarDigital Library
- P. T. Wood. Query languages for graph databases. SIGMOD Record 41(1), pages 50--60, 2012. Google ScholarDigital Library
- . Yannakakis. Algorithms for acyclic databaseschemes. In VLDB 1981, pages 82--94. Google ScholarDigital Library
Index Terms
- Does query evaluation tractability help query containment?
Recommendations
Query containment under bag and bag-set semantics
Conjunctive queries (CQs) are at the core of query languages encountered in many logic-based research fields such as AI, or database systems. The majority of existing work assumes set semantics but often in real applications the manipulation of ...
Parallel-Correctness and Containment for Conjunctive Queries with Union and Negation
Single-round multiway join algorithms first reshuffle data over many servers and then evaluate the query at hand in a parallel and communication-free way. A key question is whether a given distribution policy for the reshuffle is adequate for computing ...
Containment and Optimization of Object-Preserving Conjunctive Queries
In the optimization of queries in an object-oriented database (OODB) system, a natural first step is to use the typing constraints imposed by the schema to transform a query into an equivalent one that logically accesses a minimal set of objects. We ...
Comments