ABSTRACT
We consider the problem of minimizing tree pattern queries (TPQ) that arise in XML and in LDAP-style network directories. In [Minimization of Tree Pattern Queries, Proc. ACM SIGMOD Intl. Conf. Management of Data, 2001, pp. 497-508], Amer-Yahia, Cho, Lakshmanan and Srivastava presented an O(n4) algorithm for minimizing TPQs in the absence of integrity constraints (Case 1); n is the number of nodes in the query. Then they considered the problem of minimizing TPQs in the presence of three kinds of integrity constraints: required-child, required-descendant and subtype (Case 2). They presented an O(n6) algorithm for minimizing TPQs in the presence of only required-child and required-descendant constraints (i.e., no subtypes allowed; Case 3). We present O(n2), O(n4) and O(n2) algorithms for minimizing TPQs in these three cases, respectively, based on the concept of graph simulation. We believe that our O(n2) algorithms for Cases 1 and 3 are runtime optimal.
- S. Abiteboul, P. Buneman and D. Suciu. Data on the Web. Morgan Kaufman, San Francisco, CA, 2000.]]Google Scholar
- S. Amer-Yahia, SR. Cho, L. V. S. Lakshmanan and D. Srivastava. Minimization of Tree Pattern Queries, Proc. ACM SIGMOD Intl. Conf. Management of Data, 2001, pp. 497-508.]] Google ScholarDigital Library
- B. Bloom and R. Paige. Transformational Design and Implementation of a New Efficient Solution to the Ready Simulation Problem, Science of Computer Programming24(1995), pp. 189-220.]] Google ScholarDigital Library
- P. Buneman, S. Davidson, M. Fernandez and D. Suciu. Adding Structure to Unstructured Data, Proc. Internat. Conf. Database Theory, 1997, pp. 336-350.]] Google ScholarDigital Library
- D. Calvanese, G. De Giacomo and M. Lenzerini. On the Decidability of Query Containment under Constraints, Proc. 17th ACM Symp. Principles of Database Systems, 1998, pp. 149-158.]] Google ScholarDigital Library
- D. D. Chamberlin, J. Robie and D. Florescu. Quilt: An XML Query Language for Heterogeneous Data Sources, WebDB 2000.]] Google ScholarDigital Library
- A. K. Chandra and P. M. Merlin. Optimal Implementation of Conjunctive Queries in Relational Databases, Proc. 9th ACM Symp. Theory of Computing, 1977, pp. 77-90.]] Google ScholarDigital Library
- A. Deutch, M. Fernandez, D. Florescu, A. Levy and D. Suciu. A Query Language for XML, Intl. WWW Conf., 1999.]] Google ScholarDigital Library
- W. Fan and J. Simeon. Integrity Constraints for XML, Proc. 19th ACM Symp. Principles of Database Systems, 2000, pp. 23-34.]] Google ScholarDigital Library
- D. Florescu, A. Levy and D. Suciu. Query Containment for Conjunctive Queries with Regular Expressions, Proc. 17th ACM Symp. Principles of Database Systems, 1998, pp. 139-148.]] Google ScholarDigital Library
- M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., NY, 1979.]] Google ScholarDigital Library
- M. R. Henzinger, T. A. Henzinger and P. W. Kopke. Computing Simulations on Finite and Infinite Graphs, Proc. IEEE Symp. Foundations of Computer Science, 1995, pp. 453-462.]] Google ScholarDigital Library
- T. Howes, M. Smith and G. S. Wood. Understanding and Deploying LDAP Directory Services. MacMillan Technical Publishing, Indianapolis, 1999.]] Google ScholarDigital Library
- H. V. Jagadish, L. V. S. Lakshmanan, T. Milo, D. Srivastava and D. Vista. Querying Network Directories, Proc. ACM SIGMOD Intl. Conf. Management of Data, 1999.]] Google ScholarDigital Library
- D. Maier, A. O. Mendelzon and Y. Sagiv. Testing Implications of Data Dependencies, ACM Trans. Database Systems4(1979), pp. 455-469.]] Google ScholarDigital Library
- G. Miklau and D. Suciu. Containment and Equivalence for an XPath Fragment, Proc. 21st ACM Symp. Principles of Database Systems, 2002.]] Google ScholarDigital Library
- Y. Papakonstantinou and V. Vianu. DTD Inference for Views of XML Data, Proc. 19th ACM Symp. Principles of Database Systems, 2000, pp. 35-46.]] Google ScholarDigital Library
- P. Ramanan. Inferring DTDs for Views of XML Data, Tech. Rep. WSUCS-01-1, Comp. Sci. Dept, Wichita State Univ, August 2001.]]Google Scholar
- J. D. Ullman. Principles of Database and Knowledge Base Systems, Vol. I & II. Computer Science Press, Maryland, 1989.]] Google ScholarDigital Library
- P. T. Wood. Optimizing Web Queries Using Document Type Definitions, Proc. 2nd ACM CIKM Intl. Workshop on Web Information and Data Management, 1999, pp. 28-32.]] Google ScholarDigital Library
- P. T. Wood. On the Equivalence of XML Patterns, Proc. 1st Intl. Conf. Computational Logic, Lecture Notes in Artificial Intelligence 1861, pp. 1152-1166, Springer Verlag, New York, 2000.]] Google Scholar
- P. T. Wood. Rewriting XQL Queries on XML Repositories, Proc. 17th British National Conf. on Databases, Lecture Notes in Computer Science 1832, pp. 209-226, Springer Verlag, New York, 2000.]] Google ScholarDigital Library
- P. T. Wood. Minimising Simple XPath Expressions, WebDB 2001.]]Google Scholar
- World Wide Web Consortium. XML Path Language (XPath), W3C Recommendation, Version 1.0, November 1999. See http://www.w3.org/TR/xpath.]]Google Scholar
- World Wide Web Consortium. XQuery 1.0: An XML Query Language, W3C Recommendation, Version 1.0, December 2001. See http://www.w3.org/TR/xquery.]]Google Scholar
Index Terms
- Efficient algorithms for minimizing tree pattern queries
Recommendations
Minimization of tree pattern queries with constraints
SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of dataTree pattern queries (TPQs) provide a natural and easy formalism to query tree-structured XML data, and the efficient processing of such queries has attracted a lot of attention. Since the size of a TPQ is a key determinant of its evaluation cost, ...
Efficient algorithms for descendant-only tree pattern queries
Tree pattern matching is a fundamental problem that has a wide range of applications in Web data management, XML processing, and selective data dissemination. In this paper we develop efficient algorithms for the tree homeomorphism problem, i.e., the ...
Faster bit-parallel algorithms for unordered pseudo-tree matching and tree homeomorphism
In this paper, we consider the unordered pseudo-tree matching problem, which is a problem of, given two unordered labeled trees P and T, finding all occurrences of P in T via such many-to-one matchings that preserve node labels and parent-child ...
Comments