skip to main content
10.1145/1007568.1007709acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

XSeq: an indexing infrastructure for tree pattern queries

Published:13 June 2004Publication History

ABSTRACT

Given a tree-pattern query, most XML indexing approaches decompose it into multiple sub-queries, and then join their results to provide the answer to the original query. Join operations have been identified as the most time-consuming component in XML query processing. XSeq is a powerful XML indexing infrastructure which makes tree patterns a first class citizen in XML query processing. Unlike most indexing methods that directly manipulate tree structures, XSeq builds its indexing infrastructure on a much simpler data model: sequences. That is, we represent both XML data and XML queries by structure-encoded sequences. We have shown that this new data representation preserves query equivalence, and more importantly, through subsequence matching, structured queries can be answered directly without resorting to expensive join operations. Moreover, the XSeq infrastructure unifies indices on both the content and the structure of XML documents, hence it achieves an additional performance advantage over methods indexing either just content or structure, or indexing them separately.

References

  1. S. Abiteboul, P. Buneman, and D. Suciu. Data on the web: from relations to semistructured data and XML. Morgan Kaufmann Publishers, Los Altos, CA 94022, USA, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. Chung, J. Min; and K. Shim. APEX: An adaptive path index for XML data. In ACM SIGMOD, June 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. F. Cooper, N. Sample, M. Franklin, G. Hjaltason, and M. Shadmon. A fast index for semistructured data. In VLDB, pages 341--350, September 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Goldman and J. Widom. DataGuides: Enable query formulation and optimization in semistructured databases. In VLDB, pages 436--445, August 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Kaushik, P. Bohannon, J. Naughton, and H. Korth. Covering indexes for branching path queries. In ACM SIGMOD, June 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Q. Li and B. Moon. Indexing and querying XML data for regular path expressions. In VLDB, pages 361--370, September 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. T. Milo and D. Suciu. Index structures for path expression. In Proceedings of 7th International Conference on Database Theory (ICDT), pages 277--295, January 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Praveen R. Raw and Bongki Moon. PRIX: Indexing and querying XML using prfer sequences. In ICDE, 2004.]]Google ScholarGoogle Scholar
  9. Haixun Wang, Xiaofeng Meng, Wei Fan, and Philip S. Yu. Sequential and structural query equivalence in XML query processing. Technical report, http://wis.cs.ucla.edu/~hxwang/xseq.pdf, 2003.]]Google ScholarGoogle Scholar
  10. Haixun Wang, Sanghyun Park, Wei Fan, and Philip S. Yu. ViST: A dynamic index method for querying XML data by tree structures. In SIGMOD, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. XSeq: an indexing infrastructure for tree pattern queries

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGMOD '04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data
          June 2004
          988 pages
          ISBN:1581138598
          DOI:10.1145/1007568

          Copyright © 2004 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 13 June 2004

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate785of4,003submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader