skip to main content
article

XIRQL: An XML query language based on information retrieval concepts

Published:01 April 2004Publication History
Skip Abstract Section

Abstract

XIRQL ("circle") is an XML query language that incorporates imprecision and vagueness for both structural and content-oriented query conditions. The corresponding uncertainty is handled by a consistent probabilistic model. The core features of XIRQL are (1) document ranking based on index term weighting, (2) specificity-oriented search for retrieving the most relevant parts of documents, (3) datatypes with vague predicates for dealing with specific types of content and (4) structural vagueness for vague interpretation of structural query conditions. A XIRQL database may contain several classes of documents, where all documents in a class conform to the same DTD; links between documents also are supported. XIRQL queries are translated into a path algebra, which can be processed by our HyREX retrieval engine.

References

  1. Abiteboul, S., Buneman, P., and Suciu, D. 1999. Data on the Web. Chapter 7: Typing semistructured data. Morgan-Kauffman, San Mateo, Calif.Google ScholarGoogle Scholar
  2. Abiteboul, S., Quass, D., McHugh, J., Widom, J., and Wiener, J. 1997. The Lorel query language for semistructured data. Int. J. Dig. Lib. 1, 1 (May), 68--88.Google ScholarGoogle Scholar
  3. Alon, N., Milo, T., Neven, F., Suciu, D., and Vianu, V. 2001. XML with data values: Typechecking revisited. In Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. ACM, New York, 138--149. Google ScholarGoogle Scholar
  4. Amer-Yahia, S. and Case, P. 2003. XQuery and XPath full-text use cases. Tech. Rep., World Wide Web Consortium. Feb. http://www.w3.org/TR/2003/WD-xmlquery-full-text-use- cases-20030214/.Google ScholarGoogle Scholar
  5. Baeza-Yates, R. and Navarro, G. 2002. XQL and proximal nodes. J. ASIS 53, 6, 504--514. Google ScholarGoogle Scholar
  6. Billingsley, P. 1979. Probability and Measure. Wiley Series in Probability and Mathematical Statistics. Wiley, New York.Google ScholarGoogle Scholar
  7. Boag, S., Chamberlin, D., Fernandez, M. F., Florescu, D., Robie, J., and Siméon, J. 2002. XQuery 1.0: An XML query language. Tech. Rep., World Wide Web Consortium. http://www.w3.org/TR/xquery/.Google ScholarGoogle Scholar
  8. Bray, T., Paoli, J., Sperberg-McQueen, C. M., and Maler, E. 2000. Extensible markup language (XML) 1.0 (second edition). http://www.w3.org/TR/REC-xml. Google ScholarGoogle Scholar
  9. Buxton, S. and Rys, M. 2003. XQuery and XPath full-text requirements. Tech. Rep., World Wide Web Consortium. Feb. http://www.w3.org/TR/xmlquery-full-text-requirements/.Google ScholarGoogle Scholar
  10. Callan, J. P. 1994. Passage-level evidence in document retrieval. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, B. W. Croft and C. J. van Rijsbergen, Eds. ACM, New York, 302--310. Google ScholarGoogle Scholar
  11. Carmel, D., Maarek, Y., Mandelbrod, M., Mass, Y., and Soffer, A. 2003. Searching XML documents via XML fragments. In Proceedings of the 26st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, J. Callan, G. Cormack, C. Clarke, D. Hawking, and A. Smeaton, Eds. ACM, New York, 151--158. Google ScholarGoogle Scholar
  12. Chamberlin, D., Robie, J., and Florescu, D. 2001. Quilt: An XML query language for heterogeneous data sources. In The World Wide Web and Databases: Third International Workshop (WebDB 2000) (Dallas, Tex., May 18--19). D. Sucio and G. Vossen, Eds. Lecture Notes in Computer Science, vol. 1997. Springer-Verlag, New York, 53--62. Google ScholarGoogle Scholar
  13. Chiaramella, Y., Mulhem, P., and Fourel, F. 1996. A model for multimedia information retrieval. Tech. Rep., FERMI ESPRIT BRA 8134, University of Glasgow. Apr.Google ScholarGoogle Scholar
  14. Chinenyanga, T. and Kushmerik, N. 2001. Expressive retrieval from XML documents. In Proceedings of the 24th Annual International Conference on Research and development in Information Retrieval, W. Croft, D. Harper, D. Kraft, and J. Zobel, Eds. ACM, New York, 163--171. Google ScholarGoogle Scholar
  15. Clark, J. and DeRose, S. 1999. XML path language (XPath) version 1.0. Tech. Rep., World Wide Web Consortium. Nov. http://www.w3.org/TR/xpath20/.Google ScholarGoogle Scholar
  16. Deutsch, A., Fernandez, M., Florescu, D., Levy, A., and Suciu, D. 1998. XML-QL: A query language for XML. Tech. Rep., World Wide Web Consortium. http://www.w3.org/TR/NOTE-xml-ql. Google ScholarGoogle Scholar
  17. Fallside, D. C. 2001. XML schema part 0: Primer. W3C recommendation, World Wide Web Consortium. May. http://www.w3.org/TR/xmlschema-0/.Google ScholarGoogle Scholar
  18. Fuhr, N. 1999. Towards data abstraction in networked information retrieval systems. Inf. Proc. Manage. 35, 2, 101--119. Google ScholarGoogle Scholar
  19. Fuhr, N., Gövert, N., Kazai, G., and Lalmas, M. 2002. INEX: INitiative for the Evaluation of XML retrieval. In Proceedings of the SIGIR 2002 Workshop on XML and Information Retrieval, R. Baeza-Yates, N. Fuhr, and Y. S. Maarek, Eds. http://www.is.informatik.uni-duisburg.de/bib/xml/Fuhr_etal_02a.html.Google ScholarGoogle Scholar
  20. Fuhr, N., Gövert, N., Kazai, G., and Lalmas, M., Eds. 2003. INitiative for the Evaluation of XML Retrieval (INEX). Proceedings of the 1st INEX Workshop (Dagstuhl, Germany, Dec. 8--12). ERCIM Workshop Proceedings. ERCIM, Sophia Antipolis, France. http://www.ercim.org/publication/ws-proceedings/INEX2002.pdf.Google ScholarGoogle Scholar
  21. Fuhr, N., Gövert, N., and Rölleke, T. 1998. DOLORES: A system for logic-based retrieval of multimedia objects. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. W. B. Croft, A. Moffat, C. J. van Rijshergen, R. Wilkinson, and J. Zohel, Eds. ACM, New York, 257--265. Google ScholarGoogle Scholar
  22. Fuhr, N. and Rölleke, T. 1997. A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Trans. Inf. Syst. 14, 1, 32--66. Google ScholarGoogle Scholar
  23. Gövert, N., Fuhr, N., Abolhassani, M., and Groβjohann, K. 2003. Content-oriented XML retrieval with HyREX. INitiative for the Evaluation of XML Retrieval (INEX). Proceedings of the 1st INEX Workshop (Dagstuhl, Germany, Dec. 8--12). N. Fuhr. N. Gövert, G. Kazai, and M. Lalamas, Eds. 26--32. http://www.ercim.org/publication/ws-proceedings/INEX2002.pdf.Google ScholarGoogle Scholar
  24. Groβjohann, K., Fuhr, N., Effing, D., and Kriewel, S. 2002. Query formulation and result visualization for XML retrieval. In Proceedings of the ACM SIGIR 2002 Workshop on XML and Information Retrieval. ACM, New York. http://www.is.informatik.uni-duisburg.de/bib/xml/Groβjohann_etal_02.html.Google ScholarGoogle Scholar
  25. Hearst, M. and Plaunt, C. 1993. Subtopic structuring for full-length document access. In Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, 59--68. Google ScholarGoogle Scholar
  26. Kaszkiel, M. and Zobel, J. 1997. Passage retrieval revisited. In Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. N. J. Belkin, A. D. Narasimhalu, and P. Willet, Eds. ACM, New York, 178--185. Google ScholarGoogle Scholar
  27. Lalmas, M. 1997. Dempster-shafer's theory of evidence applied to structured documents: Modelling uncertainty. In Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. N. J. Belkin, A. D. Narasimhalu, and P. Willet, Eds. ACM, New York, 110--118. Google ScholarGoogle Scholar
  28. Myaeng, S., Jang, D.-H., Kim, M.-S., and Zhoo, Z.-C. 1998. A flexible model for retrieval of SGML documents. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. W. B. Croft, A. Moffat, C. J. van Rijshergen, R. Wilkinson, and J. Zohel, Eds. ACM, New York, 138--145. Google ScholarGoogle Scholar
  29. Navarro, G. and Baeza-Yates, R. 1997. Proximal nodes: a model to query document databases by content and structure. ACM Transactions on Information Systems 15, 4, 400--435. Google ScholarGoogle Scholar
  30. Robertson, S. E., Walker, S., Jones, S., and Hancock-Beaulieu, M. M. 1995. Okapi at TREC-3. In Proceedings of the 3rd Text Retrieval Converence (TREC-3). NTIS, Springfield, Virginia, USA, 109--126.Google ScholarGoogle Scholar
  31. Robie, J., Derksen, E., Fankhauser, P., Howland, E., Huck, G., Macherius, I., Murata, M., Resnick, M., and Schöning, H. 1999. XQL (XML query language). http://www.ibiblio.org/xql/xql-proposal.html.Google ScholarGoogle Scholar
  32. Robie, J., Lapp, J., and Schach, D. 1998. XML query language (XQL). In QL'98---The Query Languages Workshop, M. Marchiori, Ed. http://www.w3.org/TandS/QL/QL98/pp/xql.html.Google ScholarGoogle Scholar
  33. Schlieder, T. 2002. Schema-driven evaluation of approximate tree-pattern queries. In Advances in Database Technology---EDBT 2002, Proceedings of the 8th International Conference on Extending Database Technology. (Prague, Czech Republic, Mar. 25--27), C. S. Jensen, K. G. Jeffrey, J. Pokorný, S. Saltenis, E. Bertino, K. Böhm, and M. Jarke, Eds. Lecture Notes in Computer Science, vol. 2287. Springer, Heidelberg et al., 514--532. Google ScholarGoogle Scholar
  34. Schlieder, T. and Meuss, M. 2000. Result ranking for structured queries against XML documents. In DELOS Workshop: Information Seeking, Searching and Querying in Digital Libraries. ERCIM, Sophia Antipolis, France. http://www.ercim.org/publication/ws-proceedings/DelNoe01/.Google ScholarGoogle Scholar
  35. Schlieder, T. and Meuss, H. 2002. Querying and ranking XML documents. J. ASIS 53, 6 (Apr.), 489--503. Google ScholarGoogle Scholar
  36. Suciu, D. and Vossen, G., Eds. 2001. The World Wide Web and Databases: Third International Workshop WebDB 2000, Dallas, Texas, USA, May 18-19, 2000. Lecture Notes in Computer Science, vol. 1997. Springer, Heidelberg et al. ISBN 3-540-41826-1. Google ScholarGoogle Scholar
  37. Theobald, A. and Weikum, G. 2001. Adding relevance to XML. In The World Wide Web and Databases: Third International Workshop (WebDB 2000) (Dallas, Tex., May 18--19). D. Suciu and G. Vossen, Eds. Lecture Notes in Computer Science, vol. 1997. Springer-Verlag, New York, 105--124. Google ScholarGoogle Scholar

Index Terms

  1. XIRQL: An XML query language based on information retrieval concepts

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader