Abstract
XIRQL ("circle") is an XML query language that incorporates imprecision and vagueness for both structural and content-oriented query conditions. The corresponding uncertainty is handled by a consistent probabilistic model. The core features of XIRQL are (1) document ranking based on index term weighting, (2) specificity-oriented search for retrieving the most relevant parts of documents, (3) datatypes with vague predicates for dealing with specific types of content and (4) structural vagueness for vague interpretation of structural query conditions. A XIRQL database may contain several classes of documents, where all documents in a class conform to the same DTD; links between documents also are supported. XIRQL queries are translated into a path algebra, which can be processed by our HyREX retrieval engine.
- Abiteboul, S., Buneman, P., and Suciu, D. 1999. Data on the Web. Chapter 7: Typing semistructured data. Morgan-Kauffman, San Mateo, Calif.Google Scholar
- Abiteboul, S., Quass, D., McHugh, J., Widom, J., and Wiener, J. 1997. The Lorel query language for semistructured data. Int. J. Dig. Lib. 1, 1 (May), 68--88.Google Scholar
- Alon, N., Milo, T., Neven, F., Suciu, D., and Vianu, V. 2001. XML with data values: Typechecking revisited. In Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. ACM, New York, 138--149. Google Scholar
- Amer-Yahia, S. and Case, P. 2003. XQuery and XPath full-text use cases. Tech. Rep., World Wide Web Consortium. Feb. http://www.w3.org/TR/2003/WD-xmlquery-full-text-use- cases-20030214/.Google Scholar
- Baeza-Yates, R. and Navarro, G. 2002. XQL and proximal nodes. J. ASIS 53, 6, 504--514. Google Scholar
- Billingsley, P. 1979. Probability and Measure. Wiley Series in Probability and Mathematical Statistics. Wiley, New York.Google Scholar
- Boag, S., Chamberlin, D., Fernandez, M. F., Florescu, D., Robie, J., and Siméon, J. 2002. XQuery 1.0: An XML query language. Tech. Rep., World Wide Web Consortium. http://www.w3.org/TR/xquery/.Google Scholar
- Bray, T., Paoli, J., Sperberg-McQueen, C. M., and Maler, E. 2000. Extensible markup language (XML) 1.0 (second edition). http://www.w3.org/TR/REC-xml. Google Scholar
- Buxton, S. and Rys, M. 2003. XQuery and XPath full-text requirements. Tech. Rep., World Wide Web Consortium. Feb. http://www.w3.org/TR/xmlquery-full-text-requirements/.Google Scholar
- Callan, J. P. 1994. Passage-level evidence in document retrieval. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, B. W. Croft and C. J. van Rijsbergen, Eds. ACM, New York, 302--310. Google Scholar
- Carmel, D., Maarek, Y., Mandelbrod, M., Mass, Y., and Soffer, A. 2003. Searching XML documents via XML fragments. In Proceedings of the 26st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, J. Callan, G. Cormack, C. Clarke, D. Hawking, and A. Smeaton, Eds. ACM, New York, 151--158. Google Scholar
- Chamberlin, D., Robie, J., and Florescu, D. 2001. Quilt: An XML query language for heterogeneous data sources. In The World Wide Web and Databases: Third International Workshop (WebDB 2000) (Dallas, Tex., May 18--19). D. Sucio and G. Vossen, Eds. Lecture Notes in Computer Science, vol. 1997. Springer-Verlag, New York, 53--62. Google Scholar
- Chiaramella, Y., Mulhem, P., and Fourel, F. 1996. A model for multimedia information retrieval. Tech. Rep., FERMI ESPRIT BRA 8134, University of Glasgow. Apr.Google Scholar
- Chinenyanga, T. and Kushmerik, N. 2001. Expressive retrieval from XML documents. In Proceedings of the 24th Annual International Conference on Research and development in Information Retrieval, W. Croft, D. Harper, D. Kraft, and J. Zobel, Eds. ACM, New York, 163--171. Google Scholar
- Clark, J. and DeRose, S. 1999. XML path language (XPath) version 1.0. Tech. Rep., World Wide Web Consortium. Nov. http://www.w3.org/TR/xpath20/.Google Scholar
- Deutsch, A., Fernandez, M., Florescu, D., Levy, A., and Suciu, D. 1998. XML-QL: A query language for XML. Tech. Rep., World Wide Web Consortium. http://www.w3.org/TR/NOTE-xml-ql. Google Scholar
- Fallside, D. C. 2001. XML schema part 0: Primer. W3C recommendation, World Wide Web Consortium. May. http://www.w3.org/TR/xmlschema-0/.Google Scholar
- Fuhr, N. 1999. Towards data abstraction in networked information retrieval systems. Inf. Proc. Manage. 35, 2, 101--119. Google Scholar
- Fuhr, N., Gövert, N., Kazai, G., and Lalmas, M. 2002. INEX: INitiative for the Evaluation of XML retrieval. In Proceedings of the SIGIR 2002 Workshop on XML and Information Retrieval, R. Baeza-Yates, N. Fuhr, and Y. S. Maarek, Eds. http://www.is.informatik.uni-duisburg.de/bib/xml/Fuhr_etal_02a.html.Google Scholar
- Fuhr, N., Gövert, N., Kazai, G., and Lalmas, M., Eds. 2003. INitiative for the Evaluation of XML Retrieval (INEX). Proceedings of the 1st INEX Workshop (Dagstuhl, Germany, Dec. 8--12). ERCIM Workshop Proceedings. ERCIM, Sophia Antipolis, France. http://www.ercim.org/publication/ws-proceedings/INEX2002.pdf.Google Scholar
- Fuhr, N., Gövert, N., and Rölleke, T. 1998. DOLORES: A system for logic-based retrieval of multimedia objects. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. W. B. Croft, A. Moffat, C. J. van Rijshergen, R. Wilkinson, and J. Zohel, Eds. ACM, New York, 257--265. Google Scholar
- Fuhr, N. and Rölleke, T. 1997. A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Trans. Inf. Syst. 14, 1, 32--66. Google Scholar
- Gövert, N., Fuhr, N., Abolhassani, M., and Groβjohann, K. 2003. Content-oriented XML retrieval with HyREX. INitiative for the Evaluation of XML Retrieval (INEX). Proceedings of the 1st INEX Workshop (Dagstuhl, Germany, Dec. 8--12). N. Fuhr. N. Gövert, G. Kazai, and M. Lalamas, Eds. 26--32. http://www.ercim.org/publication/ws-proceedings/INEX2002.pdf.Google Scholar
- Groβjohann, K., Fuhr, N., Effing, D., and Kriewel, S. 2002. Query formulation and result visualization for XML retrieval. In Proceedings of the ACM SIGIR 2002 Workshop on XML and Information Retrieval. ACM, New York. http://www.is.informatik.uni-duisburg.de/bib/xml/Groβjohann_etal_02.html.Google Scholar
- Hearst, M. and Plaunt, C. 1993. Subtopic structuring for full-length document access. In Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, 59--68. Google Scholar
- Kaszkiel, M. and Zobel, J. 1997. Passage retrieval revisited. In Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. N. J. Belkin, A. D. Narasimhalu, and P. Willet, Eds. ACM, New York, 178--185. Google Scholar
- Lalmas, M. 1997. Dempster-shafer's theory of evidence applied to structured documents: Modelling uncertainty. In Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. N. J. Belkin, A. D. Narasimhalu, and P. Willet, Eds. ACM, New York, 110--118. Google Scholar
- Myaeng, S., Jang, D.-H., Kim, M.-S., and Zhoo, Z.-C. 1998. A flexible model for retrieval of SGML documents. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. W. B. Croft, A. Moffat, C. J. van Rijshergen, R. Wilkinson, and J. Zohel, Eds. ACM, New York, 138--145. Google Scholar
- Navarro, G. and Baeza-Yates, R. 1997. Proximal nodes: a model to query document databases by content and structure. ACM Transactions on Information Systems 15, 4, 400--435. Google Scholar
- Robertson, S. E., Walker, S., Jones, S., and Hancock-Beaulieu, M. M. 1995. Okapi at TREC-3. In Proceedings of the 3rd Text Retrieval Converence (TREC-3). NTIS, Springfield, Virginia, USA, 109--126.Google Scholar
- Robie, J., Derksen, E., Fankhauser, P., Howland, E., Huck, G., Macherius, I., Murata, M., Resnick, M., and Schöning, H. 1999. XQL (XML query language). http://www.ibiblio.org/xql/xql-proposal.html.Google Scholar
- Robie, J., Lapp, J., and Schach, D. 1998. XML query language (XQL). In QL'98---The Query Languages Workshop, M. Marchiori, Ed. http://www.w3.org/TandS/QL/QL98/pp/xql.html.Google Scholar
- Schlieder, T. 2002. Schema-driven evaluation of approximate tree-pattern queries. In Advances in Database Technology---EDBT 2002, Proceedings of the 8th International Conference on Extending Database Technology. (Prague, Czech Republic, Mar. 25--27), C. S. Jensen, K. G. Jeffrey, J. Pokorný, S. Saltenis, E. Bertino, K. Böhm, and M. Jarke, Eds. Lecture Notes in Computer Science, vol. 2287. Springer, Heidelberg et al., 514--532. Google Scholar
- Schlieder, T. and Meuss, M. 2000. Result ranking for structured queries against XML documents. In DELOS Workshop: Information Seeking, Searching and Querying in Digital Libraries. ERCIM, Sophia Antipolis, France. http://www.ercim.org/publication/ws-proceedings/DelNoe01/.Google Scholar
- Schlieder, T. and Meuss, H. 2002. Querying and ranking XML documents. J. ASIS 53, 6 (Apr.), 489--503. Google Scholar
- Suciu, D. and Vossen, G., Eds. 2001. The World Wide Web and Databases: Third International Workshop WebDB 2000, Dallas, Texas, USA, May 18-19, 2000. Lecture Notes in Computer Science, vol. 1997. Springer, Heidelberg et al. ISBN 3-540-41826-1. Google Scholar
- Theobald, A. and Weikum, G. 2001. Adding relevance to XML. In The World Wide Web and Databases: Third International Workshop (WebDB 2000) (Dallas, Tex., May 18--19). D. Suciu and G. Vossen, Eds. Lecture Notes in Computer Science, vol. 1997. Springer-Verlag, New York, 105--124. Google Scholar
Index Terms
- XIRQL: An XML query language based on information retrieval concepts
Recommendations
An XQuery engine for digital library systems
JCDL '03: Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital librariesXML is now a standard markup language for web information. Many application areas are producing XML documents on the web. This situation urges digital library systems to deal with not only typical text documents but also XML documents. XML documents are ...
XML Processing and Data Integration with XQuery
Most Web applications exchange data as XML, but they create and process this data with languages that don't have native support for XML. With appropriate middleware, XQuery can dramatically simplify this process, treating all data sources as though they ...
The essence of XML
The World-Wide Web Consortium (W3C) promotes XML and related standards, including XML Schema, XQuery, and XPath. This paper describes a formalization of XML Schema. A formal semantics based on these ideas is part of the official XQuery and XPath ...
Comments