Abstract
We describe the construction of a generic natural language query interface to an XML database. Our interface can accept an arbitrary English sentence as a query, which can be quite complex and include aggregation, nesting, and value joins, among other things. This query is translated, potentially after reformulation, into an XQuery expression. The translation is based on mapping grammatical proximity of natural language parsed tokens in the parse tree of the query sentence to proximity of corresponding elements in the XML data to be retrieved. Our experimental assessment, through a user study, demonstrates that this type of natural language interface is good enough to be usable now, with no restrictions on the application domain.
Supported in part by NSF 0219513 and 0438909, and NIH 1-U54-DA021519-01A1.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amer-Yahia, S., et al.: TeXQuery: A full-text search extension to XQuery. In: WWW (2004)
Androutsopoulos, I., et al.: Natural language interfaces to databases - an introduction. Journal of Language Engineering 1(1), 29–81 (1995)
Attardi, G., et al.: PiQASso: Pisa question answering system. In: TREC (2001)
Bates, M.J.: The design of browsing and berrypicking techniques for the on-line search interface. Online Review 13(5), 407–431 (1989)
Chu-carroll, J., et al.: A hybrid approach to natural language Web search. In: EMNLP (2002)
Cohen, S., et al.: XSEarch: A semantic search engine for XML. In: VLDB (2003)
Cui, H., et al.: Question answering passage retrieval using dependency relations. In: SIGIR (2005)
Delden, S.V., Gomez, F.: Retrieving NASA problem reports: a case study in natural language information retrieval. Data & Knowledge Engineering 48(2), 231–246 (2004)
Gao, J., et al.: Dependency language model for information retrieval. In: SIGIR (2004)
Guo, L., et al.: XRANK: Ranked keyword search over XML documents. SIGMOD (2003)
Hristidis, V., et al.: Keyword proximity search on XML graphs. In: ICDE (2003)
Hulgeri, A., et al.: Keyword search in databases. IEEE Data Engineering Bulletin 24, 22–32 (2001)
Jagadish, H.V., et al.: Timber: A native xml database. The VLDB Journa 11(4), 274–291 (2002)
Kapetanios, E., Groenewoud, P.: Query construction through meaningful suggestions of terms. In: FQAS (2002)
Kupper, D., et al.: NAUDA: A cooperative natural language interface to relational databases. SIGMOD Record 22(2), 529–533 (1993)
Li, Y., et al.: Schema-Free XQuery. In: VLDB (2004)
Li, Y., et al.: NaLIX: an interactive natural language interface for querying XML. In: SIGMOD (2005)
Li, Y., et al.: Enabling Schema-Free XQuery with Meaningful Query Focus. To appear in VLDB Journal (2006)
Lin, D.: Dependency-based evaluation of MINIPAR. In: Workshop on the Evaluation of Parsing Systems (1998)
Mel’čuk, I.A.: Studies in dependency syntax. Karoma Publishers, Ann Arbor (1979)
Meng, F., Chu, W.: Database query formation from natural language using semantic modeling and statistical keyword meaning disambiguation. Technical Report 16, UCLA (1999)
Popescu, A.-M., et al.: Towards a theory of natural language interfaces to databases. In: IUI (2003)
Popescu, A.-M., et al.: Modern natural language interfaces to databases: Composing statistical parsing with semantic tractability. In: COLING (2004)
Quirk, R., et al.: A Comprehensive Grammar of the English Language. Longman, London (1985)
Remde, J.R., et al.: Superbook: an automatic tool for information exploration - hypertext? In: Hypertext, pp. 175–188. ACM Press, New York (1987)
Schmidt, A., et al.: Querying XML documents made easy: Nearest concept queries. In: ICDE (2001)
Shaw Jr., W., et al.: Performance standards and evaluations in IR test collections: Cluster-based retrieval modles. Information Processing and Management 33(1), 1–14 (1997)
Sleator, D., Temperley, D.: Parsing English with a link grammar. In: International Workshop on Parsing Technologies (1993)
Stallard, D.: A terminological transformation for natural language question-answering systems. In: ANLP (1986)
Tang, L.R., Mooney, R.J.: Using multiple clause constructors in inductive logic programming for semantic parsing. In: ECML (2001)
The World Wide Web Consortium. XML Query Use Cases. W3C Working Draft (2003), Available at http://www.w3.org/TR/xquery-use-cases
TheWorldWideWeb Consortium. Extensible Markup Language (XML) 1.0 (Third Edition). W3C Recommendation (2004), Available at http://www.w3.org/TR/REC-xml
Trigoni, A.: Interactive query formulation in semistructured databases. In: FQAS (2002)
Woods, W., et al.: The Lunar Sciences Natural Language Information System: Final Report. In: Bolt Beranek and Newman Inc., Cambridge, MA (1972)
WordNet: http://www.cogsci.princeton.edu/~wn
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, Y., Yang, H., Jagadish, H.V. (2006). Constructing a Generic Natural Language Interface for an XML Database. In: Ioannidis, Y., et al. Advances in Database Technology - EDBT 2006. EDBT 2006. Lecture Notes in Computer Science, vol 3896. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11687238_44
Download citation
DOI: https://doi.org/10.1007/11687238_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32960-2
Online ISBN: 978-3-540-32961-9
eBook Packages: Computer ScienceComputer Science (R0)