ABSTRACT
With the amount of available text data in relational databases growing rapidly, the need for ordinary users to search such information is dramatically increasing. Even though the major RDBMSs have provided full-text search capabilities, they still require users to have knowledge of the database schemas and use a structured query language to search information. This search model is complicated for most ordinary users. Inspired by the big success of information retrieval (IR) style keyword search on the web, keyword search in relational databases has recently emerged as a new research topic. The differences between text databases and relational databases result in three new challenges: (1) Answers needed by users are not limited to individual tuples, but results assembled from joining tuples from multiple tables are used to form answers in the form of tuple trees. (2) A single score for each answer (i.e. a tuple tree) is needed to estimate its relevance to a given query. These scores are used to rank the most relevant answers as high as possible. (3) Relational databases have much richer structures than text databases. Existing IR strategies to rank relational outputs are not adequate. In this paper, we propose a novel IR ranking strategy for effective keyword search. We are the first that conducts comprehensive experiments on search effectiveness using a real world database and a set of keyword queries collected by a major search company. Experimental results show that our strategy is significantly better than existing strategies. Our approach can be used both at the application level and be incorporated into a RDBMS to support keyword-based search in relational databases.
- S Agrawal, S Chaudhuri, G Das: DBXplorer: A system for keyword-based search over relational databases. ICDE 2002Google Scholar
- G. Bhalotia, A. Hulgeri, C. Nakhey, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using BANKS. ICDE 2002 Google ScholarDigital Library
- A. Balmin, V. Hristidis,Y. Papakonstantinou: Authority Based Keyword Queries in Databases using ObjectRank. VLDB 2004Google Scholar
- S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. WWW 1998 Google ScholarDigital Library
- S. Cohen, Jonathan Mamou, Yaron Kanza, Yehoshua Sagiv: XSEarch: A Semantic Search Engine for XML. VLDB 2003 Google ScholarDigital Library
- D. Florescu, I. Manolescu, and D. Kossmann. Integrating keyword search into XML query processing. WWW 2000. Google ScholarDigital Library
- R. Goldman, N. Shivakumar, S. Venkatasubramanian, and H. Garcia-Molina. Proximity Search in Databases. VLDB 1998. Google ScholarDigital Library
- L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram. XRANK: Ranked keyword search over XML documents. SIGMOD 2003. Google ScholarDigital Library
- L. Guo,J. Shanmugasundaram, K. Beyer, E. Shekita:Efficient Inverted Lists and Query Algorithms for Structured Value Ranking in Update-Intensive Relational Databases. ICDE 2005 Google ScholarDigital Library
- V. Hristidis and Y. Papakonstantinou. DISCOVER: Keyword search in relational databases. VLDB 2002. Google ScholarDigital Library
- V. Hristidis,L. Gravano,Y. Papakonstantinou:Efficient IR-Style Keyword Search over Relational Databases.VLDB 2003Google Scholar
- R. Kaushik, R. Krishnamurthy, J. F. Naughton, and R. Ramakrishnan. On the integration of structure indexes and inverted lists. SIGMOD 2004 Google ScholarDigital Library
- Y. Li, Cong Yu, H. V. Jagadish: Schema-Free XQuery. VLDB 2004Google Scholar
- S. Liu, F. Liu, C. T. Yu, Weiyi Meng: An effective approach to document retrieval via utilizing WordNet and recognizing phrases. SIGIR 2004. Google ScholarDigital Library
- L. Page, S. Brin, R. Motwani and T. Winograd. The PageRank Citation Ranking: Bringing Order to the Web, Technical Report, 1998Google Scholar
- R. Sacks-Davis, Tuong Dao, James A. Thom, Justin Zobel Indexing documents for queries on structure, content and attributes. ISDM 1997Google Scholar
- A. Singhal, Chris Buckley, Mandar Mitra: Pivoted Document Length Normalization. SIGIR 1996 Google ScholarDigital Library
- A. Singhal. Modern information retrieval: A brief overview. IEEE Data Eng. Bull. 24(4), 2001Google Scholar
- E. M. Voorhees. Overview of the TREC-9 Question Answering Track. TREC 2000Google Scholar
- Pew Internet & American Life Project Report: Search Engine Users, 2005. www.pewinternet.org/pdfs/PIP_Searchengine_users.pdGoogle Scholar
- Google. www.google.com/ 2005Google Scholar
- DB2 Text Information Extender. 2005 http://www.ibm.com/software/data/db2/extenders/textinformation/index.htmlGoogle Scholar
- Micorsoft SQL Server 2000. www.microsoft.com/sql/ 2005Google Scholar
- MySQL. dev.mysql.com/doc/mysql/en/Fulltext_Search.html.Google Scholar
- G. A. Miller. WordNet: A lexical database for English. CACM, 38(11):39--41, 1995. Google ScholarDigital Library
- G. Salton and M. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, 1983 Google ScholarDigital Library
- D. Grossman and O. Frieder, Information Retrieval: Algorithms and Heuristics, Springer Publishers, 2nd Edition 2004 Google ScholarDigital Library
Index Terms
- Effective keyword search in relational databases
Recommendations
Keyword search in databases: the power of RDBMS
SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of dataKeyword search in relational databases (RDBs) has been extensively studied recently. A keyword search (or a keyword query) in RDBs is specified by a set of keywords to explore the interconnected tuple structures in an RDB that cannot be easily ...
Effective Top-k Keyword Search in Relational Databases Considering Query Semantics
Advances in Web and Network Technologies, and Information ManagementKeyword search in relational databases has recently emerged as a new research topic. As a search result is often assembled from multiple relational tables, existing IR-style ranking strategies can not be applied directly. In this paper, we propose a ...
Towards An Interactive Keyword Search over Relational Databases
WWW '15 Companion: Proceedings of the 24th International Conference on World Wide WebKeyword search over relational databases has been widely studied for the exploration of structured data in a user-friendly way. However, users typically have limited domain knowledge or are unable to precisely specify their search intention. Existing ...
Comments