ABSTRACT
While database management systems offer a comprehensive solution to data storage, they require deep knowledge of the schema, as well as the data manipulation language, in order to perform effective retrieval. Since these requirements pose a problem to lay or occasional users, several methods incorporate keyword search (KS) into relational databases. However, most of the existing techniques focus on querying a single DBMS. On the other hand, the proliferation of distributed databases in several conventional and emerging applications necessitates the support for keyword-based data sharing and querying over multiple DMBSs. In order to avoid the high cost of searching in numerous, potentially irrelevant, databases in such systems, we propose G-KS, a novel method for selecting the top-K candidates based on their potential to contain results for a given query. G-KSsummarizes each database by a keyword relationship graph, where nodes represent terms and edges describe relationships between them. Keyword relationship graphs are utilized for computing the similarity between each database and a KS query, so that, during query processing, only the most promising databases are searched. An extensive experimental evaluation demonstrates that G-KS outperforms the current state-of-the-art technique on all aspects, including precision, recall, efficiency, space overhead and flexibility of accommodating different semantics.
- S. Agrawal, S. Chaudhuri, and G. Das. DBXplorer: A system for keyword-based search over relational databases. In Proceedings of ICDE, 2002. Google ScholarDigital Library
- BestPeer. http://www.bestpeer.com.Google Scholar
- G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using banks. In Proceedings of ICDE, 2002. Google ScholarDigital Library
- J. P. Callan, Z. Lu, and W. B. Croft. Searching distributed collections with inference networks. In Proceedings of SIGIR, 1995. Google ScholarDigital Library
- G. Cao, J.-Y. Nie, and J. Bai. Integrating word relationships into language models. In Proceedings of SIGIR, 2005. Google ScholarDigital Library
- S. Cohen. XSEarch: A semantic search engine for XML. In Proceedings of VLDB, 2003. Google ScholarDigital Library
- DBLP. http://dblp.uni-trier.de.Google Scholar
- C. Fellbaum, editor. Wordnet: An Electronic Lexical Database. MIT Press, 1998.Google ScholarCross Ref
- J. Gao, J.-Y. Nie, G. Wu, and G. Cao. Dependence language model for information retrieval. In Proceedings of SIGIR, 2004. Google ScholarDigital Library
- L. Gravano, H. Garcia-Molina, and A. Tomasic. GlOSS: Text-source discovery over the internet. ACM Transactions on Database Systems (TODS), 24(2):229--264, 1999. Google ScholarDigital Library
- L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram. XRANK: Ranked keyword search over XML documents. In Proceedings of SIGMOD, 2003. Google ScholarDigital Library
- H. He, H. Wang, J. Yang, and P. S. Yu. BLINKS: Ranked keyword searched on graphs. In Proceedings of SIGMOD, 2007. Google ScholarDigital Library
- V. Hristidis, L. Gravano, and Y. Papakonstantinou. Efficient IR-style keyword search over relational databases. In Proceedings of VLDB, 2003. Google ScholarDigital Library
- V. Hristidis and Y. Papakonstantinou. DISCOVER: Keyword search in relational databases. In Proceedings of VLDB, 2002. Google ScholarDigital Library
- V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, and H. Karambelkar. Bidirectional expansion for keyword search on graph databases. In Proceedings of VLDB, 2005. Google ScholarDigital Library
- Y. Li, C. Yu, and H. V. Jagadish. Schema-Free XQuery. In Proceedings of VLDB, 2004. Google ScholarDigital Library
- F. Liu, C. Yu, W. Meng, and A. Chowdhury. Effective keyword search in relational databases. In Proceedings of SIGMOD, 2006. Google ScholarDigital Library
- Z. Liu and Y. Chen. Identifying meaningful return information for XML keyword search. In Proceedings of SIGMOD, 2007. Google ScholarDigital Library
- Y. Luo, X. Lin, W. Wang, and X. Zhou. SPARK: Top-k keyword query in relational databases. In Proceedings of SIGMOD, 2007. Google ScholarDigital Library
- A. Markowetz, Y. Yang, and D. Papadias. Keyword search on relational data streams. In Proceedings of SIGMOD, 2007. Google ScholarDigital Library
- R. Nallapati and J. Allan. Capturing term dependencies using a language model based on sentence trees. In Proceedings of CIKM, 2002. Google ScholarDigital Library
- S3: Scalable, Shareable and Secure P2P Based Data Management System. http://www.comp.nus.edu.sg/~s3p2p.Google Scholar
- G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Communications of the ACM, 18(11):613--620, 1975. Google ScholarDigital Library
- M. Sayyadian, H. LeKhac, A. Doan, and L. Gravano. Efficient keyword search across heterogeneous relational databases. In Proceedings of ICDE, 2007.Google ScholarCross Ref
- S. K. M. Wong, W. Ziarko, V. V. Raghavan, and P. C. N. Wong. On modeling of information retrieval concepts in vector spaces. ACM Transactions on Database Systems (TODS), 12(3):299--321, 1987. Google ScholarDigital Library
- Y. Xu and Y. Papakonstantinou. Efficient keyword search for smallest LCAs in XML databases. In Proceedings of SIGMOD, 2005. Google ScholarDigital Library
- B. Yu, G. Li, K. Sollins, and A. K. H. Tung. Effective keyword-based selection of relational databases. In Proceedings of SIGMOD, 2007. Google ScholarDigital Library
- B. Yuwono and D. L. Lee. Server ranking for distributed text retrieval systems on the internet. In Proceedings of DASFAA, 1997. Google ScholarDigital Library
Index Terms
- A graph method for keyword-based selection of the top-K databases
Recommendations
Scalable top-k keyword search in relational databases
DASFAA'12: Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part IIKeyword search in relational databases has been widely studied in recent years because it does not require users neither to master a certain structured query language nor to know the complex underlying database schemas. There would be a huge number of ...
Efficient continuous top-k keyword search in relational databases
WAIM'10: Proceedings of the 11th international conference on Web-age information managementKeyword search in relational databases has been widely studied in recent years. Most of the previous studies focus on how to answer an instant keyword query. In this paper, we focus on how to find the top-k answers in relational databases for continuous ...
Scalable top-k keyword search in relational databases
AbstractKeyword search in relational databases has been widely studied in recent years because it does not require users neither to master a certain structured query language nor to know the complex underlying database schemas. There would be a huge ...
Comments