ABSTRACT
The proliferation of knowledge-sharing communities and the advances in information extraction have enabled the construction of large knowledge bases using the RDF data model to represent entities and relationships. However, as the Web and its latently embedded facts evolve, a knowledge base can never be complete and up-to-date. On the other hand, a rapidly increasing suite of Web services provide access to timely and high-quality information, but this is encapsulated by the service interface. We propose to leverage the information that could be dynamically obtained from Web services in order to enrich RDF knowledge bases on the fly whenever the knowledge base does not suffice to answer a user query.
To this end, we develop a sound framework for appropriately generating queries to encapsulated Web services and efficient algorithms for query execution and result integration. The query generator composes sequences of function calls based on the available service interfaces. As Web service calls are expensive, our method aims to reduce the number of calls in order to retrieve results with sufficient recall. Our approach is fully implemented in a complete prototype system named ANGIE1. The user can query and browse the RDF knowledge base as if it already contained all the facts from the Web services. This data, however, is gathered and integrated on the fly, transparently to the user. We demonstrate the viability and efficiency of our approach in experiments based on real-life data provided by popular Web services.
- S. Abiteboul, O. Benjelloun, and T. Milo. The Active XML project: an overview. VLDB J., 2007. Google ScholarDigital Library
- S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarDigital Library
- B. Amann, I. Fundulaki, M. Scholl, C. Beeri, and A.-M. Vercoustre. Mapping XML fragments to community Web ontologies. In WebDB, 2001.Google Scholar
- A. Arasu and R. Kaushik. A grammar-based entity representation framework for data cleaning. In SIGMOD Conference, 2009. Google ScholarDigital Library
- S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. DBpedia: A nucleus for a Web of Open Data. The Semantic Web, 2008. Google ScholarDigital Library
- M. Banko, M. J. Cafarella, S. Soderland, M. Broadhead, and O. Etzioni. Open Information Extraction from the Web. In IJCAI, 2007. Google ScholarDigital Library
- D. Berardi, D. Calvanese, G. D. Giacomo, R. Hull, and M. Mecella. Automatic composition of transition-based semantic web services with messaging. In VLDB, 2005. Google ScholarDigital Library
- C. Bizer, T. Heath, K. Idehen, and T. Berners-Lee. Linked data on the web (LDOW2008). In WWW, 2008. Google ScholarDigital Library
- A. Calì, G. Gottlob, and T. Lukasiewicz. A general datalog-based framework for tractable query answering over ontologies. In PODS, 2009. Google ScholarDigital Library
- B. Cautis, A. Deutsch, and N. Onose. Querying data sources that export infinite sets of views. In ICDT, pages 84--97, 2009. Google ScholarDigital Library
- K. C.-C. Chang, B. He, and Z. Zhang. Toward large scale integration: Building a metaquerier over databases on the web. In CIDR, 2005.Google Scholar
- A. Deutsch, L. Sui, and V. Vianu. Specification and verification of data-driven web services. In PODS, 2004. Google ScholarDigital Library
- X. L. Dong, L. Berti-Equille, and D. Srivastava. Truth discovery and copying detection in a dynamic world. PVLDB, 2(1), 2009. Google ScholarDigital Library
- O. M. Duschka and M. R. Genesereth. Answering recursive queries using views. In PODS, 1997. Google ScholarDigital Library
- O. M. Duschka, M. R. Genesereth, and A. Y. Levy. Recursive query plans for data integration. J. Log. Program., 43(1), 2000.Google Scholar
- R. Fagin, L. M. Haas, M. A. Hernández, R. J. Miller, L. Popa, and Y. Velegrakis. Clio: Schema mapping creation and data exchange. In Conceptual Modeling: Foundations and Applications, 2009. Google ScholarDigital Library
- D. Freitag and N. Kushmerick. Boosted wrapper induction. In AAAI/IAAI, 2000. Google ScholarDigital Library
- M. Friedman and D. S. Weld. Efficiently executing information-gathering plans. In IJCAI (1), 1997.Google Scholar
- H. Garcia-Molina, Y. Papakonstantinou, D. Quass, A. Rajaraman, Y. Sagiv, J. D. Ullman, V. Vassalos, and J. Widom. The tsimmis approach to mediation: Data models and languages. J. Intell. Inf. Syst., 8(2), 1997. Google ScholarDigital Library
- A. Y. Halevy. Answering queries using views: A survey. VLDB J., 10(4), 2001. Google ScholarDigital Library
- M. Jarrar and M. D. Dikaiakos. MashQL: a query-by-diagram topping SPARQL. In ONISW, 2008. Google ScholarDigital Library
- S. Kambhampati, E. Lambrecht, U. Nambiar, Z. Nie, and G. Senthil. Optimizing recursive information gathering plans in emerac. J. Intell. Inf. Syst., 22(2), 2004. Google ScholarDigital Library
- G. Kasneci, F. M. Suchanek, G. Ifrim, M. Ramanath, and G. Weikum. NAGA: Searching and Ranking Knowledge. In ICDE, 2008. Google ScholarDigital Library
- I. Koffina, G. Serfiotis, V. Christophides, and V. Tannen. Mediating rdf/s queries to relational and xml sources. Int. J. Semantic Web Inf. Syst., 2(4), 2006.Google Scholar
- C. T. Kwok and D. S. Weld. Planning to gather information. In AAAI/IAAI, Vol. 1, 1996. Google ScholarDigital Library
- A. Y. Levy, A. Rajaraman, and J. J. Ordille. Querying heterogeneous information sources using source descriptions. In VLDB, 1996. Google ScholarDigital Library
- T. Neumann and G. Weikum. RDF-3X: a RISC-style engine for RDF. PVLDB, 1(1), 2008. Google ScholarDigital Library
- A. Polleres. From SPARQL to rules (and back). In WWW, 2007. Google ScholarDigital Library
- R. Pottinger and A. Y. Levy. A scalable algorithm for answering queries using views. In VLDB, 2000. Google ScholarDigital Library
- N. Preda, F. M. Suchanek, G. Kasneci, T. Neumann, M. Ramanath, and G. Weikum. ANGIE: Active knowledge for interactive exploration. PVLDB, 2(2), 2009. Google ScholarDigital Library
- K. Q. Pu, V. Hristidis, and N. Koudas. Syntactic rule based approach to Web service composition. In ICDE, 2006. Google ScholarDigital Library
- A. Rajaraman, Y. Sagiv, and J. D. Ullman. Answering queries using templates with binding patterns. In PODS, 1995. Google ScholarDigital Library
- P. Senellart, A. Mittal, D. Muschick, R. Gilleron, and M. Tommasi. Automatic wrapper induction from hidden-Web sources with domain knowledge. In WIDM, 2008. Google ScholarDigital Library
- D. E. Simmen, M. Altinel, V. Markl, S. Padmanabhan, and A. Singh. Damia: data mashups for intranet applications. In SIGMOD, 2008. Google ScholarDigital Library
- F. M. Suchanek, G. Kasneci, and G. Weikum. YAGO: A Core of Semantic Knowledge. In 16th international World Wide Web conference (WWW 2007), New York, NY, USA, 2007. ACM Press. Google ScholarDigital Library
- M. Technologies. The freebase project. http://freebase.com.Google Scholar
- S. Thakkar, J. L. Ambite, and C. A. Knoblock. Composing, optimizing, and executing plans for bioinformatics web services. VLDB J., 14(3), 2005. Google ScholarDigital Library
- V. Vassalos and Y. Papakonstantinou. Describing and using query capabilities of heterogeneous sources. In VLDB, 1997. Google ScholarDigital Library
- Word Wide Web Consortium. RDF Vocabulary Description Language 1.0: RDF Schema. W3C Recommendation 2004-02-10.Google Scholar
- Word Wide Web Consortium. XSL Transformations (XSLT). W3C Recommendation 1999-11-16.Google Scholar
- World Wide Web Consortium. SPARQL Query Language for RDF (W3C Recommendation 2008-01-15), 2008. http://www.w3.org/TR/rdf-sparql-query/.Google Scholar
- F. Wu and D. S. Weld. Automatically refining the Wikipedia infobox ontology. In Proc. of the Int. WWW Conf., 2008. Google ScholarDigital Library
- V. Zadorozhny, L. Raschid, M.-E. Vidal, T. Urhan, and L. Bright. Efficient evaluation of queries in a mediator for websources. In SIGMOD Conference, 2002. Google ScholarDigital Library
Index Terms
- Active knowledge: dynamically enriching RDF knowledge bases by web services
Recommendations
Description logic programs: combining logic programs with description logic
WWW '03: Proceedings of the 12th international conference on World Wide WebWe show how to interoperate, semantically and inferentially, between the leading Semantic Web approaches to rules (RuleML Logic Programs) and ontologies (OWL/DAML+OIL Description Logic) via analyzing their expressive intersection. To do so, we define a ...
Towards Adaptive and Semantic Database Model for RDF Data Stores
CISIS '12: Proceedings of the 2012 Sixth International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS)RDF Schema is a basic and yet very important language for specifying ontologies in the context of Semantic Web. Ontologies can be used to obtain more information from that explicitly stated. Traditionally, the process of revealing implicit knowledge, ...
Semantic Web Services in Action - Enterprise Information Integration
ICSOC '07: Proceedings of the 5th international conference on Service-Oriented ComputingWith the development and maturity of Service Oriented Architectures (SOA) to support business-to-business transactions, enterprises are using Web services to expose the public functionalities associated with internal systems and business processes. ...
Comments