Abstract
The data integration problem is to provide uniform access to multiple heterogeneous information sources available online (e.g., databases on the WWW). This problem has recently received considerable attention from researchers in the fields of Artificial Intelligence and Database Systems. The data integration problem is complicated by the facts that (1) sources contain closely related and overlapping data, (2) data is stored in multiple data models and schemas, and (3) data sources have differing query processing capabilities.
A key element in a data integration system is the language used to describe the contents and capabilities of the data sources. While such a language needs to be as expressive as possible, it should also enable to efficiently address the main inference problem that arises in this context: to translate a user query that is formulated over a mediated schema into a query on the local schemas. This paper describes several languages for describing contents of data sources, the tradeoffs between them, and the associated reformulation algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abiteboul, S. and Duschka, O. (1998). Complexity of answering queries using materialized views. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 254–263, Seattle, WA.
Adali, S., Candan, K., Papakonstantinou, Y., and Subrahmanian, V. (1996). Query caching and optimization in distributed mediator systems. In Proc. of ACM SIGMOD Conf. on Management of Data, pages 137–148, Montreal, Canada.
Arens, Y., Knoblock, C. A., and Shen, W.-M. (1996). Query reformulation for dynamic information integration. International Journal on Intelligent and Cooperative Information Systems, (6) 2/3:99–130.
Beeri, C., Levy, A. Y., and Rousset, M.-C. (1997). Rewriting queries using views in description logics. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 99–108, Tucson, Arizona.
Cadoli, M., Palopoli, L., and Lenzerini, M. (1997). Datalog and description logics: Expressive power. In Proceedings of the International Workshop on Database Programming Languages, 281–198.
Calvanese, D., Giacomo, G. D., and Lenzerini, M. (1999). Answering queries using views in description logics. In Working notes of the KRDB Workshop pages 6–10.
Catarci, T. and Lenzerini, M. (1993). Representing and using interschema knowledge in cooperative information systems. Journal of Intelligent and Cooperative Information Systems, 55–62.
Chandra, A. and Merlin, P. (1977). Optimal implementation of conjunctive queries in relational databases. In Proceedings of the Ninth Annual ACM Symposium on Theory of Computing, pages 77–90.
Chaudhuri, S., Krishnamurthy, R., Potamianos, S., and Shim, K. (1995). Optimizing queries with materialized views. In Proc. of Int. Conf. on Data Engineering (ICDE), Taipei, Taiwan, 190–200.
Chaudhuri, S. and Vardi, M. (1993). Optimizing real conjunctive queries. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 59–70, Washington D.C.
Chaudhuri, S. and Vardi, M. (1994). On the complexity of equivalence between recursive and nonrecursive Datalog programs. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 55–66, Minneapolis, Minnesota.
Cohen, S., Nutt, W., and Serebrenik, A. (1999). Rewriting aggregate queries using views. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 155–166.
Cohen, W. (1998). Integration of heterogeneous databases without common domains using queries based on textual similarity. In Proc. of ACM SIGMOD Conf. on Management of Data, pages 201–210, Seattle, WA.
Donini, F. M., Lenzerini, M., Nardi, D., and Schaerf, A. (1991). A hybrid system with Datalog and concept languages. In Ardizzone, E., Gaglio, S., and Sorbello, F., editors, Trends in Artificial Intelligence, volume LNAI 549, pages 88–97. Springer Verlag.
Duschka, O. (1997). Query optimization using local completeness. In Proceedings of the AAAI Fourteenth National Conference on Artificial Intelligence, 249–255.
Duschka, O., Genesereth, M., and Levy, A. (1999). Recursive query plans for data integration. Journal of Logic Programming, special issue on Logic Based Heterogeneous Information Systems, 43(l):49–73.
Duschka, O. M. and Genesereth, M. R. (1997a). Answering recursive queries using views. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), 109–116, Tucson, Arizona.
Duschka, O. M. and Genesereth, M. R. (1997b). Query planning in infomaster. In Proceedings of the ACM Symposium on Applied Computing, San Jose, CA.
Duschka, O. M. and Levy, A. Y. (1997). Recursive plans for information gathering. In Proceedings of the 15th International Joint Conference on Artificial Intelligence, 778–784.
Etzioni, O., Golden, K., and Weld, D. (1994). Tractable closed world reasoning with updates. In Proceedings of the Conference on Principles of Knowledge Representation and Reasoning, KR-94, pages 178–189. Extended version to appear in Artificial Intelligence.
Florescu, D., Raschid, L., and Valduriez, P. (1996). Answering queries using OQL view expressions. In Workshop on Materialized Views, in cooperation with ACM SIGMOD, pages 84–90, Montreal, Canada.
Friedman, M., Levy, A., and Millstein, T. (1999). Navigational plans for data integration. In Proceedings of the National Conference on Artificial Intelligence, pages 67–73.
Friedman, M. and Weld, D. (1997). Efficient execution of information gathering plans. In Proceedings of the International Joint Conference on Artificial Intelligence, Nagoya, Japan, 785–791.
Garcia-Molina, H., Papakonstantinou, Y., Quass, D., Rajaraman, A., Sagiv, Y., Ullman, J., and Widom, J. (1997). The TSIMMIS project: Integration of heterogeneous information sources. Journal of Intelligent Information Systems, 8(2):117–132.
Ives, Z., Florescu, D., Friedman, M., Levy, A., and Weld, D. (1999). An adaptive query execution engine for data integration. In Proc. of ACM SIGM OD Conf. on Management of Data, pages 299–310.
Klug, A. (1988). On conjunctive queries containing inequalities. Journal of the ACM, pages 35(1): 146–160.
Kwok, C. T. and Weld, D. S. (1996). Planning to gather information. In Proceedings of the AAAI Thirteenth National Conference on Artificial Intelligence, 32–39.
Lattes, V. and Rousset, M.-C. (1998). The use of the CARIN language and algorithms for information integration: the PICSEL project. In Proceedings of the ECAI-98 Workshop on Intelligent Information Integration.
Levy, A. and Rousset, M.-C. (1998). Combining Horn rules and description logics in carin. Artificial Intelligence, 104:165–209.
Levy, A. Y. (1996). Obtaining complete answers from incomplete databases. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), Bombay, India, 402–412.
Levy, A. Y. (1999). Answering queries using views: A survey. Submitted for publication.
Levy, A. Y., Fikes, R. E., and Sagiv, S. (1997). Speeding up inferences using relevance reasoning: A formalism and algorithms. Artificial Intelligence, 97(1–2).
Levy, A. Y., Mendelzon, A. O., Sagiv, Y., and Srivastava, D. (1995). Answering queries using views. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 95–104, San Jose, CA.
Levy, A. Y., Rajaraman, A., and Ordille, J. J. (1996a). Query answering algorithms for information agents. In Proceedings of AAAI, pages 40–47.
Levy, A. Y., Rajaraman, A., and Ordille, J.J. (1996b). Querying heterogeneous information sources using source descriptions. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), Bombay, India, pages 251–262.
Levy, A. Y., Rajaraman, A., and Ullman, J. D. (1996c). Answering queries using limited external processors. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), 227–237, Montreal, Canada.
Levy, A. Y. and Sagiv, Y. (1993). Queries independent of updates. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), pages 171–181, Dublin, Ireland.
Levy, A. Y. and Suciu, D. (1997). Deciding containment for queries with complex objects and aggregations. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 20–31 Tucson, Arizona.
Litwin, W., Mark, L., and Roussopoulos, N. (1990). Interoperability of multiple autonomous databases. ACM Computing Surveys, 22 (3):267–293.
MacGregor, R. M. (1994). A description classifier for the predicate calculus. In Proceedings of the Twelfth National Conference on Artificial Intelligence, pages 213–220.
Papakonstantinou, Y., Abiteboul, S., and Garcia-Molina, H. (1996). Object fusion in mediator systems. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), pages 413–424, Bombay, India.
Pottinger, R. and Levy, A. (1999). A scalable algorithm for answering queries using views. To appear in the proceedings of the 26th conference on very large databases, VLDB-2000, Cairo, Egypt, 2000.
Qian, X. (1996). Query folding. In Proc. of Int. Conf. on Data Engineering (ICDE), pages 48–55, New Orleans, LA.
Sagiv, Y. (1988). Optimizing Datalog programs. In Minker, J., editor, Foundations of Deductive Databases and Logic Programming, pages 659–698. Morgan Kaufmann, Los Altos, CA.
Sagiv, Y. and Yannakakis, M. (1981). Equivalence among relational expressions with the union and difference operators. Journal of the ACM, 27(4):633–655.
Shmueli, O. (1993). Equivalence of Datalog queries is undecidable. Journal of Logic Programming, 15:231–241.
Srivastava, D., Dar, S., Jagadish, H. V., and Levy, A. Y. (1996). Answering SQL queries using materialized views. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), pages 318–329, Bombay, India.
Srivastava, D. and Ramakrishnan, R. (1992). Pushing constraint selections. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 301–315, San Diego, CA.
Tsatalos, O. G., Solomon, M. H., and Ioannidis, Y. E. (1996). The GMAP: A versatile tool for physical data independence. VLDB Journal, 5(2): 101–118.
Vassalos, V. and Papakonstantinou, Y. (1997). Describing and using query capabilities of heterogeneous sources. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), pages 256–265, Athens, Greece.
Yang, H. Z. and Larson, P. A. (1987). Query transformation for PSJ-queries. In Proc. of the Int. Conf on Very Large Data Bases (VLDB), pages 245-254, Brighton, England.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer Science+Business Media New York
About this chapter
Cite this chapter
Levy, A.Y. (2000). Logic-Based Techniques in Data Integration. In: Minker, J. (eds) Logic-Based Artificial Intelligence. The Springer International Series in Engineering and Computer Science, vol 597. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-1567-8_24
Download citation
DOI: https://doi.org/10.1007/978-1-4615-1567-8_24
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-5618-9
Online ISBN: 978-1-4615-1567-8
eBook Packages: Springer Book Archive