Skip to main content

Logic-Based Techniques in Data Integration

  • Chapter
Logic-Based Artificial Intelligence

Part of the book series: The Springer International Series in Engineering and Computer Science ((SECS,volume 597))

Abstract

The data integration problem is to provide uniform access to multiple heterogeneous information sources available online (e.g., databases on the WWW). This problem has recently received considerable attention from researchers in the fields of Artificial Intelligence and Database Systems. The data integration problem is complicated by the facts that (1) sources contain closely related and overlapping data, (2) data is stored in multiple data models and schemas, and (3) data sources have differing query processing capabilities.

A key element in a data integration system is the language used to describe the contents and capabilities of the data sources. While such a language needs to be as expressive as possible, it should also enable to efficiently address the main inference problem that arises in this context: to translate a user query that is formulated over a mediated schema into a query on the local schemas. This paper describes several languages for describing contents of data sources, the tradeoffs between them, and the associated reformulation algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Abiteboul, S. and Duschka, O. (1998). Complexity of answering queries using materialized views. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 254–263, Seattle, WA.

    Google Scholar 

  • Adali, S., Candan, K., Papakonstantinou, Y., and Subrahmanian, V. (1996). Query caching and optimization in distributed mediator systems. In Proc. of ACM SIGMOD Conf. on Management of Data, pages 137–148, Montreal, Canada.

    Google Scholar 

  • Arens, Y., Knoblock, C. A., and Shen, W.-M. (1996). Query reformulation for dynamic information integration. International Journal on Intelligent and Cooperative Information Systems, (6) 2/3:99–130.

    Google Scholar 

  • Beeri, C., Levy, A. Y., and Rousset, M.-C. (1997). Rewriting queries using views in description logics. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 99–108, Tucson, Arizona.

    Google Scholar 

  • Cadoli, M., Palopoli, L., and Lenzerini, M. (1997). Datalog and description logics: Expressive power. In Proceedings of the International Workshop on Database Programming Languages, 281–198.

    Google Scholar 

  • Calvanese, D., Giacomo, G. D., and Lenzerini, M. (1999). Answering queries using views in description logics. In Working notes of the KRDB Workshop pages 6–10.

    Google Scholar 

  • Catarci, T. and Lenzerini, M. (1993). Representing and using interschema knowledge in cooperative information systems. Journal of Intelligent and Cooperative Information Systems, 55–62.

    Google Scholar 

  • Chandra, A. and Merlin, P. (1977). Optimal implementation of conjunctive queries in relational databases. In Proceedings of the Ninth Annual ACM Symposium on Theory of Computing, pages 77–90.

    Chapter  Google Scholar 

  • Chaudhuri, S., Krishnamurthy, R., Potamianos, S., and Shim, K. (1995). Optimizing queries with materialized views. In Proc. of Int. Conf. on Data Engineering (ICDE), Taipei, Taiwan, 190–200.

    Google Scholar 

  • Chaudhuri, S. and Vardi, M. (1993). Optimizing real conjunctive queries. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 59–70, Washington D.C.

    Google Scholar 

  • Chaudhuri, S. and Vardi, M. (1994). On the complexity of equivalence between recursive and nonrecursive Datalog programs. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 55–66, Minneapolis, Minnesota.

    Google Scholar 

  • Cohen, S., Nutt, W., and Serebrenik, A. (1999). Rewriting aggregate queries using views. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 155–166.

    Google Scholar 

  • Cohen, W. (1998). Integration of heterogeneous databases without common domains using queries based on textual similarity. In Proc. of ACM SIGMOD Conf. on Management of Data, pages 201–210, Seattle, WA.

    Google Scholar 

  • Donini, F. M., Lenzerini, M., Nardi, D., and Schaerf, A. (1991). A hybrid system with Datalog and concept languages. In Ardizzone, E., Gaglio, S., and Sorbello, F., editors, Trends in Artificial Intelligence, volume LNAI 549, pages 88–97. Springer Verlag.

    Google Scholar 

  • Duschka, O. (1997). Query optimization using local completeness. In Proceedings of the AAAI Fourteenth National Conference on Artificial Intelligence, 249–255.

    Google Scholar 

  • Duschka, O., Genesereth, M., and Levy, A. (1999). Recursive query plans for data integration. Journal of Logic Programming, special issue on Logic Based Heterogeneous Information Systems, 43(l):49–73.

    MathSciNet  Google Scholar 

  • Duschka, O. M. and Genesereth, M. R. (1997a). Answering recursive queries using views. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), 109–116, Tucson, Arizona.

    Google Scholar 

  • Duschka, O. M. and Genesereth, M. R. (1997b). Query planning in infomaster. In Proceedings of the ACM Symposium on Applied Computing, San Jose, CA.

    Google Scholar 

  • Duschka, O. M. and Levy, A. Y. (1997). Recursive plans for information gathering. In Proceedings of the 15th International Joint Conference on Artificial Intelligence, 778–784.

    Google Scholar 

  • Etzioni, O., Golden, K., and Weld, D. (1994). Tractable closed world reasoning with updates. In Proceedings of the Conference on Principles of Knowledge Representation and Reasoning, KR-94, pages 178–189. Extended version to appear in Artificial Intelligence.

    Google Scholar 

  • Florescu, D., Raschid, L., and Valduriez, P. (1996). Answering queries using OQL view expressions. In Workshop on Materialized Views, in cooperation with ACM SIGMOD, pages 84–90, Montreal, Canada.

    Google Scholar 

  • Friedman, M., Levy, A., and Millstein, T. (1999). Navigational plans for data integration. In Proceedings of the National Conference on Artificial Intelligence, pages 67–73.

    Google Scholar 

  • Friedman, M. and Weld, D. (1997). Efficient execution of information gathering plans. In Proceedings of the International Joint Conference on Artificial Intelligence, Nagoya, Japan, 785–791.

    Google Scholar 

  • Garcia-Molina, H., Papakonstantinou, Y., Quass, D., Rajaraman, A., Sagiv, Y., Ullman, J., and Widom, J. (1997). The TSIMMIS project: Integration of heterogeneous information sources. Journal of Intelligent Information Systems, 8(2):117–132.

    Google Scholar 

  • Ives, Z., Florescu, D., Friedman, M., Levy, A., and Weld, D. (1999). An adaptive query execution engine for data integration. In Proc. of ACM SIGM OD Conf. on Management of Data, pages 299–310.

    Google Scholar 

  • Klug, A. (1988). On conjunctive queries containing inequalities. Journal of the ACM, pages 35(1): 146–160.

    Article  MathSciNet  MATH  Google Scholar 

  • Kwok, C. T. and Weld, D. S. (1996). Planning to gather information. In Proceedings of the AAAI Thirteenth National Conference on Artificial Intelligence, 32–39.

    Google Scholar 

  • Lattes, V. and Rousset, M.-C. (1998). The use of the CARIN language and algorithms for information integration: the PICSEL project. In Proceedings of the ECAI-98 Workshop on Intelligent Information Integration.

    Google Scholar 

  • Levy, A. and Rousset, M.-C. (1998). Combining Horn rules and description logics in carin. Artificial Intelligence, 104:165–209.

    Article  MathSciNet  MATH  Google Scholar 

  • Levy, A. Y. (1996). Obtaining complete answers from incomplete databases. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), Bombay, India, 402–412.

    Google Scholar 

  • Levy, A. Y. (1999). Answering queries using views: A survey. Submitted for publication.

    Google Scholar 

  • Levy, A. Y., Fikes, R. E., and Sagiv, S. (1997). Speeding up inferences using relevance reasoning: A formalism and algorithms. Artificial Intelligence, 97(1–2).

    Google Scholar 

    Google Scholar 

  • Levy, A. Y., Mendelzon, A. O., Sagiv, Y., and Srivastava, D. (1995). Answering queries using views. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 95–104, San Jose, CA.

    Google Scholar 

  • Levy, A. Y., Rajaraman, A., and Ordille, J. J. (1996a). Query answering algorithms for information agents. In Proceedings of AAAI, pages 40–47.

    Google Scholar 

  • Levy, A. Y., Rajaraman, A., and Ordille, J.J. (1996b). Querying heterogeneous information sources using source descriptions. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), Bombay, India, pages 251–262.

    Google Scholar 

  • Levy, A. Y., Rajaraman, A., and Ullman, J. D. (1996c). Answering queries using limited external processors. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), 227–237, Montreal, Canada.

    Google Scholar 

  • Levy, A. Y. and Sagiv, Y. (1993). Queries independent of updates. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), pages 171–181, Dublin, Ireland.

    Google Scholar 

  • Levy, A. Y. and Suciu, D. (1997). Deciding containment for queries with complex objects and aggregations. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 20–31 Tucson, Arizona.

    Google Scholar 

  • Litwin, W., Mark, L., and Roussopoulos, N. (1990). Interoperability of multiple autonomous databases. ACM Computing Surveys, 22 (3):267–293.

    Article  Google Scholar 

  • MacGregor, R. M. (1994). A description classifier for the predicate calculus. In Proceedings of the Twelfth National Conference on Artificial Intelligence, pages 213–220.

    Google Scholar 

  • Papakonstantinou, Y., Abiteboul, S., and Garcia-Molina, H. (1996). Object fusion in mediator systems. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), pages 413–424, Bombay, India.

    Google Scholar 

  • Pottinger, R. and Levy, A. (1999). A scalable algorithm for answering queries using views. To appear in the proceedings of the 26th conference on very large databases, VLDB-2000, Cairo, Egypt, 2000.

    Google Scholar 

  • Qian, X. (1996). Query folding. In Proc. of Int. Conf. on Data Engineering (ICDE), pages 48–55, New Orleans, LA.

    Google Scholar 

  • Sagiv, Y. (1988). Optimizing Datalog programs. In Minker, J., editor, Foundations of Deductive Databases and Logic Programming, pages 659–698. Morgan Kaufmann, Los Altos, CA.

    Google Scholar 

  • Sagiv, Y. and Yannakakis, M. (1981). Equivalence among relational expressions with the union and difference operators. Journal of the ACM, 27(4):633–655.

    Article  MathSciNet  Google Scholar 

  • Shmueli, O. (1993). Equivalence of Datalog queries is undecidable. Journal of Logic Programming, 15:231–241.

    Article  MathSciNet  MATH  Google Scholar 

  • Srivastava, D., Dar, S., Jagadish, H. V., and Levy, A. Y. (1996). Answering SQL queries using materialized views. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), pages 318–329, Bombay, India.

    Google Scholar 

  • Srivastava, D. and Ramakrishnan, R. (1992). Pushing constraint selections. In Proc. of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 301–315, San Diego, CA.

    Google Scholar 

  • Tsatalos, O. G., Solomon, M. H., and Ioannidis, Y. E. (1996). The GMAP: A versatile tool for physical data independence. VLDB Journal, 5(2): 101–118.

    Article  Google Scholar 

  • Vassalos, V. and Papakonstantinou, Y. (1997). Describing and using query capabilities of heterogeneous sources. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), pages 256–265, Athens, Greece.

    Google Scholar 

  • Yang, H. Z. and Larson, P. A. (1987). Query transformation for PSJ-queries. In Proc. of the Int. Conf on Very Large Data Bases (VLDB), pages 245-254, Brighton, England.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer Science+Business Media New York

About this chapter

Cite this chapter

Levy, A.Y. (2000). Logic-Based Techniques in Data Integration. In: Minker, J. (eds) Logic-Based Artificial Intelligence. The Springer International Series in Engineering and Computer Science, vol 597. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-1567-8_24

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-1567-8_24

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-5618-9

  • Online ISBN: 978-1-4615-1567-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics