skip to main content
research-article

Data fusion: resolving data conflicts for integration

Published:01 August 2009Publication History
Skip Abstract Section

Abstract

The amount of information produced in the world increases by 30% every year and this rate will only go up. With advanced network technology, more and more sources are available either over the Internet or in enterprise intranets. Modern data management applications, such as setting up Web portals, managing enterprise data, managing community data, and sharing scientific data, often require integrating available data sources and providing a uniform interface for users to access data from different sources; such requirements have been driving fruitful research on data integration over the last two decades [11, 13].

References

  1. P. A. Bernstein and S. Melnik. Model management 2.0: manipulating richer mappings. In Proc. of SIGMOD, pages 1--12, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. Berti-Equille, A. D. Sarma, X. L. Dong, A. Marian, and D. Srivastava. Sailing the information ocean with awareness of currents: Discovery and application of source dependence. In CIDR, 2009.Google ScholarGoogle Scholar
  3. J. Bleiholder and F. Naumann. Data fusion. ACM Computing Surveys, 41(1):1--41, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. W. W. Cohen, P. Ravikumar, and S. E. Fienberg. A comparison of string distance metrics for name-matching tasks. In Proc. of IIWEB, pages 73--78, 2003.Google ScholarGoogle Scholar
  5. X. L. Dong, L. Berti-Equille, and D. Srivastava. Integrating conflicting data: the role of source dependence. PVLDB, 2(1), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. X. L. Dong, L. Berti-Equille, and D. Srivastava. Truth discovery and copying detection in a dynamic world. PVLDB, 2(1), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. K. Elmagarmid, P. G. Ipeirotis, and V. S. Verykios. Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering (TKDE), 19(1):1--16, 2007. Google ScholarGoogle ScholarCross RefCross Ref
  8. R. Fagin, P. G. Kolaitis, and L. Popa. Data exchange: Getting to the core. ACM Transactions on Database Systems (TODS), 30(1):174--201, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. A. Galindo-Legaria. Outerjoins as disjunctions. In Proceedings of the ACM International Conference on Management of Data (SIGMOD), pages 348--358, Minneapolis, Minnesota, May 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Greco, L. Pontieri, and E. Zumpano. Integrating and managing conflicting data. In Revised Papers from the 4th International Andrei Ershov Memorial Conference on Perspectives of System Informatics, pages 349--362, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. L. M. Haas. Beauty and the beast: The theory and practice of information integration. In Proc. of ICDT, pages 28--43, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Y. Halevy. Answering queries using views: A survey. VLDB Journal, 10(4):270--294, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Y. Halevy, A. Rajaraman, and J. J. Ordille. Data integration: The teenage years. In Proc. of VLDB, pages 9--16, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. E. Rahm and P. A. Bernstein. A survey of approaches to automatic schema matching. The VLDB Journal, 10(4):334--350, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. W. Winkler. Overview of record linkage and current research directions. Technical report, Statistical Research Division, U. S. Bureau of the Census, 2006.Google ScholarGoogle Scholar
  16. M. Wu and A. Marian. Corroborating answers from multiple web sources. In Proc. of WebDB, 2007.Google ScholarGoogle Scholar
  17. L. L. Yan and M. T. Özsu. Conflict tolerant queries in AURORA. In Proceedings of the International Conference on Cooperative Information Systems (CoopIS), pages 279--290, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. X. Yin, J. Han, and P. S. Yu. Truth discovery with multiple conflicting information providers on the web. In Proc. of SIGKDD, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Data fusion: resolving data conflicts for integration

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image Proceedings of the VLDB Endowment
            Proceedings of the VLDB Endowment  Volume 2, Issue 2
            August 2009
            367 pages

            Publisher

            VLDB Endowment

            Publication History

            • Published: 1 August 2009
            Published in pvldb Volume 2, Issue 2

            Qualifiers

            • research-article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader