Abstract
The amount of information produced in the world increases by 30% every year and this rate will only go up. With advanced network technology, more and more sources are available either over the Internet or in enterprise intranets. Modern data management applications, such as setting up Web portals, managing enterprise data, managing community data, and sharing scientific data, often require integrating available data sources and providing a uniform interface for users to access data from different sources; such requirements have been driving fruitful research on data integration over the last two decades [11, 13].
- P. A. Bernstein and S. Melnik. Model management 2.0: manipulating richer mappings. In Proc. of SIGMOD, pages 1--12, 2007. Google ScholarDigital Library
- L. Berti-Equille, A. D. Sarma, X. L. Dong, A. Marian, and D. Srivastava. Sailing the information ocean with awareness of currents: Discovery and application of source dependence. In CIDR, 2009.Google Scholar
- J. Bleiholder and F. Naumann. Data fusion. ACM Computing Surveys, 41(1):1--41, 2008. Google ScholarDigital Library
- W. W. Cohen, P. Ravikumar, and S. E. Fienberg. A comparison of string distance metrics for name-matching tasks. In Proc. of IIWEB, pages 73--78, 2003.Google Scholar
- X. L. Dong, L. Berti-Equille, and D. Srivastava. Integrating conflicting data: the role of source dependence. PVLDB, 2(1), 2009. Google ScholarDigital Library
- X. L. Dong, L. Berti-Equille, and D. Srivastava. Truth discovery and copying detection in a dynamic world. PVLDB, 2(1), 2009. Google ScholarDigital Library
- A. K. Elmagarmid, P. G. Ipeirotis, and V. S. Verykios. Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering (TKDE), 19(1):1--16, 2007. Google ScholarCross Ref
- R. Fagin, P. G. Kolaitis, and L. Popa. Data exchange: Getting to the core. ACM Transactions on Database Systems (TODS), 30(1):174--201, 2005. Google ScholarDigital Library
- C. A. Galindo-Legaria. Outerjoins as disjunctions. In Proceedings of the ACM International Conference on Management of Data (SIGMOD), pages 348--358, Minneapolis, Minnesota, May 1994. Google ScholarDigital Library
- S. Greco, L. Pontieri, and E. Zumpano. Integrating and managing conflicting data. In Revised Papers from the 4th International Andrei Ershov Memorial Conference on Perspectives of System Informatics, pages 349--362, 2001. Google ScholarDigital Library
- L. M. Haas. Beauty and the beast: The theory and practice of information integration. In Proc. of ICDT, pages 28--43, 2007. Google ScholarDigital Library
- A. Y. Halevy. Answering queries using views: A survey. VLDB Journal, 10(4):270--294, 2001. Google ScholarDigital Library
- A. Y. Halevy, A. Rajaraman, and J. J. Ordille. Data integration: The teenage years. In Proc. of VLDB, pages 9--16, 2006. Google ScholarDigital Library
- E. Rahm and P. A. Bernstein. A survey of approaches to automatic schema matching. The VLDB Journal, 10(4):334--350, 2001. Google ScholarDigital Library
- W. Winkler. Overview of record linkage and current research directions. Technical report, Statistical Research Division, U. S. Bureau of the Census, 2006.Google Scholar
- M. Wu and A. Marian. Corroborating answers from multiple web sources. In Proc. of WebDB, 2007.Google Scholar
- L. L. Yan and M. T. Özsu. Conflict tolerant queries in AURORA. In Proceedings of the International Conference on Cooperative Information Systems (CoopIS), pages 279--290, 1999. Google ScholarDigital Library
- X. Yin, J. Han, and P. S. Yu. Truth discovery with multiple conflicting information providers on the web. In Proc. of SIGKDD, 2007. Google ScholarDigital Library
Index Terms
- Data fusion: resolving data conflicts for integration
Recommendations
Data fusion
The development of the Internet in recent years has made it possible and useful to access many different information systems anywhere in the world to obtain information. While there is much research on the integration of heterogeneous information ...
Data fusion: resolving conflicts from multiple sources
WAIM'13: Proceedings of the 14th international conference on Web-Age Information ManagementMany data management applications, such as setting up Web portals, managing enterprise data, managing community data, and sharing scientific data, require integrating data from multiple sources. Each of these sources provides a set of values and ...
Multi-data source fusion
This paper describes a new approach of heterogeneous data source fusion. Data sources are either static or active: static data sources can be structured or semi-structured, whereas active sources are services. In order to develop data sources fusion ...
Comments