Abstract
In this chapter, data mining and knowledge discovery (DMKD) is presented with basic concepts, a brief history of its evolution, mathematical foundations, and usable techniques, along with the data warehouse and the decision support system (DSS). First, dataset and knowledge will be defined and elucidated as under DMKD. DMKD is a discovery process with different hierarchies, granularities, and/or scales. For a set of concepts that may be best understood if being viewed and explained from various perspectives, the chapter starts with a definition followed by a table explaining DMKD from different views (Sect. 5.1). The evolution of DMKD is then briefly tracked from the rapid advance in massive data to the birth of DMKD (Sect. 5.2). Some mathematical foundations are given in Sect. 5.3, i.e. probability theory, statistics, fuzzy set, rough set, data fields, and cloud models. Section 5.4 introduces some usable DMKD techniques. DMKD is used to discover a set of rules and exceptions with association, classification, clustering, prediction, discrimination, and exception detection. In Sects. 5.5 and 5.6, data warehouses and decision support systems are given. The first one mentioned is one of the data sources for DMKD, and DMKD is a new technique to assist the latter with a task. Finally, trends and perspectives are summarized and forecasted into two promising fields, web mining and spatial data mining (Sect. 5.7).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Abbreviations
- BI:
-
business intelligence
- DBMS:
-
database management system
- DMKD:
-
data mining and knowledge discovery
- DSS:
-
decision support system
- ERP:
-
enterprise resource planning
- ETL:
-
extraction, transformation, and loading
- GIS:
-
Geographic Information System
- GNSS:
-
Global Navigation Satellite System
- HTML:
-
Hypertext Markup Language
- OLAP:
-
Online analytical processing
- OLTP:
-
online transactional processing
- RS:
-
remote sensing
- SDMKD:
-
spatial data mining and knowledge discovery
- SDSS:
-
spatial decision support system
References
J. Wang: Encyclopedia of Data Warehousing and Mining (Idea Group Reference, Hershey 2006)
J. Han, M. Kamber: Data Mining: Concepts and Techniques, 2nd edn. (Academic, San Francisco 2001)
D.R. Li, S.L. Wang, D.Y. Li: Theories and Applications of Spatial Data Mining (Science Press, Beijing 2006)
S. Shekar, H. Xiong (Eds.): Encyclopedia of GIS (Springer, New York 2007)
M.W. Berry: Survey of Text Mining: Clustering, Classification, and Retrieval Scanned by Velocity (Springer, Berlin Heidelberg 2004)
T. Dasu: Exploratory Data Mining and Data Cleaning (Wiley, New York 2003)
M. Ester, A. Frommelt, H.-P. Kriegel, J. Sander: Spatial data mining: databases primitives, algorithms and efficient DBMS support, Data Min. Knowl. Discov. 4, 193–216 (2000)
K. Thearling: An Introduction to Data Mining (Vertex Business Services, Richardson 2001)
S.L. Wang: Data field and cloud model-based spatial data mining and knowledge discovery. Dissertation, Wuhan University, Wuhan (2002)
J. Wang: Data Mining: Opportunities and Challenges (Idea Group Reference, Hershey 2002)
D.R. Li, T. Cheng: KDG: Knowledge discovery from GIS – Propositions on the use of KDD in an intelligent GIS, Proc. ACTES, Can. Conf. GIS (1994)
W.H. Inmon: Building the Data Warehouse (QED, London 1992)
R. Kimball: The Data Warehouse Lifecycle Toolkit (Wiley, New York 2008)
W.H. Inmon: Tech Topic: What is a Data Warehouse?, Vol. 1 (Prism Solutions, Brighton 1995)
W.H. Inmon: Building the Data Warehouse, 4th edn. (Wiley, New York 2005)
F. Burstein, C.W. Holsapple: Handbook of Decision Support System (Springer, Berlin Heidelberg 2008)
D.J. Power: A Brief History of Decision Support Systems, (DSSResources.COM, Cedar Falls 2007) available at http://DSSResources.COM/history/dsshistory.html, version 4.0 (March 10, 2007)
P.J. Densham, M.F. Goodchild: Spatial decision support systems: A research agenda, Proc. GIS/LISʼ89, Orlando (1989) pp. 707–716
A.M. Arthurs: Probability Theory (Dover, London 1965)
G. Shafer: A Mathematical Theory of Evidence (Princeton Univ. Press, Princeton 1976)
S.K. Thompson: Sampling (Wiley, New York 1992)
N. Cressie: Statistics for Spatial Data (Wiley, New York 1993)
J. Grabmeier, A. Rudolph: Techniques of cluster algorithms in data mining, Data Min. Knowl. Discov. 6, 303–360 (2002)
L.A. Zadeh: The concept of linguistic variable ant its application to approximate reasoning, Inform. Sci. 8, 199–249 (1975)
Z.Y. Wang, G.J. Klir: Fuzzy Measure Theory (Plenum, New York 1992)
L. Polkowski, S. Tsumoto, T.Y. Lin: Rough Set Methods and Applications (Physica, Heidelberg 2000)
Z. Pawlak: Rough Sets: Theoretical Aspects of Reasoning About Data (Kluwer, Dordrecht 1991)
L. Polkowski, A. Skowron: Rough Sets in Knowledge Discovery 1 (Physica, Heidelberg 1998)
L. Polkowski, A. Skowron: Rough Sets in Knowledge Discovery 2 (Physica, Heidelberg 1998)
Y.Y. Yao, S.K.M. Wong, T.Y. Lin: A review of rough set models. In: Rough Sets and Data Mining Analysis for Imprecise Data, ed. by Y. Lin, N. Cercone (Kluwer, London 1997) pp. 47–75
D.Y. Li, Y. Du: Artificial Intelligence with Uncertainty (National Defense Industry Press, Beijing 2005)
D.L. Olson, D. Dursun: Advanced Data Mining Techniques (Springer, Berlin Heidelberg 2008)
K.C. Di: Spatial Data Mining and Knowledge Discovery (Wuhan Univ. Press, Wuhan 2001)
D.R. Li, S.L. Wang, D.Y. Li, X.Z. Wang: Theories and techniques of spatial data mining and knowledge discovery, Geomat. Inf. Sci. Wuhan Univ. 27(3), 221–233 (2002)
D.T. Larose: Data Mining Methods and Models (Wiley, New York 2006)
T. Bayes: An essay toward solving a problem in the doctrine of chances, Philos. Trans. R. Soc. Lond. 53, 370–418 (1764)
J. Stutz, P. Cheeseman: A Short Exposition on Bayesian Inference and Probability (NASA Ames Research Centre, Data Learning Group, Moffett Field 1994)
J. James: Bayesʼ Theorem, Stanford Encyclopedia of Philosophy (Metaphysics Res. Lab, Stanford 2003)
N. Friedman, D. Geiger, M. Goldszmidt: Bayesian network classifiers, Mach. Learn. 29, 131–163 (1997)
Daryle Niedermayer I.S.P.: An introduction to Bayesian networks and their contemporary applications, Innovations in Bayesian Networks (Springer, Berlin Heidelberg 2008) pp. 117–130
N. Friedman, M. Goldszmidt: Learning Bayesian Network from Data (SRI International, Menlo Park 1998)
D. Heckerman, D. Geiger: Learning with Bayesian Networks, Tech. Rep. MSR-TR-95-06 (Microsoft Research, Redmond 1995) available at http://research.microsoft.com/apps/pubs/default.aspx?id=69588
D. Heckerman: Bayesian networks for data mining, Data Min. Knowl. Discov. 1, 79–119 (1997)
S.L. Wang, X.Z. Wang: A Fuzzy Comprehensive Clustering Method ADMA 2007, Lecture Notes in Artifical Intelligence, Vol. 4632 (Springer, Berlin Heidelberg 2007) pp. 488–499
R.L. Winkler: An Introduction to Bayesian Inference and Decision (Holt Rinehart Winston, Toronto 1972)
S.L. Wang, H.N. Yuan, G. Chen, D.R. Li, W.Z. Shi: Rough spatial interpretation, Lect. Notes Artif. Int. 3066, 435–444 (2004)
S. Shekar, C.T. Lu, P. Zhang: A unified approach to detecting spatial outliers, GeoInformatica 7(2), 139–166 (2003)
D. Hawkins: Identifications of Outliers (Chapman Hall, London 1980)
P. Rousseeuw, A. Leroy: Robust Regression and Outlier Detection, 3rd edn. (Wiley, New York 1996)
V.J. Hodge, J. Austin: A survey of outlier detection methodologies, Artif. Int. Rev. 22, 85–126 (2004)
S. Ramaswamy, R. Rastogi, K. Shim: Efficient algorithms for mining outliers from large datasets, Proc. ACM SIGMOD Conf. Manag. Data, Dallas (2000) pp. 427–438
A. Jøsang, R. Ismail, C. Boyd: A survey of trust and reputation systems for online service provision, Decis. Support Syst. 43, 618–644 (2007)
D.J. Power: Decision Support Systems: Concepts and Resources for Managers (Quorum, Westport 2002)
The Internet of Things Council: http://www.theinternetofthings.eu/ (last accessed July 1, 2010)
B. Liu: Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Springer, Berlin Heidelberg 2007)
Z. Markov, D.T. Larose: Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage (Wiley, New York 2007)
A. Barabási, E. Bonabeau: Scale-free networks, Sci. Am. 288, 60–69 (2003)
D.J. Watts, S.H. Strogatz: Collective dynamics of small world networks, Nature 393, 440–442 (1998)
J. Srivastava, R. Cooleyz, M. Deshpande, P.-N. Tan: Web usage mining, ACM SIGKDD Explor. 1(2), 12–23 (2000)
D.R. Li, Z.Q. Guan: Integration and Realization of Spatial Information System (Wuhan Univ. Press, Wuhan 2002)
R. Haining: Spatial Data Analysis: Theory and Practice (Cambridge Univ. Press, Cambridge 2003)
H.J. Miller, J. Han: Geographic Data Mining and Knowledge Discovery, 2nd edn. (CRC, Boca Raton 2009)
D.R. Li, S.L. Wang, W.Z. Shi, X.Z. Wang: On spatial data mining and knowledge discovery (SDMKD), Geomat. Infor. Sci. Wuhan Univ. 26(6), 491–499 (2001)
F. Giannotti, D. Pedreschi: Mobility, Data Mining: Geographic Knowledge Discovery (Springer, Berlin Heidelberg 2008)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag
About this chapter
Cite this chapter
Wang, S., Shi, W. (2011). Data Mining and Knowledge Discovery. In: Kresse, W., Danko, D. (eds) Springer Handbook of Geographic Information. Springer Handbooks. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72680-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-72680-7_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72678-4
Online ISBN: 978-3-540-72680-7
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)