Information-Theoretic Measures for Knowledge Discovery and Data Mining

Yao, Y. Y.

doi:10.1007/978-3-540-36212-8_6

Y. Y. Yao³

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 119))

1070 Accesses
69 Citations

Abstract

A database may be considered as a statistical population, and an attribute as a statistical variable taking values from its domain. One can carry out statistical and information-theoretic analysis on a database. Based on the attribute values, a database can be partitioned into smaller populations. An attribute is deemed important if it partitions the database such that previously unknown regularities and patterns are observable. Many information-theoretic measures have been proposed and applied to quantify the importance of attributes and relationships between attributes in various fields. In the context of knowledge discovery and data mining (KDD), we present a critical review and analysis of information-theoretic measures of attribute importance and attribute association, with emphasis on their interpretations and connections.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Imielinski, T. and Swami, A. Mining association rules between sets of items in large databases, Proceedings of the ACM SIGMOD International Conference on the Management of Data, 207–216, 1993.
Google Scholar
Bell, A. Discovery and maintenance of functional dependencies by independencies, Proceedings of KDD-95, 27–32, 1995.
Google Scholar
Büchter, O. and Wirth R. Discovery of association rules over ordinal data: a new and faster algorithm and its application to basket analysis, in: Research and Development in Knowledge Discovery and Data Mining, Wu, X., Kotagiri, R. and Bork, K.B. (Eds.), Springer, Berlin, 36–47, 1998.
Google Scholar
Butz, C.J., Wong, S.K.M. and Yao, Y.Y. On data and probabilistic dependencies, Proceedings of the 1999 IEEE Canadian Conference on Electrical and Computer Engineering, 1692–1697, 1999.
Google Scholar
Cendrowska, J. PRISM: an algorithm for inducing modular rules, International Journal of Man-Machine Studies, 27, 349–370, 1987.
Article MATH Google Scholar
Chen, C. Statistical Pattern Recognition, Hayden Book Company, Inc., New Jersey, 1973.
Google Scholar
Chen, M., Han, J. and Yu, P.S. Data mining, an overview from a database perspective, IEEE Transactions on Knowledge and Data Engineering, 8, 866883, 1996.
Google Scholar
Chow, C. and Liu, C. Approximating discrete probability distributions with dependence trees, IEEE Transactions on Information Theory, IT-14, 462–467, 1968.
Google Scholar
Cowell, R.G., Dawid, A.P., Lauritzen, S.L. and Spiegelhalter, D.J. Probabilistic Networks and Expert Systems, Springer, New York, 1999.
MATH Google Scholar
Cover, T. and Thomas, J. Elements of Information Theory, John Wiley Sc Sons, Toronto, 1991.
Book MATH Google Scholar
Csiszar, I. and Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems, Academic Press, New York, 1981.
MATH Google Scholar
Garner, W.R. and McGill, W.J. Relation between information and variance analyses, Psychometrika, 21, 219–228, 1956.
Article MathSciNet MATH Google Scholar
Gray, B. and Orlowska, M.E. CCAIIA: clustering categorical attributes into interesting association rules, in: Research and Development in Knowledge Discovery and Data Mining, Wu, X., Kotagiri, R. and Bork, K.B. (Eds.), Springer, Berlin, 132–143, 1998.
Google Scholar
Guiasu, S. Information Theory with Applications, McGraw-Hill, New York, 1977.
Google Scholar
Han, J., Cai, Y. and Cercone, N. Data-driven discovery of quantitative rules in databases, IEEE Transactions on Knowledge and Data Engineering, 5, 29–40, 1993.
Article Google Scholar
Horibe, Y. A note on entropy metrics, Information and Control, 22, 403–404, 1973.
Article MathSciNet MATH Google Scholar
Horibe, Y. Entropy and correlation, IEEE Transactions on Systems, Man, and Cybernetics, SMC-15, 641–642, 1985.
Google Scholar
Hou, W. Extraction and applications of statistical relationships in relational databases, IEEE Transactions on Knowledge and Data Engineering, 8, 939945, 1996.
Google Scholar
Hwang, C.L. and Yoon, K. Multiple Attribute Decision Making, Methods and Applications, Springer-Verlag, Berlin, 1981.
MATH Google Scholar
Kazakos, D. and Cotsidas, T. A decision approach to the approximation of discrete probability densities, IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-2, 61–67, 1980.
Google Scholar
Kamber, M. and Shinghal, R. Evaluating the interestingness of characteristic rules, Proceedings of KDD-96, 263–266, 1996.
Google Scholar
Klir, G.J. and Yuan, B. Fuzzy Sets and Fuzzy Logic, Theory and Applications, Prentice Hall, New Jersey, 1995.
MATH Google Scholar
Klösgen, W. Explora: a multipattern and multistrategy discovery assistant, in: Advances in Knowledge Discovery and Data Mining, Fayyad, U.M, PiatetskyShapiro, G., Smyth, P. and Uthurusamy, R. (Eds.), AAAI/MIT Press, California, 249–271, 1996.
Google Scholar
Knobbe, A.J. and Adriaans P.W. Analysis of binary association, Proceedings of KDD-96, 311–314, 1996.
Google Scholar
Kohavi, R. and Li, C. Oblivious decision trees, graphs and top-down pruning, Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1071–1077, 1995.
Google Scholar
Kvâlseth, T.O. Entropy and correlation: some comments, IEEE Transactions on Systems, Man, and Cybernetics, SMC-17, 517–519, 1987.
Google Scholar
Kullback, S. and Leibler, R.A. On information and sufficiency, Annals of Mathematical Statistics, 22, 79–86, 1951.
Article MathSciNet MATH Google Scholar
Lee, T.T. An information-theoretic analysis of relational databases — part I: data dependencies and information metric, IEEE Transactions on Software Engineering, SE-13, 1049–1061, 1987.
Google Scholar
Liebetrau, A.M. Measures of Association, Sage University Paper Series on Quantitative Application in the Social Sciences, 07–032, Sage Publications, Beverly Hills, 1983.
Google Scholar
Lin, J. and Wong, S.K.M. A new directed divergence measure and its characterization, International Journal of General Systems,17, 73–81, 1991
Google Scholar
Lin, T.Y. and Cercone, N. (Eds.), Rough Sets and Data Mining: Analysis for Imprecise Data, Kluwer Academic Publishers, Boston, 1997.
Google Scholar
Linfoot, E.H. An informational measure of correlation, Information and Control, 1, 85–87, 1957.
Article MathSciNet MATH Google Scholar
Liu, H. and Motoda, H. Feature Selection for Knowledge Discovery and Data Mining, Kluwer Academic Publishers, Boston, 1998.
Book MATH Google Scholar
Lopez de Mhntaras, R. ID3 revisited: a distance-based criterion for attribute selection, in: Methodologies for Intelligent Systems, 4, Ras, Z.W. (Ed.), North-Holland, New York, 342–350, 1989.
Google Scholar
Malvestuto, F.M. Statistical treatment of the information content of a database, Information Systems, 11, 211–223, 1986.
Article MATH Google Scholar
Michalski, R.S., Carbonell, J.G. and Mitchell, T.M. (Eds.), Machine Learning, Tioga, 1983.
Google Scholar
Pfahringer, B. and Kramer, S. Compression-based evaluation of partial determinations, Proceedings of KDD-95, 234–239, 1995.
Google Scholar
Pawlak, Z. Rough Sets, Theoretical Aspects of Reasoning about Data, Kluwer Academic Publishers, Boston, 1991.
MATH Google Scholar
Pawlak, Z., Wong, S.K.M. and Ziarko, W. Rough sets: probabilistic versus deterministic approach, International Journal of Man-Machine Studies, 29, 8195, 1988.
Article Google Scholar
Pearl, J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference,Morgan Kaufmann Publishers, San Francisco, 1988.
Google Scholar
Polkowski, L. and Skowron, A. (Eds.), Rough Sets in Knowledge Discovery 1,2,Physica-Verlag, Heidelberg, 1998.
Google Scholar
Quinlan, J.R. Induction of decision trees, Machine Learning, 1, 81–106, 1986.
Google Scholar
Rao, C.R. Diversity and dissimilarity coefficients: a unified approach, Theoretical Population Biology, 21, 24–43, 1982.
Article MathSciNet MATH Google Scholar
Rajski, C. A metric space of discrete probability distributions, Information and Control, 4, 373–377, 1961.
Article MathSciNet Google Scholar
Salton, G. and McGill, M.H. Introduction to Modern Information Retrieval,McGraw-Hill, New York, 1983.
Google Scholar
Shannon, C.E. A mathematical theory of communication, Bell System and Technical Journal, 27, 379–423, 623–656, 1948.
Google Scholar
Shannon, C.E. Some topics in information theory, Proceedings of International Congress of Mathematics, 2, 262, 1950.
Google Scholar
Sheridan, T.B. and Ferrell, W.R. Man-Machine Systems: Information Control and Decision Models of Human Performance The MIT Press, Cambridge, 1974.
Google Scholar
Silverstein, C., Brin, S. and Motwani, R. Beyond market baskets: generalizing association rules to dependence rules, Data Mining and Knowledge Discovery, 2, 39–68, 1998.
Article Google Scholar
Smyth, P. and Goodman, R.M. Rule induction using information theory, in: Knowledge Discovery in Databases, Piatetsky-Shapiro, G. and Frawley, W.J. (Eds.), AAAI/MIT Press, 159–176, 1991.
Google Scholar
Spyratos, N. The partition model: a deductive database model, ACM Transactions on Database Systems12, 1–37, 1987.
Google Scholar
van Rijsbergen, C.J. Information RetrievalButterworth, London, 1979.
Google Scholar
Wan, S.J. and Wong, S.K.M. A measure for attribute dissimilarity and its applications in machine learning, in: Computing and Information, Janicki, R. and Koczkodaj, W.W. (Eds.), North-Holland, Amsterdam, 267–273, 1989.
Google Scholar
Wang, Q.R. and Suen, C.Y. Analysis and design of a decision tree based on entropy reduction and its application to large character set recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-6, 406–417, 1984.
Google Scholar
Watanabe, S. Knowing and Guessing, Wiley, New York, 1969.
MATH Google Scholar
Watanabe, S. Pattern recognition as a quest for minimum entropy, Pattern Recognition, 13, 381–387, 1981.
Article MathSciNet MATH Google Scholar
Wong, A.K.C. and You, M. Entropy and distance of random graphs with application to structural pattern recognition, IEEE Transactions on Pattern Analysis And Machine Intelligence, PAMI-7, 599–609, 1985.
Google Scholar
Wong, S.K.M. and Yao, Y.Y. A probability distribution model for information retrieval, Information Processing and Management, 25, 39–53, 1989.
Article Google Scholar
Wong, S.K.M. and Yao, Y.Y. An information-theoretic measure of term specificity, Journal of the American Society for Information Science, 43, 54–61, 1992.
Article Google Scholar
Yao, Y.Y., Wong, S.K.M. and Butz, C.J. On information-theoretic measures of attribute importance, Proceedings of PAKDD’99, 133–137, 1999.
Google Scholar
Yao, Y.Y., Wong, S.K.M. and Lin, T.Y. A review of rough set models, in: Rough Sets and Data Mining: Analysis for Imprecise Data, Lin, T.Y. and Cercone, N. (Eds.), Academic Publishers, Boston, 47–75, 1997.
Google Scholar
Yao, Y.Y. Information tables with neighborhood semantics, in: Data Mining and Knowledge Discovery: Theory, Tools, and Technology II, Dasarathy, B.V. (Ed.), The International Society for Optical Engineering, Bellingham, Washington, 108–116, 2000.
Google Scholar
Yao, Y.Y. and Zhong, N. An analysis of quantitative measures associated with rules, Proceedings of PAKDD’99, 479–488, 1999.
Google Scholar
Yao, Y.Y. and Zhong, N. On association, similarity and dependency of attributes, Proceedings of PAKDD’00, 2000.
Google Scholar
Yao, Y.Y. and Zhong, N. Granular computing using information tables, manuscript, 2000.
Google Scholar
Yao, Y.Y. and Zhong, N. Mining market value function for targeted marketing, manuscript, 2000.
Google Scholar
Zeleny, M. Linear multiobjective programming, Springer-Verlag, New York, 1974.
Book MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada, S4S 0A2
Y. Y. Yao

Authors

Y. Y. Yao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer and Systems Sciences, Jawaharlal Nehru University, 110 067, New Delhi, India
Karmeshu

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Yao, Y.Y. (2003). Information-Theoretic Measures for Knowledge Discovery and Data Mining. In: Karmeshu (eds) Entropy Measures, Maximum Entropy Principle and Emerging Applications. Studies in Fuzziness and Soft Computing, vol 119. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-36212-8_6

Download citation

DOI: https://doi.org/10.1007/978-3-540-36212-8_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05531-7
Online ISBN: 978-3-540-36212-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics