Mining data streams with concept drifts using genetic algorithm

Vivekanandan, Periasamy; Nedunchezhian, Raju

doi:10.1007/s10462-011-9209-y

Mining data streams with concept drifts using genetic algorithm

Published: 17 February 2011

Volume 36, pages 163–178, (2011)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

Periasamy Vivekanandan¹ &
Raju Nedunchezhian²

418 Accesses
16 Citations
Explore all metrics

Abstract

Recent research shows that rule based models perform well while classifying large data sets such as data streams with concept drifts. A genetic algorithm is a strong rule based classification algorithm which is used only for mining static small data sets. If the genetic algorithm can be made scalable and adaptable by reducing its I/O intensity, it will become an efficient and effective tool for mining large data sets like data streams. In this paper a scalable and adaptable online genetic algorithm is proposed to mine classification rules for the data streams with concept drifts. Since the data streams are generated continuously in a rapid rate, the proposed method does not use a fixed static data set for fitness calculation. Instead, it extracts a small snapshot of the training example from the current part of data stream whenever data is required for the fitness calculation. The proposed method also builds rules for all the classes separately in a parallel independent iterative manner. This makes the proposed method scalable to the data streams and also adaptable to the concept drifts that occur in the data stream in a fast and more natural way without storing the whole stream or a part of the stream in a compressed form as done by the other rule based algorithms. The results of the proposed method are comparable with the other standard methods which are used for mining the data streams.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Aguilar-Ruiz JS, Riquelme JC, Toro M (2003) Evolutionary learning of hierarchical decision rules. IEEE Trans Syst Man Cybern B 33(2): 324–331
Article Google Scholar
Araujo DLA, Lopes HS, Freitas AA (1999) A parallel genetic algorithm for rule discovery in large databases. In: Proceedings of IEEE systems, man and cybernetics conference, vol 3, Tokyo, pp 940–945
Bacardit J (2004) Pittsburgh genetics-based machine learning in the data mining era: representations, generalization, and run-time. PhD thesis, Ramon Llull University, Barcelona
Bacardit J, Krasnogor N (2006a) Empirical evaluation of ensemble techniques for a Pittsburgh learning classifier system. In: 9th International workshop on learning classifier systems (IWLCS 2006), Lecture Notes in Artificial Intelligence. Springer
Bacardit J, Krasnogor N (2006b) Biohel: Bioinformatics-oriented hierarchical evolutionary learning. Nottingham eprints, University of Nottingham
Google Scholar
Bacardit J, Stout M, Hirst JD, Sastry K, LloráX, Krasnogor N (2007) Automated alphabet reduction method with evolutionary algorithms for protein structure prediction. In: Proceedings of the 9th annual conference on genetic and evolutionary computation. ACM Press, New York, pp 346–353
Bacardit J, Burke EK, Krasnogor N (2009) Improving the scalability Of rule-based evolutionary learning. Memet Comput 1(1): 55–67
Article Google Scholar
Dehuri S, Mall R (2006) Predictive and comprehensible rule discovery using a multi-objective genetic algorithm. Knowl Based Syst 19: 413–421
Article Google Scholar
De Jong KA, Spears WM (1991) Learning concept classification rules using genetic algorithms. In: Proceedings of the international joint conference on artificial intelligence. Morgan Kaufmann, pp 651–656
Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings KDD 2000. ACM Press, New York, pp 71–80
Freitas AA (2002) Data mining and Knowledge discovery with evolutionary algorithms. Springer, New York
MATH Google Scholar
Gao J, Ding B, Fan W, Han J, Yu PS (2008) Classifying data streams with skewed class distributions and concept drifts. In: IEEE internet computing, special issue on data stream management(IEEEIC), Nov/Dec 2008, pp 37–49
Guan S-U, Zhu F (2005) An incremental approach to genetic-algorithms-based classification. IEEE Trans Syst Man Cybern B Cybern 35(2)
Guan SU, ZhuCollard F (2005) An incremental approach to genetic-algorithms based classification. IEEE Trans Syst Man Cybern B 35(2): 227–239
Article Google Scholar
Hashemi S, Yang Y, Mirzamomen Z, Kangavari M (2009) Adapted one-versus-all decision trees for data stream classification. IEEE Trans Knowl Data Eng 21(5): 624–637
Article Google Scholar
Holland JH (1975) Adaptation in natural and artificial systems. University of Michigan Press, Ann Arbor
Google Scholar
Hulten G, Spencer L, Domingos P (2001) Mining time-changing data streams. In: Proceedings KDD 2001. ACM Press, New York, pp 97–106
Janikow CZ (1993) A knowledge-intensive genetic algorithm for supervised learning. Mach Learn 13(2–3): 189–228
Article Google Scholar
Kolter JZ, Maloof MA (2003) Dynamic weighted majority: a new ensemble method for tracking concept drift. In: IEEE international conference on data mining
Kwedlo W, Kretowski M (1998, Sep 23–26) Discovery of decision rules from databases: an evolutionary approach. In: 2nd European symposium, PKDD’98, Nantes
Lazarescu M, Venkatesh S, Hung BH (2004) Using multiple windows to track concept drift. Intell Data Anal J 8(1): 29–59
Google Scholar
Noda E, Freitas AA, Lopes HS (1999) Discovering interesting prediction rule with a genetic algorithm. In: Proceedings of the 1999 congress on evolutionary computation, vol 2
Rivera W (2004) Scalable parallel genetic algorithms. Artif Intell Rev 16: 153–168
Article Google Scholar
Shi X-J, Lei H (2008) A genetic algorithm-based approach for classification rule discovery. In: International conference on information management, innovation management and industrial engineering, vol 1, pp 175–178
To C, Vohradsky J (2007) Binary classification using parallel genetic algorithm. In: IEEE Congress on Evolutionary Computation 2007, pp 1281–1287
Tsymbal A (2004) The problem of concept drift: definitions and related work. Department of Computer Science, Trinity College Dublin, Tech. Rep. TCD-CS-2004-15
Venturini G (1993) A supervised inductive algorithm with genetic search for learning attributes based concepts. In: Brazdil PB (ed) Machine learning: ECML-93—Proceedings of the European conference on machine learning. Springer, Berlin, pp 280–296
Google Scholar
Verma A, Llorá X, Goldberg DE, Campbell RH (2009) Scaling genetic algorithms using MapReduce. In: 9th International conference on intelligent systems design and applications, pp 13–18
Wang H, Fan W, Yu PS, Han J (2003) Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, pp 226–235
Wang P, Wang H, Wu X, Wang W, Shi B (2007) A low-granularity classifier for data streams with concept drifts and biased class distribution. IEEE Trans Knowl Data Eng 19(9)
Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1): 69–101
Google Scholar
Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2): 149–175
Article Google Scholar
Xue Z, Guo Y (2007, March) Improved cultural algorithm based on genetic algorithm. In: IEEE international conference on integration technology, 2007 (ICIT ’07), pp 117–122

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Park College of Engineering and Technology, Coimbatore, India
Periasamy Vivekanandan
Department of Computer Science and Engineering, Kalaignar Karunanidhi Institute of Technology, Coimbatore, India
Raju Nedunchezhian

Authors

Periasamy Vivekanandan
View author publications
You can also search for this author in PubMed Google Scholar
Raju Nedunchezhian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Periasamy Vivekanandan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vivekanandan, P., Nedunchezhian, R. Mining data streams with concept drifts using genetic algorithm. Artif Intell Rev 36, 163–178 (2011). https://doi.org/10.1007/s10462-011-9209-y

Download citation

Published: 17 February 2011
Issue Date: October 2011
DOI: https://doi.org/10.1007/s10462-011-9209-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mining data streams with concept drifts using genetic algorithm

Abstract

Access this article

Similar content being viewed by others

A review on genetic algorithm: past, present, and future

A survey on ensemble learning

Evolutionary algorithms and their applications to engineering problems

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Mining data streams with concept drifts using genetic algorithm

Abstract

Access this article

Similar content being viewed by others

A review on genetic algorithm: past, present, and future

A survey on ensemble learning

Evolutionary algorithms and their applications to engineering problems

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation