Abstract
Data stream mining extracts information from large quantities of data flowing fast and continuously (data streams). They are usually affected by changes in the data distribution, giving rise to a phenomenon referred to as concept drift. Thus, learning models must detect and adapt to such changes, so as to exhibit a good predictive performance after a drift has occurred. In this regard, the development of effective drift detection algorithms becomes a key factor in data stream mining. In this work we propose \(\textit{CURIE}\), a drift detector relying on cellular automata. Specifically, in \(\textit{CURIE}\) the distribution of the data stream is represented in the grid of a cellular automata, whose neighborhood rule can then be utilized to detect possible distribution changes over the stream. Computer simulations are presented and discussed to show that \(\textit{CURIE}\), when hybridized with other base learners, renders a competitive behavior in terms of detection metrics and classification accuracy. \(\textit{CURIE}\) is compared with well-established drift detectors over synthetic datasets with varying drift characteristics.
Similar content being viewed by others
References
Arrieta AB, Díaz-RodrDíguez N, Del Ser J, Bennetot A, Tabik S, Barbado A, Salvador G, Sergio GL, Daniel M, Richard B et al (2020) Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf Fusion 58:82–115
Barros RSM, Santos SGTC (2018) A large-scale comparison of concept drift detectors. Inf Sci 451:348–370
Bifet A, Gavaldà R, Holmes G, Pfahringer B (2018) Machine Learning for data streams with practical examples in MOA. MIT Press. https://moa.cms.waikato.ac.nz/book/
Bifet A, Holmes G, Pfahringer B, Frank E (2010) Fast perceptron decision tree learning from evolving data streams. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp. 299–310
Carvalho Tiago I, Carneiro Murillo G, Oliveira Gina MB (2019) Improving cellular automata scheduling through dynamics control. Int J Parallel Emerg Distrib Syst 34(1):115–141
Collados-Lara A-J, Pardo-Igúzquiza E, Pulido-Velazquez D (2019) A distributed cellular automata model to simulate potential future impacts of climate change on snow cover area. Adv Water Resour 124:106–119
Dastjerdi AV, Buyya R (2016) Fog computing: helping the Internet of Things realize its potential. Computer 49(8):112–116
Dawid AP, Vovk VG et al (1999) Prequential probability: principles and properties. Bernoulli 5(1):125–162
Del Ser J, Osaba E, Molina D, Yang XS, Salcedo-Sanz S, Camacho D, Das S, Suganthan PN, Coello CAC, Herrera F (2019) Bio-inspired computation: where we stand and what’s next. Swarm Evolut Comput 48:220–250
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Fawcett T (2008) Data mining with cellular automata. ACM SIGKDD Explor Newsl 10(1):32–39
Gama J, Žliobaitė I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv (CSUR) 46(4):44
Gilpin W (2019) Cellular automata as convolutional neural networks. Phys Rev E 100(3):032402
Gomes HM, Read J, Bifet A, Barddal JP, Gama J (2019) Machine learning for streaming data: state of the art, challenges, and opportunities. ACM SIGKDD Explor Newsl 21(2):6–22
Gonçalves Jr Paulo M, Santos Silas GT, de Carvalho B, Roberto SM, Vieira Davi CL (2014) A comparative study on concept drift detectors. Expert Syst Appl 41(18):8144–8156
Gounaridis D, Chorianopoulos I, Symeonakis E, Koukoulas S (2019) A random forest-cellular automata modelling approach to explore future land use/cover change in attica (Greece), under different socio-economic realities and scales. Sci Total Environ 646:320–335
Hashemi S, Yang Y, Pourkashani M, Kangavari M (2007) To better handle concept change and noise: a cellular automata approach to data stream classification. In: Australasian joint conference on artificial intelligence. Springer, pp. 669–674
Hu H, Kantardzic M, Sethi TS (2019) No free lunch theorem for concept drift detection in streaming data classification: a review. In: Wiley interdisciplinary reviews: data mining and knowledge discovery, pp. e1327
Ilyas M, Mahgoub I (2018) Smart dust: sensor network applications, architecture and design. CRC Press, Boca Raton
Jie L, Anjin L, Fan D, Feng G, Joao G, Guangquan Z (2018) Learning under concept drift: a review. IEEE Trans Knowl Data Eng 31(12):2346–2363
Judy JW (2001) Microelectromechanical systems (mems): fabrication, design and applications. Smart Mater Struct 10(6):1115
Kari J (2005) Theory of cellular automata: a survey. Theor Comput Sci 334(1–3):3–33
Lobo JL, Del Ser J, Laña I, Bilbao MN, Kasabov N (2018) Drift detection over non-stationary data streams using evolving spiking neural networks. In: International symposium on intelligent and distributed computing. Springer, pp. 82–94
Lobo JL, Del Ser J, Herrera F (2021) LUNAR: Cellular automata for drifting data streams. Inf Sci 543:467–487
Losing V, Hammer B, Wersing H (2018) Incremental on-line learning: a review and comparison of state of the art algorithms. Neurocomputing 1275:1261–1274
Minku Leandro L, Yao X (2011) DDD: a new ensemble approach for dealing with concept drift. IEEE Trans Knowl Data Eng 24(4):619–633
Nemenyi PB (1963) Distribution-free multiple comparisons. Princeton University, Princeton
Nichele S, Molund A (2017) Deep learning with cellular automaton-based reservoir computing. Complex Systems
Pourkashani M, Kangavari MR (2008) A cellular automata approach to detecting concept drift and dealing with noise. In: 2008 IEEE/ACS international conference on computer systems and applications. IEEE, pp. 142–148
Raghavan R (1993) Cellular automata in pattern recognition. Inf Sci 70(1–2):145–177
Rudin C (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell 1(5):206–215
Ultsch A (2002) Data mining as an application for artificial life. In: Proceedings of the 5th German workshop on artificial life. Citeseer, pp. 191–197
Uzun AO, Usta T, Dündar EB, Korkmaz EE (2018) A solution to the classification problem with cellular automata. Pattern Recog Lett 116:114–120
Von Neumann J, Burks AW et al (1966) Theory of self-reproducing automata. IEEE Trans Neural Netw 5(1):3–14
Webb GI, Hyde R, Cao H, Nguyen HL, Petitjean F (2016) Characterizing concept drift. Data Min Knowl Disc 30(4):964–994
Wolfram S (2002) A new kind of science. Wolfram media Champaign, Champaign
Žliobaitè I, Pechenizkiy M, Gama J (2016) An overview of concept drift applications. In: Big data analysis: new algorithms for a new society. Springer, pp. 91–114
Acknowledgements
This work has received funding support from the ECSEL Joint Undertaking (JU) under grant agreement No 783163 (iDev40 project). The JU receives support from the European Union’s Horizon 2020 research and innovation programme, national grants from Austria, Belgium, Germany, Italy, Spain and Romania, as well as the European Structural and Investment Funds. Authors would like to also thank the ELKARTEK and EMAITEK funding programmes of the Basque Government (Spain)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Responsible editor: Annalisa Appice, Sergio Escalera, Jose A. Gamez, Heike Trautmann.
Dedicated to Tom Fawcett and J. H. Conway, who passed away in 2020, for their noted contributions to the field of cellular automata and machine learning, and for inspiring this research work.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Lobo, J.L., Del Ser, J., Osaba, E. et al. CURIE: a cellular automaton for concept drift detection. Data Min Knowl Disc 35, 2655–2678 (2021). https://doi.org/10.1007/s10618-021-00776-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-021-00776-2