research-article

Automatic Construction of Multi-layer Perceptron Network from Streaming Examples

Authors:
Mahardhika Pratama

Nanyang Technological University, Singapore, Singapore

Nanyang Technological University, Singapore, Singapore
View Profile

,
Choiru Za'in

La Trobe University, Melbourne, Australia

La Trobe University, Melbourne, Australia
View Profile

,
Andri Ashfahani

Nanyang Technological University, Singapore, Singapore

Nanyang Technological University, Singapore, Singapore
View Profile

,
Yew Soon Ong

Nanyang Technological University, Singapore, Singapore

Nanyang Technological University, Singapore, Singapore
View Profile

,
Weiping Ding

Nantong University, Nantong, China

Nantong University, Nantong, China
View Profile

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge ManagementNovember 2019Pages 1171–1180https://doi.org/10.1145/3357384.3357946

Published:03 November 2019Publication History

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

Pages 1171–1180

ABSTRACT

Autonomous construction of deep neural network (DNNs) is desired for data streams because it potentially offers two advantages: proper model's capacity and quick reaction to drift and shift. While self-organizing mechanism of DNNs remains an open issue, this task is even more challenging to be developed for standard multi-layer DNNs than that using the different-depth structures, because addition of a new layer results in information loss of previously trained knowledge. A Neural Network with Dynamically Evolved Capacity (NADINE) is proposed in this paper. NADINE features a fully open structure where its network structure, depth and width, can be automatically evolved from scratch in the online manner and without the use of problem-specific thresholds. NADINE is structured under a standard MLP architecture and the catastrophic forgetting issue during the hidden layer addition phase is resolved using the proposal of soft-forgetting and adaptive memory methods. The advantage of NADINE, namely elastic structure and online learning trait, is numerically validated using nine data stream classification and regression problems where it demonstrates performance's improvement over prominent algorithms in all problems. In addition, it is capable of dealing with data stream regression and classification problems equally well.

References

Andri Ashfahani and Mahardhika Pratama. 2019. Autonomous Deep Learning: Continual Learning Approach for Dynamic Environments. In In SIAM International Conference on Data Mining.Google Scholar
P. Baldi, Paul D. Sadowski, and Daniel Whiteson. 2014. Searching for exotic particles in high-energy physics with deep learning. Nature communications 5 (2014), 4308.Google Scholar
Andrea Coraddu, Luca Oneto, Alessandro Ghio, Stefano Savio, Davide Anguita, and Massimo Figari. 2014. Machine Learning Approaches for Improving Condition? Based Maintenance of Naval Propulsion Plants. Journal of Engineering for the Maritime Environment --, -- (2014), --.Google Scholar
G. Ditzler and R. Polikar. 2013. Incremental Learning of Concept Drift from Streaming Imbalanced Data. IEEE Trans. on Knowl. and Data Eng. 25, 10 (Oct. 2013), 2283--2301.Google ScholarDigital Library
R. Elwell and R. Polikar. 2011. Incremental Learning of Concept Drift in Nonstationary Environments. Trans. Neur. Netw. 22, 10 (Oct. 2011), 1517--1531.Google ScholarDigital Library
I. Frias-Blanco, J. d. Campo-Avila, G. Ramos-Jimenez, R. Morales-Bueno, A. Ortiz- Diaz, and Y. Caballero-Mota. 2015. Online and Non-Parametric Drift Detection Methods Based on Hoeffdings Bounds. IEEE Transactions on Knowledge and Data Engineering 27, 3 (March 2015), 810--823. https://doi.org/10.1109/TKDE.2014. 2345382Google ScholarDigital Library
Joao Gama. 2010. Knowledge Discovery from Data Streams (1st ed.). Chapman & Hall/CRC. 8] João Gama, Ricardo Fernandes, and Ricardo Rocha. 2006. Decision Trees for Mining Data Streams. Intell. Data Anal. 10, 1 (Jan. 2006), 23--45.Google Scholar
João Gama, Indre Zliobaite, Albert Bifet, Mykola Pechenizkiy, and Abdelhamid Bouchachia. 2014. A Survey on Concept Drift Adaptation. ACM Comput. Surv. 46, 4, Article 44 (March 2014), 37 pages.Google ScholarDigital Library
Young Hun Jung, Jack Goetz, and Ambuj Tewari. [n. d.]. Online multiclass boosting. In Advances in Neural Information Processing Systems 30.Google Scholar
James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A. Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, Demis Hassabis, Claudia Clopath, Dharshan Kumaran, and Raia Hadsell. 2016. Overcoming catastrophic forgetting in neural networks. http://arxiv.org/abs/1612.00796 cite arxiv:1612.00796.Google Scholar
David Lopez-Paz and Marc' Aurelio Ranzato. [n. d.]. Gradient Episodic Memory for Continual Learning. In Advances in Neural Information Processing Systems 30.Google Scholar
Guido F. Montúfar, Razvan Pascanu, KyungHyun Cho, and Yoshua Bengio. 2014. On the Number of Linear Regions of Deep Neural Networks. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8--13 2014, Montreal, Quebec, Canada. 2924--2932. http://papers.nips.cc/paper/ 5422-on-the-number-of-linear-regions-of-deep-neural-networksGoogle Scholar
Masud Moshtaghi, James C. Bezdek, Christopher Leckie, Shanika Karunasekera, and Marimuthu Palaniswami. 2015. Evolving Fuzzy Rules for Anomaly Detection in Data Streams. IEEE Trans. Fuzzy Systems 23, 3 (2015), 688--700. https://doi. org/10.1109/TFUZZ.2014.2322385Google ScholarDigital Library
Kevin P. Murphy. 2012. Machine Learning: A Probabilistic Perspective. The MIT Press.Google ScholarDigital Library
C Oza Nikunj and January Russell Stuart. 2001. Online bagging and boosting. Jaakkola Tommi and Richardson Thomas, editors. In Eighth International Workshop on Artificial Intelligence and Statistics. 105--112.Google Scholar
Ali Pesaranghader, Herna Viktor, and Eric Paquet. 2018. Reservoir of diverse adaptive learners and stacking fast hoeffding drift detection methods for evolving data streams. Machine Learning (01 Jun 2018). https://doi.org/10.1007/ s10994-018--5719-zGoogle Scholar
M. Pratama, W. Pedrycz, and E. Lughofer. 2018. Evolving Ensemble Fuzzy Classifier. IEEE Transactions on Fuzzy Systems (2018), 1--1.Google Scholar
"Andrei A. Rusu"; "Neil C. Rabinowitz". 2016. Progressive Neural Networks. (2016).Google Scholar
D. Sahoo, Q. D. Pham, J. Lu, and S. C. Hoi. 2017. Online Deep Learning: Learning Deep Neural Networks on the Fly. arXiv preprint arXiv:1711.03705 abs/1711.03705 (2017).Google Scholar
Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. 2016. Prioritized Experience Replay. In International Conference on Learning Representations. Puerto Rico.Google Scholar
Joan Serra, Didac Suris, Marius Miron, and Alexandros Karatzoglou. 2018. Overcoming Catastrophic Forgetting with Hard Attention to the Task. In Proceedings of the 35th International Conference on Machine Learning. 4548--4557.Google Scholar
K. Simonyan and A. Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
Rupesh K Srivastava, Jonathan Masci, Sohrob Kazerounian, Faustino Gomez, and Jürgen Schmidhuber. [n. d.]. Compete to Compute. In Advances in Neural Information Processing Systems 26.Google Scholar
Salvatore J. Stolfo,Wei Fan,Wenke Lee, Andreas Prodromidis, and Philip K. Chan. 2000. Cost-based Modeling for Fraud and Intrusion Detection: Results from the JAM Project. In In Proceedings of the 2000 DARPA Information Survivability Conference and Exposition. IEEE Computer Press, 130--144.Google Scholar
Mariya Toneva, Alessandro Sordoni, Remi Tachet des Combes, Adam Trischler, Yoshua Bengio, and Geoffrey J. Gordon. 2019. An Empirical Study of Example Forgetting during Deep Neural Network Learning. In International Conference on Learning Representations. https://openreview.net/forum?id=BJlxm30cKmGoogle Scholar
D. H. Wolpert. 2016. The Power of Depth for Feed-forward Neural Networks. Journal of Machine Learning Research 49 (2016), 1--39.Google Scholar
Lu Yingwei, N. Sundararajan, and P. Saratchandran. 1997. A Sequential Learning Scheme for Function Approximation Using Minimal Radial Basis Function Neural Networks. Neural Comput. 9, 2 (Feb. 1997), 461--478.Google ScholarDigital Library
Jaehong Yoon, Eunho Yang, Jeongtae Lee, and Sung Ju Hwang. 2018. Lifelong Learning with Dynamically Expandable Networks. ICLR.Google Scholar
Guanyu Zhou, Kihyuk Sohn, and Honglak Lee. 2012. Online incremental feature learning with denoising autoencoders. Journal of Machine Learning Research 22 (2012), 1453--1461.Google Scholar

Automatic Construction of Multi-layer Perceptron Network from Streaming Examples
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches

Recommendations

Extraction of voltage harmonics using multi-layer perceptron neural network

This paper presents a harmonic extraction algorithm using artificial neural networks for Dynamic Voltage Restorers (DVRs). The suggested algorithm employs a feed forward Multi Layer Perceptron (MLP) Neural Network with error back propagation learning to ...
Read More
Policing function in ATM network using multi-layer neural network
LCN '96: Proceedings of the 21st Annual IEEE Conference on Local Computer Networks

Artificial neural networks provide an attractive alternative in performing the policing function at the user network interface (UNI) of an asynchronous transfer mode (ATM) network. In order to guarantee quality of service (QOS) for the established ...
Read More
Learning to Learn and Compositionality with Deep Recurrent Neural Networks: Learning to Learn and Compositionality
KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Deep neural network representations play an important role in computer vision, speech, computational linguistics, robotics, reinforcement learning and many other data-rich domains. In this talk I will show that learning-to-learn and compositionality are ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management
November 2019
3373 pages
ISBN:9781450369763
DOI:10.1145/3357384
General Chairs:
Wenwu Zhu
Tsinghua University, China
,
Dacheng Tao
University of Massachusetts, USA
,
Xueqi Cheng
Institute of Computing Technology, CAS, China
,
Program Chairs:
Peng Cui
Tsinghua University, China
,
Elke Rundensteiner
Worcester Polytechnic Institute, USA
,
David Carmel
Amazon Research, USA
,
Qi He
LinkedIn, USA
,
Jeffrey Xu Yu
Chinese University of Hong Kong, China
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 November 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
concept drifts
continual learning
data streams
deep learning
online learning
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '19 Paper Acceptance Rate202of1,031submissions,20%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 25
  Total Citations
  View Citations
- 365
  Total Downloads
- Downloads (Last 12 months)47
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automatic Construction of Multi-layer Perceptron Network from Streaming Examples

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Recommendations

Extraction of voltage harmonics using multi-layer perceptron neural network

Policing function in ATM network using multi-layer neural network

Learning to Learn and Compositionality with Deep Recurrent Neural Networks: Learning to Learn and Compositionality