A comprehensive survey on optimizing deep learning models by metaheuristics

Akay, Bahriye; Karaboga, Dervis; Akay, Rustu

doi:10.1007/s10462-021-09992-0

A comprehensive survey on optimizing deep learning models by metaheuristics

Published: 31 March 2021

Volume 55, pages 829–894, (2022)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

3569 Accesses
34 Citations
1 Altmetric
Explore all metrics

Abstract

Deep neural networks (DNNs), which are extensions of artificial neural networks, can learn higher levels of feature hierarchy established by lower level features by transforming the raw feature space to another complex feature space. Although deep networks are successful in a wide range of problems in different fields, there are some issues affecting their overall performance such as selecting appropriate values for model parameters, deciding the optimal architecture and feature representation and determining optimal weight and bias values. Recently, metaheuristic algorithms have been proposed to automate these tasks. This survey gives brief information about common basic DNN architectures including convolutional neural networks, unsupervised pre-trained models, recurrent neural networks and recursive neural networks. We formulate the optimization problems in DNN design such as architecture optimization, hyper-parameter optimization, training and feature representation level optimization. The encoding schemes used in metaheuristics to represent the network architectures are categorized. The evolutionary and selection operators, and also speed-up methods are summarized, and the main approaches to validate the results of networks designed by metaheuristics are provided. Moreover, we group the studies on the metaheuristics for deep neural networks based on the problem type considered and present the datasets mostly used in the studies for the readers. We discuss about the pros and cons of utilizing metaheuristics in deep learning field and give some future directions for connecting the metaheuristics and deep learning. To the best of our knowledge, this is the most comprehensive survey about metaheuristics used in deep learning field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Advanced metaheuristic optimization techniques in applications of deep neural networks: a review

Article 18 April 2021

Application of Meta-Heuristic Algorithms for Training Neural Networks and Deep Learning Architectures: A Comprehensive Review

Article 31 October 2022

Deep neural network hyper-parameter tuning through twofold genetic approach

Article 18 April 2021

References

Agarwal S, Awan A, Roth D (2004) Learning to detect objects in images via a sparse, part-based representation. IEEE Trans Pattern Anal Mach Intell 26(11):1475–1490
Google Scholar
Agbehadji IE, Millham R, Fong SJ, Yang H (2018) Kestrel-based search algorithm (KSA) for parameter tuning unto long short term memory (LSTM) network for feature selection in classification of high-dimensional bioinformatics datasets. In: Federated conference on computer science and information systems (FedCSIS), pp 15–20
Ai S, Chakravorty A, Rong C (2019) Evolutionary ensemble LSTM based household peak demand prediction. In: International conference on artificial intelligence in information and communication (ICAIIC), pp 1–6
Alvernaz S, Togelius J (2017) Autoencoder-augmented neuroevolution for visual doom playing. In: IEEE conference on computational intelligence and games (CIG), pp 1–8
Andrzejak RG, Lehnertz K, Mormann F, Rieke C, David P, Elger CE (2001) Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: dependence on recording region and brain state. Phys Rev E 64
Google Scholar
Assuncao F, Sereno D, Lourenco N, Machado P, Ribeiro B (2018) Automatic evolution of autoencoders for compressed representations. In: IEEE congress on evolutionary computation (CEC), pp 1–8
Ayumi V, Rere LMR, Fanany MI, Arymurthy AM (2016) Optimization of convolutional neural network using microcanonical annealing algorithm. In: International conference on advanced computer science and information systems (ICACSIS). IEEE, pp 506–511
Badem H, Basturk A, Caliskan A, Yuksel ME (2017) A new efficient training strategy for deep neural networks by hybridization of artificial bee colony and limited-memory BFGS optimization algorithms. Neurocomputing 266:506–526
Google Scholar
Baker B, Gupta O, Naik N, Raskar R (2016) Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167
Baker B, Gupta O, Raskar R, Naik N (2017) Accelerating neural architecture search using performance prediction. arXiv preprint arXiv:1705.10823
Baldominos A, Saez Y, Isasi P (2019) Hybridizing evolutionary computation and deep neural networks: an approach to handwriting recognition using committees and transfer learning. Complexity 2019:1–16
Google Scholar
Banharnsakun A (2018) Towards improving the convolutional neural networks for deep learning using the distributed artificial bee colony method. Int J Mach Learn Cybern 10:1301–1311
Google Scholar
Bender G, Kindermans P-J, Zoph B, Vasudevan V, Le Q (2018) Understanding and simplifying one-shot architecture search. In: International conference on machine learning, pp 550–559
Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127
MathSciNet MATH Google Scholar
Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. In: Proceedings of the 24th international conference on neural information processing systems, NIPS’11, USA. Curran Associates Inc., pp 2546–2554
Bhattacharya U, Chaudhuri BB (2005) Databases for research on recognition of handwritten characters of Indian scripts. In: Eighth international conference on document analysis and recognition (ICDAR’05), vol 2, pp 789–793
Bibaeva V (2018) Using metaheuristics for hyper-parameter optimization of convolutional neural networks. In: IEEE 28th international workshop on machine learning for signal processing (MLSP), pp 1–6
Blanco R, Malagón P, Cilla JJ, Moya JM (2018) Multiclass network attack classifier using CNN tuned with genetic algorithms. In: 28th International symposium on power and timing modeling, optimization and simulation (PATMOS), pp 177–182
Bochinski E, Senst T, Sikora T (2017) Hyper-parameter optimization for convolutional neural network committees based on evolutionary algorithms. In: IEEE international conference on image processing (ICIP), pp 3924–3928
Brinkmann BH, Bower MR, Stengel KA, Worrell GA, Stead M (2009) Large-scale electrophysiology: acquisition, compression, encryption, and storage of big data. J. Neurosci Methods 180(1):185–192
Google Scholar
Brock A, Lim T, Ritchie JM, Weston N (2017) Smash: one-shot model architecture search through hypernetworks. arXiv preprint arXiv:1708.05344
Cai H, Yang J, Zhang W, Han S, Yu Y (2018) Path-level network transformation for efficient architecture search. arXiv preprint arXiv:1806.02639
Cai Y, Cai Z, Zeng M, Liu X, Wu J, Wang G (2018) A novel deep learning approach: Stacked evolutionary auto-encoder. In: International joint conference on neural networks (IJCNN), pp 1–8
Carletta J, Ashby S, Bourban S, Flynn M, Guillemot M, Hain T, Kadlec J, Karaiskos V, Kraaij W, Kronenthal M, Lathoud G, Lincoln M, Lisowska A, andWilfried Post IM, Reidsma D, Wellner P (2006) The AMI meeting corpus: a pre-announcement. In: Proceedings of the second international conference on machine learning for multimodal interaction, pp 28–39
Cha Y-J, Choi W, Bykztrk O (2017) Deep learning-based crack damage detection using convolutional neural networks. Comput Aided Civ Infrastruct Eng 32(5):361–378
Google Scholar
Chen T, Goodfellow I, Shlens J (2015a) Net2net: accelerating learning via knowledge transfer. arXiv preprint arXiv:1511.05641
Chen X, Fang H, Lin T-Y, Vedantam R, Gupta S, Dollár P, Zitnick CL (2015b) Microsoft coco captions: data collection and evaluation server. arXiv preprint arXiv:1504.00325
Cheng F, Yu J, Xiong H (2010) Facial expression recognition in Jaffe dataset based on Gaussian process classification. IEEE Trans Neural Netw 21(10):1685–1690
Google Scholar
Chhabra Y, Varshney S, Ankita (2017). Hybrid particle swarm training for convolution neural network (CNN). In: Tenth international conference on contemporary computing (IC3), pp 1–3
Chiba Z, Abghour N, Moussaid K, Rida M (2019) Intelligent approach to build a deep neural network based IDS for cloud environment using combination of machine learning algorithms. Comput Secur 86:291–317
Google Scholar
Chiroma H, Gital AY, Rana N, Abdulhamid SM, Muhammad AN, Umar AY, Abubakar AI (2019) Nature inspired meta-heuristic algorithms for deep learning: recent progress and novel perspective. In: Advances in intelligent systems and computing. Springer, pp 59–70
CICIDS2017 (2017) Cicids2017 data set
CIDDS-001 (2019) Cidds-001 dataset, hochschule coburg
Coates A, Ng AY, Lee H (2011) An analysis of single-layer networks in unsupervised feature learning. International conference on artificial intelligence and statistics. Pt. Lauderdale, Florida, USA, pp 215–223
Google Scholar
Coello CAC (2003) Evolutionary multi-objective optimization: a critical review. In: Evolutionary optimization. Springer, pp 117–146
Cohen G, Afshar S, Tapson J, van Schaik A (2017) EMNIST: an extension of MNIST to handwritten letters
David OE, Greental I (2014) Genetic algorithms for evolving deep neural networks. In: Proceedings of the 2014 conference companion on genetic and evolutionary computation companion-GECCO Comp14. ACM Press, pp 1451–1452
De Jong K (1975) An analysis of the behavior of a class of genetic adaptive systems. Ph.D. thesis, University of Michigan
de Rosa GH, Papa JP (2019) Soft-tempering deep belief networks parameters through genetic programming. J Artif Intell Syst 1(1):43–59
Google Scholar
Deepa S, Baranilingesan I (2018) Optimized deep learning neural network predictive controller for continuous stirred tank reactor. Comput Electr Eng 71:782–797
Google Scholar
Delowar Hossain, Capi G (2017) Genetic algorithm based deep learning parameters tuning for robot object recognition and grasping. Zenodo
Desell T (2017) Large scale evolution of convolutional neural networks using volunteer computing. In: Proceedings of the genetic and evolutionary computation conference companion, GECCO ’17. Association for Computing Machinery, New York, pp 127–128
Ding C, Li W, Zhang L, Tian C, Wei W, Zhang Y (2018) Hyperspectral image classification based on convolutional neural networks with adaptive network structure. In: International conference on orange technologies (ICOT), pp 1–5
Dorigo M, Maniezzo V, Colorni A (1996) Ant system: optimization by a colony of cooperating agents. Trans Syst Man Cyber B 26(1):29–41
Google Scholar
Dua D, Graff C (2017) UCI machine learning repository
Eitz M, Hays J, Alexa M (2012) How do humans sketch objects? ACM Trans Graph (Proc SIGGRAPH) 31(4):44:1–44:10
Elsken T, Metzen JH, Hutter F (2018) Efficient multi-objective neural architecture search via lamarckian evolution. arXiv preprint arXiv:1804.09081
Erhan D, Courville A, Bengio Y, Vincent P (2010) Why does unsupervised pre-training help deep learning? In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 201–208
Evans B, Al-Sahaf H, Xue B, Zhang M (2018) Evolutionary deep learning: a genetic programming approach to image classification. In: IEEE congress on evolutionary computation (CEC), pp 1–6
Falco ID, Pietro GD, Sannino G, Scafuri U, Tarantino E, Cioppa AD, Trunfio GA (2018) Deep neural network hyper-parameter setting for classification of obstructive sleep apnea episodes. In: IEEE symposium on computers and communications (ISCC), pp 01187–01192
Fei-Fei L, Fergus R, Perona P (2007) Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. Comput Vis Image Underst 106(1):59–70
Google Scholar
Fekiač J, Zelinka I, Burguillo JC (2011) A review of methods for encoding neural network topologies in evolutionary computation. In: Proceedings of 25th European conference on modeling and simulation ECMS, pp 410–416
Floreano D, Dürr P, Mattiussi C (2008) Neuroevolution: from architectures to learning. Evol Intell 1(1):47–62
Google Scholar
Fogel LJ, Owens AJ, Walsh MJ (1966) Artificial intelligence through simulated evolution. Wiley, New York
MATH Google Scholar
Fong S, Deb S, Yang XS (2017) How meta-heuristic algorithms contribute to deep learning in the hype of big data analytics. Advances in intelligent systems and computing. Springer, Singapore, pp 3–25
Google Scholar
Fujino S, Mori N, Matsumoto K (2017) Deep convolutional networks for human sketches by means of the evolutionary deep learning. In: Joint 17th world congress of international fuzzy systems association and 9th international conference on soft computing and intelligent systems (IFSA-SCIS), pp 1–5
Fukushima K (1980) Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36(4):193–202
MATH Google Scholar
Furui S, Maekawa K, Isahara H (2000) A japanese national project on spontaneous speech corpus and processing technology. In: Proceedings of ASR’00, pp 244–248
Gaier A, Ha D (2019) Weight agnostic neural networks. In: Advances in neural information processing systems, pp 5364–5378
Gauci J, Stanley K (2007) Generating large-scale neural networks through discovering geometric regularities. In: Proceedings of the 9th annual conference on genetic and evolutionary computation, pp 997–1004
Ghamisi P, Chen Y, Zhu XX (2016) A self-improving convolution neural network for the classification of hyperspectral data. IEEE Geosci Remote Sens Lett 13(10):1537–1541
Google Scholar
Gibb S, La HM, Louis S (2018) A genetic algorithm for convolutional network structure optimization for concrete crack detection. In: IEEE congress on evolutionary computation (CEC), pp 1–8
Glover F (1986) Future paths for integer programming and links to artificial intelligence. Comput Oper Res 13(5):533–549
MathSciNet MATH Google Scholar
Goldberg DE, Richardson J et al (1987) Genetic algorithms with sharing for multimodal function optimization. In: Genetic algorithms and their applications: proceedings of the second international conference on genetic algorithms. Lawrence Erlbaum, Hillsdale, pp 41–49
Gong M, Liu J, Li H, Cai Q, Su L (2015) A multiobjective sparse feature learning model for deep neural networks. IEEE Trans Neural Netw Learn Syst 26(12):3263–3277
MathSciNet Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology
Gruau F (1994) Neural network synthesis using cellular encoding and the genetic algorithm
Gülcü A, Kuş Z (2019) Konvolüsyonel sinir ağlarında hiper-parametre optimizasyonu yöntemlerinin incelenmesi. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji 7(2):503–522
Google Scholar
Guo B, Hu J, Wu W, Peng Q, Wu F (2019) The tabu\_genetic algorithm: a novel method for hyper-parameter optimization of learning algorithms. Electronics 8(5):579
Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
MathSciNet MATH Google Scholar
Hinz T, Navarro-Guerrero N, Magg S, Wermter S (2018) Speeding up the hyperparameter optimization of deep convolutional neural networks. Int J Comput Intell Appl 17(02):1850008
Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Google Scholar
Hoekstra V (2011) An overview of neuroevolution techniques. Technical report
Holland JH (1975) Adaptation in natural and artificial systems. University of Michigan Press, Ann Arbor
Google Scholar
Hossain D, Capi G (2017) Multiobjective evolution for deep learning and its robotic applications. In: 8th international conference on information, intelligence, systems applications (IISA), pp 1–6
Hossain D, Capi G, Jindai M (2018) Optimizing deep learning parameters using genetic algorithm for object recognition and robot grasping. J Electron Sci Technol 16(1):11–15
Google Scholar
Hosseini M, Pompili D, Elisevich K, Soltanian-Zadeh H (2017) Optimized deep learning for EEG big data and seizure prediction BCI via internet of things. IEEE Trans Big Data 3(4):392–404
Google Scholar
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Huang Y, Wu R, Sun Y et al (2015) Vehicle logo recognition system based on convolutional neural networks with a pretraining strategy. IEEE Trans Intell Transp Syst 16(4):1951–19604
Google Scholar
Hull JJ (1994) A database for handwritten text recognition research. IEEE Trans Pattern Anal Mach Intell 16(5):550–554
Google Scholar
Hutter F, Hoos HH, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: International conference on learning and intelligent optimization. Springer, pp 507–523
Irsoy O, Cardie C (2014) Deep recursive neural networks for compositionality in language. In: Advances in neural information processing systems, pp 2096–2104
Jacob C, Rehder J (1993) Evolution of neural net architectures by a hierarchical grammar-based genetic system. In: Artificial neural nets and genetic algorithms. Springer, pp 72–79
Jaderberg M, Dalibard V, Osindero S, Czarnecki WM, Donahue J, Razavi A, Vinyals O, Green T, Dunning I, Simonyan K, Fernando C, Kavukcuoglu K (2017) Population based training of neural networks
Jain A, Phanishayee A, Mars J, Tang L, Pekhimenko G (2018) Gist: efficient data encoding for deep neural network training. In: ACM/IEEE 45th annual international symposium on computer architecture (ISCA), pp 776–789
Jin H, Song Q, Hu X (2019) Auto-keras: an efficient neural architecture search system. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1946–1956
Jin Y (2011) Surrogate-assisted evolutionary computation: recent advances and future challenges. Swarm Evol Comput 1(2):61–70
Google Scholar
Junior FEF, Yen GG (2019) Particle swarm optimization of deep neural networks architectures for image classification. Swarm Evol Comput 49:62–74
Google Scholar
Kaggle (2017) Kaggle competition dataset
Karaboga D (2005) An idea based on honey bee swarm for numerical optimization. Technical Report TR06, Computer Engineering Department, Engineering Faculty, Erciyes University
Karaboga D, Akay B (2009) A comparative study of artificial bee colony algorithm. Appl Math Comput 214(1):108–132
MathSciNet MATH Google Scholar
Kassahun Y, Edgington M, Metzen JH, Sommer G, Kirchner F (2007) A common genetic encoding for both direct and indirect encodings of networks. In: Proceedings of the 9th annual conference on genetic and evolutionary computation, pp 1029–1036
Kennedy J, Eberhart RC (1995) Particle swarm optimization. IEEE international conference on neural networks 4:1942–1948
Google Scholar
Khalifa MH, Ammar M, Ouarda W, Alimi AM (2017) Particle swarm optimization for deep learning of convolution neural network. In: Sudan conference on computer science and information technology (SCCSIT), pp 1–5
Khan A, Sohail A, Zahoora U, Qureshi AS (2020) A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev 53(8):5455–5516
Google Scholar
Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680
MathSciNet MATH Google Scholar
Koza JR (1990) Genetic programming: a paradigm for genetically breeding populations of computer programs to solve problems. Technical Report STAN–CS–90–1314, Computer Science Department, Stanford University
Koza JR, Rice JP (1991) Genetic generation of both the weights and architecture for a neural network. In: IJCNN-91-seattle international joint conference on neural networks, vol 2. IEEE, pp 397–404
Koziel S, Michalewicz Z (1999) Evolutionary algorithms, homomorphous mappings, and constrained parameter optimization. Evol Comput 7(1):19–44
Google Scholar
Kramer MA (1991) Nonlinear principal component analysis using autoassociative neural networks. AIChE J 37(2):233–243
Google Scholar
Kramer O (2018) Evolution of convolutional highway networks. Springer, Berlin, pp 395–404
Google Scholar
Krizhevsky A (2009) Learning multiple layers of features from tiny images. Technical Report 5, University of Toronto
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Kösten MM, Barut M, Acir N (2018) Deep neural network training with iPSO algorithm. In: 26th Signal processing and communications applications conference (SIU), pp 1–4
Kumar P, Batra S (2018) Meta-heuristic based optimized deep neural network for streaming data prediction. In: International conference on advances in computing, communication control and networking (ICACCCN), pp 1079–1085
Lakshmanaprabu S, Sachi NM, Shankar K, Arunkumar N, Gustavo R (2019) Optimal deep learning model for classification of lung cancer on CT images. Future Gener Comput Syst 92:374–382
Google Scholar
Lamos-Sweeney JD (2012) Deep learning using genetic algorithms
Lander S, Shang Y (2015) EvoAE: a new evolutionary method for training autoencoders for deep learning networks. In: IEEE 39th annual computer software and applications conference, vol 2, pp 790–795
Larochelle H, Erhan D, Courville A, Bergstra J, Bengio Y (2007) An empirical evaluation of deep architectures on problems with many factors of variation. In: Proceedings of the 24th international conference on machine learning, ICML ’07. Association for Computing Machinery, New York, pp 473-480
Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Google Scholar
Li FF, Andreetto M, Ranzato MA, Perona P (2003) Caltech-101 silhouttes dataset. Technical Report 7694, California Institute of Technology
Li J, Cheng K, Wang S, Morstatter F, Trevino RP, Tang J, Liu H (2018) Feature selection: a data perspective. ACM Comput Surv (CSUR) 50(6):94
Google Scholar
Li L, Jamieson K, DeSalvo G, Rostamizadeh A, Talwalkar A (2017) Hyperband: a novel bandit-based approach to hyperparameter optimization. J Mach Learn Res 18(1):6765–6816
MathSciNet MATH Google Scholar
Li L, Talwalkar A (2020) Random search and reproducibility for neural architecture search. In: Uncertainty in artificial intelligence. PMLR, pp 367–377
Li Y, Lu G, Zhou L, Jiao L (2017) Quantum inspired high dimensional hyperparameter optimization of machine learning model. In: International smart cities conference (ISC2), pp 1–6
Lim SM, Sultan ABM, Sulaiman MN, Mustapha A, Leong K (2017) Crossover and mutation operators of genetic algorithms. Int J Mach Learn Comput 7(1):9–12
Google Scholar
Lin K, Pai P, Ting Y (2019) Deep belief networks with genetic algorithms in forecasting wind speed. IEEE Access 7:99244–99253
Google Scholar
Lindenmayer A (1968) Mathematical models for cellular interactions in development II. Simple and branching filaments with two-sided inputs. J Theor Biol 18(3):300–315
Liu G, Xiao L, Xiong C (2017) Image classification with deep belief networks and improved gradient descent. In: IEEE international conference on computational science and engineering (CSE) and IEEE international conference on embedded and ubiquitous computing (EUC), vol 1, pp 375–380
Liu H, Simonyan K, Vinyals O, Fernando C, Kavukcuoglu K (2017) Hierarchical representations for efficient architecture search. arXiv preprint arXiv:1711.00436
Liu J, Gong M, Miao Q, Wang X, Li H (2018) Structure learning for deep neural networks based on multiobjective optimization. IEEE Trans Neural Netw Learn Syst 29(6):2450–2463
MathSciNet Google Scholar
Liu P, Basha MDE, Li Y, Xiao Y, Sanelli PC, Fang R (2019) Deep evolutionary networks with expedited genetic algorithms for medical image denoising. Med Image Anal 54:306–315
Google Scholar
Liu Z, Luo P, Wang X, Tang X (2015) Deep learning face attributes in the wild. In: Proceedings of the 2015 IEEE international conference on computer vision (ICCV), ICCV ’15, pp 3730–3738
Loni M, Daneshtalab M, Sjödin M (2018a) ADONN: adaptive design of optimized deep neural networks for embedded systems. In: 21st Euromicro conference on digital system design (DSD), pp 397–404
Loni M, Majd A, Loni A, Daneshtalab M, Sjödin M, Troubitsyna E (2018b) Designing compact convolutional neural network for embedded stereo vision systems. In: IEEE 12th international symposium on embedded multicore/many-core systems-on-chip (MCSoC), pp 244–251
Lopez-Rincon A, Tonda A, Elati M, Schwander O, Piwowarski B, Gallinari P (2018) Evolutionary optimization of convolutional neural networks for cancer miRNA biomarkers classification. Appl Soft Comput 65:91–100
Google Scholar
Lorenzo PR, Nalepa J (2018) Memetic evolution of deep neural networks. In: Proceedings of the genetic and evolutionary computation conference, pp 505–512
Loussaief S, Abdelkrim A (2018) Convolutional neural network hyper-parameters optimization based on genetic algorithms. Int J Adv Comput Sci Appl 9(10):252–266
Google Scholar
Lu Z, Whalen I, Boddeti V, Dhebar Y, Deb K, Goodman E, Banzhaf W (2019) Nsga-net: neural architecture search using multi-objective genetic algorithm. In: Proceedings of the genetic and evolutionary computation conference, pp 419–427
Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I (2010) The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. IEEE conference on computer vision and pattern recognition. California, USA, San Francisco, pp 94–101
Mahfoud SW (1995) Niching methods for genetic algorithms. Ph.D. thesis. University of Illinois at Urbana Champaign, Urbana
Mamaev A (2018) Flowers recognition dataset
Marlin B, Swersky K, Chen B, Freitas N (2010) Inductive principles for restricted Boltzmann machine learning. J Mach Learn Res Proc Track 9:509–516
Google Scholar
Martín A, Lara-Cabrera R, Fuentes-Hurtado F, Naranjo V, Camacho D (2018) EvoDeep: a new evolutionary approach for automatic deep neural networks parametrisation. J Parallel Distrib Comput 117:180–191
Google Scholar
Martinez D, Brewer W, Behm G, Strelzoff A, Wilson A, Wade D (2018) Deep learning evolutionary optimization for regression of rotorcraft vibrational spectra. In: IEEE/ACM machine learning in HPC environments (MLHPC), pp 57–66
Martín A, Fuentes-Hurtado F, Naranjo V, Camacho D (2017) Evolving deep neural networks architectures for android malware classification. In: IEEE congress on evolutionary computation (CEC), pp 1659–1666
Mattioli F, Caetano D, Cardoso A, Naves E, Lamounier E (2019) An experiment on the use of genetic algorithms for topology selection in deep learning. J Electr Comput Eng 2019:1–12
Google Scholar
McNicholas W, Levy P (2000) Sleep-related breathing disorders: definitions and measurements. Eur Respir J 15(6):988
Google Scholar
Miikkulainen R, Liang JZ, Meyerson E, Rawal A, Fink D, Francon O, Raju B, Shahrzad H, Navruzyan A, Duffy N et al (2017) Evolving deep neural networks. corr abs/1703.00548 (2017). arXiv preprintarXiv:1703.00548
Miranda V, da Hora Martins J, Palma V (2014) Optimizing large scale problems with metaheuristics in a reduced space mapped by autoencoders-application to the wind-hydro coordination. IEEE Trans Power Syst 29(6):3078–3085
Google Scholar
Mitchell M, Santorini B, Marcinkiewicz MA, Taylor A (1999) Treebank-3 ldc99t42. Phila Linguist Data Consort 3:2
Google Scholar
Moriarty DE, Miikkulainen R (1997) Forming neural networks through efficient and adaptive coevolution. Evol Comput 5(4):373–399
Google Scholar
Munder S, Gavrila DM (2006) An experimental study on pedestrian classification. IEEE Trans Pattern Anal Mach Intell 28(11):1863–1868
Google Scholar
Muñoz-Ordóñez J, Cobos C, Mendoza M, Herrera-Viedma E, Herrera F, Tabik S (2018) Framework for the training of deep neural networks in TensorFlow using metaheuristics. In: Intelligent data engineering and automated learning—IDEAL 2018, pp 801–811. Springer
Nakisa B, Rastgoo MN, Rakotonirainy A, Maire F, Chandran V (2018) Long short term memory hyperparameter optimization for a neural network based emotion recognition framework. IEEE Access 6:49325–49338
Google Scholar
Nalepa J, Lorenzo PR (2017) Convergence analysis of PSO for hyper-parameter selection in deep neural networks. In: Advances on P2P, parallel, grid, cloud and internet computing, pp 284–295. Springer
Negrinho R, Gordon G (2017) Deeparchitect: Automatically designing and training deep architectures. arXiv preprint arXiv:1704.08792
Nilsback M-E, Zisserman A (2008) Automated flower classification over a large number of classes. Conference on computer vision, graphics image processing. Madurai, India, pp 722–729
Google Scholar
Nour M, Slay J (2015) Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In: IEEE military communications and information systems conference (MilCIS), pp 1–6
NSL-KDD (2019) Dataset of nsl-kdd
Osaba E, Carballedo R, Diaz F, Onieva E, De La Iglesia I, Perallos A (2014) Crossover versus mutation: a comparative analysis of the evolutionary strategy of genetic algorithms applied to combinatorial optimization problems. Sci World J. https://doi.org/10.1155/2014/154676
Article Google Scholar
Pandey S, Kaur G (2018) Curious to click it?-Identifying clickbait using deep learning and evolutionary algorithm. In: International conference on advances in computing, communications and informatics (ICACCI), pp 1481–1487
Patterson J, Gibson A (2017) Deep learning: a practitioner’s approach. O’Reilly Media, Inc
Pavai G, Geetha T (2016) A survey on crossover operators. ACM Comput Surv (CSUR) 49(4):1–43
Google Scholar
Pham H, Guan MY, Zoph B, Le QV, Dean J (2018) Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268
Pouyanfar S, Tao Y, Mohan A, Tian H, Kaseb AS, Gauen K, Dailey R, Aghajanzadeh S, Lu Y-H, Chen S-C, Shyu M-L (2018) Dynamic sampling in convolutional neural networks for imbalanced data classification. In: IEEE international conference on multimedia information processing and retrieval, pp 112–117
Proenca H, Filipe S, Santos R, Oliveira J, Alexandre LA (2010) The UBIRIS. v2: a database of visible wavelength iris images captured on-the-move and at-a-distance. IEEE Trans Pattern Anal Mach Intell 32(8):1529
Qolomany B, Maabreh M, Al-Fuqaha A, Gupta A, Benhaddou D (2017) Parameters optimization of deep learning models using particle swarm optimization. In: 13th International wireless communications and mobile computing conference (IWCMC), pp 1285–1290
Ranzato M, Poultney C, Chopra S, LeCun Y (2006) Efficient learning of sparse representations with an energy-based model. In: Proceedings of advances in neural information processing systems, Vancouver, BC, Canada, pp 1137–1144
Rashedi E, Nezamabadi-pour H, Saryazdi S (2009) GSA: a gravitational search algorithm. Inf Sci 179(13):2232–2248 (Special section on high order fuzzy sets)
Rashid TA, Fattah P, Awla DK (2018) Using accuracy measure for improving the training of LSTM with metaheuristic algorithms. Procedia Comput Sci 140:324–333
Google Scholar
Real E, Aggarwal A, Huang Y, Le QV (2019) Regularized evolution for image classifier architecture search. Proc AAAI Conf Artif Intell 33:4780–4789
Google Scholar
Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le QV, Kurakin A (2017) Large-scale evolution of image classifiers. In: Proceedings of machine learning research, PMLR, vol 70. International Convention Centre, Sydney, pp 2902–2911
Rechenberg I (1965) Cybernetic solution path of an experimental problem. Library translation 1122, Royal Aircraft Establishment, Farnborough, Hants, UK
Rere LMR, Fanany MI, Arymurthy AM (2016) Metaheuristic algorithms for convolution neural network. Comput Intell Neurosci 2016:1–13
Google Scholar
Rere LR, Fanany MI, Arymurthy AM (2015) Simulated annealing algorithm for deep learning. Procedia Comput Sci 72:137–144
Google Scholar
Rosa G, Papa J, Costa K, Passos L, Pereira C, Yang X-S (2016) Learning parameters in deep belief networks through firefly algorithm. In: Artificial neural networks in pattern recognition. Springer, pp 138–149
Rosa G, Papa J, Marana A, Scheirer W, Cox D (2015) Fine-tuning convolutional neural networks using harmony search. In: Pardo A, Kittler J (eds) Progress in pattern recognition, image analysis, computer vision, and applications. Springer, Cham, pp 683–690
Google Scholar
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536
MATH Google Scholar
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis (IJCV) 115(3):211–252
MathSciNet Google Scholar
Sabar NR, Turky A, Song A, Sattar A (2017) Optimising deep belief networks by hyper-heuristic approach. In: IEEE congress on evolutionary computation (CEC), pp 2738–2745
Sabar NR, Turky A, Song A, Sattar A (2019) An evolutionary hyper-heuristic to optimise deep belief networks for image reconstruction. Appl Soft Comput 97
Google Scholar
Salih A, Moshaiov A (2016) Multi-objective neuro-evolution: should the main reproduction mechanism be crossover or mutation? In: IEEE international conference on systems, man, and cybernetics (SMC). IEEE, pp 004585–004590
Saxena A, Goebel K (2008) Turbofan engine degradation simulation data set
Schiffmann W (2000) Encoding feedforward networks for topology optimization by simulated evolution. In: Fourth international conference on knowledge-based intelligent engineering systems and allied technologies, KES’2000. Proceedings (Cat. No. 00TH8516), vol 1. IEEE, pp 361–364
Sehgal A, La H, Louis S, Nguyen H (2019) Deep reinforcement learning using genetic algorithm for parameter optimization. In: Third IEEE international conference on robotic computing (IRC). IEEE, pp 596–601
Semeion (2008) Semeion handwritten digit data set
Shi W, Liu D, Cheng X, Li Y, Zhao Y (2019) Particle swarm optimization-based deep neural network for digital modulation recognition. IEEE Access 7:104591–104600
Google Scholar
Shrestha A, Mahmood A (2019) Review of deep learning algorithms and architectures. IEEE Access 7:53040–53065
Google Scholar
Silva PH, Luz E, Zanlorensi LA, Menotti D, Moreira G (2018) Multimodal feature level fusion based on particle swarm optimization with deep transfer learning. In: IEEE congress on evolutionary computation (CEC), pp 1–8
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Sinha T, Haidar A, Verma B (2018) Particle swarm optimization based approach for finding optimal values of convolutional neural network parameters. In: IEEE congress on evolutionary computation (CEC), pp 1–6
Smith WA, Randall RB (2015) Rolling element bearing diagnostics using the case western reserve university data: a benchmark study. Mech Syst Signal Process 64(12):100–131
Google Scholar
Smolensky P (1986) Chapter 6: Information processing in dynamical systems: foundations of harmony theory. In: Rumelhart DE, McClelland JL, Group PR (eds) Parallel distributed processing: explorations in the microstructure of cognition: foundations, vol 1. MIT Press, Cambridge, pp 194–281
Google Scholar
Soon FC, Khaw HY, Chuah JH, Kanesan J (2018) Hyper-parameters optimisation of deep CNN architecture for vehicle logo recognition. IET Intell Transp Syst 12(8):939–946
Google Scholar
Stanley KO, D’Ambrosio DB, Gauci J (2009) A hypercube-based encoding for evolving large-scale neural networks. Artif Life 15(2):185–212
Google Scholar
Stanley KO, Miikkulainen R (2002) Evolving neural networks through augmenting topologies. Evol Comput 10(2):99–127
Google Scholar
Steinholtz OS (2018) A comparative study of black-box optimization algorithms for tuning of hyper-parameters in deep neural networks
Storn R, Price K (1995) Differential evolution: a simple and efficient adaptive scheme for global optimization over continuous spaces. Technical report, International Computer Science Institute, Berkley
MATH Google Scholar
Suganuma M, Shirakawa S, Nagao T (2017) A genetic programming approach to designing convolutional neural network architectures. In: Proceedings of the genetic and evolutionary computation conference, GECCO ’17. Association for Computing Machinery, New York, pp 497–504
Sun Y, Xue B, Zhang M, Yen GG (2018) Automatically evolving cnn architectures based on blocks. arXiv preprint arXiv:1810.11875
Sun Y, Xue B, Zhang M, Yen GG (2018) An experimental study on hyper-parameter optimization for stacked auto-encoders. In: IEEE congress on evolutionary computation (CEC), pp 1–8
Sun Y, Xue B, Zhang M, Yen GG (2019a) Evolving deep convolutional neural networks for image classification. IEEE Trans Evolut Comput 24:394–407
Google Scholar
Sun Y, Xue B, Zhang M, Yen GG (2019b) A particle swarm optimization-based flexible convolutional autoencoder for image classification. IEEE Trans Neural Netw Learn Syst 30(8):2295–2309
Google Scholar
Sung K-K, Poggio T (1998) Example-based learning for view-based human face detection. IEEE Trans Pattern Anal Mach Intell 20(1):39–51
Google Scholar
Syulistyo AR, Purnomo DMJ, Rachmadi MF, Wibowo A (2016) Particle swarm optimization (PSO) for training optimization on convolutional neural networK (CNN). Jurnal Ilmu Komputer dan Informasi 9(1):52
Google Scholar
Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2016) Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv preprint arXiv:1602.07261
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Tanaka T, Moriya T, Shinozaki T, Watanabe S, Hori T, Duh K (2016) Automated structure discovery and parameter tuning of neural network language model based on evolution strategy. In: IEEE spoken language technology workshop (SLT), pp 665–671
TCWB (2019) Wind speed and weather-related data at the Penghu station in Taiwan
The Cancer Genome Atlas (TCGA) (2006) The cancer genome atlas (TCGA)
Tian H, Pouyanfar S, Chen J, Chen S, Iyengar SS (2018) Automatic convolutional neural network selection for image classification using genetic algorithms. In: IEEE international conference on information reuse and integration (IRI), pp 444–451
Tian H, Tao Y, Pouyanfar S, Chen S-C, Shyu M-L (2018) Multimodal deep representation learning for video classification. World Wide Web 1:1–17
Google Scholar
Tian Y, Liu X (2019) A deep adaptive learning method for rolling bearing fault diagnosis using immunity. Tsinghua Sci Technol 24(6):750–762
Google Scholar
Tian Z, Fong S (2016) Survey of meta-heuristic algorithms for deep learning training. In: Optimization algorithms—methods and applications. InTech, pp 195–220
Tirumala SS, Ali S, Ramesh CP (2016) Evolving deep neural networks: a new prospect. In: 12th International conference on natural computation, fuzzy systems and knowledge discovery (ICNC-FSKD), pp 69–74
Torralba A, Fergus R, Freeman WT (2008) 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans Pattern Anal Mach Intell 30(11):1958–1970
Google Scholar
Trivedi A, Srivastava S, Mishra A, Shukla A, Tiwari R (2018) Hybrid evolutionary approach for Devanagari handwritten numeral recognition using convolutional neural network. Procedia Comput Sci 125:525–532
Google Scholar
VIA/I-ELCAP (2019) Elcap public lung image database
Vidnerova P, Neruda R (2017) Evolution strategies for deep neural network models design. In: CEUR workshop proceedings, Proceedings of the 17th conference on information technologies—applications and theory, ITAT 2017, pp 159–166
Vito SD, Fattoruso G, Pardo M, Tortorella F, Francia GD (2012) Semi-supervised learning techniques in artificial olfaction: a novel approach to classification problems and drift counteraction. IEEE Sens J 12(11):3215–3224
Google Scholar
Wade D, Vongpaseuth T, Lugos R, Ayscue J, Wilson A, Antolick L, Brower N, Krick S, Szelistowski M, Albarado K (2015) Machine learning algorithms for hums improvement on rotorcraft components. In: AHS Forum 71, at Virginia Beach, VA
Wang B, Sun Y, Xue B, Zhang M (2018) Evolving deep convolutional neural networks by variable-length particle swarm optimization for image classification. In: IEEE congress on evolutionary computation (CEC), pp 1–8
Wang C, Xu C, Yao X, Tao D (2019) Evolutionary generative adversarial networks. IEEE Trans Evolut Comput 23:921–934
Google Scholar
Wang Y, Zhang H, Zhang G (2019) cPSO-CNN: an efficient PSO-based algorithm for fine-tuning hyper-parameters of convolutional neural networks. Swarm Evol Comput 49:114–123
Google Scholar
Wei P, Li Y, Zhang Z, Hu T, Li Z, Liu D (2019) An optimization method for intrusion detection classification model based on deep belief network. IEEE Access 7:87593–87605
Google Scholar
Wistuba M (2018) Deep learning architecture search by neuro-cell-based evolution with function-preserving mutations. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 243–258
Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms
Xie L, Yuille A (2017) Genetic CNN. In: Proceedings of the IEEE international conference on computer vision (ICCV)
Xie S, Zheng H, Liu C, Lin L (2018) SNAS: stochastic neural architecture search. arXiv e-prints
Ye F (2017) Particle swarm optimization-based automatic parameter selection for deep neural networks and its applications in large-scale and high-dimensional data. PLoS ONE 12(12)
Google Scholar
Yinka-Banjo C, Ugot O-A (2019) A review of generative adversarial networks and its application in cybersecurity. Artif Intell Rev 53:1721–1736
Google Scholar
Yoo Y (2019) Hyperparameter optimization of deep neural network using univariate dynamic encoding algorithm for searches. Knowl-Based Syst 178:74–83
Google Scholar
Yu F, Seff A, Zhang Y, Song S, Funkhouser T, Xiao J (2015) LSUN: construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365
Yuliyono AD, Girsang AS (2019) Artificial bee colony-optimized LSTM for bitcoin price prediction. Adv Sci Technol Eng Syst J 4(5):375–383
Google Scholar
Zavalnyi O, Zhao G, Savchenko Y, Xiao W (2018) Experimental evaluation of metaheuristic optimization of gradients as an alternative to backpropagation. In: IEEE 4th International conference on computer and communications (ICCC), pp 2095–2099
Zhang C, Lim P, Qin AK, Tan KC (2017) Multiobjective deep belief networks ensemble for remaining useful life estimation in prognostics. IEEE Trans Neural Netw Learn Syst 28(10):2306–2318
Google Scholar
Zhang C, Sun JH, Tan KC (2015) Deep belief networks ensemble with multi-objective optimization for failure diagnosis. In: IEEE international conference on systems, man, and cybernetics, pp 32–37
Zhong Z, Yan J, Wu W, Shao J, Liu C-L (2018) Practical block-wise neural network architecture generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2423–2432
Zhu H, An Z, Yang C, Xu K, Zhao E, Xu Y (2019) EENA: efficient evolution of neural architecture. In: Proceedings of the IEEE international conference on computer vision workshops
Zoph B, Le QV (2016) Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578
Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8697–8710

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Erciyes University, 38039, Melikgazi, Kayseri, Turkey
Bahriye Akay & Dervis Karaboga
King Abdulaziz University, Jeddah, Saudi Arabia
Dervis Karaboga
Department of Mechatronics Engineering, Erciyes University, 38039, Melikgazi, Kayseri, Turkey
Rustu Akay

Authors

Bahriye Akay
View author publications
You can also search for this author in PubMed Google Scholar
Dervis Karaboga
View author publications
You can also search for this author in PubMed Google Scholar
Rustu Akay
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bahriye Akay.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Akay, B., Karaboga, D. & Akay, R. A comprehensive survey on optimizing deep learning models by metaheuristics. Artif Intell Rev 55, 829–894 (2022). https://doi.org/10.1007/s10462-021-09992-0

Download citation

Published: 31 March 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s10462-021-09992-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comprehensive survey on optimizing deep learning models by metaheuristics

Abstract

Access this article

Similar content being viewed by others

Advanced metaheuristic optimization techniques in applications of deep neural networks: a review

Application of Meta-Heuristic Algorithms for Training Neural Networks and Deep Learning Architectures: A Comprehensive Review

Deep neural network hyper-parameter tuning through twofold genetic approach

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A comprehensive survey on optimizing deep learning models by metaheuristics

Abstract

Access this article

Similar content being viewed by others

Advanced metaheuristic optimization techniques in applications of deep neural networks: a review

Application of Meta-Heuristic Algorithms for Training Neural Networks and Deep Learning Architectures: A Comprehensive Review

Deep neural network hyper-parameter tuning through twofold genetic approach

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation