Skip to main content

Investigating Deep Recurrent Connections and Recurrent Memory Cells Using Neuro-Evolution

  • Chapter
  • First Online:
Deep Neural Evolution

Part of the book series: Natural Computing Series ((NCS))

Abstract

Neural architecture search poses one of the most difficult problems for statistical learning, given the incredibly vast architectural search space. This problem is further compounded for recurrent neural networks (RNNs), where every node in an architecture can be connected to any other node via recurrent connections which pass information from previous passes through the RNN via a weighted connection. Most modern-day RNNs focus on recurrent connections which pass information from the immediately preceding pass by utilizing gated constructs known as memory cells; however, connections farther back in time, or deep recurrent connections, are also possible. A novel neuro-evolutionary metaheuristic called EXAMM is utilized to conduct extensive experiments evolving RNNs consisting of a suite of memory cells and simple neurons, with and without deep recurrent connections. These experiments evolved and trained 10.56 million RNNs, with results showing that networks with deep recurrent connections perform significantly better than those without, and in some cases the best evolved RNNs consist of only simple neurons and deep recurrent connections. These results strongly suggest that utilizing complex recurrent connectivity patterns in RNNs deserves further study and also showcases the strong potential for using neuro-evolutionary metaheuristic algorithms as tools for understanding and training effective RNNs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The bias is omitted for clarity and simplicity of presentation.

  2. 2.

    https://ngafid.org.

  3. 3.

    https://github.com/travisdesell/exact.

References

  1. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates, Inc., Red Hook (2012). Available: http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

    Google Scholar 

  2. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  3. Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)

    Google Scholar 

  4. Kim, Y.: Convolutional neural networks for sentence classification. Preprint, arXiv:1408.5882 (2014)

    Google Scholar 

  5. Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.-R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)

    Article  Google Scholar 

  6. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Preprint, arXiv:1409.1556 (2014)

    Google Scholar 

  7. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

    Google Scholar 

  8. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  9. Ororbia, A.G. II, Mikolov, T., Reitter, D.: Learning simpler language models with the differential state framework. Neural Comput. 0(0), 1–26 (2017). PMID: 28957029. Available: https://doi.org/10.1162/neco_a_01017

    Article  MathSciNet  MATH  Google Scholar 

  10. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. Preprint, arXiv:1412.3555 (2014)

    Google Scholar 

  11. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  12. Zhou, G.-B., Wu, J., Zhang, C.-L., Zhou, Z.-H.: Minimal gated unit for recurrent neural networks. Int. J. Autom. Comput. 13(3), 226–234 (2016)

    Article  Google Scholar 

  13. Collins, J., Sohl-Dickstein, J., Sussillo, D.: Capacity and trainability in recurrent neural networks. Preprint, arXiv:1611.09913 (2016)

    Google Scholar 

  14. Gomez, F., Schmidhuber, J., Miikkulainen, R.: Accelerated neural evolution through cooperatively coevolved synapses. J. Mach. Learn. Res. 9, 937–965 (2008)

    MathSciNet  MATH  Google Scholar 

  15. Salama, K., Abdelbar, A.M.: A novel ant colony algorithm for building neural network topologies. In: Swarm Intelligence, pp. 1–12. Springer, Berlin (2014)

    Google Scholar 

  16. Xie, L., Yuille, A.: Genetic CNN. Preprint, arXiv:1703.01513 (2017)

    Google Scholar 

  17. Suganuma, M., Shirakawa, S., Nagao, T.: A genetic programming approach to designing convolutional neural network architectures. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’17, pp. 497–504. ACM, New York (2017). Available: http://doi.acm.org/10.1145/3071178.3071229

  18. Sun, Y., Xue, B., Zhang, M.: Evolving deep convolutional neural networks for image classification. CoRR, vol. abs/1710.10741 (2017). Available: http://arxiv.org/abs/1710.10741

  19. Miikkulainen, R., Liang, J., Meyerson, E., Rawal, A., Fink, D., Francon, O., Raju, B., Shahrzad, H., Navruzyan, A., Duffy, N., Hodjat, B.: Evolving deep neural networks. Preprint, arXiv:1703.00548 (2017)

    Google Scholar 

  20. Real, E., Moore, S., Selle, A., Saxena, S., Suematsu, Y.L., Le, Q., Kurakin, A.: Large-scale evolution of image classifiers. Preprint, arXiv:1703.01041 (2017)

    Google Scholar 

  21. Stanley, K., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)

    Article  Google Scholar 

  22. Stanley, K.O., D’Ambrosio, D.B., Gauci, J.: A hypercube-based encoding for evolving large-scale neural networks. Artif. Life 15(2), 185–212 (2009)

    Article  Google Scholar 

  23. Rawal, A., Miikkulainen, R.: Evolving deep LSTM-based memory networks using an information maximization objective. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, pp. 501–508. ACM, New York (2016)

    Google Scholar 

  24. Rawal, A., Miikkulainen, R.: From nodes to networks: evolving recurrent neural networks. CoRR, vol. abs/1803.04439 (2018). Available: http://arxiv.org/abs/1803.04439

  25. Desell, T., Clachar, S., Higgins, J., Wild, B.: Evolving deep recurrent neural networks using ant colony optimization. In: European Conference on Evolutionary Computation in Combinatorial Optimization, pp. 86–98. Springer, Berlin (2015)

    Google Scholar 

  26. ElSaid, A., El Jamiy, F., Higgins, J., Wild, B., Desell, T.: Optimizing long short-term memory recurrent neural networks using ant colony optimization to predict turbine engine vibration. Appl. Soft Comput. 73, 969–991 (2018)

    Article  Google Scholar 

  27. ElSaid, A., Jamiy, F.E., Higgins, J., Wild, B., Desell, T.: Using ant colony optimization to optimize long short-term memory recurrent neural networks. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 13–20. ACM, New York (2018)

    Google Scholar 

  28. Desell, T.: Accelerating the evolution of convolutional neural networks with node-level mutations and epigenetic weight initialization. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 157–158. ACM, New York (2018)

    Google Scholar 

  29. ElSaid, A., Benson, S., Patwardhan, S., Stadem, D., Travis, D.: Evolving recurrent neural networks for time series data prediction of coal plant parameters. In: The 22nd International Conference on the Applications of Evolutionary Computation, Leipzig, April 2019

    Google Scholar 

  30. Elsken, T., Metzen, J.H., Hutter, F.: Neural architecture search: a survey. Preprint, arXiv:1808.05377 (2018)

    Google Scholar 

  31. Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 4780–4789 (2019)

    Google Scholar 

  32. Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. Preprint, arXiv:1802.03268 (2018)

    Google Scholar 

  33. Liu, C., Zoph, B., Neumann, M., Shlens, J., Hua, W., Li, L.-J., Fei-Fei, L., Yuille, A., Huang, J., Murphy, K.: Progressive neural architecture search. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 19–34 (2018)

    Chapter  Google Scholar 

  34. Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. Preprint, arXiv:1611.01578 (2016)

    Google Scholar 

  35. McClelland, J.L., Rumelhart, D.E., P. R. Group, et al.: Parallel distributed processing, vol. 2. MIT Press, Cambridge (1987)

    Google Scholar 

  36. Lin, T., Horne, B.G., Tino, P., Giles, C.L.: Learning long-term dependencies in NARX recurrent neural networks. IEEE Trans. Neural Netw. 7(6), 1329–1338 (1996)

    Article  Google Scholar 

  37. Lin, T., Horne, B.G., Giles, C.L.: How embedded memory in recurrent neural network architectures helps learning long-term temporal dependencies. Neural Netw. 11(5), 861–868 (1998)

    Article  Google Scholar 

  38. Lin, T., Horne, B.G., Giles, C.L., Kung, S.-Y.: What to remember: how memory order affects the performance of NARX neural networks. In: 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No. 98CH36227), vol. 2, pp. 1051–1056. IEEE, Piscataway (1998)

    Google Scholar 

  39. Giles, C.L., Lin, T., Horne, B.G., Kung, S.-Y.: The past is important: a method for determining memory structure in NARX neural networks. In: 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No. 98CH36227), vol. 3, pp. 1834–1839. IEEE, Piscataway (1998)

    Google Scholar 

  40. Diaconescu, E.: The use of NARX neural networks to predict chaotic time series. WSEAS Trans. Comput. Res. 3(3), 182–191 (2008)

    Google Scholar 

  41. Chen, J., Chaudhari, N.S.: Segmented-memory recurrent neural networks. IEEE Trans. Neural Netw. 20(8), 1267–1280 (2009)

    Article  Google Scholar 

  42. ElSaid, A., Wild, B., Higgins, J., Desell, T.: Using LSTM recurrent neural networks to predict excess vibration events in aircraft engines. In: 2016 IEEE 12th International Conference on e-Science (e-Science), pp. 260–269. IEEE, Piscataway (2016)

    Google Scholar 

  43. Ororbia, A., ElSaid, A., Desell, T.: Investigating recurrent neural network memory structures using neuro-evolution. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’19, pp. 446–455. ACM, New York (2019). Available: http://doi.acm.org/10.1145/3321707.3321795

  44. Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)

    Article  Google Scholar 

  45. Jordan, M.I.: Serial order: a parallel distributed processing approach. Adv. Psychol. 121, 471–495 (1997)

    Article  Google Scholar 

  46. Ororbia, I., Alexander, G., Linder, F., Snoke, J.: Using neural generative models to release synthetic twitter corpora with reduced stylometric identifiability of users. Preprint, arXiv:1606.01151 (2018)

    Google Scholar 

  47. Ororbia, A.G., Mali, A., Wu, J., O’Connell, S., Miller, D., Giles, C.L.: Learned iterative decoding for lossy image compression systems. In: Data Compression Conference. IEEE, Piscataway (2019)

    Book  Google Scholar 

  48. El Hihi, S., Bengio, Y.: Hierarchical recurrent neural networks for long-term dependencies. In: Advances in Neural Information Processing Systems, pp. 493–499. MIT Press, Cambridge (1996)

    Google Scholar 

  49. Kalinli, A., Sagiroglu, S.: Elman network with embedded memory for system identification. J. Inf. Sci. Eng. 22(6), 1555–1568 (2006)

    Google Scholar 

  50. Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)

    Article  Google Scholar 

  51. Williams, R.J., Zipser, D.: A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1(2), 270–280 (1989)

    Article  Google Scholar 

  52. Glüge, S., Böck, R., Palm, G., Wendemuth, A.: Learning long-term dependencies in segmented-memory recurrent neural networks with backpropagation of error. Neurocomputing 141, 54–64 (2014)

    Article  Google Scholar 

  53. Desell, T.: Asynchronous global optimization for massive scale computing. Ph.D. dissertation, Rensselaer Polytechnic Institute (2009)

    Google Scholar 

  54. ElSaid, A.A., Ororbia, A.G., Desell, T.J.: The ant swarm neuro-evolution procedure for optimizing recurrent networks. Preprint, arXiv:1909.11849 (2019)

    Google Scholar 

  55. Desell, T.: Large scale evolution of convolutional neural networks using volunteer computing. CoRR, vol. abs/1703.05422 (2017). Available: http://arxiv.org/abs/1703.05422

  56. Alba, E., Tomassini, M.: Parallelism and evolutionary algorithms. IEEE Trans. Evol. Comput. 6(5), 443–462 (2002)

    Article  Google Scholar 

  57. Message Passing Interface Forum: MPI: a message-passing interface standard. Int. J. Supercomput. Appl. High Perform. Comput. 8(3/4), 159–416 (Fall/Winter 1994)

    Google Scholar 

  58. Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)

    Article  Google Scholar 

  59. Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: International Conference on Machine Learning, pp. 2342–2350 (2015)

    Google Scholar 

  60. Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp. 1310–1318 (2013)

    Google Scholar 

  61. Desell, T.: Developing a volunteer computing project to evolve convolutional neural networks and their hyperparameters. In: The 13th IEEE International Conference on eScience (eScience 2017), pp. 19–28, Oct 2017

    Google Scholar 

  62. Camero, A., Toutouh, J., Alba, E.: Low-cost recurrent neural network expected performance evaluation. Preprint, arXiv:1805.07159 (2018)

    Google Scholar 

  63. Camero, A., Toutouh, J., Alba, E.: A specialized evolutionary strategy using mean absolute error random sampling to design recurrent neural networks. Preprint, arXiv:1909.02425 (2019)

    Google Scholar 

Download references

Acknowledgements

This material is in part supported by the U.S. Department of Energy, Office of Science, Office of Advanced Combustion Systems under Award Number #FE0031547 and by the Federal Aviation Administration National General Aviation Flight Information Database (NGAFID) award. We also thank Microbeam Technologies, Inc., as well as Mark Dusenbury, James Higgins, Brandon Wild at the University of North Dakota for their help in collecting and preparing the coal-fired power plant and NGAFID data, respectively.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Travis Desell .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Desell, T., ElSaid, A.A., Ororbia, A.G. (2020). Investigating Deep Recurrent Connections and Recurrent Memory Cells Using Neuro-Evolution. In: Iba, H., Noman, N. (eds) Deep Neural Evolution. Natural Computing Series. Springer, Singapore. https://doi.org/10.1007/978-981-15-3685-4_10

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-3685-4_10

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-3684-7

  • Online ISBN: 978-981-15-3685-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics