tutorial

A Survey on Deep Learning: Algorithms, Techniques, and Applications

Authors:
Samira Pouyanfar

Florida International University, Miami, FL

Florida International University, Miami, FL
View Profile

,
Saad Sadiq

University of Miami, Coral Gables, FL

University of Miami, Coral Gables, FL
View Profile

,
Yilin Yan

University of Miami, Coral Gables, FL

University of Miami, Coral Gables, FL
View Profile

,
Haiman Tian

Florida International University, Miami, FL

Florida International University, Miami, FL
View Profile

,
Yudong Tao

University of Miami, Coral Gables, FL

University of Miami, Coral Gables, FL
View Profile

,
Maria Presa Reyes

Florida International University, Miami, FL

Florida International University, Miami, FL
View Profile

,
Mei-Ling Shyu

University of Miami, Coral Gables, FL

University of Miami, Coral Gables, FL
View Profile

,
Shu-Ching Chen

Florida International University

Florida International University

0000-0001-9209-390X
View Profile

,
S. S. Iyengar

Florida International University

Florida International University
View Profile

Authors Info & Claims

ACM Computing Surveys Volume 51 Issue 5Article No.: 92pp 1–36https://doi.org/10.1145/3234150

Published:18 September 2018Publication History

ACM Computing Surveys

Abstract

The field of machine learning is witnessing its golden era as deep learning slowly becomes the leader in this domain. Deep learning uses multiple layers to represent the abstractions of data to build computational models. Some key enabler deep learning algorithms such as generative adversarial networks, convolutional neural networks, and model transfers have completely changed our perception of information processing. However, there exists an aperture of understanding behind this tremendously fast-paced domain, because it was never previously represented from a multiscope perspective. The lack of core understanding renders these powerful methods as black-box machines that inhibit development at a fundamental level. Moreover, deep learning has repeatedly been perceived as a silver bullet to all stumbling blocks in machine learning, which is far from the truth. This article presents a comprehensive review of historical and recent state-of-the-art approaches in visual, audio, and text processing; social network analysis; and natural language processing, followed by the in-depth analysis on pivoting and groundbreaking advances in deep learning applications. It was also undertaken to review the issues faced in deep learning such as unsupervised learning, black-box models, and online learning and to illustrate how these challenges can be transformed into prolific future research avenues.

References

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. CoRR abs/1603.04467 (2016). Retrieved from http://arxiv.org/abs/1603.04467.Google Scholar
Ossama Abdel-Hamid, Abdel-rahman Mohamed, Hui Jiang, Li Deng, Gerald Penn, and Dong Yu. 2014. Convolutional neural networks for speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing 22, 10 (2014), 1533--1545. Google ScholarDigital Library
Johannes Abel and Tim Fingscheidt. 2017. A DNN regression approach to speech enhancement by artificial bandwidth extension. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. IEEE, 219--223.Google ScholarCross Ref
Sami Abu-El-Haija, Nisarg Kothari, Joonseok Lee, Paul Natsev, George Toderici, Balakrishnan Varadarajan, and Sudheendra Vijayanarasimhan. 2016. YouTube-8M: A large-scale video classification benchmark. CoRR abs/1609.08675 (2016). Retrieved from http://arxiv.org/abs/1609.08675.Google Scholar
Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Mélanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian Goodfellow, Matt Graham, Caglar Gulcehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrancois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert T. McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, François Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, and Ying Zhang. 2016. Theano: A Python framework for fast computation of mathematical expressions. CoRR abs/1605.02688 (2016). Retrieved from http://arxiv.org/abs/1605.02688.Google Scholar
Dario Amodei, Sundaram Ananthanarayanan, Rishita Anubhai, Jingliang Bai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Qiang Cheng, Guoliang Chen, Jie Chen, Jingdong Chen, Zhijie Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Ke Ding, Niandong Du, Erich Elsen, Jesse Engel, Weiwei Fang, Linxi Fan, Christopher Fougner, Liang Gao, Caixia Gong, Awni Hannun, Tony Han, Lappi Vaino Johannes, Bing Jiang, Cai Ju, Billy Jun, Patrick LeGresley, Libby Lin, Junjie Liu, Yang Liu, Weigao Li, Xiangang Li, Dongpeng Ma, Sharan Narang, Andrew Ng, Sherjil Ozair, Yiping Peng, Ryan Prenger, Sheng Qian, Zongfeng Quan, Jonathan Raiman, Vinay Rao, Sanjeev Satheesh, David Seetapun, Shubho Sengupta, Kavya Srinet, Anuroop Sriram, Haiyuan Tang, Liliang Tang, Chong Wang, Jidong Wang, Kaifu Wang, Yi Wang, Zhijian Wang, Zhiqian Wang, Shuang Wu, Likai Wei, Bo Xiao, Wen Xie, Yan Xie, Dani Yogatama, Bin Yuan, Jun Zhan, and Zhenyao Zhu. 2016. Deep speech 2: End-to-end speech recognition in English and Mandarin. In International Conference on Machine Learning, Maria Florina Balcan and Kilian Q. Weinberger (Eds.), Vol. 48. PMLR, 173--182. Google ScholarDigital Library
Christof Angermueller, Tanel Pärnamaa, Leopold Parts, and Oliver Stegle. 2016. Deep learning for computational biology. Molecular Systems Biology 12, 7 (2016), 878.Google ScholarCross Ref
Erfan Azarkhish, Davide Rossi, Igor Loi, and Luca Benini. 2017. Neurostream: Scalable and energy efficient deep learning with smart memory cubes. CoRR abs/1701.06420 (2017). Retrieved from http://arxiv.org/abs/1701.06420.Google Scholar
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. CoRR abs/1409.0473 (2014). Retrieved from http://arxiv.org/abs/1409.0473.Google Scholar
Nicolas Ballas, Li Yao, Chris Pal, and Aaron C. Courville. 2015. Delving deeper into convolutional networks for learning video representations. CoRR abs/1511.06432 (2015). Retrieved from http://arxiv.org/abs/1511.06432.Google Scholar
Yoshua Bengio, Eric Laufer, Guillaume Alain, and Jason Yosinski. 2014. Deep generative stochastic networks trainable by backprop. In International Conference on Machine Learning. Omnipress, 226--234. Google ScholarDigital Library
Jonathan Berant, Andrew Chou, Roy Frostig, and Percy Liang. 2013. Semantic parsing on freebase from question-answer pairs. In Empirical Methods in Natural Language Processing, Vol. 2. Association for Computational Linguistics, 6.Google Scholar
Leo Breiman. 2003. Statistical modeling: The two cultures. Quality Control and Applied Statistics 48, 1 (2003), 81--82.Google Scholar
Davide Castelvecchi. 2016. Can we open the black box of AI? Nature 538, 7623 (2016), 20--23.Google Scholar
Chenyi Chen, Ari Seff, Alain Kornhauser, and Jianxiong Xiao. 2015. Deepdriving: Learning affordance for direct perception in autonomous driving. In IEEE International Conference on Computer Vision. IEEE, 2722--2730. Google ScholarDigital Library
Ting Chen and Christophe Chefd’hotel. 2014. Deep learning based automatic immune cell detection for immunohistochemistry images. In International Workshop on Machine Learning in Medical Imaging. Springer, 17--24.Google ScholarCross Ref
Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. CoRR abs/1512.01274 (2015). Retrieved from http://arxiv.org/abs/1512.01274.Google Scholar
Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. 2014. cuDNN: Efficient primitives for deep learning. CoRR abs/1410.0759 (2014). Retrieved from http://arxiv.org/abs/1410.0759.Google Scholar
Jen-Tzung Chien and Hsin-Lung Hsieh. 2013. Nonstationary source separation using sequential and variational Bayesian learning. IEEE Transactions on Neural Networks and Learning Systems 24, 5 (2013), 681--694.Google ScholarCross Ref
Kyunghyun Cho, Bart van Merrienboer, Çaglar Gülçehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In The Conference on Empirical Methods in Natural Language Processing. 1724--1734.Google ScholarCross Ref
Min Chee Choy, Dipti Srinivasan, and Ruey Long Cheu. 2006. Neural networks for continuous online learning and control. IEEE Transactions on Neural Networks 17, 6 (2006), 1511--1531. Google ScholarDigital Library
CIFAR. 2009. CIFAR-10 and CIFAR-100 datasets. Retrieved from https://www.cs.toronto.edu/&sim;kriz/cifar.html. Accessed April 18, 2017.Google Scholar
Dan C. Cireşan, Alessandro Giusti, Luca M. Gambardella, and Jürgen Schmidhuber. 2013. Mitosis detection in breast cancer histology images with deep neural networks. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 411--418.Google ScholarCross Ref
Adam Coates, Brody Huval, Tao Wang, David J. Wu, Andrew Y. Ng, and Bryan Catanzaro. 2013. Deep learning with COTS HPC systems. In International Conference on Machine Learning. Omnipress, 1337--1345. Google ScholarDigital Library
Ronan Collobert, Samy Bengio, and Johnny Mariéthoz. 2002. Torch: A Modular Machine Learning Software Library. Idiap-RR Idiap-RR-46-2002. Idiap.Google Scholar
George E. Dahl, Dong Yu, Li Deng, and Alex Acero. 2012. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Transactions on Audio, Speech, and Language Processing 20, 1 (2012), 30--42. Google ScholarDigital Library
Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In IEEE Conference on Computer Vision and Pattern Recognition, Vol. 1. IEEE, 886--893. Google ScholarDigital Library
Jeffrey Dean, Greg S. Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc’Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, and Andrew Y. Ng. 2012. Large scale distributed deep networks. In The 25th International Conference on Neural Information Processing Systems. Curran Associates, 1223--1231. Google ScholarDigital Library
Li Deng. 2014. A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Transactions on Signal and Information Processing 3 (2014), 1--29.Google Scholar
Li Deng, Xiaodong He, and Jianfeng Gao. 2013. Deep stacking networks for information retrieval. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 3153--3157.Google ScholarCross Ref
Bill Dolan, Chris Quirk, and Chris Brockett. 2004. Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources. In 20th International Conference on Computational Linguistics. Association for Computational Linguistics, 350. Google ScholarDigital Library
Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell. 2015. Long-term recurrent convolutional networks for visual recognition and description. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 2625--2634.Google ScholarCross Ref
Li Dong, Furu Wei, Ming Zhou, and Ke Xu. 2015. Question answering over freebase with multi-column convolutional neural networks. In 53rd Annual Meeting of the Association for Computational Linguistics, Vol. 1. Association for Computational Linguistics, 260--269.Google ScholarCross Ref
Timothy Dozat. 2016. Incorporating Nesterov momentum into Adam. In International Conference on Learning Representations Workshop. 1--4.Google Scholar
John C. Duchi, Elad Hazan, and Yoram Singer. 2010. Adaptive subgradient methods for online learning and stochastic optimization. In Conference on Learning Theory. Omnipress, 257--269.Google Scholar
Moataz El Ayadi, Mohamed S. Kamel, and Fakhri Karray. 2011. Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition 44, 3 (2011), 572--587. Google ScholarDigital Library
Rasool Fakoor, Faisal Ladhak, Azade Nazi, and Manfred Huber. 2013. Using deep learning to enhance cancer diagnosis and classification. In International Conference on Machine Learning. Omnipress.Google Scholar
Christoph Feichtenhofer, Axel Pinz, and Andrew Zisserman. 2016. Convolutional two-stream network fusion for video action recognition. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1933--1941.Google ScholarCross Ref
Minwei Feng, Bing Xiang, Michael R. Glass, Lidan Wang, and Bowen Zhou. 2015. Applying deep learning to answer selection: A study and an open task. In IEEE Workshop on Automatic Speech Recognition and Understanding. IEEE, 813--820.Google ScholarCross Ref
Kunihiko Fukushima. 1980. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics 36, 4 (1980), 193--202.Google ScholarCross Ref
Kavita Ganesan, ChengXiang Zhai, and Jiawei Han. 2010. Opinosis: A graph-based approach to abstractive summarization of highly redundant opinions. In 23rd International Conference on Computational Linguistics. Association for Computational Linguistics, 340--348. Google ScholarDigital Library
John S. Garofolo, Lori F. Lamel, William M. Fisher, Jonathon G. Fiscus, and David S. Pallett. 1993. DARPA TIMIT acoustic-phonetic continuous speech corpus CD-ROM. NIST speech disc 1-1.1. NASA STI/Recon Technical Report N 93 (1993).Google Scholar
Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel Urtasun. 2013. Vision meets robotics: The KITTI dataset. International Journal of Robotics Research 32, 11 (2013), 1231--1237. Google ScholarDigital Library
Ross Girshick. 2015. Fast R-CNN. In IEEE International Conference on Computer Vision. IEEE, 1440--1448. Google ScholarDigital Library
Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 580--587. Google ScholarDigital Library
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In The 13th International Conference on Artificial Intelligence and Statistics, Vol. 9. JMLR.org, 249--256.Google Scholar
Christoph Goller and Andreas Kuchler. 1996. Learning task-dependent distributed representations by backpropagation through structure. In IEEE International Conference on Neural Networks, Vol. 1. IEEE, 347--352.Google Scholar
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning, Vol. 1. MIT Press. Google ScholarDigital Library
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems. Curran Associates, 2672--2680. Google ScholarDigital Library
Google. 2016. Alphago. Retrieved from https://deepmind.com/research/alphago. Accessed April 18, 2017.Google Scholar
Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 6645--6649.Google ScholarCross Ref
Hayit Greenspan, Bram van Ginneken, and Ronald M. Summers. 2016. Guest editorial deep learning in medical imaging: Overview and future promise of an exciting new technique. IEEE Transactions on Medical Imaging 35, 5 (2016), 1153--1159.Google ScholarCross Ref
Karol Gregor and Yann LeCun. 2010. Learning fast approximations of sparse coding. In The 27th International Conference on Machine Learning. Omnipress, 399--406. Google ScholarDigital Library
Hsin-Yu Ha, Yimin Yang, Samira Pouyanfar, Haiman Tian, and Shu-Ching Chen. 2015. Correlation-based deep learning for multimedia semantic concept detection. In International Conference on Web Information Systems Engineering. Springer, 473--487.Google ScholarCross Ref
Raia Hadsell, Ayse Erkan, Pierre Sermanet, Marco Scoffier, Urs Muller, and Yann LeCun. 2008. Deep belief net learning in a long-range vision system for autonomous off-road driving. In IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 628--633.Google ScholarCross Ref
Kun Han, Dong Yu, and Ivan Tashev. 2014. Speech emotion recognition using deep neural network and extreme learning machine. In Interspeech. ISCA, 223--227.Google Scholar
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask R-CNN. In IEEE International Conference on Computer Vision. IEEE, 2980--2988.Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 770--778.Google ScholarCross Ref
Irina Higgins, Loic Matthey, Xavier Glorot, Arka Pal, Benigno Uria, Charles Blundell, Shakir Mohamed, and Alexander Lerchner. 2016. Early visual concept learning with unsupervised deep learning. CoRR abs/1606.05579 (2016). Retrieved from http://arxiv.org/abs/1606.05579.Google Scholar
Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, and Brian Kingsbury. 2012. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine 29, 6 (2012), 82--97.Google ScholarCross Ref
Geoffrey E. Hinton. 2009. Deep belief networks. Scholarpedia 4, 5 (2009), 5947.Google ScholarCross Ref
Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh. 2006. A fast learning algorithm for deep belief nets. Neural Computation 18, 7 (July 2006), 1527--1554. Google ScholarDigital Library
Sunpyo Hong and Hyesoon Kim. 2010. An integrated GPU power and performance model. In The 37th International Symposium on Computer Architecture, Vol. 38. ACM, 280--289. Google ScholarDigital Library
Gao Huang, Yu Sun, Zhuang Liu, Daniel Sedra, and Kilian Q. Weinberger. 2016. Deep networks with stochastic depth. In European Conference on Computer Vision. Springer, 646--661.Google Scholar
Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry P. Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In The 22nd ACM International Conference on Information and Knowledge Management. ACM, 2333--2338. Google ScholarDigital Library
Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, and Paris Smaragdis. 2014. Deep learning for monaural speech separation. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 1562--1566.Google ScholarCross Ref
David H. Hubel and Torsten N. Wiesel. 1962. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. Journal of Physiology 160, 1 (1962), 106--154.Google ScholarCross Ref
ImageNet. 2017. Retrieved from http://image-Net.org. Accessed April 18, 2017.Google Scholar
Intel Nervana Systems. 2017. Neon deep learning framework. Retrieved from https://www.nervanasys.com/technology/neon. Accessed April 4, 2017.Google Scholar
Anastasia Ioannidou, Elisavet Chatzilari, Spiros Nikolopoulos, and Ioannis Kompatsiaris. 2017. Deep learning advances in computer vision with 3D data: A survey. Computing Surveys 50, 2 (2017), 20. Google ScholarDigital Library
Herbert Jaeger and Harald Haas. 2004. Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication. Science 304, 5667 (2004), 78--80.Google Scholar
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross B. Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In ACM International Conference on Multimedia. ACM, 675--678. Google ScholarDigital Library
Michael I. Jordan. 1986. Serial order: A parallel distributed processing approach. Advances in Psychology 121 (1986), 471--495.Google ScholarCross Ref
Jean-Claude Junqua and Jean-Paul Haton. 2012. Robustness in Automatic Speech Recognition: Fundamentals and Applications, Vol. 341. Springer Science 8 Business Media.Google ScholarDigital Library
Mikael Kågebäck, Olof Mogren, Nina Tahmasebi, and Devdatt Dubhashi. 2014. Extractive summarization using continuous vector space models. In 2nd Workshop on Continuous Vector Space Models and their Compositionality. Citeseer, Association for Computational Linguistics, 31--39.Google Scholar
Lukasz Kaiser, Aidan N. Gomez, Noam Shazeer, Ashish Vaswani, Niki Parmar, Llion Jones, and Jakob Uszkoreit. 2017. One model to learn them all. CoRR abs/1706.05137 (2017). Retrieved from http://arxiv.org/abs/1706.05137.Google Scholar
Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei. 2014. Large-scale video classification with convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 1725--1732. Google ScholarDigital Library
Will Kay, João Carreira, Karen Simonyan, Brian Zhang, Chloe Hillier, Sudheendra Vijayanarasimhan, Fabio Viola, Tim Green, Trevor Back, Paul Natsev, Mustafa Suleyman, and Andrew Zisserman. 2017. The kinetics human action video dataset. CoRR abs/1705.06950 (2017). Retrieved from http://arxiv.org/abs/1705.06950.Google Scholar
Yoon Kim. 2014. Convolutional neural networks for sentence classification. CoRR abs/1408.5882 (2014). Retrieved from http://arxiv.org/abs/1408.5882.Google Scholar
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. CoRR abs/1412.6980 (2014). Retrieved from http://arxiv.org/abs/1412.6980.Google Scholar
Diederik P. Kingma and Max Welling. 2013. Auto-encoding variational bayes. CoRR abs/1312.6114 (2013). Retrieved from http://arxiv.org/abs/1312.6114.Google Scholar
Morten Kolbæk, Zheng-Hua Tan, and Jesper Jensen. 2017. Speech intelligibility potential of general and specialized deep neural network based speech enhancement systems. IEEE/ACM Transactions on Audio, Speech and Language Processing 25, 1 (2017), 153--167. Google ScholarDigital Library
Jan Koutník, Giuseppe Cuccu, Jürgen Schmidhuber, and Faustino Gomez. 2013. Evolving large-scale neural networks for vision-based reinforcement learning. In 15th Annual Conference on Genetic and Evolutionary Computation. ACM, 1061--1068. Google ScholarDigital Library
Vassili Kovalev, Alexander Kalinovsky, and Sergey Kovalev. 2016. Deep learning with Theano, Torch, Caffe, Tensorflow, and Deeplearning4J: Which one is the best in speed and accuracy? In The 13th International Conference on Pattern Recognition and Information Processing.Google Scholar
Jonathan Krause, Benjamin Sapp, Andrew Howard, Howard Zhou, Alexander Toshev, Tom Duerig, James Philbin, and Li Fei-Fei. 2016. The unreasonable effectiveness of noisy data for fine-grained recognition. In European Conference on Computer Vision. Springer, 301--320.Google ScholarCross Ref
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.). Curran Associates, 1097--1105. Google ScholarDigital Library
Michel Lang, Helena Kotthaus, Peter Marwedel, Claus Weihs, Jörg Rahnenführer, and Bernd Bischl. 2015. Automatic model selection for high-dimensional survival analysis. Journal of Statistical Computation and Simulation 85, 1 (2015), 62--76.Google ScholarCross Ref
Quoc V. Le. 2013. Building high-level features using large scale unsupervised learning. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 8595--8598.Google ScholarCross Ref
Yann LeCun and Yoshua Bengio. 1995. Convolutional networks for images, speech, and time series. Handbook of Brain Theory and Neural Networks 3361, 10 (1995), 255--257. Google ScholarDigital Library
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436--444.Google Scholar
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278--2324.Google ScholarCross Ref
Mu Li, David G. Andersen, Jun Woo Park, Alexander J. Smola, Amr Ahmed, Vanja Josifovski, James Long, Eugene J. Shekita, and Bor-Yiing Su. 2014. Scaling distributed machine learning with the parameter server. In USENIX Symposium on Operating Systems Design and Implementation. USENIX Association, 583--598. Google ScholarDigital Library
Xiangang Li and Xihong Wu. 2015. Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 4520--4524.Google ScholarCross Ref
Yuxi Li. 2017. Deep reinforcement learning: An overview. CoRR abs/1701.07274 (2017). Retrieved from https://arxiv.org/abs/1701.07274.Google Scholar
Jifeng Dai, Yi Li, Kaiming He, and Jian Sun. 2016. R-FCN: Object detection via region-based fully convolutional networks. In Advances in Neural Information Processing Systems. Curran Associates, 379--387. Google ScholarDigital Library
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In European Conference on Computer Vision. Springer, 740--755.Google Scholar
Geert Litjens, Thijs Kooi, Babak Ehteshami Bejnordi, Arnaud Arindra Adiyoso Setio, Francesco Ciompi, Mohsen Ghafoorian, Jeroen A. W. M. van der Laak, Bram van Ginneken, and Clara I. Sánchez. 2017. A survey on deep learning in medical image analysis. CoRR abs/1702.05747 (2017). Retrieved from http://arxiv.org/abs/1702.05747.Google Scholar
Geert Litjens, Clara I. Sánchez, Nadya Timofeeva, Meyke Hermsen, Iris Nagtegaal, Iringo Kovacs, Christina Hulsbergen-Van De Kaa, Peter Bult, Bram Van Ginneken, and Jeroen Van Der Laak. 2016. Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis. Scientific Reports 6 (2016), 26286.Google ScholarCross Ref
Feng Liu, Bingquan Liu, Chengjie Sun, Ming Liu, and Xiaolong Wang. 2015. Deep belief network-based approaches for link prediction in signed social networks. Entropy 17, 4 (2015), 2140--2169.Google ScholarCross Ref
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single shot multibox detector. In European Conference on Computer Vision. Springer, 21--37.Google Scholar
Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 3431--3440.Google ScholarCross Ref
David G. Lowe. 1999. Object recognition from local scale-invariant features. In IEEE International Conference on Computer Vision, Vol. 2. IEEE, 1150--1157. Google ScholarDigital Library
Junjie Lu, Steven Young, Itamar Arel, and Jeremy Holleman. 2015. A 1 TOPS/W analog deep machine-learning engine with floating-gate storage in 0.13 m CMOS. IEEE Journal of Solid-State Circuits 50, 1 (2015), 270--281.Google ScholarCross Ref
Jianzhu Ma, Michael Ku Yu, Samson Fong, Keiichiro Ono, Eric Sage, Barry Demchak, Roded Sharan, and Trey Ideker. 2018. Using deep learning to model the hierarchical structure and function of a cell. Nature Methods 15, 4 (2018), 290--298.Google ScholarCross Ref
Xiaolei Ma, Haiyang Yu, Yunpeng Wang, and Yinhai Wang. 2015. Large-scale transportation network congestion evolution prediction using deep learning theory. PLoS ONE 10, 3 (2015), e0119044.Google ScholarCross Ref
Christopher Manning. 2016. Understanding human language: Can NLP and deep learning help? In The 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1--1. Google ScholarDigital Library
Warren S. McCulloch and Walter Pitts. 1943. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics 5, 4 (1943), 115--133.Google ScholarCross Ref
H. Brendan McMahan, Eider Moore, Daniel Ramage, and Blaise Agüera y Arcas. 2016. Federated learning of deep networks using model averaging. CoRR abs/1602.05629 (2016). Retrieved from http://arxiv.org/abs/1602.05629.Google Scholar
Alessio Micheli. 2009. Neural network for graphs: A contextual constructive approach. IEEE Transactions on Neural Networks 20, 3 (2009), 498--511. Google ScholarDigital Library
Azalia Mirhoseini, Anna Goldie, Hieu Pham, Benoit Steiner, Quoc V. Le, and Jeff Dean. 2018. A hierarchical model for device placement. In International Conference on Learning Representations.Google Scholar
Marc’Aurelio Ranzato, Volodymyr Mnih, Joshua M. Susskind, and Geoffrey E. Hinton. 2013. Modeling natural images using gated MRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 9 (2013), 2206--2222. Google ScholarDigital Library
MNIST. 2017. The MNIST database of handwritten digits. Retrieved from http://yann.lecun.com/exdb/mnist/. Accessed April 18, 2017.Google Scholar
Alexander Mordvintsev, Christopher Olah, and Mike Tyka. 2015. Inceptionism: Going deeper into neural networks. Google Research Blog. Retrieved from https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html. Accessed March 26, 2018.Google Scholar
Igor Mozetic, Miha Grcar, and Jasmina Smailovic. 2016. Multilingual Twitter sentiment classification: The role of human annotators. CoRR abs/1602.07563 (2016). arxiv:1602.07563. Retrieved from http://arxiv.org/abs/1602.07563.Google Scholar
Maryam M. Najafabadi, Flavio Villanustre, Taghi M. Khoshgoftaar, Naeem Seliya, Randall Wald, and Edin Muharemagic. 2015. Deep learning applications and challenges in big data analytics. Journal of Big Data 2, 1 (2015), 1--21.Google ScholarCross Ref
Preslav Nakov, Alan Ritter, Sara Rosenthal, Fabrizio Sebastiani, and Veselin Stoyanov. 2016. SemEval-2016 task 4: Sentiment analysis in Twitter. In The 10th International Workshop on Semantic Evaluation. Association for Computer Linguistics, 1--18.Google ScholarCross Ref
Kazuhiro Negi, Keisuke Dohi, Yuichiro Shibata, and Kiyoshi Oguri. 2011. Deep pipelined one-chip FPGA implementation of a real-time image-based human detection algorithm. In International Conference on Field-Programmable Technology. IEEE, 1--8.Google ScholarCross Ref
Michael Neumann and Ngoc Thang Vu. 2017. Attentive convolutional neural network based speech emotion recognition: A study on the impact of input features, signal length, and acted speech. CoRR abs/1706.00612 (2017). arxiv:1706.00612. Retrieved from http://arxiv.org/abs/1706.00612.Google Scholar
Evan W. Newell and Yang Cheng. 2016. Mass cytometry: Blessed with the curse of dimensionality. Nature Immunology 17, 8 (2016), 890--895.Google ScholarCross Ref
Dat Tien Nguyen, Shafiq R. Joty, Muhammad Imran, Hassan Sajjad, and Prasenjit Mitra. 2016. Applications of online deep learning for crisis response using social media information. CoRR abs/1610.01030 (2016). Retrieved from http://arxiv.org/abs/1610.01030.Google Scholar
Laisen Nie, Dingde Jiang, Lei Guo, Shui Yu, and Houbing Song. 2016. Traffic matrix prediction and estimation based on deep learning for data center networks. In IEEE Globecom Workshops. IEEE, 1--6.Google ScholarCross Ref
Hyeonwoo Noh, Seunghoon Hong, and Bohyung Han. 2015. Learning deconvolution network for semantic segmentation. In IEEE International Conference on Computer Vision. IEEE, 1520--1528. Google ScholarDigital Library
Pascal VOC. 2012. The PASCAL Visual Object Classes. Retrieved from http://host.robots.ox.ac.uk/pascal/VOC/. Accessed April 18, 2017.Google Scholar
Razvan Pascanu, Caglar Gulcehre, Kyunghyun Cho, and Yoshua Bengio. 2013. How to construct deep recurrent neural networks. CoRR abs/1312.6026 (2013). Retrieved from http://arxiv.org/abs/1312.6026.Google Scholar
Santiago Pascual, Antonio Bonafonte, and Joan Serrà. 2017. SEGAN: Speech enhancement generative adversarial network. CoRR abs/1703.09452 (2017). arxiv:1703.09452. Retrieved from http://arxiv.org/abs/1703.09452.Google Scholar
Ryan Poplin, Avinash V. Varadarajan, Katy Blumer, Yun Liu, Michael V. McConnell, Greg S. Corrado, Lily Peng, and Dale R. Webster. 2018. Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning. Nature Biomedical Engineering 2, 3 (2018), 158--164.Google ScholarCross Ref
Samira Pouyanfar and Shu-Ching Chen. 2017. Automatic video event detection for imbalance data using enhanced ensemble deep learning. International Journal of Semantic Computing 11, 1 (2017), 85--109.Google ScholarCross Ref
Samira Pouyanfar and Shu-Ching Chen. 2017. T-LRA: Trend-based learning rate annealing for deep neural networks. In The 3rd IEEE International Conference on Multimedia Big Data. IEEE, 50--57.Google ScholarCross Ref
Samira Pouyanfar, Shu-Ching Chen, and Mei-Ling Shyu. 2017. An efficient deep residual-inception network for multimedia classification. In International Conference on Multimedia and Expo. IEEE, 373--378.Google ScholarCross Ref
Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR abs/1511.06434 (2015). Retrieved from http://arxiv.org/abs/1511.06434.Google Scholar
Rajesh Ranganath, Adler J. Perotte, Noémie Elhadad, and David M. Blei. 2016. Deep survival analysis. In Machine Learning in Health Care. JMLR.org, 101--114.Google Scholar
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 779--788.Google ScholarCross Ref
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. MIT Press, 91--99. Google ScholarDigital Library
Frank Rosenblatt. 1958. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review 65, 6 (1958), 386.Google ScholarCross Ref
Ruslan Salakhutdinov and Geoffrey Hinton. 2009. Deep Boltzmann machines. In Artificial Intelligence and Statistics. PMLR, 448--455.Google Scholar
Ruslan Salakhutdinov and Geoffrey Hinton. 2012. An efficient learning procedure for deep Boltzmann machines. Neural Computation 24, 8 (2012), 1967--2006. Google ScholarDigital Library
Dominik Scherer, Andreas Müller, and Sven Behnke. 2010. Evaluation of pooling operations in convolutional architectures for object recognition. International Conference on Artificial Neural Networks 6354 (2010), 92--101. Google ScholarDigital Library
Jürgen Schmidhuber. 2015. Deep learning in neural networks: An overview. Neural Networks 61 (2015), 85--117. Google ScholarDigital Library
Frank Seide, Gang Li, and Dong Yu. 2011. Conversational speech transcription using context-dependent deep neural networks. In 12th Annual Conference of the International Speech Communication Association. ISCA, 437--440.Google ScholarCross Ref
Pierre Sermanet, Koray Kavukcuoglu, Soumith Chintala, and Yann LeCun. 2013. Pedestrian detection with unsupervised multi-stage feature learning. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 3626--3633. Google ScholarDigital Library
Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Grégoire Mesnil. 2014. Learning semantic representations using convolutional neural networks for web search. In The 23rd International World Wide Web Conference. ACM, 373--374. Google ScholarDigital Library
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014). Retrieved from http://arxiv.org/abs/1409.1556.Google Scholar
Skymind. 2017. Deeplearning4j deep learning framework. Retrieved from https://deeplearning4j.org. Accessed April 18, 2017.Google Scholar
Paul Smolensky. 1986. Information Processing in Dynamical Systems: Foundations of Harmony Theory. Technical Report. DTIC Document.Google Scholar
Richard Socher, Eric H. Huang, Jeffrey Pennington, Andrew Y. Ng, and Christopher D. Manning. 2011. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Advances in Neural Information Processing Systems, Vol. 24. Neural Information Processing Systems Foundation, 801--809. Google ScholarDigital Library
Richard Socher, Cliff C. Lin, Chris Manning, and Andrew Y. Ng. 2011. Parsing natural scenes and natural language with recursive neural networks. In International Conference on Machine Learning. Omnipress, 129--136. Google ScholarDigital Library
Richard Socher, Alex Perelygin, Jean Y. Wu, Jason Chuang, Christopher D. Manning, Andrew Y. Ng, and Christopher Potts. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In Conference on Empirical Methods in Natural Language Processing. Citeseer, Association for Computational Linguistics, 1631--1642.Google Scholar
Khurram Soomro, Amir Roshan Zamir, and Mubarak Shah. 2012. UCF101: A dataset of 101 human actions classes from videos in the wild. CoRR abs/1212.0402 (2012). Retrieved from http://arxiv.org/abs/1212.0402.Google Scholar
Hang Su and Haoyu Chen. 2015. Experiments on parallel training of deep neural network using model averaging. CoRR abs/1507.01239 (2015). Retrieved from http://arxiv.org/abs/1507.01239.Google Scholar
Ilya Sutskever, James Martens, George E. Dahl, and Geoffrey E. Hinton. 2013. On the importance of initialization and momentum in deep learning. In International Conference on Machine Learning. JMLR.org, 1139--1147. Google ScholarDigital Library
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 1--9.Google ScholarCross Ref
Bart Thomee, David A. Shamma, Gerald Friedland, Benjamin Elizalde, Karl Ni, Douglas Poland, Damian Borth, and Li-Jia Li. 2016. YFCC100M: The new data in multimedia research. Communications of the ACM 59, 2 (2016), 64--73. Google ScholarDigital Library
Haiman Tian and Shu-Ching Chen. 2017. MCA-NN: Multiple correspondence analysis based neural network for disaster information detection. In The 3rd IEEE International Conference on Multimedia Big Data. IEEE, 268--275.Google ScholarCross Ref
Haiman Tian and Shu-Ching Chen. 2017. A video-aided semantic analytics system for disaster information integration. In The 3rd IEEE International Conference on Multimedia Big Data. IEEE, 242--243.Google ScholarCross Ref
Antonio Torralba, Rob Fergus, and William T. Freeman. 2008. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 11 (2008), 1958--1970. Google ScholarDigital Library
Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. 2015. Learning spatiotemporal features with 3D convolutional networks. In IEEE International Conference on Computer Vision. IEEE, 4489--4497. Google ScholarDigital Library
Transfer Learning. 2017. Convolutional Neural Network for Visual Recognition. Retrieved from http://cs231n.github.io/transfer-learning/. Accessed April 25, 2017.Google Scholar
Trecvid. 2017. TREC Video Retrieval Evaluation. Retrieved from http://trecvid.nist.gov. Accessed April 18, 2017.Google Scholar
Grigorios Tsagkatakis, Mustafa Jaber, and Panagiotis Tsakalides. 2017. Goal&excl;&excl; Event detection in sports video. Electronic Imaging 2017, 16 (2017), 15--20.Google ScholarCross Ref
Nicolas Vasilache, Jeff Johnson, Michaël Mathieu, Soumith Chintala, Serkan Piantino, and Yann LeCun. 2014. Fast convolutional nets with fbfft: A GPU performance evaluation. CoRR abs/1412.7580 (2014). Retrieved from http://arxiv.org/abs/1412.7580.Google Scholar
Soroush Vosoughi, Prashanth Vijayaraghavan, and Deb Roy. 2016. Tweet2Vec: learning Tweet embeddings using character-level CNN-LSTM encoder-decoder. In The 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1041--1044. Google ScholarDigital Library
Chao Wang, Lei Gong, Qi Yu, Xi Li, Yuan Xie, and Xuehai Zhou. 2016. DLAU: A scalable deep learning accelerator unit on FPGA. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 36, 3 (2016), 513--517. Google ScholarDigital Library
Peng Wang, Baowen Xu, Yurong Wu, and Xiaoyu Zhou. 2015. Link prediction in social networks: The state-of-the-art. Science China Information Sciences 58, 1 (2015), 1--38.Google ScholarCross Ref
Joonatas Wehrmann, Willian Becker, Henry E. L. Cagnini, and Rodrigo C. Barros. 2017. A character-based convolutional neural network for language-agnostic Twitter sentiment analysis. In International Joint Conference on Neural Networks. IEEE, 2384--2391.Google Scholar
Chao Weng, Dong Yu, Michael L. Seltzer, and Jasha Droppo. 2015. Deep neural networks for single-channel multi-talker speech recognition. IEEE/ACM Transactions on Audio, Speech and Language Processing 23, 10 (2015), 1670--1679. Google ScholarDigital Library
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Lukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. CoRR abs/1609.08144 (2016). arxiv:1609.08144. Retrieved from http://arxiv.org/abs/1609.08144.Google Scholar
Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. 2016. Aggregated residual transformations for deep neural networks. CoRR abs/1611.05431 (2016). Retrieved from http://arxiv.org/abs/1611.05431.Google Scholar
Omry Yadan, Keith Adams, Yaniv Taigman, and Marc’Aurelio Ranzato. 2013. Multi-GPU training of ConvNets. CoRR abs/1312.5853 (2013).Google Scholar
Yilin Yan, Min Chen, Saad Sadiq, and Mei-Ling Shyu. 2017. Efficient imbalanced multimedia concept retrieval by deep learning on spark clusters. International Journal of Multimedia Data Engineering and Management 8, 1 (2017), 1--20. Google ScholarDigital Library
Yilin Yan, Min Chen, Mei-Ling Shyu, and Shu-Ching Chen. 2015. Deep learning for imbalanced multimedia data classification. In The IEEE International Symposium on Multimedia. IEEE, 483--488.Google ScholarCross Ref
Yilin Yan, Qiusha Zhu, Mei-Ling Shyu, and Shu-Ching Chen. 2016. A classifier ensemble framework for multimedia big data classification. In The 17th IEEE International Conference on Information Reuse and Integration. IEEE, 615--622.Google ScholarDigital Library
Wenpeng Yin, Hinrich Schütze, Bing Xiang, and Bowen Zhou. 2015. ABCNN: Attention-based convolutional neural network for modeling sentence pairs. CoRR abs/1512.05193 (2015). arxiv:1512.05193. Retrieved from http://arxiv.org/abs/1512.05193.Google Scholar
Dong Yu, Adam Eversole, Michael L. Seltzer, Kaisheng Yao, Brian Guenter, Oleksii Kuchaiev, Frank Seide, Huaming Wang, Jasha Droppo, Zhiheng Huang, Geoffrey Zweig, Christopher J. Rossbach, and Jon Currey. 2014. An introduction to computational networks and the computational network toolkit. In The 15th Annual Conference of the International Speech Communication Association. ISCA.Google Scholar
Dong Yu, Morten Kolbæk, Zheng-Hua Tan, and Jesper Jensen. 2017. Permutation invariant training of deep models for speaker-independent multi-talker speech separation. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 241--245.Google ScholarCross Ref
Qi Yu, Chao Wang, Xiang Ma, Xi Li, and Xuehai Zhou. 2015. A deep learning prediction process accelerator based FPGA. In 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. IEEE, 1159--1162.Google ScholarDigital Library
Matthew D. Zeiler. 2012. ADADELTA: An adaptive learning rate method. CoRR abs/1212.5701 (2012). Retrieved from http://arxiv.org/abs/1212.5701.Google Scholar
Chen Zhang, Peng Li, Guangyu Sun, Yijin Guan, Bingjun Xiao, and Jason Cong. 2015. Optimizing FPGA-based accelerator design for deep convolutional neural networks. In ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 161--170. Google ScholarDigital Library
Xueliang Zhang and DeLiang Wang. 2017. Deep learning based binaural speech separation in reverberant environments. IEEE/ACM Transactions on Audio, Speech, and Language Processing 25 (2017), 1075--1084. Google ScholarDigital Library
Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level convolutional networks for text classification. In Advances in Neural Information Processing Systems. 649--657. Google ScholarDigital Library
Yue Zhao, Xingyu Jin, and Xiaolin Hu. 2017. Recurrent convolutional neural networks for speech processing. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE SigPort, 5300--5304.Google ScholarCross Ref
Zhiwei Zhao and Youzheng Wu. 2016. Attention-based convolutional neural networks for sentence classification. In The 17th Annual Conference of the International Speech Communication Association. ISCA, 705--709.Google ScholarCross Ref

Index Terms

A Survey on Deep Learning: Algorithms, Techniques, and Applications
1. Computing methodologies
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Machine learning theory

Recommendations

Multi-agent deep reinforcement learning: a survey
Abstract
The advances in reinforcement learning have recorded sublime success in various domains. Although the multi-agent domain has been overshadowed by its single-agent counterpart during this progress, multi-agent reinforcement learning gains rapid ...
Read More
Deep learning: systematic review, models, challenges, and research directions
Abstract
The current development in deep learning is witnessing an exponential transition into automation applications. This automation transition can provide a promising framework for higher performance and lower complexity. This ongoing transition ...
Read More
Deep Learning: Methods and Applications

This monograph provides an overview of general deep learning methodology and its applications to a variety of signal and information processing tasks. The application areas are chosen with the following three criteria in mind: (1) expertise or knowledge ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Computing Surveys Volume 51, Issue 5
September 2019
791 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/3271482
Editor:
Sartaj Sahni
Department of Computer and Information Science and Engineering
Issue’s Table of Contents
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 September 2018
- Accepted: 1 June 2018
- Revised: 1 April 2018
- Received: 1 May 2017
Published in csur Volume 51, Issue 5

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Deep learning
big data
distributed processing
machine learning
neural networks
survey
Qualifiers
- tutorial
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 874
  Total Citations
  View Citations
- 18,049
  Total Downloads
- Downloads (Last 12 months)2,101
- Downloads (Last 6 weeks)246
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

A Survey on Deep Learning: Algorithms, Techniques, and Applications

ACM Computing Surveys

Abstract

References

Cited By

Index Terms

Recommendations

Multi-agent deep reinforcement learning: a survey

Deep learning: systematic review, models, challenges, and research directions

Deep Learning: Methods and Applications