Convolutional neural network: a review of models, methodologies and applications to object detection

Dhillon, Anamika; Verma, Gyanendra K.

doi:10.1007/s13748-019-00203-0

Convolutional neural network: a review of models, methodologies and applications to object detection

Review
Published: 20 December 2019

Volume 9, pages 85–112, (2020)
Cite this article

Progress in Artificial Intelligence Aims and scope Submit manuscript

17k Accesses
514 Citations
7 Altmetric
Explore all metrics

Abstract

Deep learning has developed as an effective machine learning method that takes in numerous layers of features or representation of the data and provides state-of-the-art results. The application of deep learning has shown impressive performance in various application areas, particularly in image classification, segmentation and object detection. Recent advances of deep learning techniques bring encouraging performance to fine-grained image classification which aims to distinguish subordinate-level categories. This task is extremely challenging due to high intra-class and low inter-class variance. In this paper, we provide a detailed review of various deep architectures and model highlighting characteristics of particular model. Firstly, we described the functioning of CNN architectures and its components followed by detailed description of various CNN models starting with classical LeNet model to AlexNet, ZFNet, GoogleNet, VGGNet, ResNet, ResNeXt, SENet, DenseNet, Xception, PNAS/ENAS. We mainly focus on the application of deep learning architectures to three major applications, namely (i) wild animal detection, (ii) small arm detection and (iii) human being detection. A detailed review summary including the systems, database, application and accuracy claimed is also provided for each model to serve as guidelines for future work in the above application areas.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 16

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Article 18 August 2021

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

SSD: Single Shot MultiBox Detector

References

LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Article Google Scholar
Hong, Z.: A preliminary study on artificial neural network. In: 2011 6th IEEE Joint International Information Technology and Artificial Intelligence Conference, vol. 2, pp. 336–338 (2011)
Wang, X.J., Zhao, L.L., Wang, S.: A novel SVM video object extraction technology. In: 2012 8th International Conference on Natural Computation, pp. 44–48. IEEE (2012)
Rish, I.: An empirical study of the naive Bayes classifier. In: IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, vol. 3, no. 22, pp. 41–46 (2001)
Islam, N., Zeeshan I., Nazia N.: A survey on optical character recognition system. arXiv preprint arXiv:1710.05703 (2017)
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., WardeFarley, D., Ozair, S., Courville, A.C., Bengio, Y.: Generative adversarial networks. arXiv:1406.2661 (2014)
Besbinar, B., Alatan, A.A.: Visual object tracking with autoencoder representations. In: 2016 24th Signal Processing and Communication Application Conference (SIU), pp. 2041–2044 (2016)
Ma, X., Geng, J., Wang, H.: Hyperspectral image classification via contextual deep learning. EURASIP J. Image Video Process. 2015(1), 20 (2015)
Article Google Scholar
Hinton, G.: A practical guide to training restricted Boltzmann machines. Momentum 9(1), 926 (2010)
Google Scholar
Shin, H., Roth, H.R., Gao, M., Lu, L., Xu, Z., Nogues, I., Yao, J., Mollura, D., Summers, R.M.: Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35(5), 1285–1298 (2016)
Article Google Scholar
Li, W., Fu, H., Yu, L., Gong, P., Feng, D., Li, C., Clinton, N.: Stacked Autoencoder-based deep learning for remote-sensing image classification: a case study of African land-cover mapping. Int. J. Remote Sens. 37, 5632–5646 (2016)
Article Google Scholar
Vincent, P.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)
MathSciNet MATH Google Scholar
Feng, F., Wang, X., Li, R.: Correspondence autoencoders for cross-modal retrieval. ACM Trans. Multimed. Comput. Commun. Appl. 12(1), 1–22 (2015)
Article Google Scholar
Hutchison, D.: LNCS 8588—Intelligent Computing Theory. Springer, Berlin (2014)
Google Scholar
Koushik, J.: Understanding convolutional neural networks. arXiv preprint arXiv:1605.09081 (2016)
Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 609–616. ACM (2009)
Fukushima, K.: Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36, 193–202 (1980)
Article MATH Google Scholar
Papakostas, M., Giannakopoulos, T., Makedon, F., Karkaletsis, V.: Short-term recognition of human activities using convolutional neural networks. In: 2016 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), pp. 302–307. IEEE (2016)
Yudistira, N., Kurita, T.: Gated spatio and temporal convolutional neural network for activity recognition: towards gated multimodal deep learning. EURASIP J. Image Video Process. 2017, 85 (2017)
Article Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2011)
Zhou, X., Gong, W., Fu, W., Du, F.: Application of deep learning in object detection. In: 2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS), pp. 631–634. IEEE (2017)
Ranjan, R., Sankaranarayanan, S., Bansal, A., Bodla, N., Chen, J.-C., Patel, V.M., Castillo, C.D., Chellappa, R.: Deep learning for understanding faces: machines may be just as good, or better, than humans. IEEE Signal Process. Mag. 35(1), 66–83 (2018)
Article Google Scholar
Milyaev, S., Laptev, I.: Towards reliable object detection in noisy images. Pattern Recognit. Image Anal. 27(4), 713–722 (2017)
Article Google Scholar
Zhou, X., Gong, W., Fu, W., Du, F.: Application of deep learning in object detection, pp. 631–634 (2017)
Druzhkov, P.N., Kustikova, V.D.: A survey of deep learning methods and software tools for image classification and object detection. Pattern Recognit. Image Anal. 26(1), 9–15 (2016)
Article Google Scholar
Sze, V., Chen, Y.-H., Yang, T.-J., Emer, J.S.: Efficient processing of deep neural networks: atutorial and survey. Proc. IEEE 105, 2295–2329 (2017)
Article Google Scholar
Park, S.U., Park, J.H., Al-masni, M.A., Al-antari, M.A., Uddin, Z., Kim, T.: A depth camera-based human activity recognition via deep learning recurrent neural network for health and social care services. Procedia Comput. Sci. 100, 78–84 (2016)
Article Google Scholar
Baccouche, M., Mamalet, F., Wolf, C., Garcia, C., Baskurt, A.: Sequential deep learning for human action recognition. In: International workshop on human behavior understanding, pp. 29–39. Springer, Berlin, Heidelberg (2011)
Google Scholar
Zhao, X., Shi, X., Zhang, S.: Facial expression recognition via deep learning. IETE Tech. Rev. 32(5), 347–355 (2015)
Article Google Scholar
Xie, S., Yang, T., Wang, X., Lin, Y.: Hyper-class augmented and regularized deep learning for fine-grained image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2645–2654 (2015)
Floyd, M.W., Turner, J.T., Aha, D.W.: Using deep learning to automate feature modeling in learning by observation: a preliminary study. In: 2017 AAAI Spring Symposium Series
Tang, C., Feng, Y., Yang, X., Zheng, C., Zhou, Y.: The object detection based on deep learning. In: 2017 4th International Conference on Information Science and Control Engineering (ICISCE), pp. 723–728 (2017)
Alom, M.Z., Taha, T.M., Yakopcic, C., Westberg, S., Hasan, M., Van Esesn, B.C., Awwal, A.A.S., Asari, V.K.: The history began from AlexNet: a comprehensive survey on deep learning approaches. arXiv:1803.01164 (2018)
Nguyen, H., Maclagan, S.J., Nguyen, T.D., Nguyen, T., Flemons, P., Andrews, K., Ritchie, E.G., Phung, D.: Animal recognition and identification with deep convolutional neural networks for automated wildlife monitoring. In: 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 40–49. IEEE (2017)
Norouzzadeh, M.S., Nguyen, A., Kosmala, M., Swanson, A., Palmer, M.S., Packer, C., Clune, J.: Automatically identifying, counting, and describing wild animals incamera-trap images with deep learning. Proc. Nat. Acad. Sci. 115(25), E5716–E5725 (2018)
Article Google Scholar
Yin, C., Zhu, Y., Fei, J., He, X.: A deep learning approach for intrusion detection using recurrent neural networks. IEEE Access 5, 21954–21961 (2017)
Article Google Scholar
Olmos, R., Tabik, S., Herrera, F.: Automatic handgun detection alarm in videosusing deep learning. Neurocomputing 275, 66–72 (2018)
Article Google Scholar
Lee, J., Bang, J., Yang, S.I.: Object detection with sliding window in images including multiple similar objects. In: 2017 International Conference on Information and Communication Technology Convergence (ICTC), pp. 803–806 (2017)
Zhao, R., Yan, R., Chen, Z., Mao, K., Wang, P., Gao, R.X.: Deep learning and its applications to machine health monitoring. Mech. Syst. Signal Process. 115, 213–237 (2019)
Article Google Scholar
Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2015)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European conference on computer vision, pp. 21–37. Springer, Cham (2016)
Google Scholar
Li, Y., Ren, F.: Light-Weight RetinaNet for Object Detection. arXiv preprint arXiv:1905.10011 (2019)
Lin, T.-Y., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2999–3007 (2017)
Lin, T.-Y., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. CoRR. arXiv:1612.03144 (2016)
Zhiqiang, W., Jun, L.: A review of object detection based on convolutional neural network. In: 2017 36th Chinese Control Conference (CCC), pp. 11104–11109 (2017)
Zhao, B.: A survey on deep learning-based fine-grained object classification and semantic segmentation. Int. J. Autom. Comput. 14, 119–135 (2017)
Article Google Scholar
Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption generator. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2015)
Dai, J., He, K., Sun, J.: Instance-aware semantic segmentation via multi-task network cascades. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3150–3158 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing System, pp. 91–99 (2015)
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Xu, X., Li, Y., Wu, G., Luo, J.: Multi-modal deep feature learning for RGB-D object detection. Pattern Recognit. 72, 300–313 (2017)
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Abousaleh, F.S., Lim, T., Cheng, W.H., Yu, N.H., Anwar Hossain, M., Alhamid, M.F.: A novel comparative deep learning framework for facial age estimation. EURASIP J. Image Video Process. 2016(1), 47 (2016)
Article Google Scholar
Fang, X.: Understanding deep learning via back-tracking and deconvolution. J. Big Data 4, 40 (2017)
Article Google Scholar
Mnih, V., Heess, N., Graves, A.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp. 2204–2212 (2014)
Wang, A., Lu, J., Cai, J., Cham, T., Wang, G.: Large-margin multi-modal deep learning for RGB-D object recognition. IEEE Trans. Multimed. 17(11), 1887–1898 (2015)
Article Google Scholar
Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137 (2015)
Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)
Hua, Y., Alahari, K., Schmid, C.: Online object tracking with proposal selection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3092–3100 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)
Article Google Scholar
Yao, L., Torabi, A., Cho, K., Ballas, N., Pal, C., Larochelle, H., Courville, A.: Describing videos by exploiting temporal structure. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4507–4515 (2015)
Havaei, M., Davy, A., Warde-Farley, D., Biard, A., Courville, A., Bengio, Y., Pal, C., Jodoin, P.-M., Larochelle, H.: Brain tumor segmentation with deep neural networks. Med. Image Anal. 35, 18–31 (2017)
Article Google Scholar
Ding, Y., Cheng, Y., Cheng, X., Li, B., You, X., Yuan, X.: Noise-resistant network: a deep-learning method for face recognition under noise. EURASIP J. Image Video Process. 2017(1), 43 (2017)
Article Google Scholar
Shan, K., Guo, J., You, W., Lu, D., Bie, R.: Automatic facial expression recognition based on a deep convolutional-neural-network structure. In: 2017 IEEE 15th International Conference on Software Engineering Research, Management and Applications (SERA), pp. 123–128 (2017)
Wang, J.G., Mahendran, P.S., Teoh, E.K.: Deep affordance learning for single- and multiple-instance object detection. In: TENCON 2017-2017 IEEE Region 10 Conference, pp. 321–326 (2017)
Tian, B., Li, L., Qu, Y., Yan, L.: Video object detection for tractability with deeplearning method. In: 2017 Fifth International Conference on Advanced Cloud and Big Data (CBD), pp. 397–401 (2017)
Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., Wei, Y.: Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 764–773 (2017)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Han, J., Zhang, D., Cheng, G., Liu, N., Xu, D.: Advanced deep-learning techniques for salient and category-specific object detection: a survey. IEEE Signal Process. Mag. 35(1), 84–100 (2018)
Article Google Scholar
Babaee, M., Tung, D., Rigoll, G.: A deep convolutional neural network for video sequence background subtraction. Pattern Recogn. 76, 635–649 (2018)
Article Google Scholar
Li, S., Luo, Y., Sun, K., Choi, K.: Heterogeneous system implementation of deep learning neural network for object detection in OpenCL framework. In: 2018 International Conference on Electronics, Information, and Communication (ICEIC), pp. 1–4 (2018)
Wu, Z., Shen, C., Van Den Hengel, A.: Wider or deeper: revisiting the ResNet model for visual recognition. Pattern Recogn. 90, 119–133 (2019)
Article Google Scholar
Hossain, M.S., Muhammad, G.: Emotion recognition using deep learning approach from audio and visual emotional big data. Inf. Fusion 49, 69–78 (2019)
Article Google Scholar
Ranjan, R., Patel, V.M., Chellappa, R.: HyperFace: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 121–135 (2019)
Article Google Scholar
Zhang, S., Yao, L., Sun, A., Tay, Y.I.: Deep learning based recommender system: a survey. ACM Comput. Surv. 52(1), 5 (2019)
Google Scholar
Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Huang, G., Sun, Y., Liu, Z., Sedra, D., Weinberger, K.Q.: Deep networks with stochastic depth. In: European Conference on Computer Vision, pp. 646–661 (2016)
Chapter Google Scholar
Oh, S.I., Kang, H.B.: Object detection and classification by decision-level fusion for intelligent vehicle systems. Sensors 17(1), 207 (2017)
Article MathSciNet Google Scholar
Xu, H., Han, Z., Feng, S., Zhou, H., Fang, Y.: Foreign object debris material recognition based on convolutional neural networks. EURASIP J. Image Video Process. 2018, 21 (2018)
Article Google Scholar
Bui, H.M., Lech, M., Cheng, E.V.A., Neville, K., Burnett, I.S.: Object recognition using deep convolutional features transformed by a recursive network structure. IEEE Access 4, 10059–10066 (2017)
Article Google Scholar
Jiang, X., Pang, Y., Li, X., Pan, J.: Neurocomputing speed up deep neural network based pedestrian detection by sharing features across multi-scale models. Neurocomputing 185, 163–170 (2016)
Article Google Scholar
Tomè, D., Monti, F., Barof, L., Bondi, L., Tagliasacchi, M., Tubaro, S.: Deep convolutional neural networks for pedestrian detection. Signal Process. Image Commun. 47, 482–489 (2016)
Article Google Scholar
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: European Conference on Computer Vision, pp. 818–833. Springer, Cham (2014)
Google Scholar
Xiao, L., Yan, Q., Deng, S.: Scene classification with improved AlexNet model. In: 2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE), pp. 1–6. IEEE
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K.: Mastering the game of Go with deep neural networks and tree search. Nature 529(7585), 484–489 (2016)
Article Google Scholar
Zhang, Q., Yang, L.T., Chen, Z., Li, P.: A survey on deep learning for big data. Inf. Fusion 42, 146–157 (2018)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a largescale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Dai, J., He, K., Sun, J.: Instance-aware semantic segmentation via multi-task network cascades. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3150–3158 (2016)
Han, G., Zhang, X., Li, C.: Revisiting faster r-cnn: a deeper look at region proposal network. In: International Conference on Neural Information Processing, pp. 14–24 (2017)
Chapter Google Scholar
Wu, C.H., Huang, Q., Li, S., Kuo, C.C.J.: A Taught-Obesrve-Ask (TOA) Method for Object Detection with Critical Supervision. arXiv preprint arXiv:1711.01043
Minaee, S., Abdolrashidiy, A., Wang, Y.: An experimental study of deep convolutional features for iris recognition. In: 2016 IEEE Signal Processing in Medicine and Biology Symposium (SPMB), pp. 1–6 (2016)
Li, Q., Jin, S., Yan, J.: Mimicking very efficient network for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6356–6364 (2017)
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Lee, Y., Kim, H., Park, E., Cui, X., Kim, H.: Wide-residual-inception networks for real-time object detection. In: 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 758–764 (2017)
Liu, C., Cao, Y., Luo, Y., Chen, G., Vokkarane, V., Ma, Y.: Deepfood: deep learning-based food image recognition for computer-aided dietary assessment. In: International Conference on Smart Homes and Health Telematics, pp. 37–48. Springer, Cham (2016)
Google Scholar
Xia, X., Xu, C., Nan, B.: Inception-v3 for flower classification. In: 2017 2nd International Conference on Image, Vision and Computing (ICIVC), pp. 783–787. IEEE (2017)
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Hussain, M., Haque, M.A.: Swishnet: a fast convolutional neural network for speech, music and noise classification and segmentation. arXiv preprint arXiv:1812.00149 (2018)
Zhu, L., Deng, R., Maire, M., Deng, Z., Mori, G., Tan, P.: Sparsely aggregated convolutional networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 186–201 (2018)
Chapter Google Scholar
Zhou, P., Ni, B., Geng, C., Hu, J., Xu, Y.: Scale-transferrable object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 528–537 (2018)
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)
Adam, G., Lorraine, J.: Understanding Neural Architecture Search Techniques. arXiv preprint arXiv:1904.00438 (2019)
Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecturesearch via parameter sharing. arXiv preprint arXiv:1802.03268 (2018)
Chen, Y., Yang, T., Zhang, X., Meng, G., Pan, C., Sun, J.: Detnas: Neural Architecture Search on Object Detection. arXiv preprint arXiv:1903.10979 (2019)
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8697–8710 (2018)
Tan, M., Le, Q.V.: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv preprint arXiv:1905.11946 (2019)
Google AI Blog: EfficientNet: Improving Accuracy and Efficiency through AutoML and Model Scaling. https://ai.googleblog.com/2019/05/efficientnet-improvingaccuracy-and.html. Accessed 8 June 2019
Torrey, L., Shavlik, J.: Transfer learning. In: Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques, pp. 242–264. IGI Global (2010)
Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks?. In: Advances in Neural Information Processing Systems, pp. 3320–3328 (2014)
Tan, C., Sun, F., Kong, T., Zhang, W., Yang, C., Liu, C.: A survey on deep transfer learning. In: International Conference on Artificial Neural Networks, pp. 270–279. Springer, Cham (2018)
Google Scholar
Guignard, L., Weinberger, N.: Animal identification from remote camera images (2016)
Villa, A.G., Salazar, A., Vargas, F.: Towards automatic wild animal monitoring: identification of animal species in camera-trap images using very deep convolutional neural networks. Ecol. Inform. 41, 24–32 (2017)
Article Google Scholar
Okafor, E., Pawara, P., Karaaba, F., Surinta, O., Codreanu, V., Schomaker, L., Wiering, M.: Comparative study between deep learning and bag of visual words for wild-animal recognition. In: 2016 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–8. IEEE (2016)
Fang, Y., Du, S., Abdoola, R., Djouani, K.: Background categorization for automatic animal detection in aerial videos using neural networks. ANNPR 2016, 220–232 (2016)
Google Scholar
Yu, X., Wang, J., Kays, R., Jansen, P.A., Wang, T., Huang, T.: Automated identification of animal species in camera trap images. EURASIP J. Image Video Process. 2013(1), 52 (2013)
Article Google Scholar
Zhang, T., Xu, H., Hu, Z.: Physiognomy: personality traits prediction by learning. Int. J. Autom. Comput. 14, 386–395 (2017)
Article Google Scholar
Zhao, X., Shi, X., Zhang, S., Zhao, X., Shi, X., Zhang, S.: Facial expression recognition via deep learning facial expression recognition via deep learning. IETE Tech. Rev. 32(5), 347–355 (2015)
Article Google Scholar
Taigman, Y., Yang, M., Ranzato, M.A., Wolf, L.: Deepface: Closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)
Yoo, B., Kwak, Y., Kim, Y., Choi, C., Kim, J.: Multitask learning with weak label expansion. IEEE Signal Process. Lett. 25(6), 808–812 (2018)
Article Google Scholar
Grega, M., Matiolański, A., Guzik, P., Leszczuk, M.: Automated detection of firearms and knives in a CCTV image. Sensors 16(1), 47 (2016)
Article Google Scholar
Lai, J., Maples, S.: Developing a Real-Time Gun Detection Classifier (2017)
Anwar, M.K., Risnumawan, A., Darmawan, A., Tamara, M.N., Purnomo, D.S.: Deep multilayer network for automatic targeting system of gun turret. In: 2017 International Electronics Symposium on Engineering Technology and Applications (IES-ETA), pp. 134–139 (2017)
Glowacz, A., Kmieć, M., Dziech, A.: Visual detection of knives in security applications using active appearance models. Multimedia Tools Appl. 74(12), 4253–4267 (2015)
Article Google Scholar
Farahnakian, F., Heikkonen, J.: A deep auto-encoder based approach for intrusion detection system. In: 2018 20th International Conference on Advanced Communication Technology (ICACT), pp. 178–183 (2018)
Ning, X., Zhu, W., Chen, S.: Recognition, object detection and segmentation of white background photos based on deep learning. In: 2017 32nd Youth Academic Annual Conference of Chinese Association of Automation (YAC), pp. 182–187 (2018)
Olmos, R., Tabik, S., Lamas, A., Pérez-Hernández, F., Herrera, F.: A binocular image fusion approach for minimizing false positives in handgun detection with deep learning. Inf. Fusion 49, 271–280 (2019)
Article Google Scholar
Ning, X., Zhu, W., Chen, S.: Recognition, object detection and segmentation of white background photos based on deep learning, pp. 182–187 (2017)
Chin, T.-W., Halpern, M.: Domain-specific approximation for object detection. IEEE Micro 38, 31–40 (2018)
Article Google Scholar
Cao, W., Yuan, J., He, Z.: Fast deep neural networks with knowledge guided training and predicted regions of interests for real-time video object detection. IEEE Access 6, 8990–8999 (2018)
Article Google Scholar
Liu, Y., Hua, K.A.: Field effect deep networks for image recognition. ACM Trans. Multimed. Comput. Commun. Appl. 12(4), 1–22 (2016)
Google Scholar
Sangineto, E., Nabi, M., Culibrk, D., Sebe, N.: Self paced deep learning for weakly supervised object detection. IEEE Trans. Pattern Anal. Mach. Intell. 14(8), 712–725 (2015)
Google Scholar
Bazrafkan, S., Corcoran, P.: Enhancing iris authentication on handheld devices using deep learning derived segmentation techniques. In: 2018 IEEE International Conference on Consumer Electronics (ICCE), pp. 1–2 (2018)
Xu, H., Lv, X., Wang, X., Ren, Z., Bodla, N., Chellappa, R.: Deep regionlets for object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 798–814 (2018)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, National Institute of Technology Kurukshetra, Kurukshetra, 136119, India
Anamika Dhillon & Gyanendra K. Verma

Authors

Anamika Dhillon
View author publications
You can also search for this author in PubMed Google Scholar
Gyanendra K. Verma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anamika Dhillon.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dhillon, A., Verma, G.K. Convolutional neural network: a review of models, methodologies and applications to object detection. Prog Artif Intell 9, 85–112 (2020). https://doi.org/10.1007/s13748-019-00203-0

Download citation

Received: 28 May 2019
Accepted: 25 November 2019
Published: 20 December 2019
Issue Date: June 2020
DOI: https://doi.org/10.1007/s13748-019-00203-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convolutional neural network: a review of models, methodologies and applications to object detection

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Convolutional neural network: a review of models, methodologies and applications to object detection

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation