Abstract
Modern industry requires modern solutions for monitoring the automatic production of goods and detecting defected materials. Smart monitoring of the functionality of the mechanical parts of technology systems or machines is a mandatory step towards automatic production. Deep Learning has proven its efficiency in feature extraction from images, videos and text, thereby succeeding in various object detection, recognition, segmentation and classification tasks. Despite its advances, little has been investigated about the effectiveness of specially designed Convolutional Neural Networks (CNNs) for defect detection and industrial object recognition. In the particular study, we employed six publicly available industrial-related image datasets, containing defected materials and industrial tools, or engine parts, aiming to develop a specialized model to classify them. Motivated by the success of the Virtual Geometry Group (VGG) network, we propose a modified version of it, called Multipath VGG19, which allows for extra local and global feature extraction (multi-level feature extraction) by making use of several processing paths. The extra features are fused via concatenation. The experiments verified the effectiveness of MVGG19 over the baseline VGG19. Specifically, top classification performance was achieved in five of the six image datasets, whilst the average classification improvement was 6.95%. MVGG19 also showed better overall stability and robustness to dataset variation, compared to other baseline state-of-the-art CNNs.
Similar content being viewed by others
Data availability
The datasets generated during and/or analysed during the current study are available in the following repositories: Kaggle: Dabhi (2020). Github: https://github.com/lvxiaoming2019/GC10-DET-Metallic-Surface-Defect-Datasets. Github: https://github.com/abin24/Magnetic-tile-defect-datasets. MVTEC: https://www.mvtec.com/company/research/datasets/mvtec-itodd. UTAH State University Libraries: https://digitalcommons.usu.edu/all_datasets/48/. Github: https://github.com/zae-bayern/elpv-dataset
References
Apostolopoulos ID, Papathanasiou ND, Panayiotakis GS (2021) Classification of lung nodule malignancy in computed tomography imaging utilising generative adversarial networks and semi-supervised transfer learning. Biocybern Biomed Eng 41(4):1243–1257. https://doi.org/10.1016/j.bbe.2021.08.006
Buerhop-Lutz C, Deitsch S, Maier A et al (2018) A benchmark for visual identification of defective solar cells in electroluminescence imagery. In: 35th European photovoltaic solar energy conference and exhibition; pp 1287–1289, 9071 kb. https://doi.org/10.4229/35THEUPVSEC20182018-5CV.3.15
Caggiano A, Zhang J, Alfieri V et al (2019) Machine learning-based image processing for on-line defect recognition in additive manufacturing. CIRP Ann 68:451–454. https://doi.org/10.1016/j.cirp.2019.03.021
Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1251–1258. https://doi.org/10.1109/CVPR.2017.195
Dabhi R (2020) Casting product image data for quality inspection, Kaggle Data, v2. https://www.kaggle.com/ravirajsinh45/real-life-industrial-dataset-of-casting-product/metadata. Accessed 1 Sept 2020
Deitsch S, Christlein V, Berger S et al (2019) Automatic classification of defective photovoltaic module cells in electroluminescence images. Sol Energy 185:455–468. https://doi.org/10.1016/j.solener.2019.02.067
Deitsch S, Buerhop-Lutz C, Sovetkin E et al (2020) Segmentation of photovoltaic module cells in electroluminescence images. arXiv:180606530 [cs]
Diez-Olivan A, Del Ser J, Galar D, Sierra B (2019) Data fusion and machine learning for industrial prognosis: trends and perspectives towards Industry 4.0. Inf Fusion 50:92–111. https://doi.org/10.1016/j.inffus.2018.10.005
Drost B, Ulrich M, Bergmann P et al (2017) Introducing mvtec itodd-a dataset for 3d object recognition in industry. In: Proceedings of the IEEE international conference on computer vision workshops. pp 2200–2208. https://doi.org/10.1109/ICCVW.2017.257
Du C, Wang Y, Wang C et al (2020) Selective feature connection mechanism: Concatenating multi-layer CNN features with a feature selector. Pattern Recogn Lett 129:108–114. https://doi.org/10.1016/j.patrec.2019.11.015
Fawzi A, Samulowitz H, Turaga D, Frossard P (2016) Adaptive data augmentation for image classification. In: 2016 IEEE international conference on image processing (ICIP). IEEE, pp 3688–3692. https://doi.org/10.1109/ICIP.2016.7533048
Fu G, Sun P, Zhu W et al (2019) A deep-learning-based approach for fast and robust steel surface defects classification. Opt Lasers Eng 121:397–405. https://doi.org/10.1016/j.optlaseng.2019.05.005
Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning. MIT press, Cambridge
Gu J, Wang Z, Kuen J et al (2017) Recent advances in convolutional neural networks. arXiv:151207108 [cs]
Han J, Zhang D, Cheng G et al (2018) Advanced deep-learning techniques for salient and category-specific object detection: a survey. IEEE Signal Process Mag 35:84–100. https://doi.org/10.1109/MSP.2017.2749125
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. European conference on computer vision. Springer, New York, pp 630–645. https://doi.org/10.1007/978-3-319-46493-0_38
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 4700–4708
Huang Y, Qiu C, Guo Y et al (2018) Surface defect saliency of magnetic tile. In: 2018 IEEE 14th international conference on automation science and engineering (CASE). IEEE, Munich, pp 612–617. https://doi.org/10.1007/s00371-018-1588-5
Iandola FN, Han S, Moskewicz MW, et al (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:160207360
Khan S, Rahmani H, Shah SAA, Bennamoun M (2018) A guide to convolutional neural networks for computer vision. Synth Lect Comput vis 8:1–207. https://doi.org/10.2200/S00822ED1V01Y201712COV015
Khan S, Yong S-P (2016) A comparison of deep learning and hand crafted features in medical image modality classification. In: 2016 3rd international conference on computer and information sciences (ICCOINS). IEEE, pp 633–638. https://doi.org/10.1109/ICCOINS.2016.7783289
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:14126980
Kornblith S, Shlens J, Le QV (2019) Do better imagenet models transfer better? In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2661–2671. https://doi.org/10.1109/CVPR.2019.00277
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60:84–90. https://doi.org/10.1145/3065386
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444. https://doi.org/10.1038/nature14539
Lin Y, Nie Z, Ma H (2017) Structural damage detection with automatic feature-extraction through deep learning. Comput Aided Civ Infrastruct Eng 32:1025–1046. https://doi.org/10.1111/mice.12313
Lv X, Duan F, Jiang J et al (2020) Deep metallic surface defect detection: the new benchmark and detection network. Sensors 20:1562. https://doi.org/10.3390/s20061562
Maguire M, Dorafshan S, Thomas RJ (2018) SDNET2018: a concrete crack image dataset for machine learning applications. https://doi.org/10.15142/T3TD19
Mehta S, Paunwala C, Vaidya B (2019) CNN based traffic sign classification using adam optimizer. In: 2019 international conference on intelligent computing and control systems (ICCS). IEEE, pp 1293–1298. https://doi.org/10.1109/ICCS45141.2019.9065537
Najafabadi MM, Villanustre F, Khoshgoftaar TM et al (2015) Deep learning applications and challenges in big data analytics. J Big Data 2:1–21. https://doi.org/10.1186/s40537-014-0007-7
Poernomo A, Kang D-K (2018) Biased dropout and crossmap dropout: learning towards effective dropout regularization in convolutional neural network. Neural Netw 104:60–67. https://doi.org/10.1016/j.neunet.2018.03.016
Sandler M, Howard A, Zhu M et al (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 4510–4520
Shin H-C, Roth HR, Gao M et al (2016) Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging 35:1285–1298. https://doi.org/10.1109/TMI.2016.2528162
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556
Szegedy C, Vanhoucke V, Ioffe S et al (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2818–2826. https://doi.org/10.1109/CVPR.2016.308
Tajbakhsh N, Shin JY, Gurudu SR et al (2016) Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imaging 35:1299–1312. https://doi.org/10.1109/TMI.2016.2535302
Tan M, Le Q (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning. PMLR, pp 6105–6114
Wagner J, Schiller D, Seiderer A, André E (2018) Deep learning in paralinguistic recognition tasks: are hand-crafted features still relevant?. https://doi.org/10.21437/Interspeech.2018-1238
Wang J, Ma Y, Zhang L et al (2018a) Deep learning for smart manufacturing: methods and applications. J Manuf Syst 48:144–156. https://doi.org/10.1016/j.jmsy.2018.01.003
Wang P, Liu H, Wang L, Gao RX (2018b) Deep learning-based human motion recognition for predictive context-aware human-robot collaboration. CIRP Ann 67:17–20. https://doi.org/10.1016/j.cirp.2018.04.066
Wang J, Fu P, Gao RX (2019) Machine vision intelligence for product defect inspection based on deep learning and Hough transform. J Manuf Syst 51:52–60. https://doi.org/10.1016/j.jmsy.2019.03.002
Weiss K, Khoshgoftaar TM, Wang D (2016) A survey of transfer learning. J Big Data 3:9. https://doi.org/10.1186/s40537-016-0043-6
Wong SC, Gatt A, Stamatescu V, McDonnell MD (2016) Understanding data augmentation for classification: when to warp? In: 2016 international conference on digital image computing: techniques and applications (DICTA). IEEE, pp 1–6. https://doi.org/10.1109/DICTA.2016.7797091
Yan LC, Yoshua B, Geoffrey H (2015) Deep learning. Nature 521:436–444. https://doi.org/10.1038/nature14539
Yue-Hei Ng J, Yang F, Davis LS (2015) Exploiting local features from deep networks for image retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. pp 53–61. https://doi.org/10.1109/CVPRW.2015.7301272
Zeng G, Zhou J, Jia X et al (2018) Hand-crafted feature guided deep learning for facial expression recognition. In: 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018). IEEE, pp 423–430. https://doi.org/10.1109/FG.2018.00068
Zhuang F, Qi Z, Duan K et al (2020) A comprehensive survey on transfer learning. Proc IEEE 109:43–76. https://doi.org/10.1109/JPROC.2020.3004555
Zoph B, Cubuk ED, Ghiasi G et al (2020) Learning data augmentation strategies for object detection. European conference on computer vision. Springer, New York, pp 566–583. https://doi.org/10.1007/978-3-030-58583-9_34
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there are no conflicts of interest. The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Apostolopoulos, I.D., Tzani, M.A. Industrial object and defect recognition utilizing multilevel feature extraction from industrial scenes with Deep Learning approach. J Ambient Intell Human Comput 14, 10263–10276 (2023). https://doi.org/10.1007/s12652-021-03688-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-021-03688-7