Comparative Analysis of COVID-19 X-ray Images Classification Using Convolutional Neural Network, Transfer Learning, and Machine Learning Classifiers Using Deep Features

Rekha Rajagopal

doi:10.1134/S1054661821020140

Comparative Analysis of COVID-19 X-ray Images Classification Using Convolutional Neural Network, Transfer Learning, and Machine Learning Classifiers Using Deep Features

APPLICATION PROBLEMS
Published: 30 June 2021

Volume 31, pages 313–322, (2021)
Cite this article

Download PDF

Pattern Recognition and Image Analysis Aims and scope Submit manuscript

Comparative Analysis of COVID-19 X-ray Images Classification Using Convolutional Neural Network, Transfer Learning, and Machine Learning Classifiers Using Deep Features

Download PDF

Rekha Rajagopal¹

3684 Accesses
18 Citations
Explore all metrics

Abstract

A new type of coronavirus called (SARS-CoV-2) causes the COVID-19 coronavirus disease. The World Health Organization (WHO) declared this COVID-19 disease as pandemic because the disease got spread over several countries. At present situation, there is no medicine available for prevention or cure of the infectious disease. Samples taken from persons with COVID-19 symptoms are commonly tested using Reverse Transcription–Polymerase Chain Reaction (RT-PCR) process which is costlier and also take a minimum of 24 h to get the test result as either negative or positive. The proposed work suggests the possibility of using X-ray images of persons having COVID-19 symptoms to be classified as 1) healthy, 2) COVID-19 affected, or 3) Pneumonia affected. Experimentation is carried out with data samples from each category and classification done using Convolutional Neural Network (CNN), transfer learning using VGG Net, and machine learning techniques such as Support Vector Machine (SVM) and XGBoost which utilizes features extracted with the help of Convolutional Neural Network. Out of the models compared, the SVM with CNN extracted features was able to produce a highest precision, recall, F1-score and accuracy of 95.27, 94.52, 94.94, and 95.81%, respectively in identifying healthy, Pneumonia, and COVID-19 affected persons while experimented with 5-fold cross validation.

COVIDetectioNet: COVID-19 diagnosis system based on X-ray images using features selected from pre-learned deep features ensemble

Article 18 September 2020

Multiclass Image Classification of COVID-19 Chest X-ray Scans Using Deep Learning

A residual network-based framework for COVID-19 detection from CXR images

Article 15 December 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

INTRODUCTION

COVID-19, a type of corona virus disease that affects the respiratory tract of human beings was first found in Wuhan, China in December, 2019. As of July 20, 2020, WHO reports that there have been 14 348 858 confirmed COVID-19 cases all over the world which includes 603 691 death reports. The disease spreads very easily from one person to another person through the water droplets that comes out during sneezing, coughing or exhale by the infected person. Almost all the countries are struggling very hard to slow down the rate of COVID-19 infection. Slowing down the infection rate through preventive action is possible only if the actual number of COVID-19 cases in a geographical region is known, which can be possible only if enough number of COVID-19 tests are taken.

The important factors that should be considered in selection of a COVID-19 test is its accuracy in correctly identifying the positive cases, time needed to get the result and cost for undertaking the test. WHO has recommended RT-PCR test as a gold standard for testing COVID-19. The advantage of this test is that it is able to achieve 100% accuracy in disease detection. But the drawback is its cost and time taken to get the result. Computed tomography (CT) scan of chest can be used for initial screening of COVID-19 patients [1]. The advantage of this method is the rapid diagnostic rate. But the accuracy is low compared to RT-PCR test. Loop mediated isothermal amplification (LAMP) method [2] works with a principle of amplifying the target genetic material and can produce result within an hour. But this method produces less accuracy hence not widely used. Antibody tests are simple in which the antibodies developed by the immune system was used for diagnosing COVID cases. But the problem is its low accuracy and is good at identifying persons who were infected previously by COVID but not the currently affected persons.

Deep learning was used to differentiate COVID-19 from other Pneumonia cases with an AUC of 0.87 by [3]. Social mimic optimization was used by [4] to extract efficient features from X-ray images with the help of CNN model for initial feature extraction to detect COVID cases. A customized deep Convolutional Neural Network called COVID-net was developed by [5] to detect the disease from X-ray images. X‑ray and CT scan images were used to detect the disease by [6] using convolutional neural network with transfer learning approach. The results show that accuracy was greater when CT scan images were used. DarkNet model was used in [7] to detect corona and non-corona cases with the help of X-ray images. The accuracy obtained was 87% by the DarkNet model when tried for multiple classes such as COVID, no findings, and Pneumonia.

Fractional Multichannel Exponent Moments (FrMEMs) method was used by [8] for feature extraction from chest X-ray images and classification of COVID and non-COVID cases using machine learning techniques. A deep Convolutional Neural Network based transfer learning model was used by [9] for detection of COVID patients from chest X-ray images. Deep learning-based system that uses VB-net neural network was developed by [10] for identification and quantification of infected region in the CT images. 2D and 3D deep learning models was used by [11] for categorization of Coronavirus cases and non-coronavirus cases from CT images. Transfer learning approach in which several pre trained CNN models were used by [12] for classification of COVID and non-COVID cases using chest X-ray dataset.

Some of the important problems faced by existing approaches in detecting COVID-19 patients are listed below.

1. Respiratory tract specimen has to be collected by the healthcare professionals. Nasopharyngeal swab sampling is one of the commonly used method in which nurse has to be in close contact with the patient [13]. This may increase the risk of cross infection.

2. RT-PCR kits suggested by WHO for testing COVID cases are costly and the quantity of kits available in developing countries are not sufficient for the large population. Hence there is a need for developing cost effective solutions for testing [14].

3. The sensitivity of rapid antigen tests done are not in such a way that it alone can be used for initial screening [15].

4. The time delay in getting the test result will lead to delay in tracing the contacts of the affected person with another healthy person [16].

The objective of the proposed research work is to perform a comparative analysis of the effectiveness of using various deep learning and machine learning approaches in detecting the COVID affected patients with the help of chest X-ray images. A comprehensive outline about the various tests used for identifying the COVID positive cases and their merits, demerits are discussed in introduction section. The dataset used and the methodology followed is described in Section 2 followed by experimental results in Section 3. Discussion on the performance of models considered with the help of evaluation metrics is given in Section 4 followed by conclusion derived from the research work in Section 5.

1 MATERIALS AND METHODS

a. Dataset

The data used for this research work was taken from Kaggle (https://www.kaggle.com/prashant268/chest-xray-covid19-pneumonia). The dataset contains the train and test directories with a total of 5144 and 1288 chest X-ray images, respectively. The train directory contains chest X-ray images of 1266 healthy persons, 3418 Pneumonia patients, and 460 COVID patients. The test directory contains chest X-ray images of 317 healthy persons, 855 Pneumonia patients, and 116 COVID patients. A sample chest X-ray image of healthy person, Pneumonia patient, and COVID patient are shown in Fig. 1.

b. Methodology

A total of 6432 images of chest X-rays were used for the research work. Jupyter notebooks that are hosted by Google Colab (Google Colaboratory offers 12GB NVIDIA Tesla K80 GPU that can be used up to 12 h continuously for free) was used for the implementation of the deep learning and machine learning algorithms. The proposed research requires analysis of chest X-ray images which can be done faster using a GPU than a CPU. The flow diagram of the proposed work is shown in Fig. 2.

(i) Pre-processing. The images taken from the dataset are of different sizes and hence cannot be processed effectively by forthcoming algorithms. Henceforth, all the images are resized to a dimension of 224 × 224. Moreover, all the RGB images read are normalized by dividing it with 255, so that all the image intensity values are brought to a common range of [0 to 1].

(ii) Train, validation, and test split. In order to validate the model constructed, the training dataset was split with 80% for constructing the model and the remaining 20% for validation of the constructed model. The test dataset was not shown to the model during the training process and was used to verify the actual performance of the constructed model. The 80% training data was further divided into 5 subsets in order to perform fivefold cross validation.

(iii) Construction of Convolutional Neural Network (CNN) model. The CNN model was constructed with a sequence of following layers:

• input layer,

• convolutional layer,

• pooling layer,

• dense/fully connected layer.

Input layer. The image given to the input layer is of size 224 × 224 × 3, where 224 × 224 represents the X and Y dimensions and ‘3’ represents the RGB channels.

Convolutional layer. The CNN algorithm performs feature extraction from the input with the help of kernel / filter. The filter was made to stride over the entire image and a dot product was made between the portion of image covered by the filter and the filter itself. The multiplied values were summed up and outputs a single value at each stride. A 2D output called feature map was obtained as a result of this process when worked with a single filter. Rectified Linear Unit (ReLU) activation function was used in order to non-linearly transform the input received in a node. The feature map values are calculated according to the following formula, where the input image is denoted by f and the filter by h. The indexes of rows and columns of the result matrix are marked with m and n, respectively.

$$\begin{gathered} G[m,n] = (f \times h)[m,n] \\ = \mathop \sum \limits_j \,\mathop \sum \limits_k \,h[j,k]f[m - j,n - k]. \\ \end{gathered} $$

Pooling layer. This layer is added after convolutional layer in order to avoid the variance that may result in the feature map when the input is slightly shifted or rotated. Max pooling was used in the research work in which the maximum value covered by the pooling filter in the feature map was retained. Pooling layer also helps to minimize the number of parameters and computations required. A dropout layer was added after pooling to avoid overfitting that may happen during training.

Dense/fully connected layer. The output of the pooling layer was flattened to a 1D vector and given as input to fully connected layer. This layer updates the weights in such a way that it was able to predict the probabilities of each class to which the input belongs. Softmax activation was used at the final dense layer since probabilities of each class need to be found. The Softmax activation function is given by

$$Softmax({{x}_{i}}) = \frac{{\exp ({{x}_{i}})}}{{\sum\limits_j {{\text{exp}}({{x}_{j}})} }},$$

where ${{x}_{i}}$ is the ith value of the input vector x which has j elements. Summary of the CNN model used in the research work is given in Table 1.

Table 1. Summary of CNN model

Full size table

The CNN model was compiled with categorical cross entropy loss using Adam optimizer. The Adam version of stochastic gradient descent is used to fit the deep learning model so that due to adaptive learning rates the model converges fast. The model was trained with a batch size of 128. The model trained was validated with the data allotted for validation. Finally, the constructed model was evaluated using test dataset.

(iv) Transfer learning model using VGG16. Transfer learning is a task in which a deep learning model which was trained on a different dataset will be used to solve a classification task of a new problem. VGG16 is a deep learning model which was trained with more than million images to categorize 1000 objects. The proposed research work made use of the pre-trained VGG16 deep learning model and its corresponding weights for classification of healthy, pneumonia and COVID patients. The advantage of this approach is that it reduces the time required to construct a new model from scratch since the weights are readily available in the pretrained model. In the proposed research work, the last fully connected layer of VGG16 which contains 1000 nodes to categorize 1000 objects are replaced with 3 nodes in order to classify the input X-ray image as healthy, Pneumonia or COVID patient. The model summary of VGG16 is shown in Table 2.

Table 2. Summary of VGG16 model

Full size table

(v) Machine learning techniques with CNN as feature extractor. Convolutional Neural Network was used for extraction of features from the input image. The extracted features were used for classifying the input image into three classes: Healthy, Pneumonia and COVID using the following machine learning techniques.

Support Vector Machine (SVM). SVM is a simple and powerful machine learning technique in which the objective of the algorithm is to find a hyperplane that better separates the different classes. The separating plane should be in such a way that it maximizes the distance between the different classes. If the classes are non-linearly separable, then SVM transforms the lower dimensional data to a higher dimension using a non-linear Kernel function such as Radial Basis Function, Polynomial, and Sigmoid and then performs classification.

Gradient Boosted Decision Tree (XGBoost). This ensemble learning algorithm works in such a way that it adds new model to the existing model so that the residual errors made by the previous models are rectified. Decision trees are considered as weak learners because these algorithms work in a greedy manner to find a best split of feature with the help of Gini or entropy measures. A new decision tree is sequentially added to rectify the error made previously by a decision tree in correct classification.

(vi) Evaluation of constructed model. The models constructed using deep learning and machine learning algorithms are evaluated based on the following metrics with the help of test dataset that was not seen by the model before. The evaluation metrics computed with the help of confusion matrix are:

$${\text{Accuracy}},\quad Acc = \frac{{True~Positive + True~Negative}}{{True~Positive + True~Negative + False~Positive + False~Negative}};$$

$${\text{Precision}},\quad P = \frac{{True~Positive}}{{True~Positive + False~Positive~}};$$

$${\text{Recall}},\quad R = \frac{{True~Positive}}{{True~Positive + False~Negative~}};$$

$${\text{F1-Score}},\quad F1 = 2\frac{{P \times R}}{{P + R}}.$$

Five-fold cross validation is performed in order to estimate the actual performance of the model.

2 EXPERIMENTAL RESULTS

Experiments were carried out to classify the input images into three categories: healthy, Pneumonia, and COVID. The images obtained as input from the dataset is pre-processed and is shown in Fig. 3. The training dataset is divided into 5 folds such that at each fold 20% of training data was used for validation and remaining 80% for training the model. This step is repeated five times with different set of image samples for training and validation.

The validation and training accuracy obtained for Convolutional Neural Network (CNN) model for 50 epochs is shown in Fig. 4. The validation and training loss for the CNN model is shown in Fig. 5. The results show that as the number of epochs increases the training accuracy also increases. From the Figs. 4 and 5 it can also be viewed that there may be a chance for overfitting since training loss continue to decrease and training accuracy increase, but there is no significant improvement in the validation accuracy after 15 epochs.

The validation and training accuracy obtained for transfer learning model (VGG Net) for classification of healthy, Pneumonia, and COVID is shown in Fig. 6. The validation loss and training loss for the VGG Net model is shown in Fig. 7. A comparative graph on validation and training–loss, accuracy of CNN, and VGG Net model is shown in Fig. 8.

The normalized confusion matrix for fold 3 of CNN model, transfer learning model, SVM model with CNN feature extractor and XGBoost model with CNN feature extractor is shown in Figs. 9–12, respectively. Confusion matrices of the above-mentioned models without normalization is shown in Figs. 13–16.

The K-fold cross validation result (K = 5) for classification of input chest X-ray image into one of the categories as healthy, Pneumonia or COVID using CNN model is shown in Table 3. Similarly, the cross-validation results for transfer learning model, CNN + SVM model and CNN + XGBoost models are shown in Tables 4–6, respectively. The average precision, recall, F1-score and accuracy of the four different models constructed is also shown in the tables mentioned before.

Table 3. Precision, recall, F1-score, and accuracy of CNN model

Full size table

Table 4. Precision, recall, F1-score, and accuracy of transfer learning model

Full size table

Table 5. Precision, recall, F1-score, and accuracy of CNN + SVM model

Full size table

Table 6. Precision, recall, F1-score, and accuracy of CNN + XGBoost model

Full size table

DISCUSSION

Experimental results show that the accuracy of correctly predicting an input image as healthy, pneumonia or COVID by the constructed models is similar i.e., around 95% except for transfer learning model VGG Net (92.51%). The reason may be that the VGG Net model was originally developed to categorize 1000 different objects and hence it might not have the weights tuned to classify chest X-ray images. Hence in order to improve the performance of the transfer learning model (VGG Net), the weights can be made trainable and retaining only the network structure. In identification of whether a person is affected with COVID or not. Recall is the most important evaluation measure. The reason is that recall says about the percentage of actual positive cases identified by the model. High precision indicates low false positive rate and high recall indicate low false negative rate. The XGBoost model which uses feature extracted using CNN model outputs better recall compared to other models. SVM model with features extracted using CNN model outputs the best precision, F1-score and accuracy. On comparison with the DarkNet model developed by [7] which was able to produce 87.02% for multi class classification of healthy, pneumonia and COVID cases, the experimental results show that machine learning models with deep learning algorithm for feature extraction can perform better. A comparison of performance metrics of CNN, VGG Net, CNN + SVM, and CNN + XGBoost models is shown in Table 7.

Table 7. Comparison of performance metrics of CNN, VGGnet, CNN + SVM, and CNN + XGBoost models

Full size table

CONCLUSIONS

Comparative analysis of the precision, recall, F1‑score and accuracy of four different models: convolutional neural network model, Transfer learning model (VGG Net), Support Vector Machine with features extracted using Convolutional Neural Networks, and XGBoost with CNN features were performed to classify the input chest X-ray image as healthy, pneumonia, and COVID. The results show that machine learning models with features automatically extracted using a deep learning model like CNN performs better with CNN+SVM giving an average precision, recall, F1-score and accuracy of 95.27, 94.52, 94.94, and 95.81%, respectively. Hence, this method of identifying COVID patients with chest X-ray images can be used for initial screening. The performance of the model constructed can be further improved by increasing the size of dataset, performing data augmentation and incorporating methods to handle data imbalance.

REFERENCES

Yan Li and Liming Xia, “Coronavirus disease 2019 (COVID-19): Role of chest CT in diagnosis and management,” Am. J. Roentgenol. 214 (6), 1280–1286 (2020).
Article Google Scholar
Tsugunori Notomi, Hiroto Okayama, Harumi Masubuchi, Toshihiro Yonekawa, Keiko Watanabe, Nobuyuki Amino, and Tetsu Hase, “Loop-mediated isothermal amplification of DNA,” Nucleic Acids Res. 28 (12), e63 (2000).
Article Google Scholar
Shuo Wang, Yunfei Zha, Weimin Li, Qingxia Wu, Xiaohu Li, Meng Niu, Meiyun Wang, et al., “A fully automatic deep learning system for COVID-19 diagnostic and prognostic analysis,” Eur. Respir. J. 57 (5) (2020).
M. Toğaçar, B. Ergen, and Z. Cömert, “COVID-19 detection using deep learning models to exploit Social Mimic Optimization and structured chest X-ray images using fuzzy color and stacking approaches,” Comput. Biol. Med. 121, 103805 (2020).
Article Google Scholar
L. Wang and A. Wong, “COVID-Net: A tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images,” arXiv (2020). arXiv:2003.09871
Md Zahangir Alom, M. M. Rahman, Mst Shamima Nasrin, T. M. Taha, and V. K. Asari, “COVID_MTNet: COVID-19 detection with multi-task deep learning approaches,” arXiv (2020). arXiv:2004.03747
T. Ozturk, M. Talo, E. A. Yildirim, U. B. Baloglu, O. Yildirim, and U. R. Acharya, “Automated detection of COVID-19 cases using deep neural networks with X‑ray images,” Comput. Biol. Med. 121, 103792 (2020).
Article Google Scholar
M. A. Elaziz, K. M. Hosny, A. Salah, M. M. Darwish, Songfeng Lu, and A. T. Sahlol, “New machine learning method for image-based diagnosis of COVID-19,” PloS One 15 (6), e0235187 (2020).
Article Google Scholar
Sohaib Asif, Yi Wenhui, Hou Jin, Yi Tao, and Si Jinhai, “Classification of COVID-19 from chest X-ray images using deep convolutional neural networks,” medRxiv (2020). https://doi.org/10.1101/2020.05.01.20088211
Fei Shan, Yaozong Gao, Jun Wang, Weiya Shi, Nannan Shi, Miaofei Han, Zhong Xue, and Yuxin Shi, “Lung infection quantification of Covid-19 in CT images with deep learning,” arXiv (2020). arXiv:2003.04655
Ophir Gozes, Maayan Frid-Adar, Hayit Greenspan, Patrick D. Browning, Huangqi Zhang, Wenbin Ji, Adam Bernheim, and Eliot Siegel, “Rapid AI development cycle for the coronavirus (Covid-19) pandemic: Initial results for automated detection and patient monitoring using deep learning CT image analysis,” arXiv (2020). arXiv:2003.05037
Muhammad E. H. Chowdhury, Tawsifur Rahman, Amith Khandakar, Rashid Mazhar, Muhammad Abdul Kadir, Zaid Bin Mahbub, Khandaker Reajul Islam, et al., “Can AI help in screening viral and COVID-19 pneumonia?,” IEEE Access 8, 132665–132676 (2020). arXiv:2003.13145
Yan Qian, Tieying Zeng, Hui Wang, Min Xu, Junhua Chen, Na Hu, Daiqi Chen, and Yu Liu, “Safety management of nasopharyngeal specimen collection from suspected cases of coronavirus disease 2019,” Int. J. Nurs. Sci. 7 (2), 153–156 (2020).
Google Scholar
Mauricio J. Farfan, Juan P. Torres, Miguel O’Ryan, Mauricio Olivares, Pablo Gallardo, and Carolina Salas, “Optimizing RT-PCR detection of SARS-CoV-2 for developing countries using pool testing,” Rev. Chil. Infectol. 37 (3) (2020). https://doi.org/10.1101/2020.04.15.20067199
A. Scohy, A. Anantharajah, M. Bodéus, B. Kabamba-Mukadi, A. Verroken, and H. Rodriguez-Villalobos, “Low performance of rapid antigen detection test as frontline testing for COVID-19 diagnosis,” J. Clin. Virol. 129, 104455 (2020).
Article Google Scholar
M. E. Kretzschmar, G. Rozhnova, M. C. J. Bootsma, M. van Boven, J. H. H. M. van de Wijgert, and M. J. M. Bonten, “Impact of delays on effectiveness of contact tracing strategies for COVID-19: A modelling study,” Lancet Public Health 5 (8), E452–E459 (2020).
Article Google Scholar

Download references

Author information

Authors and Affiliations

PSG College of Technology, 641004, Coimbatore, India
Rekha Rajagopal

Authors

Rekha Rajagopal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rekha Rajagopal.

Ethics declarations

The authors declare that they have no conflicts of interest. This article does not contain any studies involving animals or human participants performed by any of the authors.

Additional information

Rekha Rajagopal is Assistant Professor in Department of Information Technology at PSG College of Technology. She was born in the year 1982 and completed her UG graduation in the year 2003, PG graduation in 2007, and Doctoral degree in 2017 under Anna University. She has more than 30 research papers published in various international and national journals and conferences. She is a life member in Indian Society for Technical Education. Her areas of interest include biometrics, machine learning, wireless networks, and security.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rekha Rajagopal Comparative Analysis of COVID-19 X-ray Images Classification Using Convolutional Neural Network, Transfer Learning, and Machine Learning Classifiers Using Deep Features. Pattern Recognit. Image Anal. 31, 313–322 (2021). https://doi.org/10.1134/S1054661821020140

Download citation

Received: 03 August 2020
Revised: 24 November 2020
Accepted: 25 November 2020
Published: 30 June 2021
Issue Date: April 2021
DOI: https://doi.org/10.1134/S1054661821020140

Keywords:

Use our pre-submission checklist

Avoid common mistakes on your manuscript.