1 Introduction

In recent times, there is a rapid advancement observed in the domain of information technologies (IT) and digital electronics while there exists an exponential increase in the development of 5G internet of things (IoT) systems. These technologies can be employed for healthcare diagnosis in different aspects. Smart healthcare for disease diagnosis and prevention concentrates on the advancement in healthcare technology for improving human health at every level by the use of smart technologies. Due to the improvements in information technology, the idea of smart healthcare has progressively come to the fore. Smart healthcare makes use of the latest generation of information technologies, like the internet of things (loT), big data, cloud computing, and artificial intelligence, for transforming the classical medical system in an all-round way, making healthcare more efficient, more convenient, and more personalized. With the introduction to the concept of smart healthcare, the diagnosis of COVID-19 becomes essential.

Nowadays, a severe disease called COVID-19 is caused by a virus named SARS-CoV-2 which was found in Wuhan, China. Initially, COVID-19 infects the respiratory tract of a human and causes fever, cough, shortness in breathing, and many other serious issues like pneumonia [1]. Pneumonia is a kind of disease caused by lung inflammation. The SARS-CoV-2 virus, bacteria, fungi, and other dangerous creatures are responsible for causing severe infections. The conditions like lack of immune power, asthma, prominent infections, and aging enhance the intensity of pneumonia. Pneumonia has various treatments that depend upon the organ which is infected, and some of the effective remedies are antibiotics, cough vaccinations, antipyretics, pain killers, etc. [2]. Based on the signs, a patient is admitted to the hospital and, in critical cases, they are admitted to the Intensive Care Unit (ICU). COVID-19 outbreak is assumed as a severe disease because of its higher permeability, and contagiousness. Moreover, the pandemic disease highly impacts the patients’ health condition where numerous patients are hospitalized in ICU, time taken for treatment, and limited hospital resources. Therefore, earlier disease diagnosis is essential to save the patient's life and prevent mortality.

Computed Tomography (CT), and X-ray images are used in early disease diagnosis and offer treatment for COVID-19 accordingly [3]. Obviously, X-ray images are inexpensive and robust, and patients are showing for medium radiation and it is preferred highly when compared with CT images [4]. But, it is impossible for analysing pneumonia manually. The white spot on X-ray images has to be monitored and interpreted widely by an expert. Thus, the spots might be confused with TB or bronchitis that results in an imbalanced diagnosis. Manual analysis of X-ray images offers précised disease diagnosis to a greater extent. In addition, maximum performance is attained by a manual intervention of CT images. Thus, Artificial intelligence (AI) related results are the inexpensive and exact diagnosis for COVID-19 and many other diseases [22, 23]. As an inclusion, Deep learning (DL) and AI methodologies are applied in biomedical applications. Diagnosing cardiac arrhythmia, brain damages, lung segmentation, breast cancer, skin cancer, epilepsy, and pneumonia using DL approaches has enhanced the popularity of such methods in the biomedical field [24,25,26,27].

Under the inexistence of a smart diagnosing model, it is mandatory to develop a robust and exact prediction technique. But, the contemporary approaches are accessible in predicting epidemic disease with better accuracy at a reasonable cost. In recent times, 3 kinds of COVID-19 testing methods are named as Reverse Transcription Polymerase Chain Reaction (RT-PCR), CT Scan, and Chest X-Ray (CXR). Additionally, DL is evolved from Machine Learning (ML) which performs automated training or learning delicate features from the data [5]. In clinical imaging, massive DL models were applied with Convolutional Neural Network (CNN). This is due to, Deep Methods like Stacked Auto-Encoder (SAE), Deep Belief Network (DBN), and Deep Boltzmann Machine (DBM) with inputs as a vector [6]. Thus, in clinical imaging, vectorization ruins the structural, as well as configuration data accessible in neighbouring pixels and voxels, which are significant structural data. CNN applies the input as 2D or 3D images and applies spatial and configurationally details. Lung infection is a major symptom of COVID-19 and Pneumonia. This infection can be identified at the time of analyzing CXR and CT-Scan images of lungs and offers to penetrate while examining COVID-19. However, the DL model is applied for extracting features from diverse radiology images and predicts the existence of the disease. A newly presented approach assures the employment of deep transfer learning on CXR and CT-Scan images of lungs and applies the report of malicious COVID-19 patients.

This paper projects a new IoT enabled Depthwise Separable Convolution Neural Network (DWS-CNN) with Deep Support Vector Machine (DSVM) for COVID-19 Diagnosis and Classification. Initially, patient data will be collected in the data acquisition stage using IoT devices and sent to the cloud server via 5G networks. The newly developed DWS-CNN approach desires to predict both binary and multiple classes of COVID-19 by embedding a collection of processes such as Gaussian filtering (GF) relied pre-processing, feature extraction, and classification. At the initial stage, the GF approach has been used to eliminate the presence of noise from an image. Followed by, the DWS-CNN technology is utilized for replacing normal convolution to compute automated feature extraction. At last, the DSVM framework is employed to compute the binary and multiple class labels of COVID-19. The DSVM model employs SVM for learning the extraction of higher-level features from the input vectors, after which these features are given to the main SVM to do the actual prediction. Besides, the effective regularization power of the main SVM eliminates overfitting. The diagnostic result of the DWS-CNN approach is sampled over the CXR image dataset and outcomes are predicted by means of diverse performance metrics.

2 Related works

Massive developers have made efforts in applying AI models and established new methods for diagnosing COVID-19. The works have aspired to develop automatic systems for predicting COVID-19. Such approaches have to be employed to support clinical staffs in the recent epidemic scenario. Moreover, ML approaches are utilized to reduce the stress factors which influence medical experts in a pandemic scenario which enhances the workflow in clinical facilities. Ozturk et al. [7] present the automatic prediction model COVID-19 under the application of Deep Neural Networks (DNN) and CXR images. The newly developed framework depends upon DarkNet technologies for realistic forecasting of conv. layers. This study desires to decide radiology for validating the scan process. The heatmaps generated by automatic models were estimated by radiologists. The dataset employed is composed of massive images. Developers have employed fivefold cross-validation to verify the function of a newly developed approach. A maximum accuracy has been gained and reported for binary as well as multiclass classification.

A DL scheme has been applied to enhance the accuracy of binary classification of COVID-19 as presented in [8]. The projected CNN was deployed based on the VGG-19 classification method. The dataset employed is comprised of X-ray scans. The performance is verified with the application of random sampling. The ratio employed for training, validation, and testing are 80:20:20. The outcomes have demonstrated better accuracy. The restricted numbers of samples for COVID-19 cases are suggested by researchers. Apostolopoulos and Mpesiana [9] projects a transfer learning model by applying VGG-19 and MobileNet v2 for automatic prediction of users with pneumonia and COVID-19. In this model, 2 various datasets were applied. First, a dataset with numerous samples that are composed of normal, pneumonia, and COVID-19 images, correspondingly. Followed by, an alternate dataset with samples of COVID-19, pneumonia, and normal patients are employed. The tenfold cross-validation was utilized for estimating the presented approaches. The VGG-19 and MobileNet v2 have resulted in better accuracy for binary classification and maximum accuracy for multi-class concerning the first dataset. Moreover, the MobileNet v2 was used in 2nd dataset by gaining optimal accuracy to binary classification and multiclass. Developers have stated that an in-depth analysis is performed by applying patient details.

In [10], a robust screening model for COVID19 depends upon DL-NN. The projected framework is relied on nCOVnet and applies CXR images. The dataset used in this approach has huge samples. The dataset is embedded with samples from COVID-19 positive patients and images of healthy patients. The efficiency of the presented approach is determined by applying random sampling using training as well as testing datasets. At last, the novelty projected method for binary classification results in remarkable accuracy. A new Artificial Neural Network (ANN) method for COVID-19 prediction has been presented in [11]. The newly developed scheme depends upon Convolutional CapsNet and applies CXR images. This system contains binary as well as multi-class classification features. Moreover, the dataset utilized is composed of huge samples, images of normal, pneumonia, and COVID-19. The tenfold cross-validation has been employed for computing the performance of the presented technique. Finally, the simulation outcome displays better accuracy for binary and multi-class classification. In constraints reported by developers are concentrated on hardware resources required to perform a huge count of images and computational time.

Nour et al. [12] developed a new clinical diagnosing approach of COVID-19 for adopting medical functions. The method has relied on deep features as well as Bayesian optimization. In CNN framework has been used for automatic feature extraction which is computed by various ML approaches like k-nearest neighbors (kNN), support vector machine (SVM), and decision tree (DT). The employed dataset has numerous instances, with images of COVID-19, normal, and pneumonia. Developers have utilized data augmentation for enhancing the count of instances with COVID-19 class. The working function of the developed framework is determined by training and testing datasets, correspondingly. The newly projected CNN yields maximum accuracy.

In [13], researchers have utilized data augmentation to enhance the size of the training dataset by applying stationary wavelets and related diverse transmit learning CNN structures. The dataset applied maximum instances for COVID-19, normal images. Researchers have used augmentation methods frameworks for maximizing the count of instance for both classes. Here, samples are applied from training and validate the performance. Consequently, the presented models have presented better accuracy in testing binary classifier by applying the ResNet 18 technique. Konar et al. [14] present a semi-supervised shallow NN technology to automatic diagnosis of COVID-19. These works are composed of 2 datasets. The dataset is comprised of massive samples, where the samples of COVID-19 positive patients, and negative patients. The latter dataset is composed of samples of COVID-19 patients. The presented framework is sampled under the application of random sampling with maximum value for training and testing. Also, a proposed technique is determined under the application of 5 and tenfold validation. Hence, the maximum accuracy has been obtained. As a result, numerous approaches were presented for the automated prediction of COVID-19. Such works apply various images and datasets from several resources. Additionally, distinct methods were utilized for estimating the working function of cross-validation and random sampling.

3 The proposed DWS-CNN model

Figure 1 illustrates the working process involved in the DWS-CNN model. The figure portrayed that the SDL model undergoes initial data collection by IoT devices and is sent via 5G networks. As depicted, the DWS-CNN model performs the classification process in two stages namely the training stage and testing stage. Both of these phases undergo three major processes namely preprocessing, feature extraction, and classification, which are discussed in the subsequent sections.

Fig. 1
figure 1

Overall process of proposed DWS-CNN model

3.1 Preprocessing

Initially, the CXR image is provided as input to the GF technique to pre-process the image. GF is a linear smoothing filter which selects the weights based on the type of Gaussian function. During the spatial field, the gaussian smoothing filter is applied as an efficient lower‐pass filter, in particular for eliminating the noises which subject the normal distribution. So, it contains an extensive view of function in image processing [15]. For 1‐dimensional Gaussian function of zero mean, it is written as follows:

$$g\left( x \right) = e^{{\frac{{x^{2} }}{{2\sigma^{2} }}}}$$
(1)

Thus, the Gaussian distribution parameter \(\sigma\) defines the width of Gaussian functions. Under the application of image processing, it frequently applies 2‐d separate Gaussian function of zero mean as smooth filter, the equivalent function is written as:

$$g\left[ {i,j} \right] = e^{{\frac{{i^{2} + j^{2} }}{{2\sigma^{2} }}}}$$
(2)

Additionally, the Gaussian function has 5essential properties as given in the following:

  • 2‐d Gaussian function is rotational symmetry. The rotational symmetry implies GF does not turn to some way in the following image processing.

  • The Gaussian function is a single value function. GF utilizes the weighted mean of the pixel neighborhood for replacing the present pixel, and the weight is monotone reducing with the distance to the center point become distant. Thus, it is containing small result on the pixels which are distant from the center, and the image is not distorted.

  • The Fourier transform spectrum of the Gaussian function is single. With this property, it uses the images to obtain rid of maximum frequency signals noise and retain mostly helpful signals.

The width of GF is characterized by the parameter \(\sigma\). A superior \(\sigma\) is, an optimal the smoothness is, and the frequency band of GF becomes larger. By modifying the smoothness parameters, it obtains the compromise among excessive smoothing as well as not smoothing thoroughly.

The Gaussian function is split, thus a superior GF is realized efficiently. The convolutional of 2‐d Gaussian function is separated into 2 steps, and the computed amount of 2‐d GF develops linearly with the width of filter template enhancement.

3.2 Feature extraction

CNN is a commonly employed DL model, which is utilized in hierarchical classification tasks, for example, image classification. At first, CNNs are represented as an image, and computer vision is applied as the visual cortex. An image tensor is convolved with a group of \(d\times d\) kernels. This convolution (“Feature Maps”) is stacked for representing several variant features identified by the filters in that layer. A process involved in the solitary outcome of a matrix is characterized as follows:

$$A_{j} = f\left( {\mathop \sum \limits_{i = 1}^{N} I_{i} *K_{i,j} + B_{j} } \right)$$
(3)

In all individual matrix, \(I_{i}\) is convolved with its equivalent kernel matrix \({K}_{i,j}\), and bias of \({B}_{j}.\) At last, an activation function is implemented for all individuals elements. The biases and weights are modified for constituting able feature detection filters behind the back‐propagation (BP) phase in CNN training [16]. A feature map filters are implemented across every 3 channels. Figure 2 shows the structure of the CNN model.

Fig. 2
figure 2

Convolution neural networks

For diminishing the calculation multifaceted nature, CNN utilizes pooling layers to reduce the size of the resultant layer from its input with 1 layer after that on the next in the networks. The typical pooling processes are utilized for decreasing outcome to safeguard the important features. One of the extremely identified pooling methods is a \(\mathrm{max}-\) pooling method where the biggest activation is selected in the pooling window.

CNN has performed a discriminative function which applies a BP technique developed from sigmoid (Eq. (4)), or (Rectified Linear Units \((\) ReLU) (Eq. (5)) activation functions. The last layer has one node with a sigmoid activation function to binary classification and softmax activation function to multi‐class problems (as illustrated in Eq. (6)).

$$f\left( x \right) = \frac{1}{{1 + e^{ - x} }} \in \left( {0,1} \right)$$
(4)
$$f\left( x \right) = {\text{ max }}\left( {0,x} \right)$$
(5)
$$\begin{gathered} \sigma (z)_{j} = \frac{{e^{{z_{i} }} }}{{\Sigma_{k = 1}^{K} e^{{z_{k} }} }} \hfill \\ \forall j \in \left\{ {1, \ldots ,K} \right\} \hfill \\ \end{gathered}$$
(6)

The DWS convolutions were executed for image classification. It is a type of factorized convolutional that factorizes a standard convolution into a depth-wise and point-wise convolution. An illustrated in Fig. 3, the depth-wise convolutional carries out lightweight filtering by implementing a single filter per input channel [17]. The point-wise convolutional after that applies a \(1\times 1\) convolutional for linearly combining the input channels. A DWS convolution returns the standard convolutional function with a factorized 2‐layer convolutional, 1 layer to space filter, and 1 layer to combine. Thus, depth-wise separable convolutional drastically diminish the unnecessary calculation and mode size.

Fig. 3
figure 3

Structure of DWS-CNN

The standard convolution layer gets an \(h\times w\times {c}_{in}\) input feature map I, and executes a \(k\times k\times {c}_{in}\times {c}_{out}\) convolution kernel \(K\) for producing an \(h\times w\times {c}_{out}\) outcome feature map \(O\) where \(h\) and \(w\) are the height and width of input feature maps, \(k\) is the spatial dimension of the kernels considered to be square, \({c}_{in}\) is the count of input channels and \({c}_{out}\) is the count of outcome channels. It can be considered that the outcome feature map is obtaining a similar spatial size as the input feature maps by zero‐padding function. A standard convolutional consists a calculation difficulty (the function counts of Multiplication and Accumulation, MAC) of \({k}^{2}\cdot {c}_{in}\bullet {c}_{out}\cdot h\cdot w\).

The difference between standard and depth-wise separable is shown in Fig. 4 [18] and the layered details are given in Fig. 5 [19]. The depth-wise separable convolutional has 2 parts: depth-wise and point-wise convolutional. It utilizes depth-wise convolutional for applying a single filter per all channels of input feature maps. The depth-wise convolutional is expressed as Eq. (7).

$$G\left( {y, x,j} \right) = \mathop \sum \limits_{u = 1}^{k} \mathop \sum \limits_{u = 1}^{k} K\left( {u, u,j} \right) \times I\left( {y + u - 1, x + u - 1,j} \right)$$
(7)
Fig. 4
figure 4

a Standard CNN b Depthwise Separable CNN

Fig. 5
figure 5

Layers a Standard CNN b Depthwise Separable CNN

in which \(K\) implies the depth-wise convolutional kernels of size \(k\times k\times {c}_{in}\). The \({n}_{th}\) filter in \(K\) is implemented to the \({n}_{th}\) the channel in I for producing the \({n}_{th}\) the channel of the filtered outcome feature map G.

A point-wise convolutional calculates the linear group of the outcome of depth-wise convolutional using \(1\times 1\) convolutional for constructing novel features. A point-wise convolutional is expressed as Eq. (8).

$$O\left( {y, x, l} \right) = \mathop \sum \limits_{j = 1}^{{c_{in} }} G\left( {y, x,j} \right) \times P\left( {j, l} \right)$$
(8)

in which the size of \(1 \times 1\) convolutional kernel is \(1\times 1\times {c}_{in}\times {c}_{out}\). By altering \(m\), it gets to modify the channel count in the outcome feature map. Related with \(k\times k(k>1)\) convolutional functions, the dense \(1\times 1\) convolutional function contain no restriction of close to the locality, thus it doesn’t need to rearrange the parameter in memory. Then, it is performed straight with extremely optimized common matrix multiplication techniques. A DWS convolutional calculation difficultly is expressed as Eq. (9):

$$C_{ds} = k^{2} \cdot c_{in} \cdot h \cdot w + c_{in} \cdot c_{out} \cdot h \cdot w$$
(9)

It represents the amount of calculation cost of the depth-wise convolutional and \(1\times 1\) point-wise convolutional. Related with the standard convolutional, a DWS convolutional diminishes the calculation difficultly by a factor as \(\eta\) that is written as the following Eq. (10).

$$\eta = \frac{{C_{ds} }}{{C_{std} }} = \frac{{k^{2} c_{in} hw + c_{in} c_{out} hw}}{{k^{2} c_{in} c_{out} hw}} = \frac{1}{{c_{in} }} + \frac{1}{{k^{2} }}$$
(10)

Generally, the value of \(m\) is comparatively huge, thus the factor \(\eta\) is about equivalent to \(1/{k}^{2}\). This paper utilizes \(3\times 3\) DWS convolutional, thus the calculation difficulty and the count of parameters of equivalent convolutional layers are \(7\sim 8\) times lesser than that of standard convolutions.

3.3 Image classification

Once the DSW–CNN model has extracted a useful set of feature vectors, the DSVM model performs the classification processes [20]. SVM classifies a binary problem utilizing a linear hyperplane by considering the training set with \(n\)‐training samples, namely \(\left({x}_{1},{y}_{1}\right),\left({x}_{2}, {y}_{2}\right), \dots ,({x}_{n}, {y}_{n})\), where \({x}_{i}\in {\mathfrak{R}}^{N}\) is an \(N\) dimensional vector that goes to one of the classes \({y}_{i}\in \{-1, +1\}\). The binary classification problem is divided utilizing a linear decision function,

$$f\left( x \right) = w \cdot x + b$$
(11)

where \(w\in {\mathfrak{R}}^{N}\) implies a vector that defines the orientation of the desired hyperplane has to be a partition, and \(b\in \mathfrak{R}\) is known as “bias”. A better hyperplane is essential for separating 2 objects as given below,

$$y_{i} \left( {w \cdot x + b} \right) \ge 1$$
(12)

The solution to this problem is by solving the constrained optimization problem (or primal problem),

$$minimise\frac{1}{2}w \cdot w + C\mathop \sum \limits_{i = 1}^{n} \xi_{i}$$
(13)

subject to: \({y}_{i}(w\cdot x+b)\ge 1-{\xi }_{i},{\xi }_{i}>0\), and for \(\forall i=1,n\), where \(C,0<C<\infty\), is known as penalty value or regularization parameter; but \({\xi }_{i}\) is a slack variable. In the nonlinear scenario, the optimization problem is expressed as,

$$maximise\mathop \sum \limits_{i = 1}^{n} \alpha_{i} - \frac{1}{2}\mathop \sum \limits_{i = 1}^{n} \mathop \sum \limits_{j = 1}^{n} \alpha_{i} \alpha_{j} y_{i} y_{j} K\left( {x_{i} , x_{j} } \right)$$
(14)

subject to: \({\sum }_{i=1}^{n}{\alpha }_{i}{y}_{i}=0\), and, \(0\le {\alpha }_{i}\le C\), for \(i=1, \dots ,n\). As the output decision function is,

$$f\left( x \right) = sign\left[ {\mathop \sum \limits_{i = 1}^{n} y_{i} \alpha_{i}^{0} K\left( {x_{i} , x} \right) + b^{0} } \right]$$
(15)

\({\alpha }_{i}^{0}\) refers to the support vector as \(K({x}_{i}, x)\) means kernel function or kernel trick. The DSVM is formulated by utilizing a multi‐layer structure that has several hidden layers.

\({X}_{1}, {X}_{2},\dots , {X}_{n}\) signifies the input layer data points. The multiple hidden layers have \(SV{M}_{11}, SV{M}_{12}, \dots ,SV{M}_{1k}\),\(SV{M}_{21}, SV{M}_{22}, \dots ,SV{M}_{2k}\) and \(SV{M}_{n1}, SV{M}_{n2}, \dots ,SV{M}_{nk}\) but \({F}_{1}\left(X\right), {F}_{2}\left(X\right),\dots , {F}_{n}(X)\) indicates the resultant layer points. For \({X}_{1}\), the result of trained \(SV{M}_{11}, SV{M}_{12}, \dots ,SV{M}_{1k}\) is \({F}_{1}(X)\). For \({X}_{2}\), the outcome is trained \(SV{M}_{21}, SV{M}_{22}, \dots ,SV{M}_{2k}\) as \({F}_{2}(X)\). For \({X}_{n}\), the outcome is trained \(SV{M}_{n1}, SV{M}_{n2}, \dots ,SV{M}_{nk}\) is \({F}_{n}(X)\). The network weights are illustrated as \(f(x)\). Every \(f(x)\) is estimated in hidden layers with multiple layers connect every input neurons with the final neurons. The net input for hidden layers neuron is written as,

$$\begin{array}{c}ne{t}_{h1}={f}_{1{1}_{1}}\left(x\right)\cdot {X}_{1}+{f}_{1{1}_{2}}\left(x\right)\cdot {X}_{2}+\cdots +{f}_{1{1}_{n}}\left(x\right)\cdot {X}_{n}+b1\\ ne{t}_{h2}={f}_{1{2}_{1}}\left(x\right)\cdot {X}_{1}+{f}_{1{2}_{2}}\left(x\right)\cdot {X}_{2}+\cdots +{f}_{1{1}_{n}}\left(x\right)\cdot {X}_{n}+b1\\ \begin{array}{c}\vdots \\ ne{t}_{hn}={f}_{1{n}_{1}}\left(x\right)\cdot {X}_{1}+{f}_{1{n}_{2}}\cdot {X}_{2}+\cdots +{f}_{1{1}_{n}}\left(x\right)\cdot {X}_{n}+b1\end{array}\end{array}$$
(16)

The logistic activation function is utilized for computing the result of all input neurons as,

$$\begin{array}{c}ou{t}_{h1}=\frac{1}{1+{e}^{-ne{t}_{h1}}}\\ ou{t}_{h2}=\frac{1}{1+{e}^{-ne{t}_{h2}}}\\ \begin{array}{c}\vdots \\ ou{t}_{hn}=\frac{1}{1+{e}^{-ne{t}_{hn}}}\end{array}\end{array}$$
(17)

The result of the hidden layer neurons are utilized as input for computing the resultant layer neurons \(ne{t}_{o{1}_{1}}\dots ,ne{t}_{o{1}_{\mathfrak{n}}},ne{t}_{o{2}_{1}}\dots ,ne{t}_{o{2}_{n}},\) and \(ne{t}_{o{n}_{1}}\dots ,ne{t}_{o{n}_{\mathfrak{n}}}\) as,

$$\begin{array}{c}ne{t}_{o{1}_{1}}={f}_{2{1}_{1}}\left(x\right). ou{t}_{h1}+{f}_{2{1}_{2}}\left(x\right). ou{t}_{h2}+\cdots +{f}_{2{1}_{n}}\left(x\right). ou{t}_{hn}+{b}_{2}\\ \vdots \\ ne{t}_{o{1}_{n}}={f}_{k{1}_{1}}\left(x\right). ou{t}_{h1}+{f}_{k{1}_{2}}\left(x\right). ou{t}_{h2}+\cdots +{f}_{k{1}_{n}}\left(x\right). ou{t}_{hn}+{b}_{2}\end{array}$$
(18)
$$\begin{array}{c}ne{t}_{o{2}_{1}}={f}_{2{2}_{1}}(x) . ou{t}_{h1}+{f}_{2{2}_{2}}(x) . ou{t}_{h2}+\cdots +{f}_{2{2}_{n}}(x)\cdot ou{t}_{hn}+{b}_{2}\\ \vdots \\ ne{t}_{o{2}_{n}}={f}_{k{2}_{1}}(x) . ou{t}_{h1}+{f}_{k{2}_{2}}(x) . ou{t}_{h2}+\cdots +{f}_{k{2}_{n}}(x) . ou{t}_{hn}+{b}_{2}\end{array}$$
(19)
$$\begin{array}{c}ne{t}_{o{n}_{1}}={f}_{2{n}_{1}}\left(x\right). ou{t}_{h1}+{f}_{2{n}_{2}}\left(x\right). ou{t}_{h2}+\cdots +{f}_{2{n}_{n}}\left(x\right). ou{t}_{hn}+{b}_{k}\\ \vdots \\ ne{t}_{o{n}_{n}}={f}_{k{n}_{1}}\left(x\right). ou{t}_{h1}+{f}_{k{n}_{2}}\left(x\right). ou{t}_{h2}+\cdots +{f}_{k{n}_{n}}\left(x\right)\cdot ou{t}_{hn}+{b}_{k}\end{array}$$
(20)

Assume the case of \(ne{t}_{o{1}_{1}}\dots ,ne{t}_{o{1}_{n}}\). The result is calculated with the logistic activation function as,

$$\begin{array}{c}ou{t}_{o{1}_{1}}=\frac{1}{1+{e}^{-ne{t}_{o{1}_{1}}}}\\ \vdots \\ ou{t}_{o{1}_{n}}=\frac{1}{1+{e}^{-ne{t}_{o{1}_{n}}}}\end{array}$$
(21)

The error computes the output \(outpu{t}_{o1}\) for \({X}_{1}\), is computed by subtracting the calculated output \(outpu{t}_{o1}\) from the recognized value of \({F}_{1}(X)\) as,

$$E_{01} = \mathop \sum \limits_{i = 1}^{n} \frac{1}{2} \left( {F_{1} \left( X \right) {-\!\!-}output_{{o1_{i} }} } \right)$$
(22)

Likewise, the model is calculated by summing every calculated error \(E_{o1} , E_{o2} , \ldots ,E_{on}\) as,

$$E_{total} = E_{o1} + E_{o2} + \cdots + E_{on}$$
(23)

Under the application of back-propagation (BP), all \(f(x)\) in a network ensures that the actual output is maximum to the target output \(F(X)\), thus the error reduction from all outcome neurons and the whole network. For example, \({f}_{1{1}_{1}}(x)\) is calculated as the gradient of \(\partial {E}_{total}\) as,

$$\frac{{\partial E_{total} }}{{f_{{11_{1} }} \left( x \right)}} = \frac{{\partial net_{{o1_{1} }} }}{{\partial f_{{11_{1} }} \left( x \right)}}*\frac{{\partial out_{{o1_{1} }} }}{{\partial net_{{o1_{1} }} }}*\frac{{\partial E_{total} }}{{\partial out_{{o1_{1} }} }}$$
(24)

The updated function \(f_{1l}^{{\left( {new} \right)}} \left( x \right)\) is calculated as

$$f_{{11_{1} }}^{{\left( {new} \right)}} \left( x \right) = f_{{11_{1} }} \left( x \right) - \lambda *\frac{{\partial E_{total} }}{{\partial f_{{11_{1} }} \left( x \right)}}$$
(25)

where \(\lambda\) indicates the learning rate to adjusted the weights of the network. In a similar method, every weight in the network to be exact \(f(x)\) is updated, and the model is repeated iteratively from Eq. (16) till \({E}_{total}\) becomes zero or infinite.

4 Performance evaluation

This section examines the diagnostic performance of the DWS-CNN model for COVID-19 against the CXR dataset [21]. It includes a set of images under diverse class labels of COVID-19, SARS, ARDS, and Streptococcus. A few sample test images are depicted in Fig. 6. The dataset comprises a set of 27 images under Normal class, 220 images under COVID-19 images and 15 images under ARDS class labels. For experimentation, tenfold cross validation process is employed. The proposed model is implemented by the use of Intel i5, 8th generation PC with 16 GB RAM, MSI L370 Apro, Nividia 1050 Ti4 GB. For experimentation, Python 3.6.5 tool is used along with pandas, sklearn, Keras, Matplotlib, TensorFlow, opencv, Pillow, seaborn and pycm. The parameter setting in the experimentation is given here: learning rate: 0.0001, momentum: 0.9, batch size: 128, and epoch count: 140.

Fig. 6
figure 6

Sample Images a Normal b COVID-19 c SARS d ARDS e Streptococcus

To ensure the proficient classification performance of the DWS-CNN model, detailed visualization of the results is attained. Figure 7a depicts that the DWS-CNN model has classified the CXR images into class ‘Normal’ and Fig. 7b illustrates that the DWS-CNN model has classified the set of CXR images into the class ‘COVID-19’.

Fig. 7
figure 7

Visualization of classified results

Table 1 and Fig. 8 analyze the classification performance of the DWS-CNN model on the classification of binary classes under varying folds. Under F1, the DWS-CNN model has resulted in a sensitivity of 98.10%, the specificity of 98.16%, the accuracy of 98.16%, and F-score of 98.10%. Along with that, under F2, the DWS-CNN method has provided a sensitivity of 98.32%, specificity of 97.89%, accuracy of 98.27%, and F-score of 98.76%. In line with this, under F3, the DWS-CNN approach has offered a sensitivity of 98.53%, specificity of 99.10%, accuracy of 99.06%, and F-score of 99.09%. Followed by, under F4, the DWS-CNN scheme has yielded a sensitivity of 98.64%, specificity of 98.84%, accuracy of 98.82%, and F-score of 98.81%. Finally, under F5, the DWS-CNN framework has generated a sensitivity of 98.28%, specificity of 98.45%, accuracy of 98.39%, and F-score of 98.41%.

Table 1 Results of proposed DWS-CNN model for binary class in terms of different measures
Fig. 8
figure 8

Binary class a sensitivity b specificity c accuracy d F-score

Table 2 and Fig. 9 examine the classification function of the DWS-CNN technique on the classification of Multi-class under various folds. Under F1, the DWS-CNN approach has provided a sensitivity of 99.32%, specificity of 99.17%, accuracy of 99.12%, and F-score of 99.06%. Similarly, under F2, the DWS-CNN technology has ended up with a sensitivity of 99.20%, specificity of 99.45%, accuracy of 99.36%, and F-score of 99.28%. Along with that, under F3, the DWS-CNN scheme has presented a sensitivity of 98.96%, specificity of 98.91%, accuracy of 98.90%, and F-score of 98.87%. Then, under F4, the DWS-CNN technology has provided a sensitivity of 98.70%, specificity of 98.85%, accuracy of 98.52%, and F-score of 98.43%. Consequently, under F5, the DWS-CNN approach has resulted in a sensitivity of 99.45%, specificity of 99.62%, accuracy of 99.41%, and F-score of 99.22%.

Table 2 Results of proposed DWS-CNN model for multi class in terms of different measures
Fig. 9
figure 9

Multi class a sensitivity b specificity c accuracy d F-score

Figure 10 investigates the accuracy analysis of the DWS-CNN model on the classification of binary and multiple classes. The figure stated that the DWS-CNN model has classified the images into binary class labels with a sensitivity of 98.37%, specificity of 98.59%, accuracy of 98.54%, and F-score of 98.63%. Similarly, the DWS-CNN model has classified the CXR images into multiple class labels with a sensitivity of 99.13%, specificity of 99.20%, accuracy of 99.06%, and F-score of 98.97%.

Fig. 10
figure 10

Average analysis of proposed DWS-CNN

Table 3 and Figs. 11 and 12 investigates the brief comparative study of the DWS-CNN method on the detection and classification of COVID-19. Accuracy analysis of different models stated that the DT model is the ineffective diagnosis model, which has attained a minimum accuracy of 86.71%. At the same time, the Inception v3 and KNN models have resulted in a slightly higher accuracy of 88.74% and 88.91% respectively. Along with that, the CapsNet and ResNet-50 model have obtained even better and closer accuracy of 89.19% and 89.61% respectively. Similarly, the AlexNet and DTL models have demonstrated a moderate accuracy of 90.5% and 90.75% respectively. Followed by, the CovxNet, LR, and MLP models have tried to show manageable results with the nearer accuracy of 91.7%, 92.12%, and 93.13% respectively. Simultaneously, the VGG-19 and FR-CNN have showcased competitive outcomes with an accuracy of 96.33% and 97.36% respectively. However, the DWS-CNN model has demonstrated better results on the classification of binary and multiple classes with an accuracy of 98.54% and 99.06% respectively.

Table 3 Comparative analysis of existing with proposed methods
Fig. 11
figure 11

Result analysis of existing with proposed DWS-CNN in terms of sensitivity and specificity

Fig. 12
figure 12

Result analysis of existing with proposed DWS-CNN in terms of accuracy and F-score

A sensitivity prediction of diverse methods implied that the CapsNet method is the worse diagnosis approach that has accomplished the last sensitivity of 84.22%. Simultaneously, the DT and KNN methodologies have provided better sensitivity of 87% and 89% correspondingly. In line with this, the DTL and CovxNet schemes have accomplished considerable and similar sensitivity of 89.61% and 90.50% respectively. Likewise, the Inception V3 and AlexNet framework have illustrated a better sensitivity of 91% and 92.50% respectively. Next, the ResNet-50, MLP, and LR approaches have managed to depict acceptable outcomes with identical sensitivity of 93%. At the same time, the VGG-19 and FR-CNN have demonstrated competing results with a sensitivity of 97.05% and 97.65% correspondingly. Hence, the DWS-CNN framework has showcased manageable results on the classification of binary and multiple classes with a sensitivity of 98.37% and 99.13% respectively.

Specificity analysis of diverse methods has depicted that the ResNet-50 approach is the poor diagnosis method, which has reached the least specificity of 67.74%. Meanwhile, the AlexNet and Inception V3 methodologies have provided moderate specificity of 71.43% and 74.19% correspondingly. In line with this, the MLP and DT methods have achieved considerable and similar specificity of 87.23% and 88.93% respectively. On continuing with, the LR and KNN frameworks have illustrated an even better specificity of 90.34% and 90.65% respectively. Then, the CapsNet, DTL, and FR-CNN schemes have attempted to display acceptable outcomes with a closer specificity of 91.79%, 92.03%, and 95.48% respectively. Concurrently, the CovxNet and VGG-19 depicted competing results with the specificity of 95.80% and 96% correspondingly. Thus, the DWS-CNN scheme has illustrated moderate results on the classification of binary and multiple categories with the specificity of 98.59% and 99.20% respectively.

An F-score investigation of distinct methods recommended that the CapsNet approach is an improper diagnosis approach, which has gained a lower F-score of 84.21%. Meantime, the DT and KNN frameworks have accomplished a better F-score of 87% and 89% correspondingly. In line with this, the DTL and CovxNet approaches have achieved considerable and similar F-score of 90.43% and 91.10% respectively. Likewise, the LR and MLP models have showcased an acceptable F-score of 92% and 93% correspondingly. Next, the Inception V3, ResNet-50, and VGG19 technologies have managed to illustrate reasonable outcomes with the closer F-score of 93.33%, 93.94%, and 94.24% respectively. At the same time, the AlexNet and FR-CNN have represented a competing outcome with the F-score of 94.63% and 98.46% correspondingly.

From the experimental results, the DWS-CNN frameworks have exhibited moderate results on classifying binary and multiple classes with the F-score of 98.63% and 98.97% respectively. The application of DSVM helps to avoid overfitting. Along with that, the preprocessing helps to eliminate the noise and enhance the image quality, which further helps to improve the classification accuracy. Hence, it can be employed as an appropriate diagnosis model for COVID-19 diagnosis in hospitals, healthcare centres, telemedicine, rural and remote areas.

5 Conclusion

This paper has developed a DWS-CNN model with DSVM for COVID-19 diagnosis and classification. The goal of the DWS-CNN model is to determine the binary and multiple class labels of COVID-19 using CXR images. The DWS-CNN model performs the classification process under two stages namely the training stage and testing stage. Both of these phases undergo three major processes namely GF based preprocessing, feature extraction, and classification. Besides, the DWS-CNN model is employed for extracting a useful group of feature vectors and finally, DSVM is used as a classifier model. The simulation processes of the DWS-CNN model are tested using the CXR image dataset. The experimental results ensured the superior results of the DWS-CNN model by attaining maximum classification performance with the accuracy of 98.54% and 99.06% on binary and multiclass respectively. In the future, the hyperparameter tuning of the DWS-CNN model takes place using the bio-inspired algorithms to attain maximum classification results.