1 Introduction

SARS-CoV-2, commonly known as coronavirus, is a highly potent virus that causes COVID-19. It belongs to the “coronaviridae” family (Yang et al. 2020). The disease is usually mild, and the symptoms include myalgia, cough, sore throat, fever and shortness of breath. However, hypoxia, chest pain and sudden confusion are observed in serious cases. As the disease progresses, it might also cause Acute Respiratory Disorder Syndrome (ARDS) in severe patients. The virus’s primary source is unknown; however, analysis of its genome sequence has indicated that it belongs to the Beta-CoV genera of the coronavirus group (Yuki et al. 2020). The virus usually uses rodents and bats as hosts (Fauci et al. 2020). The primary mode of disease transmission is through air or physical contact. It enters the respiratory system by bonding with the “Angiotensin-converting enzyme 2 (ACE2)” receptor (COVID GA 2020). The virus was initially discovered in Wuhan, China, in late December 2019. Many sources claim that the coronavirus emerged from the wet markets in Wuhan (Kumar et al. 2021). It has continued to spread around the globe since then and is also known for mutating quickly from one strain to another. As the disease spreads, it has caused numerous problems in all aspects, with new issues emerging as time passes. Vaccines have been effectively rolled out in several countries to prevent the infection from spreading. However, there is an acute shortage of vaccines in smaller and developing nations (Jeyanathan et al.) These vaccines have proven to be highly effective in combating the severe effects of Sars-CoV-2. According to recent studies, a bad prognosis is prevented most of the time in vaccinated COVID-19 patients (Arevalo-Rodriguez et al. 2020). However, a part of the population still succumbs to this fatal infection. These include the elderly population and people with existing comorbidities such as diabetes and hypertension. It is essential to diagnose this disease early to provide appropriate treatments, and sudden in-hospital deaths can be further avoided. COVID-19 is normally diagnosed using the Reverse Transcription Polymerase Chain Reaction (RT-PCR) test. However, it is known from many researches that the results are sometimes prone to false negatives (Arevalo-Rodriguez et al. 2020; Pecoraro et al. 2022). It also takes a considerable amount of time to give results (Fu et al. 2022). This is a dangerous scenario since the patient will not be given appropriate treatments. They also become a potential carrier of this contagious infection. Moreover, researchers have claimed that the RT-PCR test might not diagnose the newer strains of this viral infection (Phan et al. 2022).

According to many studies, coronavirus can also be diagnosed using other modalities such as CT Scans, chest X-rays, clinical and laboratory markers, MRIs, ultrasound, audio analysis and the Rapid Antigen Test (RAT) (Yüce et al. 2021). These procedures can be used parallelly with the RT-PCR test to improve the results' trustworthiness. The modalities can also be utilized in situations such as a pandemic peak since there will be an acute shortage of medical resources in developing countries. Various digital technologies such as AI, data science, drone engineering, Internet of Things (IoT), Virtual Reality (VR), and many more have been successfully deployed in battling this dangerous virus (Singh and Agarwal 2022). AI, in particular, has been very successful in the medical domain in the past. It is the development and study of various algorithms that mimic human intelligence. It has been successfully used in various domains such as computer vision, robotics, online advertising, stock market price prediction and many more (Huo et al. 2021; Castiglioni et al. 2021; Nti et al. 2021). With its success in fields such as therapy, diagnosis, drug development, patient monitoring and others, there is a glimmer of optimism that it will become a thriving area of research to address the present issues caused by the coronavirus (Almars et al. 2022). It is also proposed that AI will be critical in assisting academic and clinical researchers who fight the novel COVID-19 (Khamis et al. 2022). During the first pandemic wave, China utilized several steps to prevent the coronavirus from spreading by utilizing AI-based technologies (Zhang et al. 2022a). They investigated the use of deep learning in facial recognition to track infected patients, drones to inspect areas, robots to transport medicine, food and fuel delivery and many more. There are various diagnostic applications for which AI can be adapted to manage this disease. This research tries to organize the literature around various modalities that detect the virus. These applications include pharmaceutical studies, clinical applications, data pre-processing procedures and epidemiology.

The design and development of novel techniques that detect the deadly virus quickly and with high accuracy are essential. CT and X-ray images can be utilized for diagnosis since they use easily accessible equipment (Kumar et al. 2022). Even before the onset of symptoms such as sore throat, cough and fever, one can diagnose the patient by processing these images. COVID-19 detection using imaging modalities involves three crucial steps: 1) Data preparation 2) Acquisition of images 3) Diagnosis of the disease. These images can be easily analyzed using AI and image processing-based approaches (Basu et al. 2022). Mathematical models have been implemented to predict the diagnosis using various demographic, laboratory, epidemiological and clinical markers (Chadaga et al. 2022). Combining these markers can be effectively used to diagnose a patient’s COVID-19 status. The bio clinical markers such as D-dimer, C-reactive protein (CRP), Ferritin, Lactate dehydrogenase (LDH) and Neutrophil-to-Lymphocyte ratio (NLR) levels change drastically in COVID-19 patients (Chadaga et al. 2022). These tests can be further analyzed to predict the severity of a patient (Kocadagli et al. 2022). Besides Sars-CoV-2, coughing can also be a symptom of various other infections (Landt et al. 2022). As a result, diagnosis based solely on cough sounds is challenging. Different sounds such as moaning, breathing, chewing and heartbeats can be combined with a Natural Language Processing (NLP) model for accurate detection (Akbari and Unland 2022). In various hospitals, auscultation is already used to collect the above sounds. All the above methods can be beneficial in predicting COVID-19. The main contributions of this research article are provided below:

  • A comprehensive review of existing literature that use AI to battle the deadly COVID-19.

  • A review of various applications and research that use image-based modalities such as CT-scan, chest X-rays, MRIs and ultrasound for accurate diagnosis.

  • A review of different clinical studies that use laboratory and haematological markers for predicting COVID-19 diagnosis.

  • A review of various applications and articles that use AI-based voice applications to diagnose SARS-CoV-2.

  • Challenges and future research directions are provided. This can help enthusiastic medical and machine learning researchers to understand the current issues and possible future trends in diagnosing the coronavirus.

In this review, a comprehensive/extensive non-systematic review methodology has been used. COVID-19 diagnosis articles which use AI and published in the last three years (2020–2022) have been considered. Articles from various important databases such as Scopus, Google Scholar, PUBMED, MEDLINE, Embase and Crossref have been included for this vital review. The search string made use of important key words. They are: ‘COVID-19’,’Sars-CoV-2’,’Coronavirus’, ‘Artificial Intelligence’, ‘Machine learning’, ‘Deep Learning’, ‘Diagnosis’, ‘Prognosis’, ’CT scans’, ‘X- rays’, ‘Clinical markers’, ‘haematology’, ‘NLP’, ‘sound analysis’, ‘MRI’, ‘Ultrasound’, ‘Triage’, ‘Medical Imaging’, ‘Healthcare’, ‘Genomic Data Science’, ‘AI for COVID-19’ and others were included. Alternatively, the above key words were further combined using various permutations and combinations. This proved to be very effective since some the combined keywords resulted in a lot of related researches. The most important search string which delivered maximum related literature was a combination of two important key words: ‘COVID-19 Artificial Intelligence’. We were able to obtain various research, review, commentary, letter to Editor and other articles using this key string. ‘COVID-19 machine learning’ was also able to generate a lot of articles. The search string ‘COVID-19 deep learning’ was useful in finding research papers related to deep learning. The highest articles used CT-scan and X-ray as modalities. It was imperative to use other key words to find the papers related to various modalities. The search strings ‘COVID-19 blood markers machine learning’, ‘COVID-19 voice diagnosis NLP’, ‘COVID-19 machine learning MRI’, ‘COVID-19 ultrasound machine learning’ were all able to deliver subject specific articles. All the above search strings and others were highly useful in finding articles for the study’s research questions. The word ‘COVID-19’ replaced by either ‘Sars-CoV-2’ or coronavirus achieved similar results. Further, only original research published in American/British English have been considered in this review.

Inclusion Criteria:

  1. 1.

    Full-text articles on COVID-19 diagnosis with AI, machine learning, deep learning, NLP, ultrasound, MRI, clinical markers, sound analysis.

  2. 2.

    Original articles on COVID-19 that emphasize on healthcare, diagnosis, accessibility and affordability using AI.

During the inclusion criteria, the articles chosen from various databases are as follows: Scopus-212, google scholar- 102, medline-25, embase-25, pubmed-110 and crossref-104. Many of the articles were available in more than one database and they were systematically removed. Further, a few articles did not use AI and were removed. The exact identification, screening, eligibility and inclusion of the articles is depicted by the PRISMA diagram. Figure 1 represents the PRISMA architecture used to choose the final articles for review.

Fig. 1
figure 1

PRISMA architecture for choosing the final articles to review

Exclusion Criteria:

  1. 1.

    Commentaries, letter to editor, case reports and articles with no full-text content.

  2. 2.

    Animal studies.

  3. 3.

    Pure medical diagnostic studies which do not involve AI.

The final list of articles chosen for review were 74. This includes COVID-19 diagnosis based on various modalities such as CT-scans, X-rays, MRI’s, ultrasound, biomarkers and cough sound analysis.

This article aims to conduct an extensive review of different AI methodologies that diagnose COVID-19 using various modalities. This review is also meant to help researchers from medical and computer science domains. The combined knowledge in the above fields is vital to conducting subsequent research on COVID-19. The research paper is structured as follows. Section 2 describes other literature which use AI and other technologies to tackle the coronavirus. Section 3  contains various researches that use the application of AI to diagnose Sars-CoV-2 using various modalities. Threat to validation is explained in Sect.4. Key challenges and the directions for future research are explained in Sect. 5. Finally, the article concludes in Sect. 6. The entire structure of this review article is pictorially represented in Fig. 2.

Fig. 2
figure 2

Structure of this review article

2 Related work

Several researches have already attempted to survey and review various applications that use AI to combat this pandemic. This section covers the various reviews conducted by other authors all around the world. Chadaga et al. (2021) used the applications of machine learning to battle the deadly COVID-19 infection. Open source COVID-19 datasets (images, tabular, voice) were explored in the beginning. Afterwards, COVID-19 detection using X-rays, blood tests, CT scans and sound analysis were explained in detail. This article also emphasized on the use of machine learning in drug development. The researches regarding COVID-19 are increasing every year. Figure 3 describes the number of researches on COVID-19 using AI, machine learning and deep learning.

Fig. 3
figure 3

Number of papers on COVID-19 (AI, ML, DL) year wise (2019–2022*) according to PubMed database (a) COVID-19 and artificial intelligence (b) COVID-19 and machine learning (c) COVID-19 and deep learning

Chamola et al. (2020) reviewed the utilization of various trending technologies such as block chain, AI, 5G and IoT to mitigate the drastic effects of COVID-19. Many diagnostic procedures, pandemic management, clinical processes, and transmission mechanisms were included in this review. COVID-19’s impact on global economy was also highlighted. In another systematic review by Mohamadou et al. (2020), the use of AI and mathematical modelling for early COVID-19 diagnosis was explored. Several researches that used CNN on chest X-rays and CT images were compiled in this article. This study focused on demography, management strategies, medical images, case reports, healthcare work force and patient mobility. The application of technologies such as data science and machine learning (ML) to tackle coronavirus was discussed in Kumar et al. (2020). Patient diagnosis using radiology, computational biology, social control, patient epidemiology and other related topics were further explained in this review article. COVID-19 mitigation using edge computing and deep transfer learning were surveyed in Sufian et al. (2020). Many deep learning applications used in disease diagnosis, patient treatment and drug discovery were explored in this article. In addition, devices such as web cameras, drones, robots and IoT are very instrumental in managing this pandemic, claims the study.

Lalmuanawma et al. (2020) Reviewed the utilization of AI and machine learning for the SARS-CoV-2 pandemic. AI applications meant for screening and treatment, contact tracing, forecasting and vaccinations were thoroughly covered in this article. These technologies will help the policy makers and medical experts, according to the study. Pham et al. (2020), looked into how big data and AI were being used to combat the pandemic. The study stressed the need of employing the aforementioned measures in preventing the pandemic’s devastating impacts. The areas of diagnosis, disease tracking, infoveillance and infodemiology, biomedicine, pharmacotherapy and vaccine/drug discovery were all explored in this article. In yet another review, Tsikala et al. (2020) studied the emerging technologies such as big data, IoT and AI for the diagnosis and treatment of COVID-19 patients. A large number of interesting subjects such as nanotechnology, 3D-printing, telemedicine and robotics were discussed in this review article. The paper concluded that the new technologies can efficiently combat the coronavirus. Finally, Vaishya et al. (2020) reviewed a variety of AI technologies relevant to the current pandemic. The review article focused on careful screening, forecasting future patients, as well as early detection of this potent virus. The use of machine learning to reduce the strain among health care personnel was also highlighted. Emphasis was given on developing countries. The rest of the review articles along with their key features are discussed in Table 1.

Table 1 An overview of existing articles that use AI to battle COVID-19

3 COVID-19 Diagnosis using Artificial Intelligence

According to several researches conducted, it is observed that despite hundreds of viruses in the Coronavirus family, only seven of them have been declared hazardous to humans (Yuki et al. 2020). When the virus enters the respiratory tract, the symptoms show up anywhere from two to fourteen days. The lungs become inflamed, and fluid is built in the alveoli, leading to severe pneumonia. At this stage, it becomes tough to breathe. If the virus can be diagnosed early and efficiently, the patient can start the treatment early on. The presence of antigens in the bloodstream can be detected three to eight days after initial contact. Based on differentiating features of this virus, COVID-19 can be diagnosed by various methods such as RT-PCR, CT-scans, CXR (chest X-rays), lung ultrasound, MRI, voice-based analysis, blood testing, biomarker’s analysis and many more. The RT-PCR test has been continuously used in recent times. However, there have been many incidents where the test failed to diagnose accurately. False negatives become very threatening, especially in asymptomatic cases. Hence, a more reliable and quick real-time diagnosis is required, which can deliver the results with utmost accuracy and reliability. These diagnostic tests can be used in conjunction to the RT-PCR tests too. The following subsections discusses the various COVID-19 diagnosis techniques using artificial intelligence.

3.1 Diagnosis of COVID-19 using Chest X-rays

During the ongoing pandemic, it was observed that chest x-rays were utilized more, considering their availability and lesser cost. Exploiting these grounds, many studies were published with extremely high accuracies. It also ensured precise diagnosis. The usage of various machine learning classifiers, specifically convolutional neural network (CNN) based models, were widely seen.

Panwar et al. (Panwar et al. 2020) proposed a “nCOVnet” supervised deep learning network, wherein visual indications were successfully extracted by analyzing the chest images. CNN has demonstrated its ability to efficiently map image data to a predictable and reliable output. Lungs being the primary target made the study more concise, wherein the model can predict and detect the presence of the virus. Aiming towards accurate segregation between healthy and virus-infected lungs, they trained the model to search for particular features such as grey or shadowed areas. “nCOVnet” obtained an accuracy of 97.62% in detecting true positives. Further, Hemdan et al. (2020) created a model named COVIDX-Net network, which analyses CXR images and eventually assists radiologists in decision-making. The dataset considered was 50 X-ray scans, and the network was trained using seven distinguished deep CNN models which consisted of 25 confirmed Covid-19 cases. VGG19 and the second edition of Google MobileNet were utilized in building COVIDX-Net. Each neural network categized images into positive or negative cases by evaluating the normalized intensities in each image. The model had an accuracy of 90%. The f1-scores for non-COVID-19 and Covid-19 patients were 0.89 and 0.91, respectively. The results produced by the model proposed was similar to those using VGG16 along with DenseNet models. Narin et al. (2021) proposed a study where multiple pre-trained CNN models were evaluated for their efficiency of diagnosis. ResNet (50, 101 and 152) along with InceptionV3, Inception-ResNetV2 were all considered. The model was able to classify the data into Covid-positive, healthy patients and those affected by viral or bacterial pneumonia. After extensive evaluation, it was observed that the ResNet50 pre-trained model obtained a maximum accuracy of 96.1, 99.55 and 99.70% for the three different datasets that they had used. The novel approach taken by this study consisted of a comprehensive framework where no human efforts for feature-extraction were required and were successful in being a high-accuracy prediction model.

Covid-19 can manifest in many ways, and one of the most dangerous scenarios is when COVID-19 induced pneumonia is seen in a patient. Urgent care is needed to save the patient’s life. For specific detection of this problem, Heidari et al. (2020) built a CNN-based classifier VGG16, which was designed to detect the severity of the disease. In total, 8474 case data were used wherein the outcomes could successfully predict COVID-19, COVID-infected pneumonia and standard non-COVID-19 cases. While pre-processing the chest x-rays, the diaphragm region was eliminated. The original images were pre-processed using an algorithm of histogram-equalization followed by a bilateral filter. After pre-processing, the CNN model was deployed. This study used a transfer learning method which obtained a 98.1% accuracy and 98.4% sensitivity proving that the pre-processing of images to produce a pseudo-colour image can construct a deep learning computer-aided diagnosis (CAD) system for accurate and reliable diagnoses of pneumonia. Machine learning can be utilized to track COVID-19 prognosis. A novel research was conducted by Cohen et al. (2020) to generate a severity score based on CXR imaging. To estimate the severity, a DenseNet classifier was deployed. For accurate prediction, 94 COVID-19 cases were considered. The lung participation and ambiguity were effectively calculated. Figure 4 shows the COVID-19 prognosis of the patient using the DenseNet model. The lung participation was calculated utilizing a measure of ground glass ambiguity and unification. The unique score can be used to calculate the disease severity. The effectiveness of the quality of care provided can also be tracked.

Fig. 4
figure 4

DenseNet model which classifies COVID-19 severity [188]

Further, Paul et al., (2022) created a deep learning model and many transfer learning models were put together to classify CXR images simultaneously. Two databases of CXR images were considered for this research. Various activation functions such as ReLu, softmax and sigmoid functions were considered. In their evaluation, it was concluded that ReLU was superior to a Sigmoid activation function as it reduces the vanishing-gradient issue. Besides, it is extremely fast. Employing standard CNN models like VGG-16 and DenseNet-161 followed by ResNet-18 was also done. A few pre-trained models on ImageNet datasets were considered and retrained on CXR datasets using the idea of transfer learning. By this novel method, they achieved an accuracy of 99.66% and 99.84% for both of the separate datasets. Brunese et al. (2020) focused on suggesting an approach to classify the Covid-19 patients’ lungs apart from healthy patients. They followed a multi-stage approach. In the first phase, classification of healthy and pneumonia patients was performed. Further, COVID-19 patients were distinguished from regular influenza patients. In the second phase, they extracted the area of interest to analyse further. The uniqueness of this study is that it also gives an explainable proposal when choosing to detect areas of interest in the CXR images, which are indicative of the presence of Covid-19. After the analysis of 6113 images, the model achieved an accuracy of 99% with a time window of approximately 2.5 s. In this research, they distinguished between pneumonia and Covid-19 accurately. SARS-net was another architecture that used a combination of CNN and Convolution Network and was proposed by Kumar et al. (2022). CNN was used as the base network for SARS-net. Several nets were designed by adding various features such as coordinate-convolution, anti-aliasing CNNs, rank-based stochastic pooling and others. Finally, the fifth net was formed by combining the first and 2-layer graph convolution network (GCN). This newly created net proved to be the most superior. It obtained an accuracy of 97.60% and a sensitivity of 92.90%.

According to Falco et al. (2022), the approach using deep neural networks (DNN) frequently result in very high classification outcomes, as evaluated by metrics such as accuracy and F1-score. However, the difficulty with DNNs is that they operate like black boxes, which means they do not explain why they allocate an image to a specific class. There is a surge in demand for knowledge on this topic named “Explainable artificial intelligence”. This is true for both experts who deploy DNN-based categorization systems and patients whose lives are impacted by the model's outcome. This study’s main goal was to create interpretable classification tools, which generate an explicit model directly so that users may get explicit information about the problem and the rationale for categorizing patients. To provide the user with interpretability, they built a two-step method: filtering-strategy using content-based image retrieval. The second is an evolutionary algorithm capable of automatically classifying and extracting explicit information in the form of a set of “IF–THEN” rules. The models were then used for COVID-19 diagnosis. Jain et al. (2020) developed a four-way classifier that distinguished bacterial pneumonia, regular viral pneumonia, COVID-19 and healthy patients. Two phases were employed to create a suitable model: data augmentation and pre-processing. This model was applied to a dataset of 1215 images. The model was trained, tested and validated using the fivefold cross-validation technique, making it more efficient. The results showed that the accuracy and sensitivity obtained were 98.93 and 99%. Khuzani et al. (2021) developed a model with an optimum collection of synthetic features that diagnoses COVID-19 with 94% accuracy using a dimensionality reduction approach. Without any need for human extraction, the model can identify and successfully extract features from the CXR images. Not only does this new quantitative marker help us prevent segmentation errors, but it also helps to lower the computing cost of the final model. It can be concluded that the AI-based models can hence aid in the differential diagnosis of COVID-19 using chest X-rays of a patient. The rest of the related researches are described in Table 2.

Table 2 COVID-19 diagnosis using artificial intelligence and chest X-rays

3.2 Diagnosis of COVID-19 using CT Scans

CT scans are beneficial in COVID-19 diagnosis. According to several researches, the coronavirus is known to cause lengthy periphery consolidations and ground-glass opacities (Aytaç et al. 2022). This can even happen before the onset of initial symptoms such as myalgia, cough and fever. As a result, when many patients are tested, medical experts can use this method for diagnosis. The scan also reveals the degree of the lungs that have been impacted and how the disease might progress. COVID-19 related lung abnormalities are distinct and easy to detect. Even though the entire procedure can take up to 30 min, the AI classifiers can diagnose within seconds. By training a massive amount of data, the CT scan diagnosis can be implemented efficiently. However, knowledge of both image processing and deep learning is required. CT scan diagnosis using AI has three essential steps: dividing the region of interest (ROI), removing the pulmonary tissue, and using the COVID-19 classification using deep learning models. The combination of ROIs and lung images are incredibly beneficial for the model output. Once the CT images are trained, various deep learning classifiers such as CNN, U-Net, RPN, V-Net and others can be deployed. The models distinguish the images based on several key characteristics.

Focusing on these aspects, Hu et al. (2022a) published a study, mainly exploring the similarities and differences between the various CT images. Apart from this, many factors such as interpretability, class imbalance, data dimension, and sample size have been considered. Hence, their proposed framework consists of feature extraction were in the scans were initially subjected to segmentation. Data augmentation was performed later, and oversampling was done to reduce the class imbalance. Since the sample size was small, the support vector algorithm was used. Two datasets were employed, one being actual and one a synthetic dataset. The real dataset consisted of 422 cases, with 322 images of patients. The models yielded an overall accuracy of 92.57%. The sensitivity and specificity obtained were 91.44 and 94.72%, respectively.

It is known that CNN models need an immense amount of computing power. Therefore, Krishnaswamy et al. (2022) explored the existing lightweight CNN models. The SqueezeNet and ShuffleNet models were extensively utilized in this research. The data contains 1252 covid-19 and 1230 non-covid-19 scans from 120 patients. It was observed that SqueezeNet obtained an accuracy of 86.4%, and ShuffleNet obtained a 95.8% accuracy. Further, a unique approach was taken wherein both the models were ensembled. This combined model achieved an overall accuracy of 97%. The model also obtained sensitivity and specificity of 96.15 and 96.08%, respectively. This study proved that the combination of models is always beneficial.

The radiation dosage is a significant problem in radiologic patient evaluation. Large amounts of continuous radiation can lead to cancer. Hence, low radiation CT scans have been extensively studied for diagnosing pulmonary lesions. Bahrami-Motlagh et al. (2022) compared the performance of ultra-low dosage CTs to low dosage CTs for the accurate diagnosis of viral pneumonia. 167 patients were chosen for this research and were made to undergo both types of chest CT scans. GGO, nodular infiltration and various differentiating factors were documented. Both the CT scans diagnosed 44 patients as COVID-19 positive. A specificity of 98.4% and a sensitivity of 100% were obtained by both the models when viral pneumonia had to be detected. Although two patients received a false positive pneumonia diagnosis by the ultra-low dosage CT scan, the positive value was 95.7%, and the negative predictive value was 100%. This research proved that ultra-low CT scanners could effectively be used for COVID-19 diagnosis. Huang et al. (2022) suggested a combination of sparse learning and decision fusion strategy for COVID-19 diagnosis. The data obtained for this research was multi-centric as this model was aimed for heterogeneous diagnosis. Five centers were chosen for data collection: two branches of Keting Hospital, Wuhan Hospital, Wuhan Shelter Hospital, and Zhongnan Hospital of the Wuhan University, all located in China. Collecting data from these centers led to data inconsistency. To overcome this problem, image data was converted to histograms. A 3D CNN model extracted the features, and relevant information from these images were successfully fed to the classifier. The sparse learning technique was used to learn the inherent structure between the images of each centre. After deploying these models, the average accuracy and sensitivity obtained were 98.03 and 95.89%, respectively. With recent advancements, deep-learning models are preferred over traditional machine learning models since they are more accurate. However, a deep learning model requires a vast dataset to provide optimal results. When the dataset is smaller, the models tend to overfit. However, transfer learning models are generally immune to overfitting. Hence, Kaur et al. (2022) evaluated several pre-trained network architectures for COVID-19 diagnosis using transfer learning. ResNet18, ResNet50, and ResNet101 were evaluated on a publicly available chest CT dataset. ResNet50 obtained the best results with a precision of 98.02% and an F1-score of 98.41%.

Shaik et al. (2022) implemented eight pre-trained models for COVID-19 classification namely VGG(16,19), ResNet(50,50V2),InceptionV3, InceptionResNetV2, Xception, and MobileNet. The above models were deployed using a frozen convolutional base and a trainable DNN head. Two datasets were chosen for the COVID-19 diagnosis. They developed a Composite Neural Network configuration that combined multiple decision probabilities. ResNet50V2 outperformed all other models with an f1-score of 97.78%, AUC of 97.84% and accuracy of 97.79% for the above datasets. Elharrouss et al. (2022) proposed a multi-task deep-learning-based approach to segment pulmonary lesions. This study uses multi-task learning to overcome the shortage of tagged data. Furthermore, the model may learn from a range of features that might help it improve its performance owing to the multi-input stream. The lack of data does not hinder the overall accuracy of image segmentation and still extracts the region of interest effectively using a novel method called “encoding–decoding”. The encoder block consisted of convolution, batch normalization, parametric rectified linear unit and pooling layers. In contrast, the decoder block consisted of up sampling, convolution, batch normalization and parametric rectified linear unit. Sensitivity of 71.1%, specificity of 99.3%, precision of 85.6%, and mean average error of 0.062 were obtained by this technique, proving its efficiency for lung infection segmentation. One of the most effective ways to identify infection in patients is through a CT scan. Gaur et al. (2022) described a new approach for pre-processing CT images and identifying the presence of COVID-19. A unique technique named the empirical wavelet transformation (EWT) was used during the pre-processing data phase. Before using EWT on the image, it was divided into three segment colours: red, blue, and green. The EWT was explicitly used to determine the frequency of the particular channel that was primarily affected by the virus. Aiming towards classifying covid-19 positive and negative cases, the best component channel out of the three is trained on the network. 2252 confirmed positive, and 1230 CT scans of negative cases were chosen for the final model. The final accuracy, f1-score and AUC obtained by the classifiers were 85.5, 85.28 and 96.56%, respectively. An algorithm called the RADLogics (Scudellari 2020) was utilized to employ CT-scan imaging to track COVID-19 prognosis. This model generates a “corona score” which describes the extent of lung damage. This model is described in Fig. 5. A higher score indicates a severe prognosis.

Fig. 5
figure 5

RadLogics algorithm which predicts COVID-19 prognosis using lung CT images (Scudellari 2020)

It is still difficult to precisely segment COVID-19 infected lesions on CT scans due to their unusual forms, varying diameters, and indistinct borders present between the healthy and infected tissues of the lung. By increasing supervised information and merging an encoder-decoder based network of multiple scale feature mapping, a novel segmentation strategy for COVID-19 infections is proposed in a study in Hu et al. 2022b. A collaborative-supervision technique is used to aid the network to learn edge and semantic characteristics. The edge detection module was done explicitly to emphasize the features of such low-level borders by inserting information about edge supervision. Further, for the semantic characteristic detection, a module of semantic supervision is used by incorporating data of masked supervision into the latter stages of the network. To resolve the issue of high and low-level feature maps, they proposed another module that fuses the features maps on different levels. Using these three modules individually would boost the dice metric by 1.12, 1.95, and 1.63%, respectively. Resnet was used as the baseline classifier for this research. Many deep learning modules were ensembled to obtain higher accuracy and make the model more reliable. By contributing to this strategy, Shah et al. (2021) created a unique CNN module for COVID-19 diagnosis. It was named “CT-net-10”. 349 images were obtained from chest CT scans of 216 coronavirus patients, and 463 images were chosen from the healthy group. Several deep learning models were combined for prediction, such as VGG-16, VCG-19, Densenet-169, ResNet-50 and others. VCG-19 obtained the highest accuracy of 94.52%. Finally, Serte et al. (2021) proposed a deep-learning model that uses Res-Net-50 for accurate prediction. This model uses a method that fuses image-level predictions to diagnose COVID-19 on a 3D CT volume. The proposed deep learning model obtained an AUC of 96%. The diaphragm region was eliminated in this study. The original images were pre-processed using an algorithm of histogram-equalization followed by a bilateral filter. The results obtained were fed to the CNN model. This study used a transfer learning method which obtained a 98.1% accuracy and 98.4% sensitivity proving that the pre-processing of images to produce a pseudo-colour image can construct an effective diagnostic system for accurate and reliable diagnosis of pneumonia caused by the coronavirus. The rest of the research articles are described in Table 3.

Table 3 Diagnosis of COVID-19 using AI and CT scans

3.3 COVID-19 diagnosis using clinical and laboratory markers

In most scenarios, RT-PCR testing is used to diagnose COVID-19 infection. Unfortunately, this procedure takes a long time to complete and produce results (2–3 h) (Tahamtan and Ardebili 2020). It also necessitates qualified medical personnel to handle the patient samples. RT-PCR tests are also known to deliver wrong results regularly (Tahamtan and Ardebili 2020). As discussed in the above subsections, X-rays and CT scan diagnoses have been efficiently deployed using AI. However, CT scans are comparatively expensive. The scanners are also not available in all medical facilities, especially in developing countries. The prolonged use of CT scans can also cause cancer (Schultz et al. 2020). X-rays are generally available in all hospitals. However, false-negative results are observed in X-rays, too (Vaid et al. 2020). Clinical and laboratory tests are available in most medical facilities. The results of these tests can be obtained within 20 min. In the event such as a pandemic peak, these tests can be effectively used to screen patients. False-negative results obtained by the RT-PCR can be easily prevented by conducting the laboratory tests parallelly. Besides, markers such as CRP, D-dimer, NLR, LDH and ferritin levels are abnormal in coronavirus patients (Solis et al. 2022). Hence, this diagnostic method can be beneficial in the battle against this deadly virus.

Li et al. (2021b) deployed the logistic regression model to identify the unique factors that could cause fatality when an individual contracts the coronavirus. Seventy-seven clinical markers were chosen to predict the COVID-19 prognosis. LASSO algorithm was used, and 55 variables were found to be having non-zero β-coefficients and were further analysed. It was observed that patients with blood group A were more likely to have a fatal outcome compared to AB and B blood groups. Patients diagnosed with HIV or viral pneumonia are at lesser risk of covid-19 associated deaths, claimed the study. It also claimed that the transmission happens more in lower temperatures.

Tschoellitsch et al. (2021) studied a dataset of 1357 patients who were made to undergo a series of blood tests. The random forest model was trained, which consisted of 28 clinical markers to predict the RT-PCR results. The classifier was able to predict the presence of Covid-19 in a patient with 81% accuracy. The research proved that machine learning techniques could be effective in predicting the RT-PCR results by using haematological markers. Feltes et al. (2022) proved that the CBC levels of covid-19 patients could vary for different populations around the world. Many public datasets contain blood test results of COVID-19 patients across other geographical areas. The authors compared the blood test data of Brazil and Ecuador and observed a clear difference between the two. The data contained the details of 375 COVID-19 patients from the countries mentioned above. Dimensionality reduction was applied as a feature selection technique (a wrapper FS logic based on Recursive Feature Elimination and SVM technique). PCA and t-distributed stochastic neighbour embedding were later used, followed by the Mann–Whitney test that gave insights into the actual difference between the dataset parameters. It was eventually concluded that there was a significant difference in the neutrophil and eosinophil count among the male population. In contrast, a difference in eosinophil count was observed in the female population.

Chadaga et al. (2022) used AI techniques to diagnose the fatal infection using routine blood tests. They used a Brazilian public dataset which consisted of patients from Albert Einstein Hospital, Brazil. Four machine learning algorithms: logistic regression, K nearest neighbours, random forest and xgboost were utilized. Since the dataset was imbalanced, a data balancing technique called the synthetic minority oversampling technique (SMOTE) was utilized. Among all algorithms, random forest obtained the best performance with an accuracy of 92%. Feature importance techniques were used, and it was found that eosinophil, platelet and monocyte count contributed the most to the model output. In another research by Chadaga et al. (2021), ensemble algorithms were used to predict COVID-19 mortality. Patients from 18 hospitals in Mexico were considered for this research. Algorithms such as random forest, adaptive boosting, light gradient boosting machine, xgboost and categorical boosting were trained. The xgboost obtained a maximum accuracy of 96% in predicting a patient’s mortality. Age was the most critical parameter according to this research. Rahman et al. (2022) built a stacked model called the “QCoVSML”, which diagnosed COVID-19 using biomarkers. Six datasets from three different countries were chosen for this research. The combined model obtained an accuracy of 91.45%. The article concluded that it is a cheap, reliable and fast-diagnostic method. The study further claimed that the models used could prevent the false-negative results obtained by the RT-PCR test. The final model is described in Fig. 6.

Fig. 6
figure 6

COVID-19 diagnosis using routine blood markers (Rahman et al. 2022)

In another study by Kukar et al. (2021), blood markers were used in COVID-19 diagnosis. 5333 patients from University Ljubljana, Slovenia, were considered for this research. According to the investigation, the most critical blood parameters were albumin, eosinophil, MCHC, INR, and prothrombin percentage. The models were combined together with the RT-PCR test to increase the sensitivity. The maximum AUC obtained was 97%. Alves et al. (2021) used criteria graphs and decision trees to diagnose COVID-19 using clinical parameters. The public COVID-19 dataset from Brazil, consisting of 5644 patients, were considered for this research. The random forest model achieved optimal results with an accuracy of 88%. Explainable AI was further used to interpret and explain the results obtained from the machine learning models. Arpaci et al. (2021) used six machine learning models to diagnose COVID-19 using 14 clinical features. 114 patients from Taizhou Hospital, China, were considered for this research. Logistic regression obtained the highest accuracy of 87.41%. The study claims that their models can be used in developing nations where there is an acute shortage of RT-PCR kits. However, the sample size used was considerably low. The rest of the research articles that use clinical markers to diagnose COVID-19 are described in Table 4.

Table 4 Various researches that diagnose COVID-19 using AI and clinical markers

3.4 COVID-19 diagnosis using voice-based analysis

In various respiratory disorders, the voice and lung sounds change drastically. Auscultation is a method of detecting abnormal lung sounds such as crackling, wheezing, and high-pitched sounds that could help detect the presence of a lung disorder (Pasterkamp et al. 1997). It is primarily observed that during respiratory infections, the behaviour of the glottis varies abnormally, which results in the change in cough sounds. With the advent of AI, it is now possible to diagnose and predict COVID-19 by analyzing a patient's voice, breath and cough (respiratory sounds), vibration, and heart sounds. The change in pitch, frequency, rhythm, and sound volume is evaluated for disease detection. This method of diagnosis has tremendous potential for the future. However, due to lack of exploration, these tools and techniques are yet to be employed on a large scale in medical facilities.

Lella et al. (2022) used a crowdsourced dataset that contained the recorded cough sounds of 300 COVID-19 patients. The data also contained the cough sounds of patients suffering from asthma, bronchitis and pertussis. The study proposed a unique CNN model which took inputs from an audio encoder. This encoder utilized various techniques such as de-noising, gamma-tone, multi-frequency cepstral coefficient and others to modulate the voice data. The deep-CNN model obtained an accuracy of 94.45% in distinguishing COVID-19 from other respiratory infections. Sharma et al. (2022) proposed a binary classification method that distinguished COVID-19 and non-COVID-19 cases. The dataset contained heavy cough sounds and other voice modalities of 1040 patients. After balancing the dataset, classifiers such as logistic regression, Multilayer Perceptron, CNN, random forest and others were utilized. The multi-layer CNN model obtained the best results with a sensitivity and AUC of 95 and 87.09%, respectively. The model performed exceptionally well by employing the auto-regressive predictive coding technique (Harvill et al. 2021). Han et al. (2022) proposed a study that scrutinized the realistic performance of acoustics in the diagnosis of coronavirus. The dataset consisted of 5240 audio samples from 2478 participants. 514 samples were collected from confirmed Sars-CoV-2 patients. In this research, analysis of the breath, cough and voice sounds was extensively performed. The study tried to establish a relationship between the cough sounds and the patient's symptoms. For COVID-19 symptomatic patients, the models achieved a sensitivity of 67%. However, the sensitivity obtained for asymptomatic cases was a mere 56%.

Hemdan et al. (2022) proposed a model named “CR19”, which utilized genetic ML algorithms. These classifiers could potentially assist the patient with their initial diagnosis using cough sounds. An online data source was used for this research, and various ML models such as logistic regression, KNN, support vector machine, decision trees were employed. KNN obtained the highest results with a precision and f1-score of 97 and 98%. The study also proposed an IoT cloud framework that could further assist doctors in making decisions. Aly et al. (2022) used ML and IoT to diagnose COVID-19 with the help of cough sounds. In this research, two public datasets were trained for accurate prediction. The article claims that special voice patterns can be utilized instead of regular respiratory sounds. The models were further ensembled in this study to enhance the performance. The final model obtained an accuracy and AUC of 96 and 96.4%. Tena et al. (2022) proposed a novel method by extracting the time–frequency of cough sounds for automated COVID-19 diagnosis. Five public datasets were included for this research. After data collection and pre-processing, a novel algorithm named the “YAMNet” was built. It could easily classify the audio fragments of various sounds using the “AudioSet” ontology (Gemmeke et al. 2017). Further, the confidence score of each class was obtained. After post-processing, the time–frequency representation was also performed on the datasets. The models utilized were random forest, support vector machine, logistic regression, and linear discriminant analysis (LDA). Random forest obtained the best results with an accuracy of 83% in distinguishing COVID-19 from non-COVID-19 cases. The researchers from the University of Massachusetts, Amherst, created a portable gadget called the “FluSense” to diagnose COVID-19 based on voice analysis (HospiMedica. 2020; Al Hossain et al. 2020). The architecture of FluSense is depicted in Fig. 7. It is controlled by an AI-based deep learning network that could detect cough and other sounds. Further, it could analyze the voice data and diagnose various diseases, including COVID-19. This device uses a thermal camera, a microphone array and an AI-based engine to render cough and speech signals. The developers of this model believe that it can be efficiently used in medical centres to predict influenza, Sars-Cov-2 and other viruses.

Fig. 7
figure 7

FluSense device which can predict respiratory diseases using cough sounds, IoT and AI (HospiMedica. 2020; Al Hossain et al. 2020)

Deep learning neural networks were used for COVID-19 diagnosis in Islam et al. (2022). The behaviour of the glottis can be easily predicted using various pathological conditions, claimed the study. The algorithm consisted of three stages. In the first stage, acoustic features were extracted using cough sound samples. In the second step, a feature vector was created, and the last step consisted of COVID-19 diagnosis. Time-domain, frequency-domain and mixed-domain feature vectors were considered for this study. The accuracy obtained by the three vectors were 89.2, 97.5 and 93.8%, respectively. Kumar et al. (2022) proposed a lightweight CNN to detect COVID-19 from respiratory system-generated acoustic sounds. The “Modified-Mel-frequency-cepstral coefficient” technique was used to classify diseases such as pertussis, asthma, Sars-CoV-2 and Bronchitis. The model obtained an accuracy and F1-score of 92.32 and 93.48% in diagnosing COVID-19. Pahar et al. (2022) used a deep transfer learning algorithm to diagnose COVID-19 using breath, cough, and speech. Three neural network models: LSTM, Resnet50 and CNN, were deployed. The Resnet50 classifier outperformed the other models with an AUC of 98, 94, and 92% for the coughs, breaths, and speech classes. This indicates that the cough sound is more crucial in diagnosing COVID-19 than other sounds.

3.5 COVID-19 diagnosis using MRIs and ultrasound

Magnetic resonance imaging (MRI) is a widely used imaging modality that analyses the body’s entire structure using radio waves. In this procedure, there is a constant presence of a magnetic field, and the radio frequencies penetrate the body surface. The frequencies are then reflected by the water molecules present in the body. These signals are then received and translated into images for accurate diagnosis (Chaka et al. 2022). These images are considered superior to CT or X-ray images because of the details that the MRI can perceive. Besides, these tests are comparatively safer when compared with CT scans and X-rays (Berger 2002).

In a study proposed by Ates et al. (2020), thoracic MRIs were used as an alternative diagnostic tool to detect COVID-19. 32 patients diagnosed with COVID-19 underwent both chest CT scan and MRI within a time interval of 24 h. Several factors such as ground-glass opacities, area of the lung-lobes, distributions such as central, peripheral and diffused patterns of lesions, and others were critically analysed. The presence of nodular infiltration and pleural effusion were also checked. After multiple evaluations, it was observed that MRIs achieved better results with a sensitivity and specificity of 91.67 and 100% in COVID-19 diagnosis. Torkian et al. (2021) proposed that MRIs could be a potential diagnostic alternative to CT Scans. This study was conducted on eight patients. When ground-glass opacities and lung consolidations were observed, similar results were observed in five cases. Despite the remaining three having atypical features, the MRI could quickly identify all the parameters, ensuring accurate diagnosis. Although the dataset was small, this research depicted the use of MRIs in COVID-19 detection. Generally, COVID-19 detection and imaging techniques are restricted to lung and chest CT scans. However, Saleh et al. (2021) proposed a correlation between the lung scans and brain MRIs, proving the neurological manifestation of this respiratory syndrome. Seventy patients were a part of this study. Headaches, fatigue, anxiety, depression, and taste impairment are common neurological symptoms observed in COVID-19 patients. Brain MRIs detected hematoma, vasculitis, gyral oedema, micro haemorrhage and olfactory dysfunction. When the lung and brain images were analysed, it was evident that the cerebrovascular diseases and demyelinating lesions were prominent in Sars-CoV-2 patients.

It is anatomically known that lungs consist of air-filled sacs, and a pleural sac covers each lung. In the ultrasound technique, sound waves of about 20 kHz are radiated to the target site. The waves transmitted by a transducer are then received as an echo. As a result, the waves bounce back from a surface intersection with different densities (Dinsmore and Venkatraghavan 2022). Air-filled anomalies are usually barriers to the waves. In a healthy lung, the waves directly get reflected from the visceral pleura. However, in the case of severe lung infection, the waves no longer give perfect echoes from the pleura. Instead, echoes are reflected irregularly, which can then be used to diagnose the appropriate lung disease. It is known that pneumonia occurs in severe cases of COVID-19, showing a fluid build-up in scans. This can be easily identified by chest ultrasound. This diagnostic method can be used for COVID-19 since it is reliable, safer and precise (Gil-Rodríguez et al. 2022). Besides, they are known to be even more accurate than MRIs.

Xing et al. (2022) used lung ultrasound (LUS) to perform the auxiliary detection of coronavirus. A LUS scoring device was built, which used a cascaded deep learning classifier. 18,330 images were collected from 26 COVID-19 patients, and two medical experts assigned scores for each image. In the first stage, 12,949 images were further chosen for model training. During the second stage, three models: ResNet-50, Vgg-19 and GoogLeNet were combined to predict the results using a voting mechanism. The final model obtained an accuracy and F1-score of 96.1 and 96.1%, respectively. This research proved that LUS imaging has a great scope in COVID-19 diagnosis. In another research, Born et al. (2021) used LUS imaging over other modalities to diagnose the dreaded coronavirus. The authors released the largest LUS dataset, which consisted of four classes (Healthy controls, bacterial pneumonia, viral pneumonia and COVID-19). For the independent dataset, the model achieved a sensitivity and specificity of 81 and 96%, respectively. Dastider et al. (2021) used a CNN-LSTM based classifier to predict COVID-19 severity using LUS frames. The scores from one to four were predicted for each image (one being mild and four being severe). The DenseNet-201 model was used, which implemented separate convolutional branches and an autoencoder. Further, LSTM layers were concatenated, which resulted in higher accuracy. The model was able to obtain an accuracy of 80%. Chen et al. (2021) proposed a neural network model for detecting COVID-19 induced pneumonia by analyzing the excessive build-up of fluid by a method of finding the ultrasound scores. It was observed that the reliability of the lung ultrasounds scores depended significantly on the physician's experience. Hence, to ensure a high sensitivity system, an automated system was developed in this study. 1527 images of lung ultrasounds were collected from 31 patients diagnosed with COVID-19. Further, all the images were processed by various techniques such as converting curves to linear equations, detecting the lung pleural lines, analysing and selecting the relevant region of interest and features-extraction. A total of 28 features were extracted that were supposed to simulate the B-line scores assigned by the physicians. Further, classification models were deployed such as support vector machine and decision trees. Eventually, an accuracy of 87% was obtained, proving the reliability of deploying ultrasound scoring for COVID-19 induced pneumonia detection.

In this comprehensive review, various modalities which diagnose COVID-19 with the help of AI was looked into. The comparison of the various diagnostic techniques is described in Table 5.

Table 5 Comparison of various modalities that diagnose COVID-19

Ever since the dawn of the pandemic, several COVID-19 articles have been published continuously. Therefore, it is essential to compile the latest research since new information about the various aspects of the virus is being released and published worldwide. This state-of-the-art review focuses on the latest COVID-19 diagnostic articles that use AI to detect the deadly virus accurately. Further, the article emphasizes the various modalities rather than different AI methodologies and algorithms. COVID-19 diagnosis using clinical markers and sound analysis has not been explored entirely and have tremendous potential in the coming years. The review also looks into various articles that diagnose Sars-CoV-2 using MRIs and ultrasounds. These modalities have been rarely explored and presented in other review articles. The technology, time for diagnosis, patient category, performance, availability, advantages and disadvantages of all the modalities have been further compared for in-depth analysis. This comprehensive review is for researchers from both medical and engineering backgrounds. It helps them to understand the recent trends in COVID-19 diagnosis using machine learning and deep learning.

4 Threat to validation

For any machine learning use-case, it is essential to understand the two types of validity: internal and external. The obtained results must not be skewed. It must also not be prone to underfitting and overfitting. The models must work efficiently in real time (in hospitals and medical facilities). The extent to which a research has effectively analyzed its situation is its internal validity (Sirmen and Üstündağ 2022). The ML models must be checked for its accuracy, precision and recall among other factors. Data balancing is also a critical issue and the metrics used must be appropriate for the chosen problem. If there is an imbalance in the data, accuracy is not the right metric since it biases towards the majority class. Similarly, it cannot be used as a metric to solve multi class problems. Evaluation metrics such as precision, sensitivity, AUC, f1-score, log loss, Cohen Kappa score and others must be utilized. Internal validation is extremely important since classification results can be easily misinterpreted.

External validity, on the other hand is concerned whether we believe the findings of a particular study will be applicable in other situations (external) (Austin et al. 2021). The COVID-19 models must be effective in handling datasets from various countries. The model evaluation must be independent of geographical and other external factors. Hardware and software requirements must also not be an issue since they are variable. There are numerous reasons why a research’s findings may not apply to a new situation. The research population may differ from the new context’s population. The symptoms and treatments used might differ between the new context and study. Data scaling can also be a critical factor. For any research to be successful, it is extremely crucial to maximize internal and external validity.

5 Challenges and future directions

5.1 Challenges

This section discusses the various issues during the automated diagnosis of COVID-19 using AI.

  • Unavailability of large-scale datasets: a considerable amount of data is required for any ML algorithm to thrive. When the training sample is small, the result might be overfitting. Moreover, qualified medical specialists might be required to validate the medical data. However, large datasets can be collected from various hospitals all around the world with appropriate verification from doctors. This can make the models more trustworthy.

  • Noisy datasets: noisy or invalid data are a significant concern. A large amount of meaningless information is present in many public datasets. This reduces the accuracy of the model. Machine learning engineers can used appropriate data preprocessing techniques and remove the duplicate and redundant data. This makes the data extremely useful.

  • Insufficient knowledge in the combined disciplines of computing and medicine: most AI researchers have a computer science background. However, knowledge in several domains such as virology, bioinformatics, clinical biology and medical imaging are required. Experts from various disciplines must combine for conducting extensive research on COVID-19. Interdisciplinary research can be done to prevent this knowledge barrier. Medical researchers and machine learning engineers can work together to enhance the studies.

  • Preserving data privacy: the cost of collecting discrete data is meagre in the modern world. Many administrations want to gather a range of individual information such as phone numbers, ids, patient history, etc. This confidential and secure information must be protected at all costs. Substantial efforts must be put to ensure that the data does not get leaked. Cryptography and blockchain techniques can be used for authenticity and confidentiality.

  • Unstructured data: working with ambiguous and erroneous data in unlabeled texts. A huge amount of data from numerous sources may be incorrect. Furthermore, too much data makes it challenging to extract useful information. Appropriate labelling from medical professionals can be done prior to releasing the dataset. Natural language processing technique can be utilized to detect incorrect data.

  • Data from a single source: it is incredibly critical to acquire data from various geographical locations for extensive validation. However, heterogeneous data is not readily available. Hospitals from various locations can collaborate to release a universal dataset which contains details of patients from all around the world.

5.2 Directions for the future

This subsection discusses the various ideas for subsequent researches.

  • Automatic diagnosis: the entire diagnosis process can be made remotely using several automation techniques. Unnecessary contact with radiologists and medical personnel can be avoided.

  • Medical validation: these algorithms can be utilized in a range of treatment centres in the nearish future after extensive validation by medical experts.

  • Combining multiple modalities: all the models can be further combined to increase the accuracy of the models. A lot of false negative and wrong results can be drastically reduced.

  • Simulation: AI-based ML systems could be utilized in various virtual simulations. Various aspects related to COVID-19 diagnosis can be easily monitored.

6 Conclusion

COVID-19 pandemic has had a devastating impact on peoples’ well-being everywhere across the globe. Technology is advancing at a rapid pace, especially, the fields of AI and data science. AI has already made a significant contribution to people’s struggle against the dreaded virus. In this comprehensive review, emphasis is provided on several diagnostic procedures that use machine learning and deep learning techniques to detect the novel coronavirus. In this comprehensive review, COVID-19 diagnosis using modalities such as CT-scan, X-ray, clinical markers, voice-based detection, MRI and ultrasound are explored in depth. Further, all the above techniques are compared to understand their advantages and disadvantages. Finally, the key issues and future scope of this research are provided for machine learning enthusiasts and medical researchers. This extensive review provides a complete analysis of the state-of-art research on COVID-19 diagnosis using AI for the entire health community and researchers.