Deep Learning and Federated Learning for Screening COVID-19: A Review

: Since December 2019, a novel coronavirus disease (COVID-19) has infected millions of individuals. This paper conducts a thorough study of the use of deep learning (DL) and federated learning (FL) approaches to COVID-19 screening. To begin, an evaluation of research articles published between 1 January 2020 and 28 June 2023 is presented, considering the preferred reporting items of systematic reviews and meta-analysis (PRISMA) guidelines. The review compares various datasets on medical imaging, including X-ray, computed tomography (CT) scans, and ultrasound images, in terms of the number of images, COVID-19 samples, and classes in the datasets. Following that, a description of existing DL algorithms applied to various datasets is offered. Additionally, a summary of recent work on FL for COVID-19 screening is provided. Efforts to improve the quality of FL models are comprehensively reviewed and objectively evaluated.


Introduction
The novel coronavirus disease (COVID- 19) pandemic [1][2][3] began in December 2019, killing millions of people [4][5][6][7][8].Table 1 compares different pandemics in terms of death cases.COVID-19 causes symptoms of fever, lung issues, dyspnea [9], etc.In severe cases, patients need admission to intensive care units [10].In many countries, COVID-19 has created a massive burden on the healthcare system as the number of ICUs is limited.Maintaining social distancing and wearing facial masks were suggested by the World Health Organization (WHO).From late December 2020, vaccines were used for the mass people.Since January 2023, the occurrence of COVID-19 has been steadily decreasing, allowing most nations to resume their normal lives as they were before the pandemic.However, new variants of COVID-19 may emerge in the near future, against which we may not be fully protected by the existing vaccines.We will continue to face a public health threat as a result of the evolution of the virus, global dissemination, and persisting vulnerabilities within our communities.So, COVID-19 may remain a global health threat.Hence, predicting COVID-19 spread is still important for planning medical facilities and other issues.

Pandemics Number of Deaths
Spanish Flu 40-50 million were excluded.Two Boolean operators, "AND" and "OR" were used to identify necessary keywords.We searched several keywords; these were coronavirus, COVID-19, artificial intelligence, deep learning, transfer learning, federated learning, X-ray, ultrasound imaging and computed tomography.We ignored the clinical, epidemiological, and basic science aspects of COVID-19.We obtained a total of 11,700 papers on the application of DL and FL to COVID-19 from various publishers and preprints.The search criteria that resulted in 11,700 papers are shown below.
We manually excluded 11,260 articles from 11,700 articles as these were out of the main research interest of this paper.We reviewed the remaining 440 articles.Next, the most relevant 95 articles were cited in the paper and 72 were used for result synthesis.The selection of these 72 papers was made because these papers reported a description of COVID-19 datasets and performance results of the application of DL and FL to the diagnosis of COVID-19 patients.Figure 1 illustrates the summary of the PRISMA technique applied in this paper.
atic review process, reduce bias, and improve reporting quality, and consequently, popular in medical and healthcare research.We considered the papers from 1 Janu 2020 to 28 June 2023.Only English language papers were considered, and other langu papers were excluded.Two Boolean operators, "AND" and "OR" were used to iden necessary keywords.We searched several keywords; these were coronavirus, COVID artificial intelligence, deep learning, transfer learning, federated learning, X-ray, ul sound imaging and computed tomography.We ignored the clinical, epidemiological, basic science aspects of COVID-19.We obtained a total of 11,700 papers on the applicat of DL and FL to COVID-19 from various publishers and preprints.The search criteria t resulted in 11,700 papers are shown below.
We manually excluded 11,260 articles from 11,700 articles as these were out of main research interest of this paper.We reviewed the remaining 440 articles.Next, most relevant 95 articles were cited in the paper and 72 were used for result synthesis.selection of these 72 papers was made because these papers reported a description COVID-19 datasets and performance results of the application of DL and FL to the d nosis of COVID-19 patients.Figure 1 illustrates the summary of the PRISMA techni applied in this paper.The results of different algorithms are dependent on the datasets considered; hence, the performance of different DL algorithms reported in the literature needs a careful comparison.There are several survey or review papers [8,[19][20][21][27][28][29]33,35] on COVID-19 that focus mainly on either ML/DL, or FL.Table 2 summarizes the aspects that these existing papers cover and how our work differentiates from other works and contributes to the survey on this topic.From Table 2, it can be seen that there is little work that reviews DL models and FL techniques simultaneously in the context of COVID-19 detection.Because of many recent studies in the application of DL and FL to combating COVID-19, there is a need for an up-to-date survey of the available datasets and algorithms applied to these datasets.This paper is motivated by the need to present a literature survey on the datadriven diagnosis of COVID-19 using DL models and FL techniques.The main contributions of this paper are as follows: A systematic detailed review is provided on different datasets of COVID-19, and DL algorithms that are applied to the datasets.Moreover, the application of FL when screening COVID-19 is also discussed.The review was performed for research articles published from 1 January 2020 to 28 June 2023 using PRISMA guidelines.The remaining sections of this work are structured as follows: Section 2 discusses the datasets that are widely used for COVID-19 detection.Section 3 discusses several metrics for performance.Section 4 describes the comparative performance of various DL algorithms in COVID-19 diagnosis.A survey is provided on FL for COVID-19 in Section 5.The paper discusses the new hybrid dataset in Section 6, and then the existing challenges and future work are described in Section 7. Finally, the paper provides concluding remarks in Section 8.

Datasets for DL
folders for training, testing, and validation.Furthermore, the dataset had two subfolders, one for pneumonia and the other for normal cases [36].Another dataset of 319 radiographic X-ray images created by the authors of [37] is available on GitHub [38].These images were a collection of COVID-19, MERS, SARS, and ARDS, where 250 belonged to COVID-19-positive patients.A dataset termed as COVIDx containing 104,009 CT images of 1489 patients is available on GitHub [39].Furthermore, a dataset called COVNet data containing 4563 CT images of 3506 patients from six medical centres is available on GitHub [40].A dataset of pneumonia chest X-ray images is available in [57].A dataset of 306 X-ray images is available in [42].This dataset was organized in training and testing portions and categorized into patients marked as normal, bacterial pneumonia, COVID-19, and viral pneumonia.A dataset of 219 X-ray images of COVID-19 patients is available on the Kaggle repository [44].In a study [58], a dataset of 295 chest images was generated by combining two individual datasets.This dataset is available in [43] and contains images of normal as well as 53 viral and bacterial pneumonia patients.Another dataset contains X-ray images of normal, COVID-19 patients, SARS patients, Streptococcus, and Acute Respiratory Distress Syndrome (ARDS) [45].The research work in [59] presents Twitter datasets [46,53] containing tweet IDs on COVID-19.A database of SIRM COVID-19 [48] contains 94 chest X-ray and 290 lung CT scan images of 71 COVID-19 patients as of 10 May 2020.18.

CO-IRv2
CT images [55] 19.X-ray images three levels X-ray [56] A dataset comprises 32 chest X-ray images from Radiopedia [49].A study presented 103 chest images of 50 cases in Spain [50].In the Radiology Society of North America (RSNA) database, X-ray images of patients having traditional pneumonia and people without lung infection were available [60].CT image samples of COVID-19 patients were reported in another study [61].The dataset contained images of lungs and noted the type of infection.A research group in Norway created two datasets from more than 60 suspected patients; the two datasets had 100 and 829 CT scan slices [51].The research work in [61] showed that DL is effective in detecting COVID-19 in CT scan images using a dataset [52].DL can be applied to heterogeneous datasets that include CT scan images of COVID-19 and non-COVID-19 [51].Table 4 provides a comparison of different X-ray image datasets [62][63][64][65][66][67][68][69][70][71][72][73][74], while Table 5 compares the CT image datasets [75][76][77][78][79][80][81][82][83][84][85][86][87][88][89].No pneumonia, common pneumonia, and COVID-19 [89] Apart from the X-ray and CT images, there are datasets of other samples as well.Data from several medical imaging modalities can be combined to create multimodal imaging datasets for COVID-19 detection.Multimodal datasets combine the benefits of many imaging techniques and can provide a more comprehensive and accurate representation of the disease.For instance, a multi-modal dataset can improve the model's robustness to changes in data quality or missing information in a single modality of samples.In creating multimodal datasets, preprocessing the photographs to ensure they are in a consistent format, resolution, and orientation is a critical step.Table 6 briefly presents multimodal datasets [90,91] that were used for COVID-19 diagnosis.In addition, a dataset of ultrasound images is also shown.Table 7 shows a comparative summary of different preprocessing methods applied to different datasets of X-ray and CT scan images.The number of positive and negative samples in the dataset, and different data preprocessing techniques, for example, data segmentation, augmentation, rescaling, normalization, etc., are discussed.There are some clinical datasets, and there are also hybrid datasets created by the fusion of individual datasets.Classifications are performed into different classes: binary, tertiary, and quaternary.For example, studies [62,64,65,[91][92][93][94][95][96] report datasets of X-ray images, while studies [97][98][99] focus on CT images.Some works [97,98] consider segmentation as data preprocessing, and perform binary classification of COVID-19 and non-COVID-19 cases.A recent work [99] performs data augmentation and normalization as preprocessing, and then computes a binary classification of COVID-19 cases and conventional lung disease cases.The majority of the datasets described above consist of CT or X-ray images, however, some are multimodal.The datasets include tertiary and quaternary classes in addition to binary data.The quality and diversity of the data, together with appropriate validation, are crucial components in creating trustworthy DL models for the datasets indicated above.These datasets are used by researchers to develop and improve DL algorithms that can diagnose COVID-19 effectively and accurately.

Performance Metrics Used for DL
Several performance metrics were considered while classifying and diagnosing COVID-19 [121][122][123][124].These metrics depend on some true positive (TP), false positive (FP), true negative (TN), and false negative (FN) terms.Correctly detected positive patients are represented as TP.The term FP shows the negative or normal patients that are incorrectly recognized as positive.TN signifies that normal patients are correctly recognized as negative, while FN represents positive patients incorrectly recognized as normal patients.Classification accuracy indicates the correctness of classifying normal samples as normal and abnormal samples as abnormal.
The ratio of accurately classified COVID-19-positive patients to the total number of suspected patients is referred to as recall, and also known as sensitivity.The accuracy of predicting negative or normal cases is referred to as specificity.Precision, also known as positive predictive value (PPV), is the proportion of correctly identified positive instances to the total projected positive cases.The ratio of successfully identified negative samples to the total expected negative samples is the negative predictive value (NPV).The harmonic method of precision and recall is the F-measure.The area under the receiver operating characteristics curve (AUC) measures how accurately positive and negative instances are classified.The average of the absolute difference between the predictive and actual sample values is used to calculate the mean absolute error (MAE).
The performance metrics mentioned above offer a quantifiable technique to assess the efficacy of DL models when used to analyze imaging datasets in order to detect COVID-19.It is shown in the next section that these measures are used to assess the DL models and decide which one to deploy in practical situations.
Although many research papers have reported the use of DL for COVID-19, in the following, only the very recent studies [155][156][157][158][159][160][161][162][163][164][165] are described, while others are only summarized in Table 8.A framework for detecting COVID-19 from X-ray images consists of a modified version of DenseNet-121, an image data loader for batch separation, a loss function, and a weighted random sampler for balanced training [155].The framework [155] achieves a decent diagnosis performance with an accuracy of 99.81%.Another study [156] creates an automated system for categorizing COVID-19 cases into two groups, while addressing concerns such as the limited and unbalanced dataset and model overfitting.The proposed VGG-16 based DL technique [156] obtains a remarkable 99.86% accuracy and 99.9% recall.One stacking ensemble model is created by combining the outputs of the multiple pretrained models, multi-layer perceptron (MLP), recurrent neural network (RNN), long short-term memory (LSTM), and gated recurrent unit (GRU) [157], in the case of two datasets of COVID-19 symptoms [157].The COVID-CheXNet system [158] uses contrast-limited adaptive histogram equalization to improve the contrast of the X-ray images and Butterworth bandpass filters to decrease the noise level.Using the weighted sum rule at the score level, the COVID-CheXNet scheme correctly and accurately diagnoses COVID-19 patients with a detection accuracy rate of 99.99%, sensitivity of 99.98%, and specificity of 100% [158].In one study [159], a CNN is employed to screen for COVID-19 using chest CT images and Resnet50, VGG16, VGG19, Densenet121, InceptionV3, and Xception.Comparing the VGG16 model against other models, it is found to be 98% correct for the cases considered [159].The detection of COVID-19 and other pneumonia cases is proposed using a new DL-based model, COVIDWNet-GB [160] that is based on depth-wise dilated CNNs as well as feature reuse residual blocks.The proposed COVIDWNet-GB achieves a 96.81% accuracy in multi-class (COVID-19, Lung Opacity, Normal, and Viral Pneumonia) X-ray images [160].An essential weights-only transfer learning method is reported for devices with low runtime resources [161] by reducing the model's less important weight parameters.The empirical results show that the pruned ResNet34 model can achieve 95.47% accuracy, 0.9216 sensitivity, 0.9567 F1-score, and 0.9942 specificity on the CT-scan dataset with 20.64% fewer weight parameters [161].One study [162] considers DL models of ResNet50, ResNet101, DenseNet121, DenseNet169, and InceptionV3.These models are reported to perform satisfactorily, with ResNet101 outperforming the others, achieving 96% accuracy [162].DL models VGG-19, ResNet-50, Inception v3, and Xception are optimized and compared after preprocessing the data [163].The VGG-19 model, with fine-tuning, can detect COVID-19 with excellent accuracy-up to 94.17% for chest X-rays and 93% for CT scans [163].A DL model termed SCovNet classifies COVID-19 when applied to 17,599 images with an accuracy of nearly 99% and 98% on chest CT images and X-ray images, respectively [164].Explainable AI is gaining interest in the field of DL.According to one recent study on explainable AI, an encoder-decoder-encoder architecturebased DL framework is reported to provide a thorough, high-resolution, visual explanation of the classification outcomes for a dataset with four classes [165].8 presents the comparative performance of different DL methods reported in the literature.The methods are compared in terms of a number of metrics: classification accuracy, sensitivity, precision, F1-score, AUC, and specificity.It can be seen that an excellent classification accuracy of 99.58% is obtained by the COVID-Lite algorithm.The recall value of COVID-Lite is also 99.58% [116].The precision, F1-score, AUC, and specificity values of COVID-Lite are reported to be 100%, 99.79%, 99.34%, and 100%, respectively.The second-highest classification accuracy of 99.56% is reported for the Xception and ResNet 50V2 schemes [94].In one study [125], the EDL-COVID algorithm is shown to have an accuracy of 99.05% with a recall value of 99.05%.The reported EDL-COVID scheme has an F1-score of 98.59% and a specificity of 99.60%.However, ResNet 50 [101], ResNet 101 [126], VGG 16, and VGG 19 [108] have recall values of 100%.Furthermore, ResNet 101 has an accuracy value of 98.75%, which is better than ResNet 50, VGG 16, and VGG 19.Among all the systems considered, the best AUC value of 99.40% is found for Inception V1 which has an accuracy of 95.78% [111].On the other hand, the highest F1-score of 100% is achieved by ResNet 50 [101].Hence, from Table 8, it can be seen that there is no single algorithm that has the best value for every metric.
As shown above, there are several stand-alone DL models and hybrid ones that are applied to different imaging datasets for COVID-19 detection.Researchers combine the concepts of individual DL algorithms and create new frameworks.Some of these hybrid architectures are called CoroNet, CoroDet, EDL-COVID, MKs-ELM-DNN, COVIDWNet-GB, etc., in the literature.The main DL models that are applied for the data-driven diagnosis of COVID-19 are illustrated in Figure 2.  8 presents the comparative performance of different DL methods reported in the literature.The methods are compared in terms of a number of metrics: classification accuracy, sensitivity, precision, F1-score, AUC, and specificity.It can be seen that an ex cellent classification accuracy of 99.58% is obtained by the COVID-Lite algorithm.The re call value of COVID-Lite is also 99.58% [116].The precision, F1-score, AUC, and specificity values of COVID-Lite are reported to be 100%, 99.79%, 99.34%, and 100%, respectively The second-highest classification accuracy of 99.56% is reported for the Xception and Res Net 50V2 schemes [94].In one study [125], the EDL-COVID algorithm is shown to have an accuracy of 99.05% with a recall value of 99.05%.The reported EDL-COVID scheme has an F1-score of 98.59% and a specificity of 99.60%.However, ResNet 50 [101], ResNe 101 [126], VGG 16, and VGG 19 [108] have recall values of 100%.Furthermore, ResNet 101 has an accuracy value of 98.75%, which is better than ResNet 50, VGG 16, and VGG 19 Among all the systems considered, the best AUC value of 99.40% is found for Inception V1 which has an accuracy of 95.78% [111].On the other hand, the highest F1-score of 100% is achieved by ResNet 50 [101].Hence, from Table 8, it can be seen that there is no single algorithm that has the best value for every metric.
As shown above, there are several stand-alone DL models and hybrid ones that are applied to different imaging datasets for COVID-19 detection.Researchers combine the concepts of individual DL algorithms and create new frameworks.Some of these hybrid architectures are called CoroNet, CoroDet, EDL-COVID, MKs-ELM-DNN, COVIDWNet GB, etc., in the literature.The main DL models that are applied for the data-driven diag nosis of COVID-19 are illustrated in Figure 2.

Survey on FL for COVID-19
This section explores the literature on the application of FL in the context of COVID 19.Existing studies [133][134][135][136][137][138] indicate that DL models can be trained using FL in a distrib uted, private manner.This is crucial in situations where data collection and centralization are neither possible or desirable because of privacy concerns, transmission bandwidth re strictions, or legal restrictions.FL is used to safeguard organizations data privacy by us

Survey on FL for COVID-19
This section explores the literature on the application of FL in the context of COVID-19.
Existing studies [133][134][135][136][137][138] indicate that DL models can be trained using FL in a distributed, private manner.This is crucial in situations where data collection and centralization are neither possible or desirable because of privacy concerns, transmission bandwidth restrictions, or legal restrictions.FL is used to safeguard organizations' data privacy by using less accurate local models to build a global DL model [133][134][135][136][137][138].Clients train their own models locally, while the server employs an incentive mechanism and aggregates them until convergence occurs.
The FL frameworks described in [133] are reported to collect data, train an intelligent model, and then distribute this model throughout the public network in a decentralized fashion.By utilizing FL, hospitals are able to keep their patient data private and simply share weights and gradients, while the data are distributed around the facilities via blockchain [133].The authors of [133] propose the use of blockchain to authenticate data and FL to train the model worldwide while still protecting the organization's confidentiality.In particular, COVID-19 patients are identified using a capsule network segmentation and classification algorithm [133] that deals with the heterogeneity of data first.The proposed federated capsule network is compared with DL models of VGG, ResNet, MobileNet, and DenseNet.The proposed federated blockchain-based capsule network achieves an accuracy value of 98.68% when applied to CT images [133].
Using FL for COVID-19 data training and deployment is suggested in another work [134].Well-known DL models, MobileNet, ResNet, ResNeXt, and COVIDNet, are reported [134] to be tested with and without the FL framework to see which performed better.Training using the FL framework and training without the FL framework are compared in trials.ResNet performs well in training both with FL and without FL according to the results.With COVID-19 labels, ResNeXt performs the best.The smallest number of parameters can be found in MoblieNet.As a result, the research shows that ResNeXt and ResNet are the best models for identifying COVID-19 for the datasets considered in the study [134].Communication efficiency and model correctness are not considered in the works in [133,134].Model performance cannot be guaranteed with FL's default option, which involves a high connection overhead for sending model updates.Medical diagnostic image analysis for COVID-19 detection is the subject of another research work [135], which presents a unique dynamic fusion-based FL approach with an emphasis on enhancing model performance and communication efficiency.An architecture for medical diagnostic image analysis based on dynamic fusion-based FL systems is first designed.According to their local model's performance, the participating clients are dynamically selected, and the model fusion is scheduled as per participating clients' training time.It has been determined that this strategy is viable and has superior model performance, communication efficiency, and fault tolerance compared with FL's default option [135].
The non-independent and identically distributed (non-IID) and imbalanced data distributions that naturally occur in the FL environment are investigated in one study [136].The work [136] proposes VGG16 and ResNet50, two distinct FL model designs, and the studies show that the proposed models outperform centralized models.When applied to other medical imaging applications with massive, distributed, and privacy-sensitive data, it is possible that the FL approach in [136] could be more broadly applicable than the COVID-19 screening used in other studies.Using an FL technique to detect COVID-19-related CT anomalies in patients from a global study has been shown to be feasible [137].Three Hong Kong hospitals were used as training and testing facilities, while four more independent datasets were sourced from mainland China or Germany to ensure model generalizability.Longitudinal scans of COVID-19 patients in the hospital are being used to test automated lesion burden estimation.In order to construct a privacy-preserving AI model for COVID-19 medical image diagnostics, FL algorithms are examined.In the study, a CNN-based DL model is taken into account for training the decentralized multicentre CT imaging data [137].To deal with the unpredictability of the data and annotations, one paper [138] suggests a new federated semi-supervised learning technique.The performance disparity between training a model with one dataset and applying it to another is being studied using a multi-national database made up of 1704 scans from three different countries.A total of 945 COVID-19 results are manually defined by radiologists.It is possible that semi-supervised learning can lessen the burden of annotations in a distributed situation by avoiding the requirement for sensitive data sharing.The suggested framework in [138] is shown to be more effective than a fully supervised scenario with traditional data sharing rather than model weight sharing [138].FL has attracted the attention of researchers who work on topics such as training data clustering [139], multistage local training [140], and multitask learning [141].Additionally, certain studies [142,143] are concerned with the design of incentive mechanisms to encourage clients to participate in FL tasks.
One study trains an FL model named electronic medical record chest X-ray AI model (EXAM) using data from 20 different institutions around the world utilizing vital signs, laboratory results, and X-rays from symptomatic COVID-19 patients [144].Using data from all participating sites, EXAM is reported to achieve an AUC greater than 0.92 in terms of predicting outcomes 24 and 72 h after the patient first presented to the emergency room, and it improves the average AUC by 16% while increasing generalizability by 38% on average when compared to models trained using only data from a single site.At the largest independent test site, EXAM is shown to have a sensitivity of 0.950 and a specificity of 0.882 for predicting the need for mechanical ventilation or death within 24 h.The FL in [144] ensures faster data science collaboration and the creation of a model capable of predicting clinical outcomes in patients with COVID-19.Based on 231 major papers, one study [145] examines FL systems from a software engineering standpoint.The full lifetime of FL system development is covered in the data synthesis stage, which includes background understanding, requirement analysis, architectural design implementation, and assessment.It is reported that benchmarking approaches are necessary to evaluate FL system development in real-world COVID-19 situations with rich datasets and representative workloads [145].In a recent study, an FL framework is introduced for the interpretation of medical diagnostic images to separate COVID-19 from four different chest illnesses [166].The proposed FL framework in the DenseNet-169 DL model [166] successfully safeguards client privacy while accurately separating COVID-19 from four chest diseases with an accuracy of 98.45%.The main aspects and the main outcome of some of these prominent studies on FL-based COVID-19 detection are presented in Table 9.The findings mentioned in the above papers would encourage researchers to use FL to build a powerful model for COVID-19 screening.While several works have applied FL to COVID-19, the field of FL application is still in its infancy, and numerous related challenges remain unresolved.Enhancing the quality of FL models is a current research focus and a difficult task.

Hybrid Dataset
In Section 2, several COVID-19 datasets are reviewed, where most of the datasets consist of either X-ray or CT scan images from a single source.In order to increase the number of samples in the dataset, samples of different datasets can be combined to form a hybrid dataset.There are several ways to form hybrid datasets, for instance, the formation of a new hybrid dataset is shown below.
The new hybrid dataset is formed by the fusion of two individual datasets found at [51,146].One of the datasets [51] consists of 929 images where 473 are COVID-19-positive samples and 456 are of normal patients.In [51], image segmentation is already done, and the images and their associated masks are separately available.On the other hand, the other dataset [146] has 2482 images where 1252 are for COVID-19 patients and 1230 are for patients that are non-infected with the virus.The dataset in [146] has been collected from hospitals in Sao Paulo, Brazil, and these data samples do not have the masks separated.In order to combine the two datasets, the images of [146] are first segmented and then separated into the actual images and their masks.Next, the images and their associated masks for the two datasets are combined to form the hybrid dataset.There are a total of 3411 CT scans included in this dataset.There are 1726 CT scans that are positive for COVID-19.In contrast, there are 1685 normal CT scans, all of which test negative for SARS-CoV-2.The resultant dataset is available at [167].
In this research, the images are processed in two steps.In the first step, all the original images are examined, and the minimum dimensions of the images are found to be 224 by 224 pixels.After that, all the images are converted to the size of the minimum dimensions.Considering RGB images, the input dimension to CO-DenseNet is set as 224 × 224 × 3.In the second step, the images are preprocessed according to the ImageNet dataset containing thousands of image classes and millions of image samples.The CO-DenseNet is trained using a pretrained network that is already trained on the ImageNet dataset.All the images are preprocessed by dividing each pixel intensity value by 255 followed by the ImageNet mean, and the result is divided by the standard deviation of ImageNet.After that, data augmentation is applied to increase the number of image samples and prevent the overfitting of the model.As part of augmentation, rotation, horizontal flipping, vertical flipping, and zoom range selection are performed where the zoom range is 0.2 which means zoom-in and zoom-out by 20%.
Given the scarcity of large COVID-19 datasets, hybrid datasets can aid in the generation of large samples and the appropriate evaluation of DL models.

Research Implications and Future Work
This section discusses the implications of this research work, existing challenges, and future scopes of work.This paper follows PRISMA guidelines to provide a systematic review of the application of DL methods in the data-driven diagnosis of COVID-19.PRISMA guidelines are followed in preparing this review.There are several review papers [150][151][152][153][154] in the literature that focus on DL-based COVID-19 management; however, this paper considers the latest research work on DL and FL, followed by the introduction of a proposed new DL method to manage COVID-19.Since the study and research outcomes of COVID-19 are changing rapidly, the findings of our review work need to be adjusted in the future.Issues related to the epidemiology, clinical aspects, treatment, and vaccine of COVID-19 are out of the scope of this review.The effectiveness of any classification algorithm varies depending on the dataset considered.Some datasets may have erroneous or incomplete labels that may lead to misleading results of the DL algorithm.Moreover, there is a lack of benchmark datasets of COVID-19 patients.The algorithms should be applied to a number of large and correct datasets of COVID-19.
Many countries apply social distancing, lockdown, and similar approaches to mitigate the spread of the virus.There are also some challenges regarding the data privacy of the public.In the future, governments may create formal regulations when providing guidelines on applying social distancing and ensuring data privacy.AI, IoT, and wireless sensor networks can be used together to examine the presence of the virus in the environment.AI can be used to trace the spread of the virus, identify susceptible patients, and control the infection to some extent.
Moreover, AI can also predict the possibility of death cases based on the data of previous cases.DL methods can contribute to devising treatment plans and vaccines for COVID-19.DL can be used along with image processing techniques in recording the severity of chest and lung lesions and measuring the shape, length, and comparative changes in the lesions.The examination of the lesions may assist medical practitioners to make an efficient and quick decision regarding the stage of the disease.For effective data-driven diagnosis using DL, large and reliable datasets on COVID-19 should be formed from different hospitals.In order to provide a more thorough understanding of the disease, multimodal approaches and techniques should combine various modalities, such as various medical images (for example, X-rays and CT scans as multimodal datasets), clinical data (for example, symptoms, patient history), and laboratory tests (for example, PCR results).AI, along with other emerging technologies such as blockchain and the Internet of Drone Things [168], may be able to contribute to the management of pandemics.
In the future, there should be strong collaboration among experts in different disciplines such as medical, biological, image processing, and computer science.This collaboration will greatly help in devising new ways to fight against COVID-19.

Conclusions
This research examines the use of DL approaches in the fight against COVID-19.To begin, several X-ray, CT scans, and ultrasound image collections are described.The overall number of images, the number of positive COVID-19 samples, and the classes contained within the dataset are all listed in the description.Following that, a comparative description of several preprocessing methods applied to various datasets is presented, where the preprocessing techniques include data normalization, segmentation, rescaling, and augmentation.This review work also shows that DL approaches such as CNN, VGG-16, ResNet, VGG-19, COVIDNet, and hybrid neural networks may successfully diagnose COVID-19 when employed on X-ray and CT scan images.The review also covers the most recent advances in FL research and application, as well as a comparison of the effects of FL and non-FL products, with a particular emphasis on the use of FL in the diagnosis of COVID-19.In the future, studies should be carried out to collect high-quality X-ray and CT images for a significant number of suspected patients.To increase COVID-19 identification in a realistic scenario, new DL algorithms should be developed.

Figure 2 .
Figure 2. Illustration showing the DL models that are applied for COVID-19 diagnosis.

Figure 2 .
Figure 2. Illustration showing the DL models that are applied for COVID-19 diagnosis.

Table 1 .
Comparison of fatalities of selected pandemics.

Table 2 .
Summary of some of the existing survey/review works on COVID-19.

Table 4 .
Comparison of different X-ray datasets.

Table 5 .
Comparison of different CT datasets.

Table 6 .
Comparison of different ultrasound and multimodal datasets.

Table 7 .
Comparative summary of the datasets and DL algorithms reported in the literature.

Table 8 .
Performance results of different DL algorithms.

Table 9 .
Summary of different FL methods for COVID-19 diagnosis.