Applications and Challenges of Federated Learning Paradigm in the Big Data Era with Special Emphasis on COVID-19

: Federated learning (FL) is one of the leading paradigms of modern times with higher privacy guarantees than any other digital solution. Since its inception in 2016, FL has been rigorously investigated from multiple perspectives. Some of these perspectives are extensions of FL’s applications in different sectors, communication overheads, statistical heterogeneity problems, client dropout issues, the legitimacy of FL system results, privacy preservation, etc. Recently, FL is being increasingly used in the medical domain for multiple purposes, and many successful applications exist that are serving mankind in various ways. In this work, we describe the novel applications and challenges of the FL paradigm with special emphasis on the COVID-19 pandemic. We describe the synergies of FL with other emerging technologies to accomplish multiple services to ﬁght the COVID-19 pandemic. We analyze the recent open-source development of FL which can help in designing scalable and reliable FL models. Lastly, we suggest valuable recommendations to enhance the technical persuasiveness of the FL paradigm. To the best of the authors’ knowledge, this is the ﬁrst work that highlights the efﬁcacy of FL in the era of COVID-19. The analysis enclosed in this article can pave the way for understanding the technical efﬁcacy of FL in medical ﬁeld, speciﬁcally COVID-19.


Introduction
In recent years, artificial intelligence (AI) techniques have been highly successful in assisting mankind in various ways, such as improved healthcare, ambient assisted living, smart services, awareness/forecasting of future events, etc. Three major elements have significantly contributed to the success of AI developments in real-life scenario(s): (i) the availability of big data stemming from diverse sources, (ii) advancements in newer learning models as well as computational power, and (iii) the evolution of deep learning (DL) models and high-performance computing infrastructures [1]. FL has various market use cases and commercial applications focusing on data science, healthcare, industry, and education [2]. Despite their many benefits, AI techniques face multiple challenges due to poor quality and unstructured data, the non-availability of data for certain tasks, and/or the inability to handle and process data originating from the real-time domain. Though AI has shown a very huge success rate and many remarkable developments exist worldwide, most domains (e.g., real-time, personal data-driven applications, etc.) are still not in a position to leverage AI techniques commercially due to the following major concerns: • Users are highly concerned about their data privacy, and therefore, acquiring and using personal data is very challenging. • The confidentiality of personal data (also known as users' data) can be compromised because the data are mostly collected in some central place (e.g., server) for central learning (CL). • Most processing in CL-based environments is performed in a black-box manner. Hence, privacy violations cannot be restricted. related FL developments, which remained unexplored in the current literature. In addition, none of the previous surveys have highlighted the synergies of FL with other emerging technologies to fulfill its promises in the context of COVID-19. The major contributions of this paper are given as follows: • A review of the applications of FL to COVID- 19 To the best of the authors' knowledge, this is the first work centering FL with regard to COVID-19, and we believe this could pave the way to understanding FL's role in the COVID-19 era.
The rest of this paper is structured as follows: Section 2 discusses the background of the FL paradigm including working methodology (e.g., clients and server responsibilities), main types, and emerging research areas concerning the FL paradigm. Section 3 presents technical applications of the Fl in the context of COVID-19. Section 4 presents the latest synergies of FL with the other emerging technologies to enhance the privacy level as well as the application horizon of FL technology. Section 5 discusses the open-source implementations of FL with a special focus on medical-related developments. Section 6 highlights the challenges of FL in modern times and suggests valuable recommendations to address those challenges. Section 7 compares this work with the existing works. We conclude this paper in Section 8.

Background of Federated Learning
In this section, we describe the background of the FL for the clarity of the readers. Specifically, we demonstrate the working mechanism of FL, its types, and hot research areas centering on the FL paradigm.

Federated Learning: A State-of-the-Art (SOTA) Development for Privacy-Preservation
A typical FL paradigm includes one central server denoted with S, N clients/participants, and a training protocol/algorithm that works in multiple rounds. The FL paradigm does not centralize data, and therefore, it is a privacy-preserved solution. The rigorous application of privacy regulations is not needed because the FL paradigm is legally compliant. The FL paradigm is the solution for data islands/silos and the data winter problem (https://redasci.org/, accessed on 10 August 2022). Figure 1a presents a high-level overview of the Fl paradigm. There are M rounds in the FL paradigm. The conceptual overview of one round is given in Figure 1b. As shown in Figure 1a, in each round's global model, ∆W, performance is checked, updated, and shared among participating entities (e.g., clients and S) involved in the paradigm. The FL process is repeated several times until some defined criterion/condition is met. Figure 1b illustrates a detailed procedure of one round in the FL paradigm.
In Figure 1b (a), clients obtain a global model update (e.g., ∆W) from the central server. Afterward, each client computes a local model weight independently based on the local data in Figure 1b (b). Subsequently, in Figure 1b (c), all local updates of each client are sent back to the central server for joint analysis (i.e., aggregation). The aggregator function, F, employed at the server side for federated averaging at epoch/time t is given in Equation (2): where N denotes the total number of clients, F(t) is a global weight at time/epoch t, and ∆W t i represents the gradient update for the client i at epoch t. In the FL paradigm, clients and servers carry out a variety of functions to accomplish collaborative learning tasks. The generic overview of tasks performed by each participating entity is given in Table 1. In some cases, the number of tasks can vary depending upon the target domain/scenario. Table 1. Functions/activities performed by clients and server in the FL paradigm.

Clients
Obtaining parameters from the central server. Training the AI model with parameters obtained from a central server and local data.
Uploading local gradients to the server for aggregation.

Server
Sharing parameters with all participants/clients. Acquiring local gradients from all participants. Computing aggregated global model (F) utilizing local gradients.
Updating model parameters with new in each t. Filtering malicious gradients/updates using anomaly detection or any other method.
The data on which AI models are trained on the client side can be independent and identically distributed (i.i.d.) or non-i.i.d., depending upon the scenario. The latter case is more challenging as it can slow the convergence model and result in poor accuracy.

Classification/Types of the Federated Learning Paradigm
As shown in Figure 2, FL approaches can be classified into three main categories, namely, horizontal FL, vertical FL, and federated transfer learning (FTL) [36]. Horizontal FL is the ML/DL in cases where multiple datasets from different clients are not identical in the sample space but identical in the feature space. For instance, the datasets originating from different hospitals can denote the same feature space, i.e., the patients' information, but not identical in the sample space, i.e., the data from diverse/different patients. In vertical FL, the clients can have the data with identical sample space but with non-identical feature space. An example of vertical FL can be the bank statement and information on the online shopping history of the same group of users. FTL applies to multiple datasets that are non-identical with regard to both the sample space and the feature space.

Focus of Recent Studies on FL Paradigm
Research on the FL paradigm is underway from multiple perspectives. Figure 3 describes the hot research areas that are under investigation to enhance the efficacy of the FL paradigm.
In addition to the key research areas mentioned in Figure 3, FL has been extensively investigated from multiple perspectives (e.g., hyper-parameters optimization, personalized FL, attacks on FL systems) [37]. The compact overview of key areas can enable researchers to choose a niche area for further research. The rest of this paper highlights FL's significance/use in the COVID-19 context.

Technical Applications of Federated Learning in the Context of COVID-19
FL has demonstrated its effectiveness in many sectors, including supply chains, robotics, finance, smart cities, smart healthcare, natural language processing and modeling, the insurance sector, social networks, and the IoT, to name a few. In this paper, our focus is on healthcare and especially COVID-19, and therefore, we summarize the achievements of FL in the healthcare sector only. Before presenting the detailed applications of FL in the COVID-19 era, we present the overall applications of FL in the healthcare domain in Figure 4. As shown in Figure 4, FL has been contributing significantly to the healthcare sector with diverse applications. The input to these applications are data of patients in the form of electronic health records (EHR), data from wearables, sensor readings, demographics, vital signs, images, X-rays, medical histories, and visuals, audio, and videos of various body organs. FL trains high-quality models for neurological (and other) disease diagnoses. Recently, FL and other AI-based developments have been used to assist doctors in performing various activities in hospitals. In the coming years, FL will be one of the mainstream technologies in performing various operations/services. Some studies explored the use of FL in tumor identification using ultrasound images and compared FL architecture and traditional AI architectures [39]. In this analysis, FL was proven more effective than the traditional ML/DL-based training architectures. Some studies have explored the usage of FL in hearing aids, survival prediction in patients with lung cancer, and confidentialityaware data processing [40][41][42]. Ngo et al. [43] developed a SOTA approach by combining DL and FL for diagnosing cerebellar ataxia (CA) using image data. The proposed approach yields higher diagnosis accuracy without feature engineering and ensures data privacy in real-life deployable scenarios. Islam et al. [44] proposed an FL-based secure data-collection method from IoT devices using drones and blockchain. The proposed approach yields better results in proof of concept experiments, highlighting multiple benefits such as data collection, storage, privacy preservation, security, and execution time. Similarly, FL has also contributed to lowering the effects of this pandemic on the general public when vaccines were unavailable. At present, there are various commercial deployments of FL to control COVID-19 using a variety of data sources. FL can work with many digital technologies, and therefore, the application/use of FL is more dominant than traditional AI techniques. Through a detailed analysis of the SOTA published in the past three years, we summarize practical examples of FL in Table 2.  As shown in Table 2, FL has many practical applications in the context of COVID-19. These applications have helped many entities in lowering the severe effects of COVID-19. Further information about the applications of FL in medical fields can be learned from previous survey [81][82][83][84]. Based on the extensive analysis, we found that most FL applications in the COVID-19 context are detection, prediction, diagnosis, and forecasting. In addition, the most commonly used data types are X-rays, images, and data from wearables. From the AI model's point of view, CNN and common ML models were frequently used in experiments. This knowledge can assist researchers in customizing existing developments as well as proposing new models for enhancing accuracy, precision, recall, F 1 score, etc. Apart from the data sources mentioned in Table 2, some FL applications have used signals data as well in improving medical services focusing on COVID-19 [85][86][87]. The implementation of FL with these heterogenous data sources helped in constraining the spread of the virus in a privacy-preserving way. In addition to FL, many other digital technologies have also contributed to lowering the effects of the pandemic on the general public. In Figure 5, we summarize the key technologies that have helped mankind mostly in the pre-vaccine era.
As shown in Figure 5, many technical developments have been made across the globe to combat the virus. In addition, the developments of contactless services have also boosted AI-related developments across the globe. In the post-COVID-19 era, more disruptive technologies will further reshape the industry.

Recent Synergies of Federated Learning with Other Emerging Technologies in the Context of COVID-19
Due to the distributed nature, the invisibility of training data, and untrustworthy clients' behavior, FL could not unleash much of its potential [89]. For example, FL failed to fully protect training data from adversaries because sometimes gradients/parameter sharing can weaken the privacy of participants. Similarly, due to the open nature of training, any party (including malicious entities) can join the system, and corrupt the training process with either wrong data or wrong models. Furthermore, FL cannot guarantee that the number of participants in the initial rounds will remain until the end of the training process. To overcome these challenges, FL has been extensively integrated with other emerging technologies. For example, to protect the privacy of training data, FL has been integrated with the differential privacy [90]. To further protect personal data in industrial settings, FL has established synergy with the blockchain [91,92]. In Table 3, we highlight the main synergies of FL with other emerging technologies in the context of COVID-19. FL + B5G + UAVs Data collection in a privacy-preserving manner Nasser et al. [110] FL + Computational intelligence (CI) Enhancement of data quality and equality in CI Peyvandi et al. [111] FL + Case-based reasoning Solving concept drift issues in healthcare Jaiswal et al. [112] FL + CFmMIMO Improve convergence speed Vu et al. [113] FL + SMC (secure multi-party computation) Prevent leakage of sensitive information in local models Li et al. [114] Apart from the analysis presented in Table 3, some recent surveys have highlighted the synergies in one or more aspects of the FL paradigm [115,116]. Furthermore, the synergies of FL are increasing data day by day to improve various technical aspects of this technology [117,118]. These synergies have also extended the applicability of this technology to many commercial and industrial sectors. Furthermore, some of these integrations are made to lower the communication and computation overheads of this technology [119]. In addition, some integrations are improving the privacy aspects of this technology [120]. In the coming years, the synergies of this paradigm with emerging technologies are likely to expand to advance its capabilities.

Open Source Implementation Frameworks of Federated Learning
In this section, we discuss the open-source implementations of the FL paradigm that have been experimentally tested on some real-world datasets. Although many open-source frameworks have been developed in the recent past, we present only the main frameworks that are accessible for rapid validation and experimentation in Figure 6. Most frameworks listed in Figure 6 can work with any dataset, but only a few provide robust support against attacks (i.e., Privacy FL). The tutorials and documentation about most frameworks are incomplete/partial except for OpenFL and PySft. Only a few frameworks provide support for other libraries and data partitioning. Most frameworks run on traditional CPUs, and only a few can run on large-scale hardware such as graphical processing units (GPUs). In addition to these open source developments, some propriety frameworks such as IBM FL [121], Substra [122], and NVIDIA CLARA [123], etc., have also been developed, which are not yet publicly available for rapid testing and validation. Moreover, there exists an open-source implementation of FL for some other emerging technologies (e.g., the IoT) [124]. Interestingly, only two frameworks (e.g., OpenFL and Fed-BioMed) provide support for medical applications. By using the FL frameworks listed in Figure 6, possible risks of exposing patients' sensitive health-related information can be resolved. In addition, the FL strategy enhances the training performance on medical data by exploiting big and large-scale datasets and offloading most processing to the local devices in a network, which would not be possible with the centralized AI technique. To provide technical information concerning two medical-related frameworks, we compare both frameworks on technical grounds in Table 4. The analysis presented in Table 4 can help to understand and further improve the implantation of these frameworks.
Lastly, these implementations have been used as the baseline in most studies and have been rigorously enhanced.

Challenges and Recommendations
When it comes to the actual deployment of FL in real-life healthcare settings, there exist multiple challenges. Although some challenges have been described in the previous research, a clear picture from all perspectives is still missing. In this work, we highlight most challenges of the FL paradigm and suggest valuable recommendations to address those challenges. As shown in Figure 7, we have categorized these FL challenges into nine main categories, which remained unexplored in the current literature.
Apart from the challenges cited in Figure 7, explainability, transparency, and fairness are also the main challenges of FL in the context of healthcare [125,126]. We will present each of these challenges in detail in the following paragraphs.
Client-related challenges: In the FL paradigm, clients are regarded as independent, which means they can perform most activities autonomously. Hence, they can leave the system at any time, which can lead to longer convergence and disturbs the training process [127]. The prevention of the client's dropout is a longstanding challenge in FL. In addition, the selection of clients who can contribute good models/data is also a non-trivial task. In some cases, clients make bots with each other to carry out any sort of malicious activities, which makes FL results unreliable or corrupts the training process. In addition, some clients hold up the data/model and delay the convergence speed. All these client-related challenges can degrade the performance of the FL paradigm. Servers-related challenges: In the FL paradigm, the server is responsible for the orchestration of the local models, aggregating models, and sharing the global model. However, in some cases, multiple attacks can be executed on the server by adversaries, which makes the FL system untrustworthy. Since the server is only concerned with the model weights without deep inspection, it cannot filter malicious clients, which degrades the performance of the FL paradigm. In some cases, gradients/parameters are exposed to adversaries during aggregation. All these server-related challenges can degrade the performance of the FL paradigm.
Training-data-related challenges: In the FL paradigm, training data are the most important element because the quality of FL models depends on the training data. There exist multiple challenges with regard to the quality of data. In addition, the privacy of the training data is one of the hot challenges in the FL paradigm [128]. Recently, the non-i.i.d. nature of the training data poses various technical challenges in the FL paradigm, and their solution has become more urgent than ever. In addition, guaranteeing the quality of data and preventing it from poisoning the paradigm is also one of the main challenges [129]. To truly benefit from the potential of FL, training-data-related challenges need robust solutions.
Poisoning-attacks-related challenges: In the FL paradigm, two main challenges that make the FL system unreliable in terms of results are: data poisoning and model poisoning. In the former attack, wrong data are used in training the local mode. In the latter attack, wrong local models are being sent to the central server [130,131]. Both these attacks have been investigated to enhance the trustworthiness of FL results. Furthermore, many strategies, even such as compromising privacy, have been suggested to eliminate these attacks [132]. To truly benefit from the potential of FL, both these challenges need a robust solution from the research community.
COVID-19-related challenges: In the COVID-19 era, due to the rapid rise in the amount of data, processing large and high-dimensional datasets has become challenging [133]. Furthermore, due to privacy concerns, good-quality data cannot be obtained easily. In these circumstances, FL can contribute toward resolving the data winter problem. However, the lack of a well-defined method for deploying FL methods in real life hinders the progress of AI-related methods. In addition, privacy issues such as data reconstruction make the deployment of FL very hard. Furthermore, identifying clients that can contribute good data in the FL paradigm remains challenging. Lastly, processing heterogeneous sources of data and deriving knowledge if it is very challenging. Furthermore, studying all dynamics of COVID-19 is still challenging because good-quality data for some aspects of this pandemic are not available for research purposes.
Apart from the challenges discussed above, handling inference and training time vulnerabilities in the FL paradigm is also very challenging. Luo et al. [134] discussed the possibility of inference attacks on FL systems through which potential privacy leakages can occur in real-life scenarios. Through this approach, the authors highlighted the need to preserve the privacy of prediction outputs in the vertical FL. Qiu et al. [135] highlighted the possibility of relation leakage and node leakage, leading to severe privacy breaches from graph data in vertical FL. Ha et al. [136] highlighted the possibility of inference attacks on the client side in FL systems using the generative adversarial networks (GANs) model. The authors have shown that some DL models can learn "unintended" features that can expose personal information to adversarial participants/clients. Rassouli et al. [137] have shown that in FL systems, an adversary can perfectly reconstruct a substantial number of features when the number of predictions is large enough. These kinds of data reconstruction attacks enable full training data disclosure in most cases. Zhang et al. [138] proposed a GAN-enhanced method for launching a membership inference attack in FL systems. The authors achieved a 98% attack accuracy and identified two main reasons (i.e., diversity in training data and overfitted FL models) for the success of such attacks. To address these inference attacks, many defense strategies have also been developed [139][140][141]. Further information about inference attacks and their corresponding defense can be learned from a recent study [142]. Recently, security and privacy issues have been rigorously investigated by many researchers [143]. In the future, more defense mechanisms will be needed to provide a solid defense against many emerging inference attacks (e.g., feature detection, extraction, feature disclosure, label disclosure, data reconstruction, membership inference, unintended features, feature information, etc.). Recently, addressing statistical heterogeneity in training data across clients/devices has also become one of the hot challenges in the FL paradigm [144]. Concept drift makes the FL learning process more complicated because of the higher inconsistency between existing and upcoming data. Traditional concept drifts handling techniques (e.g., chunk-based and ensemble-learning-based) are unsuitable in the FL frameworks due to the heterogeneity of local devices [145]. Similarly, handling some data types such as genome data in the FL environments poses various challenges [146]. Considering these challenges, robust solutions are needed to address all of the above-mentioned challenges. In Table 5, we propose technical recommendations to address these challenges by analyzing the existing open-source developments, as well as the detailed synthesis of published literature. The detailed guidelines presented in Table 5 can contribute to enhancing the technical effectiveness of this recent paradigm.

Clients
Development of multi-criteria (i.e., activeness, data quality, computing resources, etc.) incentive mechanisms to retain potential clients.
Server Implementation of anomaly detection algorithms for filtering malicious clients/local-models/updates.
Training data (i) Analyzing the distributions of data concerning balance and adding synthetic samples for minority classes.
(ii) Implementation of privacy solutions such as differential privacy or encryption for securing it. (iii) Implementation of secure data sharing strategies to remove poisoned samples.
Network Architecture Performance issues (i) Implementation of parallel computing algorithms for enhancing scalability. (ii) Implementation of algorithms that do not share local models frequently (i.e., partial information sharing methods). (iii) Implementation of edge/fog computing models to donate some computing to nearby devices. (iv) Implementation of computing offloading methods to prevent cold start problems. (v) Implementation of low-cost convergence criteria.

Models and parameters
(i) Implementation of secure methods for communication between clients and server. (ii) Implementation of clustering methods to share information in the clustered form.
(iii) Implementation of methods for filtering wrong models.
Inference issues (i) Implementation of secure methods for preventing data reconstruction attacks. (ii) Implementation of methods for hiding details of training data. (iii) Restricting access to data/results by analyzing the sensitivity-based analysis of the queries.
Deployment issues (i) Forming multidisciplinary teams to analyze the risks of deployment. (ii) Implementation of explainability, fairness, and trustworthy functionalities for results understanding.
(iii) Proposing GPU-based implementations to address scalability, communication, and computing issues.

Comparisons and Discussion
In this section, we compare our work with the existing state-of-the-art (SOTA) studies in multiple aspects. Although many studies have highlighted the potential of the FL paradigm in the medical field, only a few studies have focused on the applications of the FL paradigm in the COVID-19 era. To compare our work, we selected seven SOTA and recently published studies centering on the FL paradigm in the medical field. We have chosen various parameters for fair comparisons to prove the significance of our work in the body of knowledge. Table 6 presents the in-depth analysis and comparison of our work with the existing SOTA studies. As shown in Table 6, our work has covered many more aspects of FL with regard to COVID-19 than the previous SOTA studies. In addition, this is the first work that has comprehensively covered FL's role in the recent pandemic. The contents enclosed in this article can pave the way for understanding this leading technological role in the medical field, especially related to COVID-19. In addition, our work is the first that highlights the open-source developments of FL, which can assist in understanding the development status of this paradigm. In recent years, FL has been fused with multiple technologies (i.e., the industrial internet of things (IIoT), blockchain, edge computing, etc.) to address the privacy and security issues in real-life domains [147]. Although FL can solve many cybersecurity-related issues, the FL paradigm is prone to multiple attacks due to its decentralized architecture. Therefore, more approaches are needed to address cybersecurity-related issues in FL-based systems.
The major contributions of this work compared to previous studies are: (i) higher coverage of FL applications in terms of numbers (i.e., 36) in the era of COVID-19; (ii) through discussion of challenges faced by FL paradigm which are either ignored or barely discussed by previous studies; (iii) systematic discussion of data types which were used to lower the spread of COVID-19; (iv) highlighted the open-source frameworks that have recently been developed along with their in-depth details; (v) a discussion and analysis of open source frameworks that were being developed specifically for the medical domain; (vi) it is the first study to provide recommendations to address the technical deficiencies of the FL paradigm; (vii) it is the first study that pinpoints and discusses the synergies of FL with other emerging technologies; (viii) the systematic coverage of issues that can emerge in FL deployment; (ix) a discussion about other COVID-19-fighting digital technologies; and (x) a detailed discussion of hot research area(s) targeting the FL paradigm. Furthermore, our study has covered many FL applications in the COVID-19 era that remained unexplored in previous works. Furthermore, this is the first study that discussed FL applications along with AI models and data details. This work can pave the way to providing the recent status of FL developments in the COVID-19 era.

Conclusions and Future Work
In the big data era, there is a growing demand for the responsible use of data to draw fair, unbiased, and impartial decisions with the help of data science tools to improve the quality of many real-world services (e.g., healthcare, recommendation, navigation, smart cities, mobile doctors, etc.). Since data have a huge impact on the advancements of real-life services/decisions, data must therefore be shared on a large scale with analysts/researchers. Unfortunately, data distribution at a wider scale is not possible due to privacy concerns, and many companies are reluctant to share aggregated personal data. Thanks to the rapid development in the FL paradigm, personal data orchestration at a central place is no longer required while the AI model can still be trained on them locally. In this paper, we present a technical overview including applications and challenges of the FL paradigm with a special emphasis on COVID-19 in the big data era. Although there are some review papers on FL applications in the medical domain, they paid less attention to FL applications in the context of the COVID-19 pandemic. To fill this gap, we presented an in-depth review of the FL applications and challenges in the context of COVID-19. In the future, we intend to cover the taxonomy of FL applications involving both independent and identically distributed (IID) and non-IID data in the medical field [148][149][150]. Finally, we intend to discuss the hardware and software challenges in the deployment of FL models in real-life scenarios.  Data Availability Statement: Data and studies that were used to support the findings of this study are included within this article.