Applications of Federated Learning; Taxonomy, Challenges, and Research Trends

: The federated learning technique (FL) supports the collaborative training of machine learning and deep learning models for edge network optimization. Although a complex edge network with heterogeneous devices having different constraints can affect its performance, this leads to a problem in this area. Therefore, some research can be seen to design new frameworks and approaches to improve federated learning processes. The purpose of this study is to provide an overview of the FL technique and its applicability in different domains. The key focus of the paper is to produce a systematic literature review of recent research studies that clearly describes the adoption of FL in edge networks. The search procedure was performed from April 2020 to May 2021 with a total initial number of papers being 7546 published in the duration of 2016 to 2020. The systematic literature synthesizes and compares the algorithms, models, and frameworks of federated learning. Additionally, we have presented the scope of FL applications in different industries and domains. It has been revealed after careful investigation of studies that 25% of the studies used FL in IoT and edge-based applications and 30% of studies implement the FL concept in the health industry, 10% for NLP, 10% for autonomous vehicles, 10% for mobile services, 10% for recommender systems, and 5% for FinTech. A taxonomy is also proposed on implementing FL for edge networks in different domains. Moreover, another novelty of this paper is that datasets used for the implementation of FL are discussed in detail to provide the researchers an overview of the distributed datasets, which can be used for employing FL techniques. Lastly, this study discusses the current challenges of implementing the FL technique. We have found that the areas of medical AI, IoT, edge systems, and the autonomous industry can adapt the FL in many of its sub-domains; however, the challenges these domains can encounter are statistical heterogeneity, system heterogeneity, data imbalance, resource allocation, and privacy.


Introduction
The number of IoT devices and edge devices has increased significantly, which results in the extraordinary growth of generated data [1]. Predictions have been drawn that the global data will reach 180 trillion GBs, and 80 billion nodes will most probably be linked to the Internet in 2025 [1]. Nevertheless, the nature of most of the data has been privacysensitive, and there is a risk of privacy breaches to store the data in data centers, as well as becoming expensive in terms of communication [2].
To sustain the privacy of edge data and to decrease the communication cost, it is essential to have a different category of machine learning (ML) approaches, which moves the processing over the edge nodes so that the clients' data can be maintained. It is possible by using a prevalent approach called federated learning (FL). This approach is not only a precise algorithm but also a design framework for edge computing. the processing over the edge nodes so that the clients' data can be maintained. It is possible by using a prevalent approach called federated learning (FL). This approach is not only a precise algorithm but also a design framework for edge computing.
Federated learning is a method of ML that trains an ML algorithm with the local data samples distributed over multiple edge devices or servers without any exchange of data. This term was first introduced in 2016 by McMahan in [3].
Federated learning distributes deep learning by eliminating the necessity of pooling the data into a single place [4], as shown in Figure 1. In FL, the model is trained at different sites in numerous iterations [5]. This method stands in contrary to other conventional techniques of ML, where the datasets are transferred to a single server and to more traditional decentralized techniques that undertake that local datasets are scattered identically. FL allows multiple nodes to form a joint learning model, with no need of exchanging their data samples [3], and it addresses critical problems such as access rights, access to heterogeneous data, privacy, security, etc. Applications of this distributed learning approach are spread over several business industries including traffic prediction and monitoring [6], healthcare [7], telecom, IoT [8], transport and autonomous vehicles [9], pharmaceutics, and medical AI.
The FL stances new tasks to prevailing privacy-preserving ML algorithms [10]. Outside providing demanding privacy assurances, it is essential to mature the techniques with computationally economical methods and to become tolerant to dropped devices with communication efficiency and increased accuracy (as represented in Figure 1).

Related Work
Federated learning (FL) is an evolving approach to solve privacy problems in distributed data. Many studies have been conducted to design new frameworks to improve this new paradigm of ML, but few survey studies and literature reviews have been performed to evaluate the research showing the new frameworks and approaches. These surveys are reviewed in this section.
Xu et al. performed a study focusing on the advancement of federated learning in healthcare informatics [7]. They summarized the general statistical challenges and their FL allows multiple nodes to form a joint learning model, with no need of exchanging their data samples [3], and it addresses critical problems such as access rights, access to heterogeneous data, privacy, security, etc. Applications of this distributed learning approach are spread over several business industries including traffic prediction and monitoring [6], healthcare [7], telecom, IoT [8], transport and autonomous vehicles [9], pharmaceutics, and medical AI.
The FL stances new tasks to prevailing privacy-preserving ML algorithms [10]. Outside providing demanding privacy assurances, it is essential to mature the techniques with computationally economical methods and to become tolerant to dropped devices with communication efficiency and increased accuracy (as represented in Figure 1).

Related Work
Federated learning (FL) is an evolving approach to solve privacy problems in distributed data. Many studies have been conducted to design new frameworks to improve this new paradigm of ML, but few survey studies and literature reviews have been performed to evaluate the research showing the new frameworks and approaches. These surveys are reviewed in this section.
Xu et al. performed a study focusing on the advancement of federated learning in healthcare informatics [7]. They summarized the general statistical challenges and their solutions, system challenges, as well as privacy issues in this regard. With the results of this survey, they hope to provide useful resources for computational research on machine learning techniques to manage extensive scattered data without ignoring its privacy and Electronics 2022, 11, 670 3 of 33 health informatics. However, there should be some discussion on the datasets used for health and informatics systems. Yang et al. proposed frameworks for secure federated learning [11]. They introduced a secure federated learning framework, including both vertical and horizontal federated learning as well as federated transfer learning. They provided descriptions of architecture and the applications of federated learning. They also provided a detailed survey of already existing research works in this area. Besides, based on federated mechanisms, they proposed data networks building among organizations to share data without compromising the privacy of the user. However, they did not discuss a detailed taxonomy on the domains in which this technique can be applied. Yang et al. surveyed and reviewed the current difficulties of executions of federated learning as well as their solutions [12]. The author also displayed portable edge enhancements and then concluded the most vital challenges and problems for future research in FL. However, they did not discuss the datasets used for implementation of federated learning in edge networks.
Recently, [5] has a broad narrative about the attributes and challenges of federated learning gathered from diverse published articles. Although, they mostly focus on crossdevice FL, where the nodes are a very huge number of IoT and mobile devices.
To the best of our knowledge, there is no systematic literature review with a discussion of datasets of FL and implementation techniques published as of yet. All the surveys on this area are summarized in Table 1, and the detailed comparison is summarized in Table 2. "They provided a survey on FL in optimizing resource allocation while preserving data privacy in wireless networks." [5,12,18,19] [17] 2019 arXiv preprint arXiv:1908.07873 Survey on FL approaches and its challenges "This paper provides a detailed tutorial on Federated Learning and discusses execution issues of FL." [5,12,14,17,20]

Contribution and Significance
The key emphasis of this paper is to perform systematic literature review (SLR) of present research studies that evidently defines the adoption of federated learning in multiple application areas.

•
The main contribution of this research study is that it analyzes and investigates the state-of-the-art research on how federated learning is used to preserve client privacy. • Furthermore, the taxonomy of FL algorithms is proposed to help the data scientists to have an overview about this technique. • Moreover, a complete analysis of the industrial applications that can obtain benefits from FL implementation has been presented.

•
In addition, the research gaps and challenges have been identified and explained for future research. • Lastly, the overview of available distributed datasets, which can be used for this approach, are discussed.

Organization of the Article
This research article has been partitioned into seven main sections: Section 2 represents the background of the areas related to this research study and presents basic knowledge to the reader. Section 3 discusses the protocol and methodology for conducting SLR by defining the research questions (RQs), search scheme, search procedure, inclusion and exclusion criteria, and results; Section 4 presents the execution of the systematic review for our problem. Section 5 presents the discussions on the findings and outcomes; where Section 5.1 addresses the applications of FL, Section 5.2 explains the algorithms of FL and their advantages, followed by Section 5.3 which explains the datasets used in FL, and the last sub-section of the discussion section explains the challenges of deploying FL on large scale. The article is concluded in Section 6.

Background
Data plays an important role in machine learning-based systems, for the fact that it brings effective model performance. The data produced from a huge number of IoT devices on an hourly basis arises from the major challenge of resource consumption for the data science industry in pooling the data. Moreover, data privacy of IoT can be at risk. For mitigating the issues, the FL technique provides an adaptable platform to the data scientists.
Federated learning sets novel challenges to current privacy-preserving methods and algorithms [10]. Outside providing demanding privacy assurances, it is essential to mature computationally economical and communication efficient methods that can be tolerant to dropped devices without compromising accuracy.

Iterative Learning
To attain as better performance as centralized machine learning, FL employs an iterative method containing multiple client-server exchanges, which is known as federated learning round [21]. Each interaction/round in iterative learning starts with diffusing the current/updated global model state to the contributing nodes (participants), then training the local models on those nodes to yield certain potential model updates from the nodes, and then processing and aggregating the updates from local nodes into an aggregated global update so that the central model can be updated accordingly (see Figure 1).
For this methodology, a server (named FL server) is used for this processing and aggregation of local updates to global updates. The local training is performed by local nodes with respect to the commands of FL server. The iterative learning of the model is performed in three major steps, i.e., initiation, iterative training, and termination, as shown in Figure 2. The details of these steps are described as follows: and then processing and aggregating the updates from local nodes into an aggregated global update so that the central model can be updated accordingly (see Figure 1).
For this methodology, a server (named FL server) is used for this processing and aggregation of local updates to global updates. The local training is performed by local nodes with respect to the commands of FL server. The iterative learning of the model is performed in three major steps, i.e., initiation, iterative training, and termination, as shown in Figure 2. The details of these steps are described as follows: Initiation: A model is selected for its training and initialized. The nodes are activated and go on waiting for the commands from the central FL server. Iterative training: These steps are executed for numerous iterations of learning rounds [12]: Selection: A segment of edge devices are chosen for training on their own data sample by providing the same recent statistical model from the FL server [22], whereas passive devices wait for the next iteration.
Configuration: FL server asks clients to train the current model on their local data in a stated manner [23].
Reporting: Every node reverts the learned model to the FL server. All results are aggregated and processed by the server, and the new model is stored [21]. It also tackles failures (such as if a node connection is lost). Then, it goes back to the selection.
Termination: Upon reaching a stated criterion for termination (such as local accuracy of the nodes higher than some target maximal number of rounds), the central server asks the termination of the iterative training. Then, the FL server considers the globally trained model as a robust model because of its training on multiple heterogeneous sources. Initiation: A model is selected for its training and initialized. The nodes are activated and go on waiting for the commands from the central FL server.
Iterative training: These steps are executed for numerous iterations of learning rounds [12]: Selection: A segment of edge devices are chosen for training on their own data sample by providing the same recent statistical model from the FL server [22], whereas passive devices wait for the next iteration.
Configuration: FL server asks clients to train the current model on their local data in a stated manner [23].
Reporting: Every node reverts the learned model to the FL server. All results are aggregated and processed by the server, and the new model is stored [21]. It also tackles failures (such as if a node connection is lost). Then, it goes back to the selection.
Termination: Upon reaching a stated criterion for termination (such as local accuracy of the nodes higher than some target maximal number of rounds), the central server asks the termination of the iterative training. Then, the FL server considers the globally trained model as a robust model because of its training on multiple heterogeneous sources.

How FL Works
FL is based on the "FedAvg" federated averaging method. FedAvg is Google's first vanilla federated learning algorithm for tackling federated learning challenges. Since then, numerous variants of FedAvg algorithms have been created to handle many of the federated learning challenges, including "FedProx", "FedOpt", "FedMa", and "Scaffold" (outlined in Section 5.2).
The following is a high-level explanation of how the FedAvg algorithm works. The goal of each round of FedAvg is to reduce the global model's objective 'w', which is just the total of the weighted average of the local device loss.
shows the loss on device k A random selection of clients/devices is taken. Each client receives the server's global model. Clients execute SGD (stochastic gradient descent) on their loss function, in parallel, and direct the learned model to the FL server for model aggregation. The server then uses the average of these local models to update its global model. The technique is then repeated for n more rounds of communication.

Systematic Review Protocol
A literature review is typically carried out to identify any critical gaps or overlooked extents of the research field that necessitate more investigation or analysis. A systematic literature review, on the other hand, can be used to make any relevant judgments or compile findings in a certain field (SLR). The SLR aids in identifying future research avenues and focusing on research gaps. Because SLR evaluates all of the academics who have started working on certain subjects so far, it necessitates a lot of labor and time. A consistent study approach, on the other hand, can demonstrate the completeness of SLR.
The first step in this research project was to perform a literature review on the subject. Several fragments of research linked to the topics are identified during the initial search. As a result, the problem was resolved in order to conduct SLR. SLR that has never been published can be found by analyzing the literature on federated learning. This could be due to the fact that FL is still a new paradigm. As a result, SLR can be used to create a framework for federated learning. This SLR is conducted using the reference manual adapted from Kitchenham (2007).

Research Objectives (RO)
The research objectives are as follows: RO1. To explore the areas that can potentially obtain advantages from using FL techniques. RO2. Evaluating the practicality and feasibility of federated learning in comparison with centralized learning in terms of privacy, performance, availability, reliability, and accuracy. RO3. To explain about the datasets used in different studies of federated learning and to highlight their experimental potential. RO4. To explore the research trends in applying federated learning. RO5. To highlight the challenges that can be encountered due to the employment of federated learning in edge devices.
Later, a search string including primary , secondary, tertiary, and additional keywords was selected to choose all the potential work for the SLR.
(FL) or (federated (deep or machine) learning) or (federated (application or framework or implementation)) or (federated (algorithm or method or approach)) or (federated learning in (edge computing or IoT or smart cities or NLP or healthcare or autonomous industry)) Figure 3 depicts the crucial words. The core keywords for this research study are the basic phrases utilized for federated machine learning, as well as the application in edge and other sectors. The secondary keyword is the application in edge and other fields. The secondary and additional keywords are used to locate studies on alternative applications in various industries, as well as concerns or challenges discovered during the process. Figure 4 depicts the procedure for conducting the review. secondary and additional keywords are used to locate studies on alternative applications in various industries, as well as concerns or challenges discovered during the process. Figure 4 depicts the procedure for conducting the review.

Research Questions
This SLR aims at summarizing and acquiring the comprehension of federated learning with respect to its usage, applications, and challenges to fulfill the research objectives. To this end, the research objectives are transformed into these research questions (RQs), as shown in Table 3  secondary and additional keywords are used to locate studies on alternative applications in various industries, as well as concerns or challenges discovered during the process. Figure 4 depicts the procedure for conducting the review.

Research Questions
This SLR aims at summarizing and acquiring the comprehension of federated learning with respect to its usage, applications, and challenges to fulfill the research objectives. To this end, the research objectives are transformed into these research questions (RQs), as shown in Table 3:

Research Questions
This SLR aims at summarizing and acquiring the comprehension of federated learning with respect to its usage, applications, and challenges to fulfill the research objectives. To this end, the research objectives are transformed into these research questions (RQs), as shown in Table 3: Table 3. Research questions.

RQ1
Which types of mobile edge applications and sub-fields can obtain advantage from FL?
The industries can obtain many benefits by deploying the FL, and these areas of interest need to be determined.

RQ2
Which algorithms, tools, and techniques have been implemented in edge-based applications using federated learning?
This would help to find the implementation and advantages of deploying FL in mobile edges

RQ3
Which datasets have been used for the implementation of federated learning?
To know about the details of datasets available to experiment in the field of FL.

RQ4
What are the possible challenges and gaps of implementing federated learning in mobile edge networks?
The implementation of FL in different fields may face some issues and challenges, which are need to be discussed.

Search Strategy
In this section, the search strategy for obtaining literature to analyze and answer the aforementioned RQs is presented.

Database
The most popular and reliable literature sources are used for this SLR. The search process for this paper is based upon the digital libraries as shown in Table 4.

Study Selection
The study concentrates on high-quality scholarly research work in the area of federated learning. After the retrieval of the initial results, the impertinent papers were filtered by executing a set of inclusion/exclusion criteria. These criteria reflect the most relevant and appropriate literature. Table 5 provides an insight of the criteria on which articles were selected or exempted for this research. Table 5. Inclusion and exclusion criteria.

IC1
Papers that unambiguously examine federated learning and are accessible.

IC2
Papers that mention and investigate the implementation approaches and applications of federated learning.

IC3
Papers that are focused on presenting research trend opportunities and the challenges of adopting federated learning. IC4 Papers that are published as technical reports and book chapters.

IC5
Papers that are focused on presenting research trend opportunities and challenges of adopting federated learning.

EC1
Papers that have duplicate or identical titles EC2 Papers that do not entail federated learning as a primary study. EC3 Papers that are not accessible EC4 Papers in which the methodology is unclear

Data Extraction
At this phase, an Excel structure was made as taught in the rule by Kitchenham [24]. The main rationale behind the structure is to collect the publications and the data achieved in critical examinations and to monitor the information expected to answer the RQs. The gathered data incorporate the open paper with the papers' titles, keywords, abstract and full-text, year of publication, and type of research.
Those papers that have the properties for the three filter columns as Yes were then moved to a new sheet to extract the necessary information for addressing the RQs: Applications: One or more applications of FL that are presented by a paper. Implementation: One or more implementation approaches for FL or algorithms that are explained, described, compared, or discussed by a paper.
Dataset: The paper provides any distributed dataset or expanded any dataset for FL. Challenges: An array of issues pertinent to federated learning, which are needed to be addressed by the researcher in this area.
If the recent research papers (ranging from 2016 to 2021) were related to these three factors, then these were considered as systematic literature reviews.

Systematic Review Execution
The search process was started in April 2020 and proceeded to May 2021. Initially a total of 7546 papers published between 2016 and 2020 were found through the digital libraries, out of which 478 were shortlisted after going through the titles and keywords filter. After reviewing the abstracts, 185 publications remained for the review conduction. Afterward, 80 papers being exempted following a detailed analysis of their substance, 105 papers were chosen as primary studies. The papers collected are from the Process, which is depicted in Figure 5, and Table 6 summarizes the detailed numbers for each phase of the filtration process.
If the recent research papers (ranging from 2016 to 2021) were factors, then these were considered as systematic literature reviews

Systematic Review Execution
The search process was started in April 2020 and proceeded to M of 7546 papers published between 2016 and 2020 were found through t of which 478 were shortlisted after going through the titles and keywo ing the abstracts, 185 publications remained for the review conduction being exempted following a detailed analysis of their substance, 105 primary studies. The papers collected are from the Process, which is de Table 6 summarizes the detailed numbers for each phase of the filtratio

Studies Related to the Research Question
The selected papers were organized in the form of clusters with respect to the research questions. Studies related to the question are represented as one cluster as shown in Table 7.

Discussions
This study underwent 105 research studies to find the applications, algorithms, datasets, and challenges that can be encountered while employing FL. Over careful filtration and examination of the shortlisted studies, a detailed analysis of the approaches and outcomes of some potential studies are achieved. Those are presented in Table A1 of Appendix A. The number of published research articles and technical reports are increasing exponentially in the area of FL as depicted in Figure 6

Studies Related to the Research Question
The selected papers were organized in the form of clusters with respect to the research questions. Studies related to the question are represented as one cluster as shown in Table 7.

Discussions
This study underwent 105 research studies to find the applications, algorithms, datasets, and challenges that can be encountered while employing FL. Over careful filtration and examination of the shortlisted studies, a detailed analysis of the approaches and outcomes of some potential studies are achieved. Those are presented in Table A1 of Appendix A. The number of published research articles and technical reports are increasing exponentially in the area of FL as depicted in Figure 6, thus making FL an emerging technique of machine learning.  In addition, the latest research articles were examined to conclude that the research questions discussion is divided according to the answers to the research questions. The RQ1 related studies are highlighted in Section 5.1, Section 5.2 discusses the studies focusing on RQ2, and for RQ3, Section 5.3 summarizes the datasets used for this research. Finally, the challenges that are summarized in Section 5.4 explain the RQ4. In addition, the latest research articles were examined to conclude that the research questions discussion is divided according to the answers to the research questions. The RQ1 related studies are highlighted in Section 5.1, Section 5.2 discusses the studies focusing on RQ2, and for RQ3, Section 5.3 summarizes the datasets used for this research. Finally, the challenges that are summarized in Section 5.4 explain the RQ4.

Applications of Federated Learning
FL is a promising distributed ML approach with the advantage of privacy preservation. It allows multiple nodes to build a joint learning model without exchanging their data. That is how it addresses critical problems such as data access rights, privacy, security, and access to heterogeneous data types.
Its applicability is claimed to be in a variety of fields such as autonomous vehicles, traffic prediction, and monitoring, healthcare, telecom, IoT, pharmaceutics, industrial management [74], industrial IoT [75], and healthcare and medical AI [76]. The proportion of the trend to used FL in different fields is depicted in Figure 7. Its first application was in Google GBoard, where it shows tremendous results, some of which are summarized in Table 8. Nevertheless, it applicability in some other areas such as finance [77] and quantum computing [78] is still being explored.
Electronics 2022, 11,670 of the trend to used FL in different fields is depicted in Figure 7. Its first a in Google GBoard, where it shows tremendous results, some of which are s Table 8. Nevertheless, it applicability in some other areas such as finance tum computing [78] is still being explored.  Figure 7). FL is adopted particul stances where privacy concerns and the desire to develop algorithms collid set of FL applications is organized into a taxonomy, as shown in Figure 8. T known federated learning initiatives are now being carried out on smartpho in Table 8); however, the same approaches can be used for PCs, IoT, and vices such as autonomous vehicles. Federated learning provides an extensive variety of possible applications in many areas such as NLP, IoT, etc. (as shown in Figure 7). FL is adopted particularly in circumstances where privacy concerns and the desire to develop algorithms collide. This diverse set of FL applications is organized into a taxonomy, as shown in Figure 8. The most well-known federated learning initiatives are now being carried out on smartphones (as shown in Table 8); however, the same approaches can be used for PCs, IoT, and other edge devices such as autonomous vehicles.  Nearly of the current and potential FL applications include.

Google Gboard
In the first place, Google opted for a federated learning strategy in GBoard for preserving the client privacy while providing better word recommendations and maintaining client privacy. This happened to be the first real-world application of FL where the algorithmic model is trained by the words typed by the user, and then the trained model is sent to the server. Then the aggregated model is used to enhance Google's predictive text feature [28]. This facilitates users to have better keyboard suggestions persistently with no need to share the data.

Healthcare
Modern healthcare systems entail a collaboration among hospitals, research labs and institutes, and federal agencies for the betterment of healthcare nation-wide [79]. Furthermore, collaborative research among nations is significant when worldwide health emergencies, such as COVID-19, are being encountered [80]. Table 8. Applications of federated learning in different domains and industries (RQ1).

Applications Related Studies
Edge computing FL is implemented in edge systems using the MEC (mobile edge computing) and DRL (deep reinforcement learning) frameworks for anomaly and intrusion detection. [8,22,[81][82][83][84][85] Recommender systems To learn the matrix, federated collaborative filter methods are built utilizing a stochastic gradient approach and secured matrix factorization using federated SGD. [86][87][88][89][90][91][92] NLP FL is applied in next-word prediction in mobile keyboards by adopting the FedAvg algorithm to learn CIFG [93]. [28,[94][95][96][97] IoT FL could be one way to handle data privacy concerns while still providing a reliable learning model [12,[98][99][100] Mobile service The predicting services are based on the training data coming from edge devices of the users, such as mobile devices. [28,94] Biomedical The volume of biomedical data is continually increasing. However, due to privacy and regulatory considerations, the capacity to evaluate these data is limited. By collectively building a global model for the prediction of brain age, the FL paradigm in the neuroimaging domain works effectively.
[ [101][102][103][104][105][106][107][108] Healthcare Owkin [31] and Intel [32] are researching how FL could be leveraged to protect patients' data privacy while also using the data for better diagnosis. [7,79,[109][110][111][112][113] Autonomous industry Another important reason to use FL is that it can potentially minimize latency. Federated learning may enable autonomous vehicles to behave more quickly and correctly, minimizing accidents and increasing safety. Furthermore, it can be used to predict traffic flow. [9,34,[114][115][116][117] Banking and finance The FL is applied in open banking and in finance for anti-financial crime processes, loan risk prediction, and the detection of financial crimes. [77,[118][119][120][121][122] In the healthcare industry, data privacy and security are extremely difficult to manage [123]. Many organizations have large volumes of sensitive and valuable patient data, which hackers are eager to get their hands on. Nobody wants their unpleasant diagnosis to be made public [7]. For frauds such as identity theft and insurance fraud, the abundance of data available in these repositories is quite beneficial. Because of the vast amounts of data and the significant threats that the health business faces, most nations have enacted strong laws governing how healthcare data ought to be handled, such as the HIPAA standards in the United States. These restrictions are fairly tight, and if an organization breaks them, it will face severe penalties. This is often beneficial for patients who are concerned about their personal information being misused. These laws, on the other hand, make it harder to use certain sorts of data in studies that could lead to new medical advances. Because of this complicated legal issue, companies such as Owkin [31] and Intel [32] are looking at how FL may be used to preserve patients' privacy while still putting their data to good use. Owkin is developing a platform that employs FL to secure patient data in experiments to identify drug toxicity, forecast disease progression, and estimate survivability rates for rare cancers. As a proof of concept, Intel teamed up with the "University of Pennsylvania's Center for Biomedical Image Computing and Analytics" in 2018 to show how federated learning may be used in medical imaging. Their DL model may be developed to be 99 percent as accurate as a model trained using conventional approaches, using a FL methodology; according to the cooperation.

Autonomous Vehicles
Federated learning has two primary applications for self-driving automobiles. The first is that it may safeguard user data privacy-many people are uncomfortable with the thought of their journey logs and other traveling data being shared and evaluated on a central server. By merely apprising the algorithms with precise data rather than whole user information, federated learning could improve user privacy.
Another important reason to use federated learning is that it has the potential to reduce latency. When there are many self-driving vehicles on roads in the future, they will need to respond and address quickly during safety incidents.
Because conventional cloud-learning entails huge transfers of data and a slower learning rate, federated learning has the potential to permit autonomous vehicles to respond more quickly and correctly, minimizing accidents and increasing safety.
Much research demonstrate that FL has the potential to revolutionize autonomous vehicles and the Internet of Things [124]. In [114], the authors claimed that due to a large number of self-driving cars on the road, it is necessary to respond quickly to real-world circumstances [114]. However, various security concerns occur with the standard cloudbased method. They argued that federated learning may be utilized to tackle this problem and eliminate such threats by limiting data transfer volume and speeding up the learning process of autonomous vehicles.

Algorithms and Models
The algorithms and models used for the implementation of federated learning and its better performance are summarized in Table 9. In this table, the most adopted algorithms are mentioned with respect to model and privacy mechanism, applied areas, and related studies of these algorithms. Konečný et al. (2016) set the algorithms through which each client can independently compute the update based on its local data to the current model [50]. They can send the updated information to a central server. A new global model is computed in the central server by combining the updates from clients. Mobile phones are the system's key clients, and their communication efficiency is critical. The researchers offered two approaches to reduce the cost of uplink transmission in this study: structured updates and sketching updates. Chen et al. described an end-to-end tree boosting system named XGBoost [51]. This system is used by data scientists widely to obtain many state-of-the-art results on several ML (machine learning) tasks. For tree learning, they weighted quantile sketch, and for the sparse data, they proposed a unique new sparsity aware algorithm. The research paper also provides insights on data compression and sharding to build scalable XGBoost. Conclusively, XGBoost uses way fewer resources compared to other systems and scales far billions of examples [4].
Nilsson et al. benchmarked three FL algorithms. They compared the performance of these three algorithms by residing the data on the server [125]. The algorithms include federated averaging (FedAvg), CO-OP, and Federated-Stochastic Variance Reduced Gradient. These algorithms were evaluated on MINIST dataset with the use of non-i.i.d. and i.i.d. partitioning of data. The research resulted in FedAvg as the highest accuracy algorithm among all. Chawla et al. proposed an over-sampling technique named SMOTE (synthetic minority over-sampling technique), which generates minority classes records to rebalance the data sample [126]. Han et al. enhanced the SMOTE by seeing the data distribution of marginal classes [127], but it also requires a larger dataset. However, this method is not appropriate for federated learning because the client's data is distributed. Some other approaches, such as Xgboost [51] and AdaBoost, can reduce bias as it learns from misclassification. However, these algorithms are subtle to outliers and noise. Certain protocols and frameworks need to be implemented on edge networks to successfully implement the federated learning approach. Some of those are discussed in this section.
Wang et al. proposed the integration of deep reinforcement learning (DRL) as well as federated learning to improve edge systems [179]. This proposal was integrated to optimize mobile edge computing, communication, and caching. To make use of edge nodes and collaboration among devices, they designed the "In-Edge Al" framework. While this framework was demonstrated to reach near-optimal performance, the overhead of training was relatively low. Finally, they discussed different challenges as well as opportunities to reveal a capable future of "In-Edge Al" such as AI acceleration in edge computing and [6,81].
Xu et al. performed a survey focusing on the progress of federated learning in healthcare informatics [7]. They summarized the general statistical challenges and their solutions, system challenges, as well as privacy issues in this regard. With the results of this survey, they provide useful resources for computational research on machine learning techniques to manage extensive scattered data without ignoring its privacy and health informatics.
Yang et al. proposed frameworks for secure federated learning [11]. They introduced a comprehensive secure FL framework, including horizontal and vertical FL as well as federated transfer learning. They provided definitions, architecture, and applications for FL. They also provided a detailed survey of already existing works in this area. Besides, based on federated mechanisms, they proposed data networks building among organizations to share data without compromising the privacy of the user [9]. Basnayake, V. developed a method to improve sensor measurement reliabilities in a mobile robot network [180]. It considered the cost of repairing faulty sensors as well as inter-robot communication. They built a system for anticipating sensor failures using sensor features in this work. The wireless connection and network-wide sensor replacement cost capturing were then minimized, given the sensor measurement reliability constraint. For the aforementioned task, they used convex optimization approaches to construct an algorithm that gave the optimal wireless information communication strategy and sensor selection. To detect sensor failures and estimate sensor properties in a distributed manner, they used federated learning. Finally, they ran extensive simulations and compared the proposed mechanism to existing state-of-the-art procedures to demonstrate its effectiveness. Sattler et al. (2019) presented clustered federated learning to address the issue of suboptimal results by FL if the data distribution of the local client diverges [169]. CFL is a federated multi-task learning framework. The geometric properties of federated learning loss surface are exploited by FMTL, which helps to group the populations of clients into clusters with trainable data distribution. There are no modifications required for the FL communication protocol in CFL. It applies to deep neural networks, and, on the clustering quality, it gives strong mathematical guarantees. CFL handles diverse client populations over time and is also flexible enough to preserve privacy. CFL is considered a post-processing mechanism and is achieving similar or more significant goals than the FL.
Mohri et al. optimized a centralized model in a newly proposed framework of agnostic federated learning [147]. Client distributions' mixture optimized it for any of the target distribution formed. They suggested that the framework yields a notion of fairness. To solve the corresponding optimization issues, they also proposed a fast stochastic optimization algorithm. For this, they also proved convergence bounds supposing a convex hypothesis set, as well as a convex loss function. They also demonstrated the advantages of their approaches in different datasets. Their framework can also be interesting for other scenarios of learning as drifting, cloud computing, domain adaptation, and others [12].
Han et al. introduced the value of imbalanced datasets as well as their broad application areas in data mining [127]. After that, they summarized the matrices of evaluation and previously existing possible methods to solve any imbalance problem. To address this issue, the synthetic minority over-sampling technique (SMOTE) is one of the oversampling tech-niques used. Two new minorities were proposed by this method using borderline-SMOTE 1 and borderline-SMOTE 2 over the sampling method.
In [181], the authors introduced a generalization of Dropout, the DropConnect to regularize the large and fully connected layers in neural networks. In contrast to Dropout, which sets the randomly selected activation subsets to zero in each segment, DropConnect sets a subset of weights in the system to zero. They compared the DropConnect to the Dropout, evaluating on a range of datasets. They aggregated multiple DropConnect trained models, and, on different image recognition benchmarks, they showed state-of-theart results.

Datasets for Federated Learning
For the federated learning implementation, there were numerous datasets accessible. Some were accessible to the public, while others were not, and this provides an overview of the publicly available datasets.
Different client devices deconstruct the federated datasets. For experimentation, several datasets are turned into federated datasets. A benchmark LEAF [66] provides some public federated datasets and evaluation framework, and other datasets and models used in existing publications in top tier conferences of the machine learning community throughout the past two years are summarized in Table 10 below.

Challenges and Research Scope
There are some other issues of this field that can arise as challenging problems to be addressed such as resource allocation [183], data imbalance, statistical heterogeneity, etc., as depicted in Figure 9. All the stated challenges are described in this section and summarized in Table 11. Table 11. Challenges in implementing federated machine learning.

Ref
Year Research Type Problem Area Contribution Related Researches [25] 2018 Experimental Statistical heterogeneity "They demonstrated a mechanism to improve learning on non-IID data by creating a small subset of data which is globally shared over the edge nodes." [3,26,30,67,184,185] [3] 2017 Experimental Statistical and communication cost "They experimented a method for the FL of deep networks, relying on iterative model averaging, and conducted a detailed empirical evaluation." [3,25,26,67,186,187] [67] 2020 Experimental Convergence analysis and resource allocation "They presented a novel algorithm for FL in wireless networks to resolve resource allocation optimization that captures the trade-off between the FL convergence." [3,[25][26][27]  "They performed analysis of SGD with k-sparsification or compression and showed that this approach converges at the same rate as vanilla SGD (equipped with error compensation)." [3,46,49,50] [189] 2020 Experimental Biasnesss of data "They demonstrated that generative models can be used to resolve several data-related issues even when ensuring the data's privacy. They also explored these models by applying it to images using a novel algorithm for differentially private federated GANs and to text with differentially private federated RNNs." [47,48] Electronics 2022, 11, 670 19 of 32

Challenges and Research Scope
There are some other issues of this field that can arise as challenging problems to be addressed such as resource allocation [184], data imbalance, statistical heterogeneity, etc., as depicted in Figure 9. All the stated challenges are described in this section and summarized in Table 11.

Imbalanced Data
Using their local data, each edge node in FL trains a shared model. As a result, the distribution of data from those edge devices is based on their many uses. In comparison to cameras located in the wild, cameras in the park, for example, capture more photographs of humans. We divided these FL imbalances into three categories to make it easier to distinguish between them:

1.
Size imbalance: when the size of each edge node's data sample is uneven.

2.
Local imbalance: this is also known as non-identical distribution (non-identically distributed) or independent distribution because not all nodes have the same data distribution. 3.
Global imbalance: denotes a collection of data that is class imbalanced across all nodes.
To explain the effect of imbalanced data on the training process, we will use the federated learning approach to train CNN (convolutional neural networks) with an imbalanced distributed dataset.

Expensive Communication
FL networks hypothetically encompassed a gigantic quantity of devices [17] (such as millions of notebooks and hand-held devices) and network communication may be computationally expensive and slower due to orders of magnitude. In such networks, communication requires more computations as compared to traditional data center environments. To make a model trained through the data provided by devices in an edge-based network, communication-efficient methods must be developed, which iteratively communicates short messages or model changes as a part of the training process, rather than transferring the complete dataset over the network.

Systems Heterogeneity
The ferreted networks are natively heterogeneous due to differences in network connectivity (Wi-Fi, 3G, 4G, 5G), hardware (CPU, RAM), power (battery level), communication, storage, and computing capacities of nodes. Furthermore, due to device and network size-related limits, only a small fraction of end nodes is active at any given time. The devices may be unreliable, and an active gadget will frequently stop working after a certain iteration. Fault tolerance is made possible by these system-level properties.

Statistical Heterogeneity
The edges frequently collect and share data in a non-i.i.d. manner across the network [12,25,139,184]. For the prediction of the next word, cellular phone users may utilize a wide range of vocabulary. Furthermore, the amount of data on different edges may differ, and an underlying structure that reflects the interaction between devices and their associated distributions may exist. This data generation paradigm challenges widely held i.i.d. principles in distributed optimization, raises the likelihood of stragglers, and may increase the complexity of analysis, modeling, and assessment.

Privacy Concerns
Privacy is often a major concern in FL applications in comparison to learning in the centralized data in data centers [12]. By sharing only model updates (such as gradient information) rather than the whole data, FL takes a step toward securing user data. However, transmitting local model updates throughout the training process may divulge sensitive data to the central server or a third party. While current efforts try to increase the privacy of federated learning through the use of mechanisms such as differential privacy [190] and secure multiparty computation [11], these approaches often provide privacy at the cost of lesser system efficiency or reduced model accuracy. Recognizing and assessing these trade-offs, theoretically and empirically, is a significant task in achieving private federated learning systems.
All the challenges addressed in the state-of-the-art literature are summarized in Table 9, with respect to its methodologies and their contribution.

Conclusions
Federated learning enables the collaborative training of a machine learning model and deep learning for mobile edge network optimization. FL allows multiple nodes to form a joint learning model to address critical problems such as access rights, access to heterogeneous data, privacy, security, etc. Applications of this distributed learning approach are spread over several business industries including traffic prediction and monitoring, healthcare, telecom, IoT, transport and autonomous vehicles, pharmaceutics, and medical AI. This paper summarized how federated learning is used to preserve client privacy through a detailed review of the literature. The search procedure was performed from April 2020 to May 2021, with the total initial number of papers being 7546 published in the duration of 2016 to 2020. After careful screening and filtering, 105 papers were selected to adequately describe the research questions of this study. It provides a systematic literature review about FL across domain applications and the algorithms, models, and frameworks of federated learning and its scope of application in different domains. Moreover, this study discusses the current challenges of implementing, and a taxonomy is proposed on implementation of, FL over a variety of domains. The survey reveals that healthcare and IoT have a vast implementation opportunity of FL models, as 30% and 25% of the selected studies used FL in healthcare and edge applications. The growing and real-world trend of FL research is seen in NLP with more than 10% of the total literature. The domains of recommender systems, FinTech, and the autonomous industry can adapt the FL, but the challenges these domains can encounter are statistical heterogeneity, system heterogeneity, data imbalance, resource allocation, and privacy.

Future Directions
To mitigate the data privacy concerns, along with providing a transfer learning paradigm, FL has emerged as an innovative learning platform by enabling edge devices to train the model with their own data. The growing storage and computation capacity of edge nodes, such as autonomous vehicles, smartphones, tablets, and fast communication such as 5G, FL has revolutionized the way of machine learning in the modern era. Thus, the applications of FL are cross-domain. However, there are certain areas that require further development of FL. For example, the convergence of its baseline aggregation algorithm, FedAvg, is application-dependent, and more refined methods for aggregation are worth exploring. Similarly, with the heavy computation required for FL, resource management can play an important part. So, optimization of communication, computation, and storage cost for edge devices during the process of model training needs to be refined and matured. In addition, most of the studies usually cover the area of IoT, healthcare, etc. However, more application areas can benefit from this learning paradigm, such as food delivery systems, VR applications, finance, public safety, hazard detection, traffic control, and monitoring, etc.

Conflicts of Interest:
The authors declare no conflict of interest.     The DFNAS applicability is needed to be explored over some real-world scenarios, such as text recommendation.

AST-GNN
[101] 2021 Medical imaging FL NA Reviews the latest research of FL to find its applicability in medical imaging.

Survey
Explains how patient privacy is maintained across sites using FL.
The technical presentation of the medical imaging can be illustrated by using a certain case study.