A Review on Federated Learning and Machine Learning Approaches: Categorization, Application Areas, and Blockchain Technology

: Federated learning (FL) is a scheme in which several consumers work collectively to unravel machine learning (ML) problems, with a dominant collector synchronizing the procedure. This decision correspondingly enables the training data to be distributed, guaranteeing that the individual device’s data are secluded. The paper systematically reviewed the available literature using the Preferred Reporting Items for Systematic Review and Meta-analysis (PRISMA) guiding principle. The study presents a systematic review of appliable ML approaches for FL, reviews the categorization of FL, discusses the FL application areas, presents the relationship between FL and Blockchain Technology (BT), and discusses some existing literature that has used FL and ML approaches. The study also examined applicable machine learning models for federated learning. The inclusion measures were (i) published between 2017 and 2021, (ii) written in English, (iii) published in a peer-reviewed scientiﬁc journal, and (iv) Preprint published papers. Unpublished studies, thesis and dissertation studies, (ii) conference papers, (iii) not in English, and (iv) did not use artiﬁcial intelligence models and blockchain technology were all removed from the review. In total, 84 eligible papers were ﬁnally examined in this study. Finally, in recent years, the amount of research on ML using FL has increased. Accuracy equivalent to standard feature-based techniques has been attained, and ensembles of many algorithms may yield even better results. We discovered that the best results were obtained from the hybrid design of an ML ensemble employing expert features. However, some additional difﬁculties and issues need to be overcome, such as efﬁciency, complexity, and smaller datasets. In addition, novel FL applications should be investigated from the standpoint of the datasets and methodologies.


Introduction
The volume of data is no longer the focus of our consideration, because of the emergence of big data [1].Data privacy and security are pressing issues that must be addressed.Data leakage is never a minor issue, and the public has recently been more concerned about data security [2][3][4].Individuals, collectives, and society are all working to improve data security and privacy protection.The GDPR [5] strives to safeguard consumers' privacy and data security, as shown by the European Union's execution of the Wide-ranging Data Fortification Guidelines on 25 May 2018.This requires operators to properly state user agreements and prohibits operators from deceiving or inducing users to waive their privacy rights.Operators were also forbidden from training the model without the handler's authorization.It also enables users to remove their personal information.Similarly, since 2017, China's Cyber Security Law of the People's Republic of China (PPC) [6] and the it saves data on multiple working nodes in a distributed manner and distributes resources via a trustworthy central server.Compared to dispersed ML, each worker node in federated learning is the single owner of its data and a model training participant.
Users have total sovereignty over local data, which stresses the confidentiality fortification of data owners.This is the fundamental quintessence of FL to ensure confidentiality.In a federated learning environment, there are two types of privacy protection systems.Encryption methods such as homomorphic encryption and safe aggregation are often used.Adding the noise of variance confidentiality to the method constraints is another common method.To maintain privacy, Google's planned federated learning [8] uses a combination of secure convergence and differential confidentiality.Other research [9] relies only on homomorphic encoding fortification settings to accomplish confidentiality fortification.
The following five research questions (RQ) were formulated to accomplish the aim and objective of the systematic review conducted.
RQ1: What are the applicable machine learning methods for FL? RQ2: What is the categorization of federated learning?RQ3: What are the FL application areas?RQ4: What is the relationship between FL and BT concerning data sharing in distributed systems?
RQ5: What are the ML algorithms implemented with FL? Therefore, the foremost contribution of this study is as follows: 1. Review the appliable ML approach for FL. 2. Review the categorization of federated learning.
3. Discuss the FL application areas.4. Presents the relationship between FL and BT. 5. Discussed some existing literature that has used FL and ML approaches.
The remainder of this article is arranged as follows: Section 2 discusses the interrelated pieces of literature on FL.Section 3 presents the materials and methods used in this investigation.Here the search strategy, suitability measures, information source and search, the study selected, data collection processes, and data extraction with analysis were similarly discussed.The results, including the search strategy yield, study characteristics, and study limitations, are discussed in Section 4. The remainder of this paper is concluded in Section 5.

Related Works
A lot of reviews have been conducted on FL, BT, and ML.Few of them are presented in this section and a summary of their study is summarized and shown in Table 1.Yang et al. [10] focused on the concept and application of Federated ML, and Kairouz et al. [11] presented the present advanced and open problems in FL.Many of the researchers focused on the problems encountered in FL, Li et al. [12] performed a survey on FL system components in terms of privacy and protection.Nguyen et al. [13] did an overview of the concepts and opportunities of the FL chain in mobile-edge computing (MEC).Mothukuri et al. [14] presented a comprehensive review on FL security and privacy that can assist in bridging the gaps between the present state of federated AI (FAI).Ali, Karinmipour and Tariq [15] discussed the integration of BT and FL for IoT in terms of the privacy issue and preservative measures.Antunes et al. [16] conducted an SLR o FL for healthcare and they focused on recent studies on FL in HER for healthcare applications.Lee and Kim [17] focused on the trends in BT and FL for data sharing in disturbed platforms such as industrial vehicles and healthcare applications.Khan et al. [18] presented the recent advances of FL towards enabling FL-powered IoT applications.Li, Yan and Lin [19] conducted a reviewed related studies of FL based on the baseline of a universal definition to give guidance for future works.
In summary, most of the reviews conducted by the researchers were focused on FL privacy and protection [10, 12,14,15], and others were on problems and challenges in FL [11,15,18,20].Most of all the few pieces of literature reviewed focused on just review and only one of the studies did a systematic literature review (SLR).
The motivation for this SLR is that it was noticed that there haven't existed many studies on SLR in the area of federated learning.It was also noticed that there hasn't been an SLR conducted on the integration of FL and ML with BT.We, therefore, decided to conduct an SLR on the recent advances in FL and ML application, as well as on the integration of BL and FL which we think does not exist even from the review of related works as shown in Table 1.We also discussed the FL application areas as researchers have not looked in that area recently.A survey on FL systems: Vision, Hype, and Reality for Data Privacy and Protection They conducted a comprehensive review of FL systems.They analyzed the FL system components in terms of privacy and protection.
Nguyen et al. [13] FL meets BT in edge computing:

Opportunities and Challenges
The authors presented an overview of the fundamental concepts and explore the opportunities for FL chain in MEC Mothukuri et al. [14] A survey on security and privacy of federated learning The authors provided a comprehensive study on FL security and privacy that can assist to bridge the gap between the present state of FAI.
Ali, Karinmipour & Tariq [15] Integration of BT and FL for IoT: Recent advances and future challenges The authors presented the notion of BT and its application in IoT systems.They discussed the privacy issues and preservation techniques in FL Antunes et al. [16] FL for Healthcare: Systematic review and architecture proposal The authors presented a systematic literature review on the recent study about FL in the context of electronic health records for healthcare applications.
Lee & Kim [17] Trends in BT and FL for data sharing in distributed platforms They reviewed FL and BT mechanisms and then described a survey on the integration of BT and FL for data sharing in industrial vehicles and healthcare applications.
Khan et al. [18] FL for IoT: Recent advances, taxonomy, and open challenges.
using full-text articles.The PRISMA flow diagram contains detailed information regarding the research selection process (Figure 1). Figure 1 depicts the entire procedure of the literature search and selection.Identification, screening, eligibility, and inclusion were the four stages of this procedure.In the identification stage, 4893 papers were gathered (IEEE Xplore = 1785, Taylor and Francis = 830, Sage = 492, SpringerLink = 274, WoS = 167).The total number of papers was 3424 after duplicates were removed.Following that, two independent reviewers undertook a coarse-to-fine evaluation of manuscript eligibility, with one screening title, keywords, and abstracts and the other reading complete texts.Unpublished thesis and dissertation studies, conference papers, not published in a peerreviewed journal, not in English, and not applying artificial intelligence models were all exclusion criteria.As a result, screening eliminated 3145 articles and full-text evaluation eliminated 101 papers.84 papers were from the original 3548 papers.Eighty-four studies were selected for the eligibility phase.Following these steps, 84 publications were found suitable for inclusion in this study.
study selection, data collection processes, data extraction, and analysis.

Study Selected and Data Gathering Procedures
After the first literature exploration, each article's title, keywords, and abstract were examined, and possibly pertinent articles were further retrieved and tested for suitability using full-text articles.The PRISMA flow diagram contains detailed information regarding the research selection process (Figure 1). Figure 1 depicts the entire procedure of the literature search and selection.Identification, screening, eligibility, and inclusion were the four stages of this procedure.In the identification stage, 4893 papers were gathered (IEEE Xplore = 1785, Taylor and Francis = 830, Sage = 492, SpringerLink = 274, WoS = 167).The total number of papers was 3424 after duplicates were removed.Following that, two independent reviewers undertook a coarse-to-fine evaluation of manuscript eligibility, with one screening title, keywords, and abstracts and the other reading complete texts.Unpublished thesis and dissertation studies, conference papers, not published in a peer-reviewed journal, not in English, and not applying artificial intelligence models were all exclusion criteria.As a result, screening eliminated 3145 articles and full-text evaluation eliminated 101 papers.84 papers were from the original 3548 papers.Eighty-four studies were selected for the eligibility phase.Following these steps, 84 publications were found suitable for inclusion in this study.

Search Strategy
The authors performed an electronic search using five publishing databases: IEEE Xplore, Taylor and Francis, Sage, Springer, and WoS.The language of the search was restricted to the English language.The publishing date was set as the time of the search (November 2021), with a lower limit of January 2017.Table 2 lists the terms used in the search.The AND was used as a logical operator.A targeted search was performed to supplement the computerized search.This comprised a Google Scholar online search and a manual examination of the cited references of relevant publications found using the search approach.The relevant papers were then placed on the ISI Web of Science (on 2 December 2021) to determine whether any additional publications cited them (forward citation search).

Eligibility Criteria
All papers that examined Federated Learning in a Distributed Environment and their applications were considered.The admission criteria were (i) published between 2017 and 2021, (ii) written in English, (iii) published in a peer-reviewed scientific journal, and (iv) Preprint published papers.Studies that were unpublished thesis and dissertation studies, (ii) conference papers, (iii) not in English, and (iv) did not use artificial intelligence models and blockchain technology were all removed from the review.

Information Source and Search
Literature exploration was achieved via IEEE Xplore, Taylor, Francis, Sage, Springer, and Web of Science (WoS).Numerous explorations in the stated e-databases were accomplished during December 2021 using the following search terms:  3 and distribution per publication source types is shown in Figure 2. Figures 2-4 show the outcomes of these processes.In the next section, the mentioned headings are used to summarize the recognized studies and their distribution in research.

Selection Execution
The goal of the search was to compile a preliminary list of research that will be evaluated further.The papers were then examined to determine whether they were appropriate and could be utilized to answer the research questions formulated, which had a time frame of five years between 2017 and 2021 (Figures 1-5).Tables 4-10 summarize some of the studies chosen based on the formulated research questions.

Selection Execution
The goal of the search was to compile a preliminary list of research that will be evaluated further.The papers were then examined to determine whether they were appropriate and could be utilized to answer the research questions formulated, which had a time frame of five years between 2017 and 2021 (Figures 1-5).Tables 4-10 summarize some of the studies chosen based on the formulated research questions.

Selection Execution
The goal of the search was to compile a preliminary list of research that will be evaluated further.The papers were then examined to determine whether they were appropriate and could be utilized to answer the research questions formulated, which had a time frame of five years between 2017 and 2021 (Figures 1-5).Tables 4-10 summarize some of the studies chosen based on the formulated research questions.The noise generated by our perturbing approach will have little impact on overall performance.
Table 9. Summary of FL application in healthcare.

Authors Applicable Domain Objective Contribution Limitation
Brisimi et al. [39] Predict the number of times a patient will be admitted to the hospital in the future.

Algorithm for Cluster Primal-Dual Splitting
Yield classifiers with a small number of features For convergence, additional iterations are required.

Silva et al. [40] MRI examination
Establish a federated analytic framework that is compatible with ENIGMA's standard pipelines.
Effortlessly deal with a variety of high-dimensional features.
Only a small dataset was used for testing.
Federated NLP approach with two stages.
To increase accuracy, a pre-processing step has been included.
Small, suspect instances are not suited.

Gao et al. [42]
Classification of EEG Make a horizontal FL framework that is hierarchical and diverse.
Over heterogeneous EEG data, the first EEG classifier was developed.
Work on just three separate datasets at a time.
Li et al. [43] Calculate the likelihood of death and the length of time spent in the hospital.
Introduce community-based FL and assess its effectiveness on non-iid icu EMRs.
In comparison to the baseline FL model, the model was able to achieve greater prediction accuracy in fewer communication cycles.
Extra communication overhead will result from community model settings.

Pfohl et al. [44] Medical Forecasting
Determine the effectiveness of FL in comparison to centralized and local learning.
Perform FL in a way that is both distinct and private.
The cost of privacy is undervalued.
Huang et al. [45] Predicting mortality based on drug use data

Method of adaptive boosting
Introduce data-sharing technologies to alleviate non-iid.
Using iid data for training iid data outperforms non-iid data Kim et al. [46] Computer phenotypes are studied.
Computational phenotyping using federated tensor factorization for privacy.
The patient data is not revealed since the information is summarized.
Only accurate when the data is tiny or skewed distributed.

Lee et al. [47] Similar patient matching
Framework for patient hashing that is federated Reverse engineering is a security threat that should be avoided.
Computed complexity is unavoidable.As presented in Table 4, the systematic review of the literature summarized the number of related articles reviewed.

Results and Discussion
In this section, data extraction and analysis, a summary of the reviews, the search strategy yielded during the study and, the limitations of the review study are presented.

Data Extraction and Analysis
The outcomes of each study topic are discussed in the following parts, as well as an appraisal of the existing works' strengths and limitations.
RQ1: What are the applicable ML methods for FL? FL is slowly entering the prevalent ML paradigm, intending to ensure privacy and efficiency in FL systems.We focus on three classes of methods that federated learning can support: linear models, decision trees, and neural networks (Table 5).

i. Linear methods
There are three types of linear models: linear regression, ridge regression, and lasso regression.Du et al [54] suggested using a federated environment to train a linear model, which addresses the security concern of entity analysis and accomplishes the equivalent accuracy as the non-private alternative.Nikolaenko et al [55] created the highest performing ridge regression system using homomorphic encoding, and Lindell and Pinkas [56].The linear method is straightforward to apply in comparison to other models, and it is a good model for adopting FL.
ii. Tree models Single or many decision trees (DT), for instance, gradient boosting decision trees and random forests (RF), may be trained via federated learning.The Gradient Boosting Decision Tree (GBDT) method has attracted a lot of consideration lately, owing to its excellent performance in a variety of classification and regression applications.For the first time, Zhao et al. [57] used the GBDT confidentiality fortification system in regression and binary classification responsibilities.To avoid the leak of user data privacy, the system securely combines regression trees learned by multiple data owners into a group.Cheng et al. [58] presented the SecureBoost framework, which allows users to create an FL system by training the gradient lifting DT model for horizontal and vertical partition data.

iii. Neural network (NN) models
The NN model is a prominent ML method right now, and it seeks to train neural networks to do complicated tasks.Deep neural network research is becoming further prevalent in the federal context.Drones may help with a wide range of tasks, including trajectory planning, target identification, and target localization.The UAV (Unmanned Aerial Vehicle) group typically trains the model through DL to provide more efficient services, but owing to the absence of an unceasing linking between the UAV group and the ground base station, the federal training technique cannot produce the UAV's real-time performance.Zeng et al. [59] were the foremost to apply a distributed FL approach to a UAV group, improve federated learning convergence speed, and perform joint power allocation and scheduling.The principal UAV recaps the local flight method taught by the other UAVs to develop the comprehensive flight method, which is then delivered to the other UAVs over the intra-group network.Bonawitz et al. [60] used TensorFlow to create a scalable FL system for mobile devices that can train a great quantity of distributed data models.To accomplish priority applications incorporating data, Yang et al. [10] put up a federated DL system built on data division.In addition to corporate data applications, traffic flow data in government affairs big data regularly includes a significant amount of user confidentiality.Liu et al. [61] recommend a clustering FedGRU technique that incorporates the ideal comprehensive method and captures the Spatio-temporal correlation of traffic flow data more precisely by combining GRU (Gated Recurrent Unit) NN for traffic flow forecasting with FL.Experiments on actual data sets reveal that it outperforms non-federated learning approaches significantly.

RQ2: What is the categorization of federated learning?
Here we converse on how to classify FL based on the distribution characteristics of the data.According to Yang et al.
[10], FL may be divided into three categories: horizontal FL, vertical FL, and FTL.Data deposited in separate nodes or institutions are generally in the form of a feature matrix.In most cases, data comprises numerous occurrences, with the horizontal axis of the sheet representing the client and the vertical axis representing the customer's qualities.Then, depending on the data partition mode, we may split FL (Table 6).

i.
Horizontal FL There is some intersection between the features of data dispersed over multiple nodes in horizontal FL, even though the data are fairly diverse in sample space.At the moment, current FL algorithms are largely intended for use in smart devices or internet of things devices (IoT).Horizontal FL is the most common kind of FL in these settings.Since data may vary greatly in model space while having a comparable feature space at the same time.Since the data has the same feature dimension (FD), the federated model solution for the Android mobile phone update proposed by Google [62] is often a horizontal FL.In addition, Gao et al. [42] proposed a hierarchical heterogeneous horizontal FL frame to address the problem of limited labeled entities.The problem of a lack of label may be handled by adapting heterogeneous domain adaption numerous times, each time utilizing each partaker as the aimed domain.This would help to compensate for the absence of data annotation in EEG classification.Data collecting is inextricably linked to a great amount of effort in real-world applications such as medical care.It is almost hard for any institution to create a data pool for sharing when it comes to cross-regional collaboration.
To strengthen the joint model, FL might build a federal network for cross-regional hospitals with comparable healthcare information.
ii. Vertical FL Vertical FL is appropriate for scenarios in which data is segregated vertically based on FD.The entire parties have homogenous data, which indicates that they have some sample ID overlap but vary in feature space.For instance, there was a healthcare facility that aimed to forecast disorders such as diabetes mellitus.According to studies, those with high blood pressure (HBP) and obesity are more likely to acquire type 2 diabetes [63].As a result, it may be assessed based on certain general measurements, for instance, the age and weight of the patients, including their health history.If a young guy does not have obesity or HBP but consumes extra calories and does not engage in physical exercise.He is also at risk for diabetes, but owing to a lack of knowledge, it cannot be anticipated or tailored.With the development of FL, it will be possible to collaborate with firms that have data sets from smartphone applications such as step counters or dietary structures.Furthermore, they may work together without requiring raw data transfer.Scholars often approach this topic by removing similar entities with different qualities to receive joint training.Due to entity resolution, it is a more difficult task than horizontal FL.Not nearly as straightforward as in horizontal FL, pooling all the datasets on a shared server to acquire from the worldwide model does not work on vertical FL since the communication among various proprietors remains a pressing issue.To preprocess vertical segregated data, Nock et al. [64] have developed an improved token-built entity resolution technique.To defend honest-butcurious opponents for vertical FL, Hardy et al. [65] proposed an end-to-end technique based on a linear classifier and applied improver homomorphic encoding.Existing applications for parties with similar illustration space, such as traffic desecration evaluation and trivial business credit risk investigation, are said to be founded on FATE, which was established by the Webank team.Furthermore, Cheng et al. [58] developed SecureBoost, a safe context for vertically partitioned data sets.The approaches outlined above, on the other hand, could only be used in basic ML methods such as logistic regression.As a result, vertical FL still has a lot of potential for development when it comes to applying it to more complex machine learning methodologies.

iii. Federated Transfer Learning (FTL)
In most circumstances, in contrast with the scenarios in horizontal and vertical FLs, data does not share model or feature space.As a result, the key issue in this scenario is an absence of data markers and deprived data value.Transfer learning (TL) allows you to transfer information from one domain (the source domain) to another domain (the target domain) to improve your learning outcomes, which is ideal in this case [66].In this fashion, Liu et al. [9] devised FTL as a technique to take a broad view of FL for use with shared parties with minor intersections.This is the first FL stack that includes training, assessment, and cross-validation and is based on transfer learning.Furthermore, the neural networks in this frame with additive homomorphic encryption technology may not solitarily avoid confidentiality seepage but similarly, give equivalent accuracy to nonconfidentiality-conserving methods.Nevertheless, communication proficiency continues to be a problem.As a result, Sharma et al. [67] labor diligently to enhance FTL.Instead of using HE, they used secret sharing technology to cut overhead while maintaining accuracy.It might also be expanded to block rogue servers.They presume that the model is semihonest in the earlier work.For a real-world application, Chen et al. [13] built a FedHealth system that uses FL to collect data from many organizations and then uses transfer learning to provide individualized healthcare services.Certain illness diagnostic and treatment data from one infirmary might be moved to an additional infirmary to aid in the analysis of other diseases using FTL.FTL research is still in its early stages, therefore there is a lot of possibility for enhancement to make it further versatile with various data structures.Data isles and confidentiality concerns are two major difficulties that have arisen as a result of the present large-scale industrialization of ML.FTL, on the other hand, is a viable technique to safeguard both data safety and user confidentiality while breaking down data island boundaries.

RQ3: What are the FL application areas?
With the establishment of a collaborative model free of legal worry, FL becomes a popular strategy.Despite the restrictions and considerable problems outlined above, early participants saw significant prospects in FL and began a series of associated research and efforts to implement FL in actual life.Numerous applications connected to industrial engineering or computer science are discussed in this section.

i. Application for mobile devices
Since Google originally proposed the notion of FL to forecast users' input via Gboard on Android gadgets, academics have been paying close attention to it.Chen et al. [12]; Leroy et al. [21]; Hard et al. [23], and Yang et al. [24] have all made improvements to keyboard prediction.Emoji prediction is also a center for study [25].A possible application is to apply the FL method to smart equipment to forecast human trajectory [28] or human behavior [29].
Although mobile device storage space and computational power are rapidly increasing.Due to transmission capacity constraints, it is challenging to meet the increased quality demand from mobile users.To avoid network congestion, most comprehensive providers choose to provide a service environment at the cellular network's edge, near to the client, rather than integrating cloud computing and cloud storage into the main network.Mobile edge computing (MEC) is the name given to this technology; however, it comes with a higher danger of data leakage.The combination of FL and MEC is one potential approach.Wang et al. [26] develop an 'In-Edge AI' framework that combines FL founded on deep reinforcement learning with a MEC system to additionally enhance resource apportionment issues.Furthermore, Qian et al. [27] focused on the application of FL to MEC.They created a confidentiality-consciousness service placement technique that allows them to deliver high-quality service by secreting needed services on edge servers near to customers.
In this scenario, mobile devices don't only relate to regular phones; they also refer to IoT devices.One of the most essential IoT applications is smart homes.Devices in smart home design will upload certain associated data to a cloud server to better understand customers' preferences, which might lead to a data breach.As a result, Aïvodji et al. [30] describe a safe federated architecture that can be used to develop joint models.Yu et al. [31] create a federated multi-task learning framework for smart home IoT to robotically study users' activity patterns and identify physical dangers.In addition, Liu et al. [32] suggested a data fusion strategy for robots' artificial learning in automaton networking based on FL.This technology might be used to develop guidance models and predict different crises in self-driving automobiles.The research on FL applications in mobile devices discussed earlier is summarized in Table 7.

ii. Application in industrial engineering
As a result of FL's success in data confidentiality fortification, it's only natural for industrial engineering (IE) to follow suit with FL applications.Due to legal and regulatory restrictions, data in certain sectors is not readily accessible.However, we can only take advantage of these dispersed datasets to acquire limitless benefits if FL is applied to these locations.
To the best of our knowledge, FL might have widespread adoption and application possibilities in data-sensitive domains for IE as a result of its ascent and development.
In the context of environmental protection, Hu et al. [33] devised a new conservational monitoring framework based on federated region learning (FRL) to compensate for the difficult interchangeability of observing data.Thus, observing data scattered from many sensors might be used to improve the collaborative model's performance.FL is also used to do visual inspections [34].It could not solitarily assist us to overcome the issue of insufficient faulty illustrations for detecting flaws in production jobs, nonetheless, it could similarly provide manufacturers with privacy assurances.Liu et al. [32] use FL to collect diversiform illustrations from federated tasks for improved grounding applications in picture fields.FL has suited for malicious attack detection in communication systems constituted of Unmanned Aerial Vehicles (UAVs) in addition to picture recognition and representation [35].Since UAV characteristics such as imbalanced data distribution and poor communication situations are extremely similar to FL difficulties.With the increased popularity of electric cars, Saputra et al. [36] developed a federated energy demand forecast approach for diverse charging stations to avoid energy congestion in the communication procedure.Furthermore, Yang et al. [37] used FL to transactions held by multiple banks to easily identify credit card swindles, which is a major addition to the financial area.Wang et al. [38] use a federated architecture based on Latent Dirichlet Allocation to do text mining.It passed the spam filtering and sentiment analysis tests on actual data.
To conclude, FL allows data owners to increase the scope of their data applications and enhance model performance by iterating across multiple entities.FL technology will help more sectors become smarter in the future.The combination of FL and AI will create a federal ecology free of data privacy concerns.The research on FL applications in industrial engineering discussed earlier is summarized in Table 8.
iii.Application in HealthCare FL has a bright future in health care as a disruptive technique of conserving data confidentiality.Although each medical facility may have a huge volume of patient data, this may not be sufficient to train their prediction methods [68].One of the effective options for breaking down the boundaries of analysis across various hospitals is to combine FL with illness prediction.
EMRs (electronic medical records) provide a wealth of clinical information.Kim et al., 2017 attempted to employ tensor factorization models for phenotyping examination to extract information from health annals without revealing patient-level data.It might be considered the first FL application in the medical field.In a federated setup, Pfohl et al. [44] investigated differentially secluded learning for EMR.They also showed that the results are equivalent to training in a unified environment.Huang et al. [45] utilize EMRs from several hospitals to estimate the death rate of heart disease patients.There is no data or parameter communication across hospital databases throughout the training phase.Aside from that, data collected from various distant clients into a dominant server is encoded ahead of time, and the decipherer is turned off after the training.Brisimi et al. [39] also utilize EMRs to determine if a patient with heart disease will be admitted to the hospital using an FL method known as cluster Primal-Dual Splitting (cPDS).This forecasting work may be carried out on health managing gadgets or in hospitals that save medical data without leaking information.Lee et al. [47] suggested a federated patient hashing architecture based on health data to find similar patients in multiple institutions without exchanging patientlevel information.This kind of patient matching might assist physicians in determining a patient's overall personality and directing them to a patient with greater experience.Huang et al. [45] used the Loss-based adaptive boosting Federated Averaging method on medication consumption retrieved from the MIMIC-III database to forecast patient death rates.This study looked at computing complexity, communication costs, and accuracy for each client, and found that they outperformed baselines.
Studies have also shown that FL can be used to assess genuine data from health records in the realm of natural language processing (NLP).The necessity for unstructured data processing of clinical notes is highlighted by Liu et al. [41].It was the first time NLP was used in conjunction with FL.They used a two-stage federated training model that included pre-processing to forecast a representation model for each patient and phenotyping training to investigate each kind of sickness.FL has recently been popular in the field of biologicalimage analysis.Silva et al. [40] proposed federated principal components analysis (fPCA) to extract characteristics from magnetic resonance imaging (MRI) from several medical facilities.In addition, Gao et al. [42] developed a hierarchical heterogeneous horizontal FL (HHHFL) framework for EEG classification to tackle the difficulty of limited labeled cases as well as the privacy limitation.
To the best of our knowledge, FL might have a broad range of popularization and application possibilities in data-sensitive industries in addition to the aforementioned domains as a result of its ascent and maturity.In 2019, the use of FL has increased by leaps and bounds.As a result, it is expected that FL will have a lot of potential in the future.FL now contributes mostly to horizontally collaborative training for landing applications, implying that the feature dimensions of each data are identical.Medical data at hospitals might be shared with other institutions in the future, such as insurance agents, to acquire more affordable pricing.As a result, vertical FL is a viable path to pursue.Furthermore, one issue is that current government training is focused on a limited number of organizations and is unable to scale to include collaborative training for a large number of devices or institutions.As a result, better analysis of mobile device data based on FL should be pursued to provide more useful data.The research on FL applications in healthcare discussed earlier is summarized in Table 9.
RQ4: What is the relationship between FL and BT concerning data sharing in distributed systems?
Blockchain (BC) is a relatively new technology that is rapidly gaining traction in other countries.In a nutshell, BC is a distributed ledger inspired by Bitcoin [69][70][71][72][73], categorized by decentralization, immutableness, traceability, communal conservation, frankness, and transparency.Quality investigation of 3D-printed articles [74], utilization observation and confidentiality-conserving energy trading for shrewd grids [75], and emergency healthcare facilities for pre-hospital maintenance are just a few of the BC-aided structures for industrial data allotment that have been projected [76].Existing BC research is mostly focused on developing innovative medical information allocation systems [77], but collaborative training to optimize data use has yet to be applied.BC can drastically modify several challenges in Florida, according to a new study.FL and BC are complementary technologies.BC is a natural fit for FL since it is a distributed technology that is inherently safe.Since the BC architecture is forgiving of rogue nodes, it will continue to function correctly as long as bad nodes do not account for more than 51% of total nodes.
Majeed and Hong [78] imagined a strong FL chain that could validate local model updates by injecting blockchain technology (BCT) into the language.Although BCT may ensure the security of a complete architecture, it has nothing to do with privacy.Individual node allusion does not pose a threat to privacy.If a malevolent clinic or hospital participates in the collaborative training, it may go to great lengths to pry into the personal information of other participants.Hence Ilias and Georgios [79] employed a BC smart convention to coordinate all clients and homomorphic encryption to provide further anonymity.Awan et al. [80] integrated a variant of the Paillier cryptosystem into their BC-based privacypreserving FL architecture as a precautionary step to prevent privacy leaking.Furthermore, by using BC, each party's contribution to optimizing the global model can be tracked, allowing for an incentive system to be implemented.The BC-based FL frames stated above did not provide a special incentive for clients to participate in training.A dynamic weighing mechanism was presented to increase the performance of FL [81].To inspire high-quality clients to partake in the training, it used learning accuracy and participation frequency as training weights.In addition, Kim et al. [46] introduced Block-FL, which rewards clients that store a large number of samples and thereby minimizes convergence time.To summarize, combining BC with FL is advantageous since blockchain is a decentralized technology that eliminates the need for a dominant server to anticipate worldwide models.As a result, it may be able to overcome FL's bandwidth limitations.In addition, it could not solitarily exchange updates while verifying accuracy to improve safety, but it could also use some kind of activation mechanism to advance FL service.When it comes to sharing learning models, however, incorporating blockchain may create extra delay.A BC-built FL with minimal dormancy would be preferable.
Blockchain is a networked peer-to-peer distributed, open-source, unchangeable public digital register.A blockchain is a ledger that is made up of a chain of blocks with agreement methods and encoding.This register keeps track of all transactions and interactions between users of the dispersed and distributed BC scheme [48].This network setup is resistant to malicious attacks since it only has negotiated blocks between users.Consensus techniques, for instance, proof-of-steak (PoS) and proof-of-work (PoW) are used to obtain an agreement in a dispersed setting.Fiscal facilities, smart contracts, IoT, and safety services are all possible applications of blockchain technology.
Blockchain may be used to attract clients for businesses that demand a high level of dependability and honesty.It is also spread, which eliminates the peril of a sole point of failure [49].The combination of AI and BCT has prepared the method for several robust structures that enable the collaboration of numerous gadgets while maintaining secrecy, verification, and veracity [48].The decentralized nature of BC [50] potentially replaces the central server in the BC-built FL (BCFL) system.Instead of a centralized server, smart contracts (SC) may perform the same operations and be triggered by blockchain transactions.In other words, the FL is carried out by the participating nodes using BC, which keeps track of global models and local modifications.BC storage, a group consensus device, and a method training component make up the method.The BCFL training material is stored on a BC scheme, which solitary approved gadgets may have access.Limited trustworthy nodes form a committee that verifies changes and provides a score to them in the committee consensus method.Only the most up-to-date changes will be stored on the blockchain.A new committee is constituted at the beginning of each cycle.Other than committees, nodes undertake local training for the model training every cycle.The researchers of Lu et al. [51] present a technique for distributing manifold clients in IIoT applications that combines federated learning into permissioned blockchain and integrates FL into authorized BCT.Kang et al. [52] propose a distributed vehicle strategy to alleviate the communication burden and meet provider confidentiality issues.The integration of FL with BCT, according to Rahman et al. [53] offers increasing value to the healthcare industry.Table 10 shows the summary of articles reviewed on the current state of data sharing in distributed systems in BC and FL.
RQ5: What are the ML algorithms implemented with FL?Some studies have employed ML and Deep learning (DL) with FL, and some of these are shown in Table 11.The methods, datasets, performance metrics, and limitations of the study were listed.

Summary of the Review
The review study included 22 articles in our systematic review and examined them founded on the aspects of ML approaches, categorization and application areas.A summary of our investigation is obtainable in Figure 2.This investigation reviews several fascinating and valuable articles regarding the state-of-the-art in FL.This article is organized based on ML approaches, categorization, and application areas.Figure 2 shows the PRISMA flow diagram of how the systematic review was conducted.Table 1 shows the summary of the related pieces of literature reviewed and finally, Table 2 shows the databases and keywords used for the study search.

Search Strategy Yield
Figure 2 shows a full summary of the search strategy yield.The database search yielded sixty-one published publications, with an additional twenty-three items discovered using a focused Google Scholar search and the reference lists of pertinent articles.Two reviewers (ROO and SM) looked through the publications and used inclusion criteria to find the relevant ones.Figure 2 depicts the technique.Despite satisfying all of the inclusion requirements, one hundred and twenty-four articles were rejected for being duplicates, one hundred and twelve for being irrelevant, ten for being book chapters, and twenty-six for being beyond the topic of the article.At this point, sixty-one papers were selected, and these sixty-one articles were placed into the ISI Web of Science database for a forward citation search.This search resulted in the discovery of eight new articles.A total of eighty-four publications were found to be suitable for inclusion in this systematic review.

Comparative Analysis
The study was compared with existing related works (literature reviews) and it was discussed that our study drove more into the relationship between blockchain, ML, and federated learning as seen in Table 12.Ali, Karimipour and Tariq [15] discussed the present progress and incoming challenges in blockchain and federated learning for IoTs.Nguyen et al. [20] presented the opportunities and challenges experienced in FL meeting BT in edge computing.Li et al. [19] discussed the application areas of federated learning alone.Passerat-Palmbach et al. [93] presented a study on Blockchain-orchestrated ML for confidentiality preserving FL in automated medical data.Zeng et al. [94] presented an all-inclusive review of the incentive mechanism for FL.In Hou et al. [95] architectures, applications, and issues encountered in blockchain-built FL were systematically reviewed in this research.Preuveneers et al. [96] examined an intrusion detection case study that is a chained anomaly detection model for FL. Lee and Kim [17] discussed the inclinations in BT and FL for data allocation in disseminated platforms.The closest studies to this present review are surveys by Nguyen et al. [13], Li et al. [19], Zeng et al. [94], Hou et al. [95], and Lee and Kim [17], but the difference is that PRISMA systematic review method was not used and then their studies are limited to only one aspect of blockchain and FL.ML was not also included in their studies.Hence, we contributed to knowledge by conducting a systematic review using the PRISMA method on federated learning and machine learning methods categorization, application areas, and blockchain technology.

Limitation
Due to the inclusion of only studies published in English, chosen search keywords, and database constraints, some relevant publications may be missing despite the exhaustive search across databases.Important data may also be found in non-peer-reviewed research, as well as unpublished thesis and dissertation studies.

Conclusions and Future Work
FL is a jointly decentralized privacy-preserving system that addresses data silos and data sensitivity issues.We looked at existing machine learning models for FL in this work.For FL modeling, it was argued that hybrid ML and DL models can typically outperform classic ML.However, there are several hurdles and issues with these ML approaches that have yet to be overcome.There are two aspects to this research that it contributes.To begin, we've included a comprehensive overview of several machine learning (ML) methodologies that may be used in FL applications.Secondly, we discussed some future research possibilities.Federated learning is expected to offer safe and shared security services for more applications soon, promoting the steady growth of artificial intelligence.The study acknowledges certain unsolved difficulties in its analysis based on existing studies, including Extreme communication schemes, communication reduction, the Pareto frontier, heterogeneity diagnostics, and granular privacy constraints, beyond supervised learning and productionizing FL and benchmarks.
A future study might advance the understanding of FL by providing (i) findings on hybrid deep learning classification methods for FL and (ii) findings on using larger datasets for FL implementations.Similarly, the unsolved difficulties can also be considered for solutions in the future.It is also suggested that FL solutions exist for different data partition cases and for what application domains can be surveyed in the future.

Figure 1 .Figure 1 .
Figure 1.PRISMA flow diagram of paper selection employed in this review study.
(Federated learning AND Distributed environment AND Machine learning Model OR Blockchain; 'Federated learning AND Distributed environment AND Machine learning Model OR Blockchain' within Computer Science Remove this filter Article Remove this filter 2017-2021; [All: federated] AND [All: learning] AND [All: distributed] AND [All: environment] AND [All: machine] AND [All: learning] AND [[All: model] OR [All: blockchain]] AND [Publication Date: (01/01/2017 TO 12/31/2021)]; [All federated] AND [All learning] AND [All distributed] AND [All environment] AND [All machine] AND [All learning] AND [[All model] OR [All blockchain]]within2017-2021 OR Federated learning AND Distributed environment AND Machine learning Model OR Blockchain) The keywords used in the database searching is shown in Table

Figure 2 .
Figure 2. Distribution per publication source types.Figure 2. Distribution per publication source types.

Figure 3 .
Figure 3. Number of publications per year.Figure 3. Number of publications per year.

Figure 3 .
Figure 3. Number of publications per year.

Figure 4 .
Figure 4. Articles that are pertinent to the study which was reviewed.

Figure 5 .
Figure 5. Categorization of primary articles based on a present review on FL.

Figure 4 .
Figure 4. Articles that are pertinent to the study which was reviewed.

Figure 3 .
Figure 3. Number of publications per year.

Figure 4 .
Figure 4. Articles that are pertinent to the study which was reviewed.

Figure 5 .
Figure 5. Categorization of primary articles based on a present review on FL.Figure 5. Categorization of primary articles based on a present review on FL.

Figure 5 .
Figure 5. Categorization of primary articles based on a present review on FL.Figure 5. Categorization of primary articles based on a present review on FL.

Table 1 .
Summary of the state-of-the-art related works reviews.

Table 2 .
Databases and Keywords used for Study Search.

Table 3 .
Keywords and search string.

Table 3 .
Keywords and search string.

Table 4 .
Summary of some selected studies and number of articles reviewed.

Table 5 .
Summary of Applicable Machine learning Methods.

Table 6 .
Summary of FL categorization.

Table 7 .
Summary of FL application in mobile devices.

Table 8 .
Summary of FL application in industrial engineering.

Table 10 .
Summary of research on the current state of data sharing in distributed systems in BC and FL.

Table 11 .
Summary of ML algorithms implemented with FL.

Table 12 .
Summary of some related review.