1. Introduction
The rapid adoption of intelligent healthcare systems in recent years has significantly contributed to the exponential growth in the volume of medical data generated. Concurrently, the evolution of the Internet of Things (IoT) has given rise to the Internet of Medical Things (IoMT), a specialized ecosystem comprising interconnected medical devices and equipment. IoMT is playing an increasingly vital role in modern healthcare, influencing not only clinical practices but also having a broad impact across various sectors of the economy. IoT-enabled sensors are now embedded in wearable technologies, medical devices, and healthcare infrastructure in both hospital and home settings. Furthermore, the integration of medical robotics has introduced capabilities such as performing surgical procedures, assisting in patient care, and enabling remote patient monitoring. Therefore, IoMT signifies a pivotal advancement at the intersection of healthcare and digital technology, integrating medical devices and applications with internet connectivity to enable real-time collection, analysis, and transmission of health-related data. The IoMT ecosystem includes a broad spectrum of technologies, including wearable sensors, implantable devices, and remote patient-monitoring systems, all designed to optimize clinical decision-making and enhance the efficiency of healthcare operations [
1]. By facilitating continuous monitoring and data collection, IoMT has the potential to revolutionize healthcare operations, particularly in the proactive management of chronic diseases and the enhancement of patient outcomes.
The growing ecosystem of connected medical technologies not only facilitates continuous monitoring and data acquisition but also lays the foundation for advanced data-driven healthcare solutions. In this context, Machine Learning (ML) and Deep Learning (DL) techniques are transforming healthcare by enabling predictive analytics, personalized medicine, and advanced decision support systems. These methods are capable of processing large volumes of heterogeneous health data, identifying complex patterns, forecasting disease progression, and recommending optimized treatment strategies [
2,
3]. The synergy between IoMT and ML/DL fosters the development of intelligent healthcare systems capable of real-time data analysis and responsive clinical interventions, ultimately enhancing the quality, efficiency, and personalization of care delivery.
The vast amount of medical big data generated by IoMT devices presents significant opportunities for applying ML and DL techniques to mine, store, analyze, and utilize this data to improve the quality of care and treatment delivery [
4]. However, the inherently complex and fragmented nature of IoMT systems introduces substantial privacy and security challenges. The many stakeholders, including hospitals, laboratories, clinicians, and researchers, often require access to patient data that is distributed across various platforms. Given the sensitive nature of personal health information, strict privacy regulations and secure data access protocols are essential [
5,
6]. Consequently, despite the promising advancements driven by the integration of IoMT and ML/DL in smart healthcare systems, data privacy and security concerns continue to pose a significant barrier to their widespread adoption and implementation. Moreover, the sensitive nature of health data necessitates stringent measures to protect patient confidentiality and comply with regulatory frameworks such as the Health Insurance Portability and Accountability Act (HIPAA) and the General Data Protection Regulation (GDPR). These regulations impose strict requirements on the collection, storage, and sharing of medical data, often limiting access to comprehensive datasets necessary for centralized training of ML/DL models [
7]. Furthermore, IoMT-generated medical data is vulnerable to security breaches, unauthorized access, and misuse, necessitating robust privacy-preserving mechanisms that are also computationally efficient. These constraints present significant challenges for conventional ML/DL approaches, which typically require large and centralized datasets for training [
8]. To address these limitations, Federated Learning (FL) has emerged as a promising privacy-preserving paradigm that enables collaborative model training across decentralized devices or institutions without the need to share raw data. In the FL framework, individual clients (e.g., hospitals or edge devices) locally train models using their private data and only transmit model updates, rather than the data itself, to a central server. The server then aggregates these updates to construct a global model that benefits from diverse data distributions while maintaining data locality and confidentiality [
9,
10]. This decentralized approach aligns with privacy regulations by ensuring that sensitive patient data remains on local devices, thereby reducing the risk of data breaches and unauthorized access. In practice, FL clients download an initial model from a central server, perform local training on their private data, and only send model updates back to the server. The central server integrates updates from all clients to improve the global model iteratively across multiple training rounds [
11].
Enhancing privacy through an FL-based approach has the potential to significantly improve health-monitoring practices. This is achieved by enabling the collection and analysis of sensitive patient data while ensuring that privacy and security are consistently prioritized [
12]. Similarly, this approach enables healthcare providers and researchers to access medical data without compromising patient privacy. Thus, applying FL within the IoMT domain has the potential to enhance patient health outcomes while simultaneously mitigating privacy and security risks associated with sensitive medical data [
13].
The aim of this study is to address the critical challenge of ensuring data privacy and security in the application of FL technique to healthcare data collected through IoMT devices. To achieve this objective, the study explores the integration of FL into IoMT-based healthcare analytics as a means to overcome the inherent privacy and security limitations of traditional centralized ML approaches. Therefore, a novel FL approach called FED-EHR is proposed to enable privacy-preserving analytics to be performed on Electronic Health Records (EHRs). FED-EHR is designed following the Privacy-by-Design principle, ensuring that sensitive medical data remains strictly local to the originating device or institution throughout the training process. This decentralized architecture enables collaborative learning across multiple data sources while preserving data locality, minimizing privacy risks, and ensuring compliance with major healthcare data protection regulations such as HIPAA and GDPR. By facilitating secure predictive modeling without compromising patient confidentiality, FED-EHR provides a robust and practical solution for privacy-sensitive healthcare analytics. To evaluate the framework, it is tested on two publicly available healthcare datasets related to diabetes and breast cancer. Further, FED-EHR is benchmarked against traditional Centralized Machine Learning (CML) approaches. In contrast to existing methods that often rely on data anonymization, encryption, or differential privacy, this study proposes a novel solution that offers a decentralized approach to EHR analysis, enabling an efficient and scalable FL approach for privacy-preserving healthcare analytics. This study is particularly significant in advancing secure and scalable healthcare analytics, especially in settings where data sharing is constrained due to ethical, legal, or infrastructural limitations. Hence, this study contributes a practical, regulation-aware, and deployable solution to the ongoing challenges of smart healthcare systems by demonstrating that FL can deliver performance comparable to centralized models while protecting patient data.
In summary, the recent advances in IoMT and ML have introduced new opportunities for the development of intelligent healthcare systems. However, the integration of these technologies raises significant concerns regarding data privacy, regulatory compliance, and model scalability. This study addresses these issues by leveraging FL to enable secure and decentralized analytics of EHR in IoMT environments. The primary objective of this study is to develop and evaluate a privacy-preserving and regulation-compliant FL framework for EHR analytics that operates effectively across distributed IoMT devices without compromising its predictive performance.
Contributions
This study aims to address the growing challenge of balancing predictive performance with stringent data privacy requirements in healthcare ML applications. As EHRs become increasingly distributed across IoMT-enabled healthcare infrastructures, conventional CML approaches raise substantial privacy and regulatory concerns. To overcome these limitations, this research proposes FED-EHR, a FL framework that enables collaborative model training across decentralized EHR data sources without requiring raw data exchange. Within this scope, the study performs a series of empirical evaluations using two publicly available medical datasets, the UCI Breast Cancer and PIMA Indians Diabetes datasets, to benchmark the predictive performance of FED-EHR against traditional CML models. The experiments utilize Logistic Regression (LR) and Multi-Layer Perceptron (MLP) models under both centralized and federated configurations with varying client counts. The study further compares FED-EHR with existing FL and CML-based healthcare solutions, providing a comprehensive performance analysis in terms of ROC-AUC metrics, scalability, and data privacy preservation. The main contributions of this study are summarized below.
Development of a privacy enhanced FL-based framework (FED-EHR): This study introduces FED-EHR, a novel FL framework designed for decentralized analysis of EHRs in IoMT environments. The framework inherently preserves data privacy by keeping sensitive health data local to the source devices, eliminating the need for additional privacy-preserving mechanisms such as differential privacy or homomorphic encryption.
Empirical performance validation on medical datasets: The effectiveness of FED-EHR is demonstrated through extensive experiments on publicly available healthcare datasets, the UCI Breast Cancer and PIMA Indians Diabetes datasets. The results show that FL achieves ROC-AUC scores of up to 0.98 (Breast Cancer) and 0.81 (Diabetes), closely matching the performance of CML models while ensuring privacy through data locality
Regulatory compliance for privacy-sensitive applications: FED-EHR adopts Privacy-by-Design architecture that ensures that sensitive health data remains within local IoMT or clinical environments throughout the ML process. This decentralized approach minimizes data exposure and aligns with the foundational principles of major healthcare data protection regulations such as HIPAA and GDPR. As a result, FED-EHR is well-suited for deployment in privacy-sensitive clinical settings where regulatory compliance and data confidentiality are critical concerns.
Scalability and applicability in real-world IoMT settings: FED-EHR is evaluated in simulated federated environments involving multiple clients (e.g., hospitals or edge devices). The results confirm that FED-EHR is both scalable and robust across varying client configurations, validating its suitability for real-world smart healthcare infrastructures characterized by data heterogeneity and wide geographic distribution.
The remainder of the paper is structured as follows:
Section 2 provides a review of the relevant literature and introduces the FL approach in the context of IoMT;
Section 3 explains the proposed FED-EHR framework and describes the implementation methodology;
Section 4 presents the evaluation;
Section 5 discusses the experimental results; and
Section 6 concludes the paper and provides future directions.
2. Related Works
The increasing interactions between patients and healthcare providers have resulted in large volumes of EHRs, creating opportunities for developing predictive models for diverse clinical tasks. Leveraging data from heterogeneous sources enhances generalizability and performance, particularly as IoMT devices continue to expand EHR generation through continuous monitoring and diagnosis support. Chronic conditions such as diabetes and breast cancer are of particular significance due to their high prevalence and the critical role of early diagnosis in improving outcomes. Despite this importance, limited utilization of data-driven approaches has restricted the full potential of predictive analytics in these domains. ML offers a powerful means to analyze both real-time and historical data, improving predictive accuracy, reducing costs, and transforming the doctor–patient dynamic by alleviating diagnostic workloads through employing algorithms such as Random Forest (RF), Support Vector Machines (SVM), and neural networks, which have demonstrated considerable promise in achieving high classification performance. For example, the study presented in [
14] evaluated multiple standalone and ensemble techniques on a breast cancer dataset, including SVM, K-Nearest Neighbors (K-NN), Decision Trees (DT), and RF classifiers. While numerous studies have focused on breast cancer prediction, the study presented in [
15] advanced the field by identifying early-stage predictors and recommending optimal ML models for sustainable breast cancer classification. Similarly, reference [
16] applied a centralized MLP model to the breast cancer dataset and explored the impact of adversarial attacks on model performance. Early diagnosis of diabetes is equally critical, as it can slow disease progression and reduce severe complications such as nephropathy and cardiovascular disorders. The study in [
17] proposed an ML-based system that integrates multiple algorithms to predict diabetes risk, supporting clinicians in early diagnosis and timely intervention. Additional studies have also explored various ML approaches for the early detection of both breast cancer and diabetes, further underscoring the relevance of ML in predictive healthcare analytics [
18,
19].
Traditional healthcare analytics have primarily relied on centralized learning paradigms, where EHRs from multiple institutions are aggregated into a central repository for model training. However, this approach raises significant privacy and governance concerns, as legal and ethical constraints often prohibit the external transfer of sensitive health data. Such restrictions make centralized solutions impractical in many real-world scenarios and expose patient data to heightened security risks. Within IoMT ecosystems, protecting medical data is paramount due to the large attack surface created by numerous interconnected devices. Therefore, it is essential to implement security and privacy-preserving mechanisms that minimize computational overhead and remain suitable for deployment in resource-constrained environments. FL has emerged as a promising decentralized paradigm that addresses these challenges by enabling collaborative model training without transferring raw data, thereby preserving privacy and ensuring compliance with regulations. By mitigating the privacy and security risks inherent in centralized systems, FL offers a practical solution for deploying ML in healthcare environments with restricted data sharing, motivating numerous IoMT-based healthcare approaches [
20,
21].
Recent advancements in Artificial Intelligence (AI) and the IoMT have facilitated personalized healthcare through remote diagnostics and continuous monitoring. However, conventional ML and DL approaches require centralized data aggregation, which poses significant privacy and security challenges under regulations such as GDPR and HIPAA. Moreover, centralized learning has become increasingly impractical with the exponential growth of IoMT devices and the distributed nature of sensitive health data. FL addresses these challenges by enabling model training directly on decentralized devices without transferring raw data to a central server [
11,
22] and allows for large-scale healthcare analytics while preserving patient privacy [
23,
24]. Thus, this paradigm has demonstrated applicability in IoMT networks [
25] and diverse domains, including cardiac activity monitoring, stress detection [
26], chest X-ray analysis for COVID-19 [
27], EHR-based predictive modeling [
28], Adverse Drug Reaction (ADR) prediction [
29], and multi-site health studies with strong privacy guarantees [
30]. Additional applications include breast cancer classification using Federated and Split Learning [
31], heart sound classification [
32], blood pressure estimation via optical sensors [
33], and thoracic disease diagnosis from medical images [
34]. Privacy-preserving frameworks have also been proposed for COVID-19 diagnosis using international datasets [
35]. Collectively, these studies underscore the growing relevance of FL for privacy-sensitive, distributed healthcare environments and its potential for enabling secure, large-scale predictive analytics in IoMT ecosystems.
Despite the growing interest in applying FL for healthcare, most existing studies either target specific diseases or rely on additional privacy-enhancing techniques, such as differential privacy or encryption, which can introduce computational overhead and reduce model performance [
36,
37]. Furthermore, to the best of our knowledge, previous studies lack a comprehensive evaluation framework that compares FL with CML approaches using diverse and publicly available healthcare datasets. To address these gaps, FED-EHR, a practical FL framework for EHR analytics within IoMT environments, is proposed. In contrast to the prior studies, FED-EHR inherently preserves privacy, eliminating the need for additional privacy mechanisms. Further, a comprehensive performance evaluation is conducted to assess the effectiveness of FED-EHR compared to CML methods. The results demonstrate that FL can achieve comparable accuracy while ensuring that sensitive patient data remains local. Therefore, this approach ensures privacy preservation and enhances data security. A comparative summary of FL-based healthcare studies in IoMT environments is presented in
Table 1.
As a result, by enabling distributed model training while ensuring data locality, FED-EHR addresses both scalability and compliance with data protection regulations. Therefore, the practical viability and generalizability of FL in real-world smart healthcare environments is demonstrated.
3. Materials and Methods
FL provides a decentralized ML paradigm that enables robust predictive model development without aggregating sensitive user data on a central server. This architecture is particularly advantageous in healthcare applications where data privacy, security, and regulatory compliance are paramount. In the context of EHRs collected through IoMT devices, FL offers a scalable and privacy-preserving solution for data-driven analytics.
In the proposed FED-EHR framework, a central server distributes a base ML model to participating clients, such as hospitals, medical data centers, or IoMT devices. Each client trains the model locally on its private dataset and only transmits updated model parameters (e.g., weight gradients) back to the central server. Raw data never leaves the local environment. The server aggregates these updates using techniques such as Federated Averaging (FedAvg) to generate an improved global model, which is then redistributed to the clients for subsequent training rounds. This iterative process continues until the global model reaches the desired performance threshold, preserving data privacy while minimizing risks associated with centralized storage and transfer.
FED-EHR is specifically tailored to address the challenges of IoMT-based health systems, including data heterogeneity, limited computational resources at the edge, and communication constraints. As shown in
Figure 1, the architecture comprises two core components: the clients and the aggregation server. Clients, ranging from wearable and non-wearable IoMT devices to institutional medical data sources, generate and retain sensitive health data locally. In this setup, N clients collaboratively train local models and periodically share parameter updates with a centralized aggregation server located at a Service Access Point (SAP). The server aggregates the received model parameters to generate an improved global model and redistributes it to the clients for further local training. This hierarchical configuration enhances scalability and efficiency by enabling real-time coordination of model updates between IoMT sensors and the aggregation server. To maintain consistency, both local clients and the central server adopt a unified neural network architecture, facilitating seamless integration of updates and robust performance across heterogeneous environments. The entire FL process is executed iteratively across communication rounds, following the coordinated steps below.
Initialization: A global model is initialized on the central server, and client configurations are established.
Model Distribution: The initialized global model is dispatched to all participating local devices or clients.
Local Training: Each client trains the received model on its locally stored data, updating the model parameters based on their local context and distribution.
Update Computation: The clients compute the model updates, which are typically the difference in model weights between the locally trained and the original global model.
Aggregation: All local updates are transmitted back to the central server, where they are integrated into a new global model using aggregation methods such as weighted averaging.
Model Redistribution: Once aggregation is complete, the updated global model is redistributed to all participating entities, including healthcare institutions, clinicians, or IoMT devices. These updated models can then be deployed for a variety of real-time healthcare applications, such as disease detection, clinical decision support, treatment planning, and continuous health monitoring.
The architecture shown in
Figure 1 comprises a central FL server and multiple FL clients (e.g., hospitals, clinics, or IoMT devices) that collaboratively train a global ML model without exchanging raw data. The process proceeds iteratively through four main steps: (1) Local Training, where each client independently trains the model using its local dataset, ensuring that sensitive patient data remain within the device or clinical site; (2) Model Upload, where the locally trained model parameters (e.g., weights and gradients) are transmitted to the central server instead of raw data; (3) Aggregation, where the server aggregates the received parameters using an algorithm such as the FedAvg algorithm to construct an updated global model; and (4) Model Download, where the updated global model is redistributed to all participating clients for further local training or inference. This iterative training process is repeated until the global model converges to the desired accuracy. Thus, the process enhances performance while ensuring strict privacy and compliance with regulatory standards.
3.1. Federated Aggregation Strategy and the FedAvg Algorithm
The FED-EHR architecture relies on a federated aggregation mechanism to combine locally trained model parameters into a unified global model. Aggregation involves consolidating updates from multiple decentralized clients. A widely adopted approach is the Federated Averaging (FedAvg) algorithm. FedAvg computes a weighted average of locally updated models, where the weights correspond to the number of data samples at each client site. The mathematical formulation of FedAvg is expressed as follows:
where
is the global model at round
,
is the local model from client
,
is the number of local data samples at client
, and
is the total number of samples across all clients.
Aggregation strategies may vary based on factors such as data heterogeneity, communication constraints, and privacy requirements. Advanced methods include secure aggregation, differentially private aggregation, adaptive weighting, and gradient sparsification. Each method provides a different balance between privacy, accuracy, and efficiency [
38]. This flexibility enables the FED-EHR framework to support a wide range of IoMT and healthcare deployment scenarios.
3.2. Federated Optimization
Advances in DL have largely been driven by variants of Stochastic Gradient Descent (SGD). In decentralized environments, optimization performance can be improved not only through algorithmic refinements but also by adapting model structures and loss functions to better align with gradient-based methods.
In FL, the FedAvg algorithm is the most widely used optimization approach. FedAvg enables privacy-preserving collaborative training across distributed clients with heterogeneous data. Each client performs local SGD-based training on its own dataset and only transmits model updates (weights or gradients) to a central server, without sharing raw data. The server computes a weighted average of these updates based on the number of samples per client to produce a new global model. This model is then sent back to clients for the next training round. The process repeats until convergence.
FedAvg’s strength lies in its simplicity, communication efficiency, and ability to accommodate non-IID (non-independent and identically distributed) data, which are commonly encountered in IoMT and healthcare applications. In the FED-EHR framework, FedAvg enables scalable, privacy-preserving training on decentralized EHRs while ensuring that patient data remains entirely local and confidential [
39]. The formal procedure is presented in Algorithm 1, and the communication workflow is illustrated in
Figure 2.
Algorithm 1. Federated Averaging Algorithm |
Federated Averaging Algorithm: The indices k represents the K clients. E denotes the number of local training epochs, B is the size of the local mini batch, and η is the learning rate for updates, t: Communication round index, C: Fraction of clients selected per round, wt: Global model weights at round t, is Local model weights of client k at round t, nk: Number of samples on client k. |
Server executes: Initialize the global model weights w0 for each communication round do Compute m, the number of clients to participate in the round: , where is the fraction of clients selected. Randomly select m clients, forming the set Parallel Execution for Each Client do Update the Model: ← Client Update Aggregate Updates Compute ← Update the Global Model:
ClientUpdate : Split the client’s dataset Pk into the Batches of size B for each local epoch to do for each batch do Update Model Weights: w ← w − η∇ℓ(w; b) //Where is η is learning rate and ∇ℓ(w; b) is the gradient of the loss function. return the updated Model weight w to server. end |
3.3. Datasets
In the study, two publicly accessible datasets from the UCI Machine Learning Repository were selected for the experimental validation of the proposed FED-EHR framework. The key characteristics of the datasets utilized in this study are detailed below and summarized in
Table 2.
Breast Cancer Wisconsin (Diagnostic) Dataset: The Breast Cancer Wisconsin dataset [
40] provides diagnostic information on breast cancer cases derived from digitized images of Fine Needle Aspirate (FNA) of breast masses. It consists of 569 instances, each labeled as either benign or malignant, forming a binary classification task. Each record includes 30 numerical features, such as Mean Radius, Texture, Perimeter, Area, and Smoothness. These features represent various characteristics of cell nuclei present in the image. Prior to training, the dataset was standardized through normalization, ensuring that all input features lie within the same scale to enhance model convergence and performance.
Pima Indians Diabetes Dataset: The Pima Indians Diabetes dataset [
41] contains clinical diagnostic information aimed at predicting the onset of diabetes based on several physiological parameters. It comprises 768 instances, each associated with 8 numerical features, including Plasma Glucose Concentration, Diastolic Blood Pressure, Triceps Skinfold Thickness, Serum Insulin Level, and Body Mass Index (BMI). The output variable denotes the presence or absence of diabetes. Similar to the Breast Cancer dataset, normalization was applied to the features to maintain consistency across the learning process and prevent any one feature from dominating due to scale differences.
The UCI Breast Cancer Wisconsin (Diagnostic) dataset and the Pima Indians Diabetes dataset were selected based on their clinical relevance to healthcare predictive analytics and their suitability for FL experiments. Both datasets contain clinically significant features that are commonly used in diagnostic decision-making and patient monitoring. Additionally, both datasets are publicly available and have been widely used in prior studies involving both centralized and FL models, which ensures reproducibility and facilitates comparative benchmarking with prior work. Their moderate size and diverse attributes also make them computationally efficient for simulating FL across multiple clients without excessive overhead. Therefore, by leveraging these two clinically meaningful and computationally efficient datasets, this study provides a realistic evaluation of FL’s potential to support scalable and privacy-preserving solutions in healthcare.
3.4. Machine Learning Models and Evaluation Metrics
In this study, LR and MLP were employed to evaluate and compare the performance of centralized and FL frameworks on healthcare-related datasets. LR is a widely utilized supervised learning algorithm that is particularly well-suited for binary classification tasks. It models the probability of a binary outcome based on one or more predictor variables using the logistic sigmoid function. For this study, binary logistic regression was adopted as it effectively handles the dichotomous target labels present in both the Breast Cancer and Diabetes datasets. The algorithm offers computational efficiency and interpretability, making it an ideal baseline for comparing the performance of different learning paradigms. MLP, a type of feedforward artificial neural network, was also utilized to evaluate more complex nonlinear relationships within the datasets. MLPs consist of an input layer, one or more hidden layers with non-linear activation functions such as ReLU or sigmoid, and an output layer. Trained using the backpropagation algorithm, MLPs are capable of capturing intricate patterns in high-dimensional data, which is particularly relevant in medical diagnostics. The inclusion of MLP allows for performance comparisons between lightweight linear models and more expressive Deep Learning models under both centralized and FL conditions. To further evaluate the discriminative performance of the LR and MLP models, the Receiver Operating Characteristic (ROC) curve and the Area Under the Receiver Operating Characteristic Curve (AUC-ROC) were employed as key evaluation metrics.
ROC Curve: The ROC curve illustrates the trade-off between the true positive rate (sensitivity) and the false positive rate (1 − specificity) across varying classification thresholds. By plotting these rates, the ROC curve provides a comprehensive view of a model’s diagnostic ability, independent of class distribution or decision threshold. A model that perfectly distinguishes between the positive and negative classes will have a curve that approaches the top-left corner of the plot.
AUC-ROC: The Area Under the ROC Curve serves as a scalar summary of the model’s performance. An AUC value of 0.5 suggests no discriminative power (equivalent to random guessing), while an AUC value closer to 1.0 indicates excellent class separation. In clinical applications, high AUC scores are particularly valuable as they reflect a model’s robustness in distinguishing between diseased and non-diseased states, irrespective of specific threshold selections.
3.5. Centralized Machine Learning and Federated Learning Methodology
In the centralized configuration, the entire dataset is aggregated on a single node for training and evaluation purposes. Both the LR and MLP models are trained for 50 epochs using the full dataset. Model performance was evaluated based on the ROC-AUC score, which served as a baseline for comparison with federated approaches.
In the FL setting, the dataset was partitioned across four simulated clients to emulate a decentralized healthcare environment. Two experimental scenarios were designed to examine the impact of client participation on performance: the first case used FL with two active clients, and the second case used FL with four active clients. The FL configuration details are described below.
Framework: TensorFlow Federated (TFF) is used to implement the FL simulation.
Local Training: Each client trains the LR model for five local epochs on its private data.
Aggregation Mechanism: The global model is updated using the FedAvg algorithm.
Communication Rounds: A total of 20 global communication rounds are conducted to iteratively update and refine the global model.
ROC-AUC scores were computed under both centralized and federated settings to measure the effectiveness of each learning strategy. The key observations included the following:
The centralized learning results provided the performance upper bound by leveraging access to the entire dataset.
The FL results under two-client and four-client configurations were analyzed to assess model convergence and generalization in decentralized environments.
Table 3 presents a comprehensive summary of the methodology employed in this study, detailing the datasets, models, training configurations, and evaluation criteria used to compare the centralized and FL approaches.
The experimental environment used in this study consisted of a Windows 10 operating system running on a Ryzen 5 CPU with 8 GB of RAM. The implementation was carried out using Python 3.10, with the key libraries including PyTorch 2.3.1 and TFF for the ML and FL simulations, as well as NumPy 2.2 and Pandas 2.2.1 for data manipulation and analysis. The FL setup involved four clients, with each client performing five local training epochs per global round. A total of 20 global rounds were conducted using the FedAvg algorithm. Both LR and MLP models were evaluated, with the learning rate set at 0.01.
4. Results
This study evaluated the performance of CML and FL approaches using EHRs from publicly available datasets related to diabetes and breast cancer. The experiments were implemented using PyTorch and TFF, focusing on LR and MLP models under both centralized and federated settings. The FL simulations involved two scenarios: (1) training with two clients and (2) training with four clients, where each client represents a simulated healthcare institution with local data. Each client trains the received global model on its private dataset for five epochs per round, and then sends the updated model weights to a central server. The server aggregates these updates using the FedAvg algorithm and distributes the refined global model back to all clients. This communication cycle is repeated for 20 global rounds. The overall architecture is illustrated in
Figure 3.
Figure 4 and
Figure 5 present the ROC–AUC performance of the LR and MLP models under the CML and FL paradigms with the Diabetes and Breast Cancer datasets. The detailed results corresponding to these figures are comprehensively summarized in
Table 4. In both datasets, the centralized models consistently achieved slightly higher AUC scores than their federated counterparts, which was expected given that CML leverages the entire dataset during training. Nevertheless, the performance gap between the centralized and federated approaches remained moderate, particularly in the Breast Cancer dataset, where FL with four clients achieved AUC values (0.92 for LR in
Figure 4b and 0.98 for MLP in
Figure 5b) closely approaching the centralized results (0.93 and 1.00, respectively). Notably, increasing the number of federated clients from two to four consistently improved model performance, as observed in the Diabetes dataset where the LR model’s AUC increased from 0.74 to 0.78 in
Figure 4a and the MLP’s AUC increased from 0.79 to 0.81 in
Figure 5a. This improvement suggests that greater data diversity and distribution across clients contribute to better global model generalization in federated settings.
The model complexity and dataset characteristics also influenced the results. The MLP model demonstrated higher resilience to the federated setting, achieving AUC scores closer to those of the centralized performance compared to the LR model, especially on the Diabetes dataset. Additionally, the Breast Cancer dataset yielded overall higher AUC values, likely due to its more separable feature space and reduced class imbalance compared to the Diabetes dataset. These findings underscore the importance of considering dataset complexity when deploying federated approaches in healthcare. Importantly, despite a marginal decrease in AUC compared to the centralized models, FL provides substantial advantages in maintaining data locality and complying with privacy regulations. The results confirm that FL can achieve near-centralized predictive performance while preserving patient privacy, making it a viable and scalable approach for IoMT-driven healthcare analytics.
The model convergence and training dynamics of the proposed FED-EHR framework were evaluated against those of CML by analyzing the training accuracy and loss of both learning paradigms.
Figure 6,
Figure 7,
Figure 8 and
Figure 9 illustrate the convergence behavior of CML, FL with two clients (FED-2), and FL with four clients (FED-4) applied to the Breast Cancer and Diabetes datasets, utilizing LR and MLP models. These figures depict the training accuracy and loss trends over the epochs and communication rounds, with the convergence patterns influenced by the dataset characteristics and the chosen learning algorithm. To ensure methodological rigor, the CML models were trained for 100 epochs with their performance metrics logged every 5 epochs, corresponding to the 20 global rounds in the federated configurations.
Across both datasets, the MLP model consistently achieved faster and more stable convergence than LR. Notably, on the Breast Cancer dataset, the MLP model attained a high accuracy early in training, likely due to its effective preprocessing and the presence of informative features.
Among the federated configurations, the FED four-client setup demonstrated convergence curves closely aligned with those of the centralized baseline, which was attributed to a more balanced and diverse distribution of training data across clients. Conversely, the FED two-client configuration exhibited slower convergence and a lower final accuracy, reflecting the limited local data available per client. LR showed a more gradual convergence trend, especially on the Diabetes dataset, which was further emphasized in the federated settings. This effect was particularly pronounced for LR on the Diabetes dataset, where the combination of challenging feature distributions and the model’s linear decision boundary resulted in a more significant accuracy decline compared to the MLP model. Nevertheless, the FED four-client setup employing LR still achieved convergence patterns that approximated those of centralized training, with only minor and acceptable variations due to data fragmentation. These findings validate that the FED-EHR framework achieves a convergence behavior comparable to that of centralized learning, especially in settings with sufficient client participation. The minor convergence delay observed in federated settings represents an inherent trade-off of decentralized learning and is expected to diminish in real-world scenarios, where federated systems leverage increased data diversity and interaction while maintaining data privacy.
These findings confirm that the proposed FED-EHR framework maintains competitive classification performance compared to CML while offering the added advantage of preserving data privacy. The performance gap between FL and CML was marginal, particularly when more clients were included, highlighting the scalability of the framework. FL’s decentralized architecture ensures that sensitive medical data remains on local devices, significantly reducing the risk of data breaches. Furthermore, the flexibility and scalability of FL make it particularly suitable for real-world smart healthcare systems, especially in scenarios involving geographically distributed data sources or frequent data mobility. As healthcare data privacy becomes increasingly critical, the ability to maintain strong model performance without compromising confidentiality marks FL as a practical solution.
The effectiveness of FED-EHR was further evaluated through a comparative analysis with prior studies utilizing both centralized and FL approaches in healthcare. As shown in
Table 5, previous studies employing centralized models often achieved high performance, but with limited privacy assurance. For example, Ref. [
15] reports 98% accuracy using SVM on breast cancer data under a centralized setup. Meanwhile, Ref. [
29] reports an AUC of 0.77 using FL on MIMIC-III, nearly matching its centralized counterpart. In our study, FED-EHR achieved a Breast Cancer AUC of 0.97 (MLP, four clients) and a Diabetes AUC of 0.81 (MLP, four clients), both closely matching the AUCs of the centralized results.
These results underscore the practical viability of FL in real-world IoMT environments, where data is inherently distributed across institutions or regions. The decentralized nature of FL inherently reduces the risk of data leakage while complying with stringent privacy regulations such as HIPAA and GDPR. Furthermore, the study emphasizes the importance of model selection (e.g., LR vs. MLP) and hyperparameter tuning for different datasets. Overall, FED-EHR offers a robust, privacy-preserving, and high-performing alternative to centralized analytics, which is particularly suitable for deployment in smart healthcare systems with geographically distributed data sources.
5. Discussion
The experimental evaluation of the proposed FED-EHR framework demonstrates its strong potential for secure and effective predictive modeling in healthcare applications involving sensitive data. The results from both the Diabetes and Breast Cancer datasets reveal that the predictive performance of FL closely approaches that of CML models while preserving data privacy by maintaining data locality.
As seen in the Results section, using LR, the centralized model achieved an AUC of 0.83 on the Diabetes dataset, while FL achieved an AUC of 0.74 with two clients and improved to 0.78 with four clients. On the Breast Cancer dataset, centralized LR yielded an AUC of 0.93, with FL obtaining AUCs of 0.88 and 0.92 with two and four clients, respectively. Similarly, with the MLP model, the centralized model achieved an AUC of 0.81 on the Diabetes dataset, and FL attained AUCs of 0.79 (2 clients) and 0.81 (4 clients). On the Breast Cancer dataset, FL achieved AUCs of 0.96 (two clients) and 0.98 (four clients), closely approximating the centralized result of 1.0. These findings emphasize that FED-EHR retains high predictive accuracy even under decentralized settings, validating its suitability for real-world IoMT environments where data heterogeneity and wide geographic distributions are prominent.
A comparative analysis with prior studies further substantiates the framework’s effectiveness. For example, Sadilek et al. [
30] reported AUC scores of 0.78 (CML) and 0.77 (FL) using LR on the MIMIC-III dataset. Qiu et al. [
32] achieved accuracies of 57.9% (CML) and 52.6% (FL) on heart sound classification using MLP. Brohi and Mastoi [
16] applied a centralized MLP model to the Breast Cancer dataset, achieving 98% accuracy, and examined the impact of adversarial attacks on model performance. While their study reported a high classification accuracy, its primary focus was on model robustness under adversarial conditions rather than on developing a privacy-preserving learning framework. Karnati and Baiju [
31] compared FL and Split Learning on the same dataset, reporting accuracies of 98% and 99%, respectively. However, the specific model architectures employed are not clearly detailed. Additionally, Split Learning typically requires more complex coordination between client and server layers, which may limit its feasibility in heterogeneous, resource-constrained IoMT environments. In contrast, the proposed FED-EHR framework consistently achieved higher AUC scores, such as 0.98 on the Breast Cancer dataset with MLP under FL, which are comparable to or surpass the centralized results from the literature. These results underscore the framework’s robustness and the impact of effective model and parameter tuning. Moreover, the scalability of FED-EHR across varying client settings and its integration of privacy-preserving techniques, without significant degradation in model performance, demonstrate its practical viability. Unlike traditional CML, which aggregates data at a central server and poses privacy risks, FED-EHR ensures that the data remains local, aligning with regulations such as HIPAA and GDPR. Further, the robustness and learning dynamics of the proposed FED-EHR framework were validated by analyzing its convergence behavior through training accuracy and loss across learning epochs (CML) and communication rounds (FL). The results demonstrate that the FED-EHR framework can achieve performance and convergence patterns closely aligned with those of centralized learning, particularly in settings with balanced data distribution across clients. While minor performance gaps were observed in configurations with fewer clients, these are consistent with the known trade-offs of decentralized learning. The observed results highlight the framework’s ability to maintain accuracy and stability while preserving data privacy, indicating strong potential for scalability and generalization in real-world healthcare environments where client diversity and participation are greater.
While the FED-EHR framework does not perform a formal privacy audit or regulatory compliance assessment, its architectural design aligns with the Privacy-by-Design principles endorsed by regulations such as HIPAA and GDPR. By ensuring that the raw patient data remains exclusively on local IoMT or healthcare devices during training, the framework minimizes data exposure, preserves data control, and mitigates the associated risks that are fundamental objectives of these regulations. Consequently, FED-EHR facilitates compliance efforts by enforcing data locality and leveraging distributed intelligence, thereby providing a practical and privacy-aware solution for real-world healthcare applications. In conclusion, the proposed FED-EHR framework offers a balance between privacy, scalability, and model performance. Its capability to support both LR and MLP architecture, perform effectively under federated settings, and outperform or rival prior FL-based studies positions it as a promising paradigm for privacy-preserving analytics in smart healthcare systems. Moreover, the practical value of the FED-EHR framework lies in its ability to enable collaborative model development in healthcare ecosystems where data sharing is restricted by legal, ethical, or operational constraints. By ensuring that sensitive patient data remains on local devices or within institutional boundaries, the framework supports compliance with privacy regulations while facilitating large-scale and multi-institutional learning. This capability is particularly relevant for hospital networks and wearable IoMT deployments, where it can enable privacy-aware health monitoring, early diagnosis, and adaptive care delivery without compromising data security.
6. Conclusions and Future Works
The growing connectivity of the IoMT has significantly increased the attack surface for healthcare data to cyber threats. With numerous access points, ranging from patient wearables and sensor-equipped medical devices to cloud infrastructures, the risk of unauthorized access and data breaches is substantial. Therefore, ensuring the confidentiality, integrity, and availability of EHRs is critical.
This study examined the privacy challenges associated with CML approaches for EHR analytics and evaluated the effectiveness of FL as a privacy-preserving alternative to traditional CML approaches. Therefore, FED-EHR, a federated learning framework that ensures data locality by design, was proposed. The experimental results on the UCI Breast Cancer dataset and Pima Indians Diabetes dataset using both LR and MLP models showed that FED-EHR achieved a predictive performance closely matching that of centralized models, with ROC-AUC scores reaching 0.83 and 0.98, respectively. These findings underscore the potential of FL as a promising and effective alternative to CML, enabling collaborative model development while preserving patient data privacy and ensuring compliance with healthcare data protection regulations. The FED-EHR framework is particularly well-suited for deployment in geographically distributed and privacy-sensitive healthcare environments, allowing both large hospitals and smaller clinics to benefit from intelligent decision-support systems.
Future work will focus on enhancing the robustness and privacy of the FED-EHR framework by integrating differential privacy and secure multi-party computation to further mitigate risks such as model inversion and data leakage. The FED-EHR framework will also be extended to support multimodal clinical data, including medical imaging, free-text clinical notes, and physiological time-series from IoMT devices. Additionally, future work will focus on evaluating the resilience of the FED-EHR framework against adversarial attacks and designing robust defense mechanisms to protect model integrity in the presence of malicious clients. Another important direction will involve exploring personalized federated learning strategies that adapt models to specific client contexts while preserving overall global learning objectives.
As a result, by expanding both its technical capabilities and application domains, FED-EHR has the potential to serve as a foundational framework for secure and scalable AI deployment in next-generation healthcare infrastructures.