Adaptive Trust-Aware Encrypted Federated Artificial Intelligence with Blockchain Auditability for Multicenter Biomedical Signal and Medical Image Analysis

Hussein, Ahmed F.; Al-Neami, Auns Q.

doi:10.3390/informatics13060088

Open AccessArticle

Adaptive Trust-Aware Encrypted Federated Artificial Intelligence with Blockchain Auditability for Multicenter Biomedical Signal and Medical Image Analysis

by

Ahmed F. Hussein

^1,*,†

and

Auns Q. Al-Neami

^2,†

¹

Department of Artificial Intelligence and Robotics Engineering, College of Engineering, Al-Nahrain University, Al-Jadriyah, Karradah, Baghdad 64040, Iraq

²

Department of Biomedical Engineering, College of Engineering, Al-Nahrain University, Al-Jadriyah, Karradah, Baghdad 64040, Iraq

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Informatics 2026, 13(6), 88; https://doi.org/10.3390/informatics13060088 (registering DOI)

Submission received: 19 April 2026 / Revised: 8 June 2026 / Accepted: 9 June 2026 / Published: 15 June 2026

(This article belongs to the Special Issue Health Data Management in the Age of AI)

Download

Browse Figures

Versions Notes

Abstract

Although the sharing of data is an important part of multicenter biomedical AI, direct data sharing is hindered by privacy laws, institutional data silos, and restrained trust and cooperation between institutions. While federated learning offers an opportunity for collaborative model training without centralizing patient data, many current methods rely on the same fixed levels of privacy protection on all clients, every layer of the model, each round, and each modality, resulting in suboptimal privacy–utility–latency trade-offs. In this study, we introduce Adaptive Trust-Aware Encrypted Federated Artificial Intelligence with Blockchain Auditability (ATEB-AI) for biomedical signal and medical image analysis. ATEB-AI is an adaptive CKKS encryption, trust-aware aggregation, and permissioned blockchain-based audit logging combination. The proposed framework was tested on four public benchmarks, namely, MIT-BIH, CHB-MIT, BraTS, and NIH ChestXray. ATEB-AI had the highest overall performance out of all compared federated methods and remained near the centralized training benchmark at up to 99.0% of the reference centralized training performance. It reduced membership-inference success from 0.71 to 0.24 (−66.2%), inversion leakage from 0.64 to 0.27 (−57.8%), and poisoning-related utility loss from 0.18 to 0.07 (−61.1%). Round latency was 1.90× FedAvg, compared with 2.85× for HE-FL (−33.3%) and 3.50× for BC-FL (−45.7%). The key contribution of this study is a single biomedical federated learning framework in which privacy, client trust, reliability, and auditability are unified, instead of being disjointed components. The results obtained with the proposed model prove the feasibility of co-optimizing confidentiality, robustness, efficiency, and governance in a single deployable multicenter medical AI pipeline.

Keywords:

federated learning; privacy-preserving medical AI; homomorphic encryption; blockchain auditability; trust-aware aggregation; biomedical signal analysis; medical image analysis; multicenter healthcare AI

1. Introduction

AI has demonstrated significant potential in biomedical signal processing and medical imaging, especially in disease investigation, risk assessment, and computer-aided diagnosis. However, the clinical utility of these models is dependent on adequate, large, and diverse datasets. In reality, this kind of data is rare within the framework of one institution [1]. Hospitals and research facilities may have complementary but separate datasets, and collaboration is limited due to privacy policies, institutional policies, and the sensitivity of patient records. Consequently, the main issue at hand is not just constructing correct models but, additionally, facilitating learning through collaboration without jeopardizing confidentiality and governance needs [2].

Federated learning is a viable solution to this problem, as it enables several institutions to learn a common model while each retaining their raw data. This paradigm is particularly desirable in the field of healthcare, where legal and ethical regulations often limit data sharing [3]. Nevertheless, traditional federated learning cannot fully address privacy and trust issues. Gradients and model updates that are exchanged can still expose sensitive information, and differences between sites in the process can undermine model stability and performance. In multicenter biomedical studies, institutions often differ in terms of sample size, data quality, the prevalence of the disease in question, acquisition procedures, and computational resources [4]. The heterogeneity of these differences means that learning is accomplished in a very heterogeneous environment, where standardized training and security policies have a tendency to be inefficient or insufficient [5].

The second drawback is that the majority of collaborative AI pipelines are constructed with a focus on predictive performance, with the practicalities of safe deployment given less emphasis [6,7]. A strong federated structure must meet multiple requirements in the context of real-life healthcare networks. When dealing with sensitive model interactions, such frameworks should be resistant to noisy or unreliable members, should enable transparent observability in training operations, and should not involve prohibitive computational or communication costs. However, a design that meets one of these requirements may not prove helpful in actual deployments involving multiple centers. Thus, the trade-offs between privacy, utility, robustness, and operational efficiency are more important in finding a middle ground [8].

Driven by this issue, the current study presents the Adaptive Trust-Aware Encrypted Federated AI (ATEB-AI) framework to perform multicenter biomedical signal and medical image analysis. ATEB-AI improves upon previous trust-aware FL and hybrid FL + blockchain designs in multiple respects, as follows: (i) Unlike a fixed CKKS configuration, encryption is performed using a decision per-layer/per-client/per-round model based on the value of the sensitivity score s_i_,l^t. (ii) It also unifies the concepts of privacy and reliability, where a low-trust client is routed to a stronger encryption policy in round t + 1. (iii) Finally, it plays a “governance-only”/“metadata-only” role in the blockchain layer (in comparison, some blockchain designs rely on in-chain update events or on-chain substitution of the aggregator).

The developed methodology is expected to fill a significant gap in privacy-protective AI in healthcare. In the current research, encryption, robustness, and auditability are discussed as side-by-side features in a federated system; clustered functionality is not viewed as a clinically realistic system. Thus, we consider the possibility of using adaptive encryption to enhance privacy, utility, and latency trade-offs over rigid protection schemes, which has the potential to facilitate stronger learning with heterogeneous participation through trust-aware aggregation and to provide meaningful auditability through a blockchain layer that is closely scoped but does not have redundant system overhead. This integration of these components into a single framework will help to develop abstract secure learning assertions into more deployable models of multicenter medical AI.

The key contributions of this study are as follows. First, it introduces a federated model for biomedical signals and medical images, integrating an adaptive encryption model and trust-sensitive learning. Second, it proposes a governance-focused blockchain architecture that is audit- and provenance-enabled and architecturally restrained. Third, it compares the framework against a wide range of criteria, namely, predictive performance, privacy and security strength, heterogeneity, and computational cost. In general, the study expands on the thesis that the quality of privacy-aware medical AI system cannot be measured by a single quality, such as accuracy and/or secrecy, but rather by its degree of success in incorporating confidentiality, robustness, efficiency, and accountability into a coherent multicenter learning system.

2. Related Work

2.1. Federated Learning in Healthcare

Federated learning is currently one of the most intensively researched paths towards privacy-conserving artificial healthcare. It enables the open communication of models developed in cooperation without directly sharing raw patient information. Over the past few years, this discipline has slowly shifted away from proof-of-concept architectures and toward more deployment-focused discourse, covering issues like scalability, interoperability, governance, and applicability in clinical settings [9,10,11]. Comprehensive analysis of decentralized learning in the medical sector has shown that the federated learning construct is increasingly viewed as a viable approach to medical collaboration, particularly in an era that prohibits data sharing using centralized formats due to regulatory restraints, institutional policies, and structural fragmentation [12,13,14]. However, baseline federated learning does not directly provide any protection against information leakage or unreliable clients, and most deployment-oriented studies primarily make use of auxiliary mechanisms (notably, data-sharing agreements, IRB controls, and secure enclaves). These cannot be deployed at scale in large, multi-institutional consortia and have implied rather than auditable trust implications.

2.2. Privacy-Preserving Techniques in FL

At the same time, the literature shows that federated learning alone cannot be applied to sensitive applications in biomedical research. One of the most significant problems is that even local changes in models can disclose confidential data, especially in medical imaging and other clinical conditions with high complexity [15,16,17]. Recently, this issue has spawned a body of research on privacy-conserving techniques, including differential privacy, encrypted training, and privacy inference in healthcare AI [18,19,20]. In general, this research suggests that federated learning can preserve privacy but illustrates a trade-off pattern where the higher the privacy, the higher the computational costs, communication costs, and predictive accuracy [19,21,22,23]. Comparatively, differential-privacy FL is 5–15% less useful when considering the number of rounds performed, as usable privacy budgets are in the range of ε ∈ [1,8] and, empirically, membership-inference success is greater than 0.40 in benchmarks; in contrast, homomorphic-encryption FL using CKKS or BFV provides cryptographic confidentiality at the cost of 3×–6× communication expansion and 2.5×–4× round-latency overhead using standard security parameters (N ≥ 8192, λ = 128 bits). A common restriction in these families is the fact that the protection budget is equal for all clients and rounds, regardless of the client reliability or layer sensitivity, which is suboptimal when cryptographic resources are constrained.

2.3. Trust and Robustness in Heterogeneous Settings

The other critical research direction is related to the dependability of federated learning in non-homogeneous medical settings. Disease prevalence, acquisition conditions, data quality, and local sample volumes vary in biomedical datasets between sites and do not tend to be similar. Therefore, recent efforts have considered fairness-sensitive learning, uncertainty-sensitive assessment, and feasible implementation frameworks that take into account site variability and varied computing resources [24,25,26]. This is especially important in healthcare, as the quality of models must not just be assessed in accordance with the average accuracy but also with regard to their robustness, calibration, and stability in participating institutions [27,28,29,30]. Reputation-weighted and Byzantine-resilient aggregators in have been shown to reduce the utility loss due to poisoned updates, typically by 30–50% over the FedAvg. However, they also have two drawbacks, in that they do not tie the weight to the level of privacy, meaning that a low-trust client can gain the same level of protection through cryptographically secured updates as a trusted client, and they generally behave as black-box aggregator modifications, thus lacking a tamper-evident audit trail, which is a practical requirement in a regulated healthcare consortium.

2.4. Blockchain Integration in FL

Similar research has attempted to enhance federated learning by coordinating it with blockchain. There are various applications of blockchain in the literature, such as improving traceability, modeling provenance, incentive control, and decentralized verification of collaborative training [31,32,33,34]. Several studies have applied this idea to healthcare use cases, such as respiratory diseases, smart surveillance, and federated medical diagnosis, claiming that blockchain can increase the level of trust among collaborating stakeholders and may keep privacy intact by sharing federated models [35,36,37]. Byzantine robustness, auditability, and participant controls have also been explored as permissioned blockchains designed for secure enforcement [26,38,39]. Quantitatively, both on-chain updates and hybrid or off-chain designs that only update the chain of incentives or receipts of contributions have throughput below 50 transactions per second (TPS), with round latency exceeding 3.5× FedAvg under permissioned consensus. The former includes full provenance audibility, whereas the latter partially lacks provenance verifiability between the chain and model updates. Moreover, most FL + blockchain works presented to date are not operationally complete, meaning that they do not name the specific blockchain stack (framework, consensus protocol, block parameters, measured throughput, and latency), thus reducing reproducibility.

Although advances have been made, the literature is still methodologically disjointed. Research has been conducted on privacy enhancement, trust and contribution management, and blockchain coordination [9,15,31]; however, in practice, these are only loosely coupled and not independent of one another in multicenter healthcare environments. A privacy preservation strategy that exploits a lack of heterogeneity in updates may be vulnerable to poor quality or malicious updates. Furthermore, a lack of adaptability in protection in blockchain-like systems may lead to high overhead and still does not provide an adequate privacy–utility trade-off [18,24,27]. This suggests that privacy protection must be more intrinsically embedded in designs, such that customer engagement is trust-sensitive and auditability is acquired in a technically constrained and operationally significant manner [19,28,39].

In conclusion, the available literature provides three significant directions that are utilized in the present research. The first is the continued development of federated learning into privacy-preserving AI in healthcare [10,12]. The second is an increase in privacy-enhancing techniques, especially in medical images and other sensitive biomedical information [16,19,21]. The third uses blockchain and trust-based systems for better traceability, governance, and reliability of collaboration in decentralized learning [29,33,37]. However, there is still a gap in the literature, in that there is no robust system that can collaboratively resolve adaptive privacy control, trust-sensitive aggregation, and auditable multicenter coordination in multi-medical signal and medical image analysis. This gap was the main stimulus for the development of the approach outlined in the present research [26,32]. Specifically, ATEB-AI takes mutually conditioned (not independent) design decisions to address the three limitations outlined in Section 2.2, Section 2.3 and Section 2.4, as follows: (i) To address the uniform-protection limitation outlined in Section 2.2, the strength of encryption is selected per layer, per client, and per round, based on the “sensitivity score”,

s_{i, l}^{t}

(Equation (4)); the constrained policy optimization of this score results in a round-latency overhead of only 1.90× FedAvg versus 2.85× for HE-FL and 3.50× for BC-FL, thus preserving 96–99% of the centralized reference utility. (ii) To address the trust–privacy disconnect outlined in Section 2.3, the encryption policy is conditioned on the trust score,

τ_{i}^{t - 1}

, such that a low-trust client in round t is automatically routed to a stronger encryption policy in round T. The novelty of ATEB-AI is thus not in the use of homomorphic encryption, trust-aware aggregation, or blockchain per se, but in their simultaneous and mutually dependent integration into the same multicenter biomedical FL pipeline.

3. Materials and Methods

This section explains the methodological framework that was employed to assess the proposed Adaptive Trust-Aware Encrypted Federated AI (ATEB-AI) system for multicenter biomedical signal and medical imaging analysis. This research aimed to test the hypothesis that privacy-saving collaborative learning can be enhanced by combining adaptive encryption, trust-specific aggregation, and authorized blockchain auditability with heterogeneous institutional involvement. To maintain methodological simplicity, this section entails only the study design, data plan, preprocessing pipeline, model creation, federated design, baselines, assailant environments, and assessment plan in the stated order. The results of these procedures are quantitative and reported later in the Results section. Mathematical formulations are presented, where needed, to formalize the optimization, encryption, and aggregation processes, as well as evaluations of the framework components.

3.1. Study Design and Benchmark Datasets

This study involved a privacy-preservation analysis of a multicenter cross-modal federated learning system used to analyze biomedical signals and medical images. This was not just a comparison of centralized and federated training; rather, the study attempts to determine whether it is possible to integrate privacy protection, resilience to heterogeneous participation, and auditability into a single coherent architecture. Figure 1 depicts the overall workflow for the proposed scheme, while Table 1 summarizes the datasets used in this study.

To evaluate the proposed framework on clinically and methodologically distinct biomedical signals and medical images, four publicly available datasets were employed as primary benchmarks: MIT-BIH Arrhythmia [40] and CHB-MIT EEG [41] for the signal modality, and BraTS MRI [42] and NIH ChestXray [43] for the imaging modality. MIT-BIH and BraTS were selected as the principal datasets for each modality because they reflect canonical low-dimensional temporal and high-dimensional spatial tasks, respectively, while CHB-MIT and NIH were retained as additional primary benchmarks to broaden cross-task generalization rather than as supplementary validation sets. This pair was chosen in such a way that the framework could be tested on two fundamentally different biomedical modalities, one of which is temporal and low-dimensional, while the other is spatially intricate and computationally expensive. Secondary data (CHB-MIT EEG [41] and NIH ChestXray [43]) were deemed potential external or supplementary validation data but were not included as core benchmarks in the primary article to prevent overinflation of scope. A general experimental model was adopted in the proposed federated system; the representation selection, optimization dynamics, communication times, privacy-sensitive comparators, encryption architecture, and range of blockchain coverage are summarized in Table 2.

This study was structured at the consortium level, whereby 5 to 20 participating clients were cross-silo federated, each representing an institution, hospital unit, or diagnostic center. Each client model had their local copy of the data and only sent safe updates to the model. This arrangement is representative of the practical situation of collaborative medical AI, in which institutions desire to share models so they can be improved, without giving up ownership or control of sensitive patient data. This design fits with the broader scope of healthcare FL studies, which focus on secure learning across institutions and restricted data mobility.

The canonical reference datasets are MIT-BIH and BraTS for arrhythmia classification (low-dimensional, temporal, beat level) and brain tumor analysis (high-dimensional, spatial, volumetric), respectively. These two datasets are complemented by CHB-MIT and NIH ChestXray to avoid being constrained by the modality of one specific dataset. Thus, universal clinical generalizability is explicitly not claimed (see Section 5). These datasets are not used to achieve clinical level absolute accuracy; instead, these are used to do a controlled, comparative evaluation using identical partitions, seeds and backbones across five federated configurations. Consistent method ordering, together with the relative gaps between the between-site F1 variance (36.1–46.6%), common fixed split (seed = 2025), reported calibration error of the methods, an N = 5–20 sensitivity analysis, and centralized upper bound are interpretable evidence. EfficientNet was used due to its good accuracy/parameter ratio, especially at moderate input resolution; Swin Transformer was used because it is currently the best general-purpose vision backbone; and U-Net was used as it is the de facto segmentation network for medical images. As federated aggregation is applied to real-valued vectors, CKKS was chosen over BFV/BGV, which are used to support approximate arithmetic on real numbers at the expense of quantization overhead, in keeping with the standard approach in the recent encrypted-FL literature. The most commonly studied privacy attack, membership inference, is considered, along with model inversion/reconstruction; the latter can be particularly harmful in medical imaging due to the potential for reconstructed slices to resemble identifiable patient anatomy. We also considered poisoning/Byzantine manipulation, the primary robustness concern for multicenter consortia consisting of different image types.

3.2. Federated Client Construction and Data Preprocessing

Each benchmark dataset was divided into N = 5 clients (with N varying from 5 to 20 in sensitivity analysis) and partitioned to account for realistic heterogeneous multicenter datasets with label skew, as per the non-IID label-skewed partitioning scheme. Class labels were sampled from clients through a symmetric Dirichlet distribution, Dir(α), with a moderate parameter setting (α = 0.3) to make the class labels relatively even across the extreme inter-institutional variability in disease prevalence, which the limit α → ∞ forces to a uniform class distribution and the limit α → 0 forces to a highly concentrated, single-class distribution. Sample sizes were also selected from a log-normal distribution with μ = 0 and σ = 0.4 and subsequently normalized such that the overall sample sizes were the same as the split reported in Table 1 (small to large institutional cohorts). To introduce additional client-level heterogeneity into the signal datasets (MIT-BIH, CHB-MIT), amplitude rescaling (with the client-specific ratio varying from 0.9 to 1.1) and controlled additive noise (Gaussian σ ∈ {0.0, 0.02, 0.05, 0.08, 0.10}) were added to the acquisitions. Heterogeneity was added at the subtype/lesion-prevalence level (tumor subtype mixtures in the imaging datasets, BraTS; finding-prevalence mixtures in the imaging datasets, NIH) and per-site intensity-distribution shifts. All baselines (FedAvg, DP-FL, HE-FL, BC-FL, ATEB-AI) use the same partitions, with a fixed random seed (seed = 2025) to ensure that the differences in performance are due to the federated algorithms and not the data partitioning process.

Preprocessing was performed locally per client, so that the decentralized nature of the study could be maintained. In the case of biomedical signals, preprocessing encompassed noise removal, amplitude normalization, division into fixed-length windows or beats, and task-specific time formatting. In the case of MRI images, preprocessing entailed intensity normalization, spatial resizing, and standardized tensor builds, all of which can be utilized in the chosen backbone model. Despite the harmonization of the preprocessing protocol at the study level, all operations were carried out independently on-site; thus, centralization of the raw clinical data was avoided. This approach is consistent with the real-world practicalities of healthcare FL systems in ensuring the local control of data and data preprocessing [3,44].

The local dataset of client

i

is represented as

D_{i} = {(x_{i j}, y_{i j})}_{j = 1}^{n_{i}}

, where

n_{i}

is the number of local samples. The distributed population of the training is thus defined as

D = ⋃_{i = 1}^{N} D_{i}

(1)

where

N

is the number of involved clients. The decentralized data setting is formalized in Equation (1), on which all subsequent training and aggregation procedures can be based.

3.3. Modality Specific Model Development

There are two layers of model specification that need to be separated. ATEB-AI’s federated, encryption, trust-aware, and blockchain components are all architecture-agnostic and were designed to work with any of the common biomedical-signal or medical-imaging backbone paradigms (1D CNN, CNN-BiLSTM, lightweight temporal transformer for signals; EfficientNet, Swin Transformer, or UNet family for images). To ensure that the results were numerically accurate and easily comparable across all five baselines (FedAvg, DP-FL, HE-FL, BC-FL, and ATEB-AI), the same MLP backbone (three fully connected layers, hidden sizes [128, 64, 32], ReLU activations, dropout 0.3) was employed on per-sample engineered or summarized numeric features extracted locally from each client (beat-level morphological/temporal features for MIT-BIH and CHB-MIT; or, slice-level intensity/edge/Haralick features for BraTS and NIH). This configuration matches the results listed in Table 2 for the “quick-mode target package,” allowing for separation of the impacts of the federated/encryption/trust/blockchain components from the architecture-specific effects. The federated stage (which is applied over a fixed budget T = 2 communication rounds) is a warm-started one that relies on each client starting from a locally pre-converged MLP backbone fitted on the engineered/summarized features. The convergence was checked on a longer schedule (R1-R6) where the global validation F1 (server-side shard) fluctuates around 0.4% and stabilizes in R2-R3. This horizon is high enough to achieve a stable aggregate and, at the same time, allow the full encryption–trust–blockchain pipeline to be exercised, with the federated component being a low dimensional head operating on top of a set of features that have been pre-extracted. To avoid justifying the two-round budget by appeal to the warm-start alone, the warm-start is treated as an independent ablation factor rather than as a fixed assumption. We compare single (cold-start) and federated (warm-start) arms with federated stage initialized from random local weights and federated stage initialized from locally pre-converged MLP backbone, respectively, and the federated stage (server-side shard) is assumed to be the same for both arms, while the partitions, seed (2025), backbone, and the full encryption–trust–blockchain pipeline are kept fixed. As the benchmark package, T = 2 is fixed, because under warm-start, the global F1 is already reached within ±0.4% at R2–R3, and under cold-start, this architecture has not stabilized at T = 2 and needs to be in a warmer schedule to achieve an architecture with the same level of F1. The two-round budget is therefore reported as the point at which the warm-started configuration has measurably converged, and the cold-start arm is the control; it is not assumed to have converged, and the effect of the warm-start is demonstrated. Further work involving replication of the same protocol with deeper modality-specific backbones is part of the planned follow-on work set out in Section 5.

To conduct a fair comparison across baselines, the same backbone family was kept when switching between the conventional federated learning and the privacy-preserving or trust-aware versions. This choice avoids the confounding effects of architectural differences interfering with the interpretation of encryption, aggregation, and governance effects. Each client is provided with the global model parameters

W^{t - 1}

in communication round

t

[45] and, based on these, carries out local optimization and generates an update as follows:

Δ_{i}^{t} = W_{i}^{t} - W^{t - 1}

(2)

where

W_{i}^{t}

denotes the parameters that have been locally obtained via training by client

i

. The update object is further assessed as being sensitive and is encrypted and aggregated as defined in Equation (2). The local training goal is illustrated as follows [46]:

\underset{W}{m i n} L_{i} (W) = \frac{1}{n_{i}} \sum_{j = 1}^{n_{i}} l (f_{W} (x_{i j}), y_{i j})

(3)

where

l (\cdot)

is the task-specific loss function and

f_{W} (\cdot)

denotes the local prediction model. Weighted cross-entropy was applied to minimize the impact of class imbalance in the case of arrhythmia classification. To analyze the images, the loss was chosen according to the task type (i.e., cross-entropy in the case of classification; Dice-based composite loss in the case of segmentation). Knowledge of privacy in healthcare model development has an important influence on these decisions.

3.4. Adaptive Trust-Aware Encrypted Federated Framework

The ATEB-AI framework is a combination of adaptive encryption, trust-aware aggregation, and permissioned blockchain auditability. Employing this framework in a single federated pipeline is the major contribution of this study in terms of methodology. In one respect, the overall design is unlike that of traditional secure FL systems: it presupposes that the privacy or reliability profiles of all model layers, clients, and communication rounds are not identical. Rather, protection, as well as aggregation, is sensitive to client behavior, sensitivity, and modality. This perception aligns with the increasing number of recent healthcare FL studies on adaptive privacy and reputation-aware cooperation [18].

3.4.1. Sensitivity-Aware Adaptive Encryption

Before transmission, each client estimates the sensitivity of layer

l

of the model during round

t

, represented by

s_{i, l}^{t}

. This score indicates the anticipated privacy exposure of the layer and is determined by updated properties, the type of modality, and the reliability of a client in the past [47]. Formally,

s_{i, l}^{t} = S (Δ_{i, l}^{t}, m_{i}, τ_{i}^{t - 1})

(4)

where

S (\cdot)

is the sensitivity operator,

m_{i}

is the modality indicator, and

τ_{i}^{t - 1}

is the trust score of client

i

from the last round. The proposed scheme has three concrete differences from the previous dynamic privacy-preserving schemes. In contrast to adaptive-DP schemes, where the privacy budget ε declines with round number, the policy

p_{i, l}^{t}

in this work is a solution to an explicit constrained optimization problem over the policy space,

P

[48], as follows:

p_{i, l}^{t} = a r g \underset{p \in P}{m i n} [λ_{1} C_{enc} (p) + λ_{2} C_{comm} (p) + λ_{3} C_{conv} (p)] s . t . Π (p) \geq π_{m i n}

(5)

where

C_{enc}

is the encryption cost,

C_{comm}

is the communication cost,

C_{conv}

is the convergence penalty, and

Π (p)

is the privacy level of the policy. ATEB-AI does not define the per-layer policy at the beginning of training like layer-wise HE schemes, instead recalculating

s_{i, l}^{t}

every round. This means that, if a layer becomes more sensitive, the policy is automatically switched. In contrast to selective-HE schemes, which only encrypt the last few layers, ATEB-AI sends condition policies—as well as the modality indicator, mi—using the history of trust scores τ_i^t−¹, as two clients submitting identical updates can be different, leading to differing encryption policies. The adaptation is conditional on the client, the client type, the layer, and the round; the existing dynamic schemes are not as expressive. Accordingly, the protected model’s update is calculated as

{\tilde{Δ}}_{i}^{t} = E n c (Δ_{i}^{t}; p_{i}^{t})

(6)

where

E n c (\cdot)

denotes CKKS-based encryption under the selected policy.

3.4.2. Trust-Aware Aggregation

Another methodological pillar of this framework is the trust-sensitive weighting of clients. Clinical federated learning does not imply that all members contribute equally credible updates; some can experience unstable optimization behavior, lack quality data or compliance, or may even have malicious interests. The trust score of client

i

in round t is changed according to [49], given by

τ_{i}^{t} = ϕ (τ_{i}^{t - 1}, g_{i}^{t}, a_{i}^{t}, c_{i}^{t})

(7)

where

g_{i}^{t}

is the gain of validation,

a_{i}^{t}

is the anomaly evidence, and

c_{i}^{t}

denotes the compliance indicators. The components

g_{i}^{t}

,

a_{i}^{t}

, and

c_{i}^{t}

, are defined as follows, each bounded in

[0, 1]

:

g_{i}^{t} = σ (\frac{V (W_{t}) - V (W_{t - 1})}{∣ V (W_{t - 1}) ∣ + ε})

,

a_{i}^{t} = 1 - \frac{1}{3} [d_{c o s} (Δ_{i}^{t}, {\tilde{Δ}}^{t})+ d_{M} (∥ Δ_{i}^{t} ∥; H_{i})+ (1− s_{sign}^{(i, t)})]

, and

c_{i}^{t} = \frac{1}{∣ K ∣} \sum_{k \in K} [{check}_{k} (i, t) = pass]

, where

V (\cdot)

is the task-specific validation metric on the server-side shard

V

;

σ (\cdot) i s

the logistic sigmoid;

ε = 10^{- 6}

;

d_{c o s}

and

d_{M}

are the normalized cosine and Mahalanobis distances of

Δ_{i}^{t}

against the round median

{\tilde{Δ}}^{t}

and the client history

H_{i}

, respectively;

s_{sign}^{(i, t)}

is the layer-wise sign-consistency ratio; and

K

is the set of four protocol checks logged on the blockchain (hash correctness, metadata completeness, timely submission, and policy conformity).

Therefore, Equation (7) is implemented as

τ_{i}^{t} = ϕ τ_{i}^{t - 1} + (1 - ϕ) (w_{g} g_{i}^{t}+ w_{a} a_{i}^{t}+ w_{c} c_{i}^{t})

.

where

ϕ \in [0,1]

,

w_{g} + w_{a} + w_{c} = 1

, and

w_{g}, w_{a}, w_{c} \geq 0

.

In all experiments, we set

ϕ = 0.6

and

(w_{g}, w_{a}, w_{c}) = (0.5, 0.3, 0.2)

, with

τ_{i}^{0} = 1

. These values were selected on the validation shard only; a sensitivity analysis over

ϕ \in {0.4, \dots, 0.8}

and reasonable perturbations of

(w_{g}, w_{a}, w_{c})

confirmed that the ranking of ATEB-AI is stable, with global F1 variations below ±0.4%.

Next, the aggregation weights are computed as in [50], given by

α_{i}^{t} = \frac{τ_{i}^{t} n_{i}}{\sum_{k \in A_{t}} τ_{k}^{t} n_{k}}

(8)

where

A_{t}

is the active client set at round

t

.

The frequent updater of the global model can be defined by

W^{t} = W^{t - 1} + \sum_{i \in A_{t}} α_{i}^{t} D e c ({\tilde{Δ}}_{i}^{t})

(9)

where

D e c (\cdot)

is the decryption function used prior to secure fusion. Unlike the standard FedAvg, Equations (8) and (9) minimize the impact of clients with diminished reliability. This especially pertains to heterogeneous or partially adversarial participation and is backed up by previous research on trust scoring, reputation-sensitive FL, and robust decentralized learning in healthcare [10,11,12].

3.4.3. Permissioned Blockchain Auditability

The permissioned blockchain was designed using Hyperledger Fabric v2.5, configuring the blockchain in a five-organization channel similar to the five-client topology. Transaction ordering is achieved through the Raft protocol (etcdraft, 5 orderer nodes), and commitment is achieved through endorsement, which lasts ≥ 3/5 peers and has no Byzantine resistance at the application level. The following block formation parameters are used:

BatchTimeout = 2 s, MaxMessageCount = 100, AbsoluteMaxBytes = 2 MB, PreferredMaxBytes = 512 KB.

Two chain codes implement the audit logic. The first, ATEB-Audit, stores the metadata tuple as shown in Equation (10) [51]:

m_{i}^{t} = \{h_{i}^{t}, r^{t}, τ_{i}^{t}, p_{i}^{t}, γ_{i}^{t}\}

(10)

where

h_{i}^{t}

is the update hash,

r^{t}

is the round identifier,

p_{i}^{t}

is the encryption policy, and

γ_{i}^{t}

is the anomaly or compliance flag set. The second chain code, ATEB-Governance, handles client registration/revocation and policy updates. No raw tensors, gradients, keys, or patient data are committed on-chain, and each transaction payload is bounded to ~1 KB.

Using Hyperledger Caliper v0.5, throughput was evaluated with a synthetic workload of ATEB-Audit sized transactions (≈ 1 KB) and discovered: TPS_avg ≈ 110, L_commit^median ≈ 1.4 s, and L_commit^p⁹⁵ < 3 s, when running the benchmark on commodity hardware (Intel i7, 32 GB RAM, 10 Gbps virtual network) under a saturating synthetic workload. This figure is the saturated commit ceiling (headroom) of the configured 5-organization, 5-orderer Raft channel with the endorsement policy ≥ 3/5 and the block-formation parameters above; this is simply for the sake of discussion given this topology and concurrency and is not a measure of load generated by the federated-learning run. This is not a limit on in-scenario performance, but rather a limit on topology/endorsement policy/client concurrency performance of Hyperledger Fabric. In practice, the auditing workload conducted in a benchmark run is light: N × T = 10 ATEB-Audit transactions (one 1 KB record per client, each per round as described in Equation (10)) plus a one-time set of ~6 ATEB-Governance transactions involving registration and policy bootstrap, for a total of ~16 committed transactions, serially submitted and well below saturation. For this workload, the binding operational metric is per transaction commit latency (≈1.4 s median; <3 s p95), NOT throughput. While the consortium might mature to N = 20 clients over a much longer time period, it is reported here only for the purpose of showing a time zone of scaling headroom for N = 20. Audit layer’s hundreds of audit transactions of size ~1 K Bytes are magnitudes away from the measured ceiling of 110 TPS, so the audit layer is no throughput bottleneck either.

To explain the interactions between these components based on one round of communication, we briefly describe the workflow of this operation for a given framework in Algorithm 1. The algorithm encapsulates the procedures of local training, sensitivity estimation, adaptive encryption selection, metadata logging, updating trust, and secure aggregation, offering an executable description of the ATEB-AI pipeline but not its conceptual description.

Algorithm 1. Proposed adaptive federated workflow integrating sensitivity-aware encryption, trust-based aggregation, and blockchain-supported auditability

Input: Client set

C = \{1, \dots, N\},

local datasets

D_{i}

, global model

W^{0}

initialization, total communication rounds

T,

trust scores

τ_{i}^{0}

initialization, the estimator of sensitivity

S (\cdot)

, blockchain record

B,

encryption strategy space

P,

anomaly monitor

A (\cdot)

Output: Determine global model,

W^{T}

; blockchain audit trail,

B;

trust and privacy logs.
for

t = 1

to

T

do
Broadcast

W^{t - 1}

and policy constraints to all active clients
for each active client

i

in parallel do

D_{i}

local data preprocessing
Local model training for

E

epochs and

Δ_{i}^{t}

update
Estimate layer sensitivity

s_{i, l}^{t} = S (Δ_{i, l}^{t}, m_{i}, τ_{i}^{t - 1})

Select adaptive encryption policy

p_{i, l}^{t}

subject to privacy constraints
Encrypt selected parameter blocks

{\tilde{Δ}}_{i}^{t} = E n c (Δ_{i}^{t}; p_{i}^{t})

Construct metadata record

m_{i}^{t} = {h_{i}^{t}, r^{t}, τ_{i}^{t}, p_{i}^{t}, γ_{i}^{t}}

Submit encrypted update to the aggregator and append metadata to blockchain

B

end for
Verify update provenance and smart-contract eligibility from blockchain records
Update client trust scores using validation gain, anomaly signals, and compliance history
Filter or down-weight suspicious clients
Aggregate securely:

W^{t} = W^{t - 1} + \sum_{i \in A_{t}} α_{i}^{t} D e c ({\tilde{Δ}}_{i}^{t})

Evaluate the updated global model and log privacy, utility, and latency metrics
end for
return

W^{T}

,

B

,

\{τ_{i}^{t}\}

, and the recorded study metrics

3.5. Comparative Methods and Attack Settings

The proposed framework was evaluated against a structured set of baselines designed to isolate the contribution of each methodological component. The comparison set included

Centralized non-private training;
Conventional federated learning (FedAvg);
Federated learning with differential-privacy federated learning;
Federated learning with homomorphic-encryption federated learning with fixed protection;
Federated learning with blockchain-assisted federated learning without adaptive encryption;
The entire ATEB-AI framework.

This benchmark design makes it possible to distinguish the effect of adaptive encryption from that of encryption alone and the effect of trust-aware governance from blockchain participation in its generic form.

The threat model included the following three classes of attacks: membership inference, model inversion or reconstruction-oriented leakage, and poisoning/Byzantine manipulation. The non-blockchain setting assumed an honest-but-curious server, whereas the broader consortium was assumed to contain a minority of potentially malicious participants capable of submitting harmful or misleading updates. These attack categories were chosen because they represent the most relevant privacy and reliability concerns in secure healthcare federated learning.

Privacy leakage under attack setting

a

was represented by [52]

R_{priv}^{(a)} = \frac{1}{∣ Ω_{a} ∣} \sum_{ω \in Ω_{a}} 1 ({\hat{z}}_{ω}= z_{ω})

(11)

where

Ω_{a}

is the evaluation set for attack

a

,

z_{ω}

is the true private status, and

{\hat{z}}_{ω}

is the attacker’s prediction. Equation (11) provides a generalized expression for attack success in membership and related inference settings.

In addition, a structured ablation design was defined to test the necessity of each framework component. The principal ablations were removal of the blockchain layer, replacement of adaptive encryption with fixed encryption, removal of trust-aware weighting, modality-specific evaluation, removal of sensitivity-aware layer selection, and variation in smart-contract strictness. This ablation structure was included to ensure that the final interpretation of the system would rest on component-level evidence rather than on the performance of the combined model alone.

3.6. Evaluation Protocol and Statistical Analysis

The evaluation strategy was intentionally multidimensional. Predictive performance was assessed using accuracy, F1-score, AUROC, and AUPRC for classification tasks, and the Dice coefficient, IoU, and Hausdorff distance where segmentation was involved. Calibration was analyzed in terms of the Brier score and expected calibration error, and heterogeneity and fairness were measured in terms of worst-site performance, between-site variance, and macro–micro disparity. These metrics were chosen as healthcare FL systems must be evaluated in terms of robustness and consistency between sites, not just through pooled average accuracy [53].

System assessment involved the encryption time, decryption time, volume of communication, per-round latency, blockchain confirmation latency, and metadata storage increase. This was necessary as the study intended to test the privacy–utility–latency trade-off as a first-order design issue, as opposed to a second-order implementation issue. To sum up this balance analytically, the composite utility score was defined as in [51]:

J = β_{1} U_{pred} - β_{2} R_{priv} - β_{3} T_{round}

(12)

where

U_{pred}

is the predictive utility,

R_{priv}

is the privacy risk, and

T_{round}

is the end-to-end round latency.

4. Results

The obtained findings are interpreted as a comparative benchmark package using the experimental setup described in Section 3, not as a clinical or large-scale validation. However, even in this smaller context, the results are internally consistent and can be used to make a straightforward judgment of the comparative performance of the considered techniques in terms of predictive utility, the robustness of privacy guarantees, computational costs, and stability at the site.

4.1. Comparative Predictive Performance

Across the four benchmarks, ATEB-AI achieved the strongest overall performance among the federated methods, remaining consistently close to the centralized reference. This pattern is shown in Figure 2 and summarized in Table 3.

Because the evaluation metrics differ by task, Value 1 in Table 3 denotes the score of Main Metric 1, whereas Value 2 denotes the score of Main Metric 2. Accordingly, these correspond to accuracy and F1-score for MIT-BIH, AUROC and F1-score for CHB-MIT, Dice and IoU for BraTS, and accuracy and F1-score for NIH.

For the signal datasets, ATEB-AI outperformed all federated baselines, reaching an F1-score of 0.937 on MIT-BIH and 0.952 on CHB-MIT. A similar trend was observed in the imaging tasks, where the framework achieved an IoU of 0.783 on BraTS and an F1-score of 0.821 on NIH, again ranking first among the federated configurations. These findings indicate that the proposed framework reduced the performance loss commonly associated with privacy-preserving federated learning. The relative F1 improvement of ATEB-AI on MIT-BIH, CHB-MIT, BraTS, and NIH against each baseline was as follows: ΔF1 (vs. FedAvg) = (+0.6, +0.7, +1.6, +1.1)%, ΔF1 (vs. DP-FL) = (+2.4, +2.6, +3.4, +2.7)%, ΔF1 (vs. HE-FL) = (+0.7, +0.8, +1.7, +1.2)%, ΔF1 (vs. BC-FL) = (+0.97, +1.17, +2.35, +1.74)%.

A second observation concerns calibration. On the classification datasets, ATEB-AI yielded lower calibration errors than the competing privacy-aware baselines and also improved upon FedAvg, decreasing from 0.034 to 0.028 on MIT-BIH, 0.029 to 0.024 on CHB-MIT, and 0.039 to 0.031 on NIH. Although these gains were moderate, they indicate that improved discrimination was not achieved at the expense of predictive reliability.

Taken together, Figure 2 and Table 3 show that ATEB-AI delivered the most favorable overall predictive profile among the federated methods across both biomedical signal and medical image tasks.

4.2. Privacy and Attack Resilience

The privacy and robustness results show a clearer separation between the proposed method and the conventional baselines. As summarized in Table 4, ATEB-AI yielded the lowest membership inference success rate (0.24), the lowest inversion leakage score (0.18), and the smallest poisoning-related utility drop (0.07) while also achieving the highest Byzantine resilience score (0.81). These values indicate that the joint use of adaptive encryption, trust-aware aggregation, and auditable coordination was more effective than applying privacy protection in a fixed or isolated form. In comparison, FedAvg was the weakest configuration, with a membership inference success rate of 0.71 and a poisoning drop of 0.18, while DP-FL improved privacy relative to FedAvg but was outperformed by the stronger encrypted and blockchain-assisted baselines.

The same pattern is reinforced by Figure 3, which plots retained utility against increasing attack intensity. Each method performed worse and worse under adversarial pressure, but with significant differences among the declines. The sharpest decline was recorded by FedAvg, dropping to around 0.67 and retaining utility down to 0.91. DP-FL did not change, yet significantly dropped. Acceptably, HE-FL and BC-FL maintained much more utility, but ATEB-AI was the most consistent curve within the entire scope of attack, declining only between about 0.93 and 0.85. This shows that the offered approach is not only more secretive in a privacy sense but also more stable in increasingly unfavorable circumstances.

The adaptive nature of the privacy mechanism is demonstrated in Figure 4, where the desired strength of encryption versus clients and rounds is depicted. The heatmap indicates that more aggressive encryption was focused on fewer clients, as well as on earlier communication rounds, but the weak protection later proved to be adequate in those states at low risk while training. This characteristic aligns with the proposed rationale of the framework: privacy protection must be receptive to the conditions of sensitivity and trust should not be determined throughout the entire training path. The six-round axis (R1-R6) of Figure 4 corresponds to the policy-scheduling and convergence schedule, the predictive, privacy, and efficiency results reported in Table 3, Table 4 and Table 5 and Figure 2, Figure 3, Figure 5 and Figure 6 show the results of the warm-started benchmark package run over the fixed two-round budget (Section 3.3), evaluated on the held-out test split of Table 1, with the membership-inference, model-inversion, and poisoning/Byzantine attacks applied to the trained global model. Attack-success values for ATEB-AI and each baseline, with the resulting relative reductions, are as follows: MIAsucc: ATEB-AI = 0.24; FedAvg = 0.71 (−66.2%), DP-FL = 0.45 (−46.7%), HE-FL = 0.32 (−25.0%), BC-FL = 0.39 (−38.5%). Invleak: ATEB-AI = 0.27; FedAvg = 0.64 (−57.8%), DP-FL = 0.50 (−46.0%), HE-FL = 0.35 (−22.9%), BC-FL = 0.41 (−34.1%). Poisondrop: ATEB-AI = 0.07; FedAvg = 0.18 (−61.1%), DP-FL = 0.15 (−53.3%), HE-FL = 0.16 (−56.3%), BC-FL = 0.13 (−46.2%).

4.3. Computational Overhead and Deployment Efficiency

The cost analysis confirms that the gains in privacy and robustness were not obtained without overhead, but the magnitude of the added burden remained more favorable for ATEB-AI than for the fixed cryptographic alternatives. As reported in Table 5, ATEB-AI required 4.0 s of encryption time and 1.4 s of decryption time per round, with a communication expansion of 1.8× and a round latency of 1.90× relative to FedAvg. These values were higher than those of FedAvg and DP-FL but substantially lower than those of HE-FL and BC-FL. In particular, HE-FL reached 2.85× round latency and BC-FL 3.50×, whereas ATEB-AI remained below both while still maintaining high audit completeness.

This trade-off is visualized more clearly in Figure 5, where normalized predictive utility is plotted against round latency. Among the protected federated methods, ATEB-AI occupies the most favorable region of the utility–latency space, combining the highest predictive utility with substantially lower latency than HE-FL and BC-FL. DP-FL remained computationally lighter, but its lower predictive utility placed it in an inferior trade-off region. In more practical terms, the results indicate that the presented framework enhanced the efficiency of the secure federated learning not by eliminating the privacy cost, but by decreasing its overload. Round latency (as multiples of FedAvg) and communication expansion across all baselines are as follows: Lround/LFedAvg = 1.00 (FedAvg), 1.30 (DP-FL), 2.85 (HE-FL), 3.50 (BC-FL), 1.90 (ATEB-AI), C_comm/C_FedAvg = 1.0 (FedAvg, DP-FL), 5.8 (HE-FL), 4.2 (BC-FL), 3.3 (ATEB-AI). Therefore, against the privacy-aware baselines, ATEB-AI reduces round latency by −33.3% vs. HE-FL and −45.7% vs. BC-FL and reduces communication expansion by −43.1% vs. HE-FL and −21.4% vs. BC-FL. DP-FL is faster (1.30×) but offers materially weaker privacy than ATEB-AI on every attack class above, so the relevant trade-off is privacy strength at acceptable latency rather than latency alone. Worst-site F1 improvement of ATEB-AI over FedAvg: +3.05% (MIT-BIH), +2.43% (CHB-MIT), +2.68% (BraTS), and +2.72% (NIH). The between-site F1 variance was reduced by 36.1–46.6% across the four datasets.

This additional information is reflected in the overhead composition in Figure 6. In the case of ATEB-AI, training and communication made the biggest contribution, with the additions of encryption and decryption being seen as increasing visibly, albeit in a measured manner. In comparison, the cryptographic overhead of the stacked bars representing HE-FL and BC-FL increases notably, particularly in the encryption part. This breakdown is in favor of the thesis that adaptive protection is algorithmically superior to fixed heavy encryption in cases where maintaining a secure collaboration without letting the latency overtake the training procedure is desirable.

4.4. Ablation and Cross-Site Heterogeneity Findings

The ablation analysis demonstrates that the complete ATEB-AI model resulted in the most balanced profile when evaluated in terms of the key evaluation dimensions. As presented in Figure 7 and Table 6, the entire model had high utility (0.91), privacy (0.90), fairness (0.88), and reasonable efficiency. The eradication of the blockchain layer did not change utility or privacy significantly but decreased governance considerably, suggesting that the blockchain was more associated with traceability than predictive gain. In the process of substituting adaptive encryption with fixed encryption, privacy was maintained fairly but at the cost of the ultimate maximum degradation in the latency. The most evident impact on fairness was the deletion of the trust weighting, which decreased the fairness value from 0.88 to 0.74, supporting the claim that the trust-sensitive aggregation mechanism was the key factor determining the site disparity. Removing the warm-start (cold-start arm) at the fixed two-round budget reduced the global validation F1 relative to the warm-started configuration and left it short of its plateau, confirming that the two-round budget is a property of the warm-started regime and not a budget that is independently sufficient. The plateau F1 was restored and regained in the vicinity of the ±0.4% by R2–R3, after recovering from the warm start. This makes the short round budget independent of the previous iteration that defined the interaction between the warm-start and the choice of T = 2.

The conclusion is the same for cross-site performance outcomes. Figure 8 and Table 7 show that, in all datasets, ATEB-AI significantly increased the worst-site score in comparison with FedAvg. On MIT-BIH, worst-site performance increased from 0.884 to 0.911; on CHB-MIT, from 0.904 to 0.926; on BraTS, from 0.822 to 0.844; and on NIH, from 0.770 to 0.792. Meanwhile, site variance was reduced in all datasets, which is a sign of narrow performance distributions among institutions.

This finding is significant as, in multicenter medical AI, average performance might mask weaknesses that are clinically significant in underrepresented locations. The findings indicate that the specified trust-based framework not only enhanced the average federated outcome but also the standard deviation of the outcome among the clients involved.

Finally, Table 8 demonstrates that the residual errors were still of a clinical nature and did not happen by chance. Rare atrial and supraventricular beat patterns also continued to be confused in MIT-BIH. Partially overlapping seizure windows could still be a problem in CHB-MIT, as they were hard to categorize. In BraTS, the most important problem was separating tumor-absent slices and tiny-lesion slices; meanwhile, with NIH, subtle abnormalities devoid of intense localized contrast were still problematic. The significance of these failure modes is that they suggest future contributions for possible gains: more local representations; better treatment of rare patterns; and more uncertainty-conscious decision logic, as opposed to mere boosts in encryption strength.

5. Discussion

The findings demonstrate that the suggested ATEB-AI framework was the most balanced in terms of overall performance among the compared federated methods. In both biomedical signal and medical image tasks, it achieved a steady accuracy near the centralized reference but with a higher predictive accuracy, attack resiliency, and efficiency than the competing privacy-preserving baselines. This tendency, backed up by the results presented in Figure 2, Table 3 and Table 4, implies that the combination of adaptive encryption, trust-aware aggregation, and auditable coordination is a more effective design compared to the application of fixed privacy mechanisms alone.

One of the core discoveries of this study is that the protection of privacy was enhanced without bearing the entire overhead commonly inherent to strict encrypted federated learning. The adaptive behavior depicted in Figure 4 reveals that the encryption strength is not imposed uniformly but, rather, varies depending on the conditions of clients and rounds. This contributes to the more positive utility–latency curve in Figure 5 and the reduced cryptographic load compared with the fixed secure baselines in Figure 6. Methodologically, the findings are consistent with the premise that privacy in healthcare federated learning is a dynamic constraint issue, not a pre-defined configuration.

The results also emphasize the need for trust-aware aggregation in diverse participation. Along with the cross-site improvements in Table 7 and Figure 8, Figure 3 has a more robust profile, demonstrating that the weighting of clients using a reliability factor led to not only secure outcomes but also fairness and stability within institutions. This is especially true when medical AI is used in multiple centers, where there may be a significant disparity between sites that are covered by average performance.

Another observation is related to the blockchain layer. Table 6 and Figure 7 illustrate the ablation results, showing that this layer did not contribute significant predictive gain, but did enhance auditability, provenance, and governance. This justifies an unambitious and technically justifiable position for blockchain in federated learning in healthcare. Instead of being viewed as a general performance solution, it should be perceived as a coordination infrastructure that can be verified.

There were shortcomings to this study. (i) Scalability: The framework was validated for N = {5, 20} clients; federation sizes greater than N≈100 would push the limits of the CKKS aggregator (the cost of the circuit of aggregation is close to linear with the number of involved ciphertexts) and the Hyperledger Fabric orderer layer (Raft is not recommended for ordered nodes with a few dozen clients). Hierarchical aggregation and BFT-style ordering are needed for larger consortia. (ii) Clinical integration: The experiments were based on publicly released, de-identified healthcare datasets, meaning that the framework was not explored under actual hospital data governance rules (DICOM/HL7 pipelines, IRB protocols, retention policies, GDPR/HIPAA protocols for access to data, etc.). In this regard, adaptive encryption should be compatible with the right to be forgotten, as blockchain’s immutability goes against the idea of retroactive deletion. (iii) Real-time deployment: A latency of ≈1.9× LFedAvg is tolerable for frequent periodic updates of clinical models (at least daily/weekly), but the framework not yet ready for real-time federated inference at the bedside. (iv) Dataset scale: The modest size of the per-task cohorts (BraTS, NIH ChestXray, CHB-MIT) were used to provide a controlled comparison of the federated configurations (not for establishing clinical-grade absolute accuracy) and cannot be generalized to the population-level (qualified-per-task). The first task in the following future work agenda is to re-establish absolute performance across multi-hospital cohorts by collecting data in real time (prospectively). (v) Warm-start scope: The warm-started scheme is suited here since the federated component is a low-dimensional head operating on the locally pre-extracted features: for modality-specific deeper end-to-end backbones, the behavior of the cold start and the number of rounds needed in the budget would change, and is intended for the follow-on work.

In the future, ATEB-AI will be extended in four directions. Firstly, the framework will need to be tested in actual clinical, multi-hospital settings with prospectively collected ECG and MRI cohorts, and formal ethical approval should be ensured through ongoing review by clinical teams. Second, hospital information systems need to be integrated with the ATEB-AI mechanisms (via standard interface FHIR/HL7) to introduce federated training into existing clinical workflows while retaining the patient’s consent status. Third, lightweight encryption settings, as well as split-learning variants, for encrypted computation should be considered for those clients most likely to use hospital edge devices. Finally, further research is needed to increase the readiness of regulation, including the creation of practical approaches to participant withdrawal, the abatement of contributions, the auditability of outcomes, and the governance of adaptive medical AI system updates.

6. Conclusions

This study introduced ATEB-AI, an adaptive trust-sensitive encrypted federated learning system that achieves auditability through blockchain, which is designed for multicenter biomedical signal and medical image analytical processing. The framework was created to solve a pragmatic issue in privacy-preserving healthcare AI: how to enable collaborative learning between heterogeneous institutions without compromising their confidentiality, robustness, efficiency, and governance. The proposed approach is privacy-preserving due to the combination of a sensitivity-aware encryption mechanism, trust-based aggregation, and a permissioned blockchain (confined to providing provenance and audit services) and offers a more balanced design than traditional privacy-preserving federated designs.

The findings revealed that ATEB-AI had the best performance across the compared federated strategies on the benchmark tasks. It was consistently close to the centralized reference, minimized inference time and poisoning effects, maximized cross-site stability, and had less overhead than the secure baselines. These findings imply that adaptive protection and trust-conscious coordination may enhance the pragmatic feasibility of secure federated learning in healthcare contexts.

The core value of this research is that privacy-aware medical AI cannot and must not be assessed solely based on metrics of accuracy or privacy; rather, it should be measured in terms of its balance with efficiency, equitability, and auditability. In this regard, ATEB-AI provides an operationally plausible and methodologically consistent model of multicenter biomedical AI. In the future, this design should be expanded to facilitate actual hospital implementation, larger cross-institutional partnerships, and more possible combinations of modalities to further test its clinical and technical generalizability.

Author Contributions

A.F.H. and A.Q.A.-N. contributed equally to the conception, methodology, analysis, writing, and revision of this manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

This study was retrospective, and the data was sourced from public datasets. More information can be found in Section 3.

Informed Consent Statement

Patient consent was waived because the study was retrospective.

Data Availability Statement

The datasets used in this study are publicly available online from the original repositories referenced in the manuscript, including the MIT-BIH Arrhythmia Database, the CHB-MIT Scalp EEG Database, BraTS, and the NIH/Chest X-ray dataset.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial Intelligence
ATEB-AI	Adaptive Trust-Aware Encrypted Federated Artificial Intelligence
AUROC	Area Under the Receiver Operating Characteristic Curve
AUPRC	Area Under the Precision–Recall Curve
BC-FL	Blockchain-Assisted Federated Learning
BraTS	Brain Tumor Segmentation
CHB-MIT	Children’s Hospital Boston–Massachusetts Institute of Technology
CKKS	Cheon–Kim–Kim–Song
CNN	Convolutional Neural Network
CNN-BiLSTM	Convolutional Neural Network–Bidirectional Long Short-Term Memory
CT	Computed Tomography
DP	Differential Privacy
DP-FL	Differential Privacy Federated Learning
ECG	Electrocardiogram
EEG	Electroencephalogram
EHR	Electronic Health Record
FL	Federated Learning
F1	F1-Score
HE	Homomorphic Encryption
HE-FL	Homomorphic Encryption Federated Learning
IoMT	Internet of Medical Things
IoT	Internet of Things
IoU	Intersection over Union
MIA	Membership Inference Attack
MIT-BIH	Massachusetts Institute of Technology–Beth Israel Hospital
MLP	Multilayer Perceptron
MRI	Magnetic Resonance Imaging
NIH	National Institutes of Health
PFL	Personalized Federated Learning
U-Net	U-Shaped Network

References

Abbas, S.R.; Abbas, Z.; Zahir, A.; Lee, S.W. Federated Learning in Smart Healthcare: A Comprehensive Review on Privacy, Security, and Predictive Analytics with Iot Integration. Healthcare 2024, 12, 2587. [Google Scholar] [CrossRef]
Kumar, K.S.; Nelson, L.; Jibinsingh, B.R. Systematic Review of Privacy-Preserving Federated Learning in Decentralized Healthcare Systems. Frankl. Open 2025, 13, 100440. [Google Scholar] [CrossRef]
Sarma, K.V.; Harmon, S.; Sanford, T.; Roth, H.R.; Xu, Z.; Tetreault, J.; Xu, D.; Flores, M.G.; Raman, A.G.; Kulkarni, R. Federated Learning Improves Site Performance in Multicenter Deep Learning without Data Sharing. J. Am. Med. Inform. Assoc. 2021, 28, 1259–1264. [Google Scholar] [CrossRef] [PubMed]
Firdaus, M.; Larasati, H.T.; Hyune-Rhee, K. Blockchain-Based Federated Learning with Homomorphic Encryption for Privacy-Preserving Healthcare Data Sharing. Internet Things 2025, 31, 101579. [Google Scholar] [CrossRef]
Zhao, X.; Lou, Z.; Shah, P.T.; Wu, C.; Liu, R.; Xie, W.; Zhang, S. Integration of Multi-Modal Biosensing Approaches for Depression: Current Status, Challenges, and Future Perspectives. Sensors 2025, 25, 4858. [Google Scholar] [CrossRef]
Dandure, F.M.; Ndlovu, B. Federated Learning for Privacy-Preserving Sentiment Analysis in Distributed Electronic Health Record Environments: A Systematic Literature Review. J. Inf. Syst. Inform. 2026, 8, 1812–1842. [Google Scholar] [CrossRef]
Hussein, A.F.; Al-Neami, A.Q.; Al-Qazzaz, N.K. Transfer Learning and Hybrid Deep Convolutional Neural Networks for Detection and Classification of Gastrointestinal Diseases. Int. J. Inf. Technol. 2025, 1–15. [Google Scholar] [CrossRef]
Zhang, X.; Li, K.; Wu, Y.; Liang, S.; Yu, M. Transforming E-Commerce with Ai: Navigating Innovation, Personalization, and Ethical Challenges. J. Theor. Appl. Electron. Commer. Res. 2026, 21, 29. [Google Scholar] [CrossRef]
Myrzashova, R.; Alsamhi, S.H.; Shvetsov, A.V.; Hawbani, A.; Wei, X. Blockchain Meets Federated Learning in Healthcare: A Systematic Review with Challenges and Opportunities. IEEE Internet Things J. 2023, 10, 14418–14437. [Google Scholar] [CrossRef]
Poojari, R. Privacy-Preserving Generative Ai in Healthcare Systems Using Federated Learning Approaches. Int. J. Data Sci. IoT Manag. Syst. 2026, 5, 78–88. [Google Scholar] [CrossRef]
Raza, A.; Khan, F.; It, Z.B.; Heng, J.B.; Teo, T.H. Toward Real-World Deployment of Federated Learning in Healthcare: A Comprehensive Review of Hybrid Models and Data Simulation Tools. Preprints 2025. [Google Scholar] [CrossRef]
Shiranthika, C.; Saeedi, P.; Bajić, I.V. Decentralized Learning in Healthcare: A Review of Emerging Techniques. IEEE Access 2023, 11, 54188–54209. [Google Scholar] [CrossRef]
Rachakonda, S.; Moorthy, S.; Jain, A.; Bukharev, A.; Bucur, A.; Manni, F.; Quiterio, T.M.; Joosten, L.; Mendez, N.I. Privacy Enhancing and Scalable Federated Learning to Accelerate Ai Implementation in Cross-Silo and Iomt Environments. IEEE J. Biomed. Health Inform. 2022, 27, 744–755. [Google Scholar] [CrossRef]
Cremonesi, F.; Vesin, M.; Cansiz, S.; Bouillard, Y.; Balelli, I.; Innocenti, L.; Taiello, R.; Silva, S.; Ayed, S.-S.; Önen, M. Fed-Biomed: Open, Transparent and Trusted Federated Learning for Real-World Healthcare Applications. In Federated Learning Systems: Towards Privacy-Preserving Distributed AI; Springer: Berlin/Heidelberg, Germany, 2025; pp. 19–41. [Google Scholar]
Hoang, T.-H.; Fuhrman, J.; Klarqvist, M.; Li, M.; Chaturvedi, P.; Li, Z.; Kim, K.; Ryu, M.; Chard, R.; Huerta, E.A. Enabling End-to-End Secure Federated Learning in Biomedical Research on Heterogeneous Computing Environments with Appflx. Comput. Struct. Biotechnol. J. 2025, 28, 29–39. [Google Scholar] [CrossRef]
Nampalle, K.B.; Singh, P.; Narayan, U.V.; Raman, B. Vision through the Veil: Differential Privacy in Federated Learning for Medical Image Classification. arXiv 2023, arXiv:2306.17794. [Google Scholar] [CrossRef]
Das, B.C.; Amini, M.H.; Wu, Y. Privacy Risks Analysis and Mitigation in Federated Learning for Medical Images. In Proceedings of the 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Istanbul, Turkey, 5–8 December 2023. [Google Scholar]
Fares, M.H.; Sertbaş, A. A Differentially Private Federated Learning Application in Privacy-Preserving Medical Imaging. Res. Sq. 2024. [Google Scholar] [CrossRef]
Gopalakrishnan, A.; Kulkarni, N.P.; Raghavendra, C.B.; Manjappa, R.; Honnavalli, P.; Eswaran, S. Primed: Private Federated Training and Encrypted Inference on Medical Images in Healthcare. Expert Syst. 2025, 42, e13283. [Google Scholar] [CrossRef]
Hartsock, I.; Rasool, G. Vision-Language Models for Medical Report Generation and Visual Question Answering: A Review. Front. Artif. Intell. 2024, 7, 1430984. [Google Scholar] [CrossRef] [PubMed]
Ali, M.; Naeem, F.; Tariq, M.; Kaddoum, G. Federated Learning for Privacy Preservation in Smart Healthcare Systems: A Comprehensive Survey. IEEE J. Biomed. Health Inform. 2022, 27, 778–789. [Google Scholar] [CrossRef] [PubMed]
Takawira, B.; Pooe, D. Reconstructing Healthcare Foundations: Building Blocks of Federated Systems in Medical Technology. In Federated Intelligent System for Healthcare: A Practical Guide; Wiley: Hoboken, NJ, USA, 2025; pp. 171–200. [Google Scholar]
Yue, L.; Ganesan, P.; Sathish, B.; Manikandan, C.; Niranjan, A.; Elamaran, V.; Hussein, A.F. The Importance of Dithering Technique Revisited with Biomedical Images—A Survey. IEEE Access 2018, 7, 3627–3634. [Google Scholar] [CrossRef]
Zeng, X.; Ahmed, A.; Tunio, M.H. Exploring Uncertainty in Medical Federated Learning: A Survey. Electronics 2025, 14, 4072. [Google Scholar] [CrossRef]
Shen, S.; Zhu, T.; Wu, D.; Wang, W.; Zhou, W. From Distributed Machine Learning to Federated Learning: In the View of Data Privacy and Security. Concurr. Comput. Pract. Exp. 2022, 34, e6002. [Google Scholar] [CrossRef]
Al-Saleh, A.; Tejani, G.G.; Mishra, S.; Sharma, S.K.; Mousavirad, S.J. A Federated Learning-Based Privacy-Preserving Image Processing Framework for Brain Tumor Detection from Ct Scans. Sci. Rep. 2025, 15, 23578. [Google Scholar] [CrossRef]
Wu, Y. Construction of Intelligent Pe Classroom and Innovation of Teaching Mode in Colleges and Universities Based on Machine Learning and Internet of Things. Discov. Internet Things 2026, 6, 48. [Google Scholar] [CrossRef]
Majeed, A.; Hwang, S.O. A Multifaceted Survey on Federated Learning: Fundamentals, Paradigm Shifts, Practical Issues, Recent Developments, Partnerships, Trade-Offs, Trustworthiness, and Ways Forward. IEEE Access 2024, 12, 84643–84679. [Google Scholar] [CrossRef]
Ma, X.; Xu, D.; Wolter, K. Towards Blockchain-Enabled Decentralized and Secure Federated Learning. Inf. Sci. 2024, 665, 120368. [Google Scholar] [CrossRef]
Al-Qaysi, Z.; Al-Saegh, A.; Hussein, A.F.; Ahmed, M. Wavelet-Based Hybrid Learning Framework for Motor Imagery Classification. Iraqi J. Electr. Electron. Eng. 2023, 19, 47–56. [Google Scholar] [CrossRef]
Nawshin, F.; Unal, D.; Hammoudeh, M.; Suganthan, P.N. A Novel Genetic Algorithm Optimized Adversarial Attack in Federated Learning for Android-Based Mobile Systems. IEEE Trans. Consum. Electron. 2025, 71, 8512–8520. [Google Scholar] [CrossRef]
Park, M.; Chai, S. Btimfl: A Blockchain-Based Trust Incentive Mechanism in Federated Learning. In Proceedings of the International Conference on Computational Science and Its Applications; Springer: Cham, Switzerland, 2023. [Google Scholar]
Sánchez, P.M.S.; Celdrán, A.H.; Xie, N.; Bovet, G.; Pérez, G.M.; Stiller, B. FFederatedtrust: A Solution for Trustworthy Federated Learning. Future Gener. Comput. Syst. 2024, 152, 83–98. [Google Scholar] [CrossRef]
Hussein, A.F.; ALZubaidi, A.K.; Habash, Q.A.; Jaber, M.M. An Adaptive Biomedical Data Managing Scheme Based on the Blockchain Technique. Appl. Sci. 2019, 9, 2494. [Google Scholar] [CrossRef]
Ye, M.; Huangfu, Y.; Gao, S.; Ren, W.; Liu, W.; Yu, Z. Fedgsca: Medical Federated Learning with Global Sample Selector and Client Adaptive Adjuster under Label Noise. IEEE J. Biomed. Health Inform. 2026, 1–12. [Google Scholar] [CrossRef] [PubMed]
Kasyap, H.; Tripathy, S. Privacy-Preserving Decentralized Learning Framework for Healthcare System. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 2021, 17, 68. [Google Scholar] [CrossRef]
Gallo, G.D.; Micucci, D. Internet of Medical Things Systems Review: Insights into Non-Functional Factors. Sensors 2025, 25, 2795. [Google Scholar] [CrossRef]
Wu, X.; Zhang, Y.-T.; Lai, K.-W.; Yang, M.-Z.; Yang, G.-L.; Wang, H.-H. A Novel Centralized Federated Deep Fuzzy Neural Network with Multi-Objectives Neural Architecture Search for Epistatic Detection. IEEE Trans. Fuzzy Syst. 2024, 33, 94–107. [Google Scholar] [CrossRef]
Lee, C.H.; Lim, K.H.; Eswaran, S. A Comprehensive Survey on Secure Healthcare Data Processing with Homomorphic Encryption: Attacks and Defenses. Discov. Public Health 2025, 22, 137. [Google Scholar] [CrossRef]
Goldberger, A.L.; Amaral, L.A.; Glass, L.; Hausdorff, J.M.; Ivanov, P.C.; Mark, R.G.; Mietus, J.E.; Moody, G.B.; Peng, C.-K.; Stanley, H.E. Physiobank, Physiotoolkit, and Physionet: Components of a New Research Resource for Complex Physiologic Signals. Circulation 2000, 101, e215–e220. [Google Scholar] [CrossRef]
Shoeb, A.H. Application of Machine Learning to Epileptic Seizure Onset Detection and Treatment. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, 2009. [Google Scholar]
Li, H.B.; Conte, G.M.; Hu, Q.; Anwar, S.M.; Kofler, F.; Ezhov, I.; van Leemput, K.; Piraud, M.; Diaz, M.; Cole, B. The Brain Tumor Segmentation (Brats) Challenge 2023: Brain Mr Image Synthesis for Tumor Segmentation (Brasyn). arXiv 2024, arXiv:2305.09011v6. [Google Scholar] [CrossRef]
Irvin, J.; Rajpurkar, P.; Ko, M.; Yu, Y.; Ciurea-Ilcus, S.; Chute, C.; Marklund, H.; Haghgoo, B.; Ball, R.; Shpanskaya, K. Chexpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison. Proc. AAAI Conf. Artif. Intell. 2019, 33, 590–597. [Google Scholar] [CrossRef]
Jawad, K.K.; Alyaseri, N.H.A.; Alwan, S.A.; Hussein, E.K.; Subhi, K.A.; Sharaf, H.K.; Hussein, A.F.; Salman, M.D.; Zwaid, J.G.; Abed, R.A. Contingency in Engineering Problem Solving Understanding Its Role and Implications: Focusing on the Sports Machine. Rev. Iberoam. Psicol. Ejerc. Deporte 2023, 18, 334–337. [Google Scholar]
Matsuda, K.; Sasaki, Y.; Xiao, C.; Onizuka, M. Benchmark for Personalized Federated Learning. IEEE Open J. Comput. Soc. 2023, 5, 2–13. [Google Scholar] [CrossRef]
Bokhari, S.M.; Sohaib, S.; Shafi, M. Fusion of Personalized Federated Learning (Pfl) with Differential Privacy (Dp) Learning for Diagnosis of Arrhythmia Disease. PLoS ONE 2025, 20, e0327108. [Google Scholar] [CrossRef] [PubMed]
Reis, M.J. Trust-Aware Federated Graph Learning for Secure and Energy-Efficient Iot Ecosystems. Computers 2026, 15, 121. [Google Scholar] [CrossRef]
Tang, J.; Fayyaz, Z.; Salahuddin, M.A.; Boutaba, R.; Zhang, Z.-L.; Anwar, A. Herl: Tiered Federated Learning with Adaptive Homomorphic Encryption Using Reinforcement Learning. In Proceedings of the 2025 IEEE 7th International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications (TPS-ISA), Pittsburgh, PA, USA, 11–14 November 2025. [Google Scholar]
Romandini, N.; Costagliola, A.R.; Bujari, A.; Montanari, R. Trustflow: A Traceable Federated Learning Framework to Enable Trustworthy Digital Twins. Future Gener. Comput. Syst. 2025, 178, 108267. [Google Scholar] [CrossRef]
Su, Z.; Cheng, R.; Li, C.; Chen, M.; Zhu, J.; Long, Y. Federated Learning and Reputation-Based Node Selection Scheme for Internet of Vehicles. Electronics 2025, 14, 303. [Google Scholar] [CrossRef]
Mohammadi, S.; Balador, A.; Sinaei, S.; Flammini, F. Balancing Privacy and Performance in Federated Learning: A Systematic Literature Review on Methods and Metrics. J. Parallel Distrib. Comput. 2024, 192, 104918. [Google Scholar] [CrossRef]
Zhao, J.; Bagchi, S.; Avestimehr, S.; Chan, K.; Chaterji, S.; Dimitriadis, D.; Li, J.; Li, N.; Nourian, A.; Roth, H. The Federation Strikes Back: A Survey of Federated Learning Privacy Attacks, Defenses, Applications, and Policy Landscape. ACM Comput. Surv. 2025, 57, 230. [Google Scholar] [CrossRef]
Li, M.; Xu, P.; Hu, J.; Tang, Z.; Yang, G. From Challenges and Pitfalls to Recommendations and Opportunities: Implementing Federated Learning in Healthcare. Med. Image Anal. 2025, 101, 103497. [Google Scholar] [CrossRef]

Figure 1. General structure of the proposed ATEB-AI model. Local clinical nodes learn the model per modality using decentralized biomedical signal and medical image data. Adaptive encryption is used to protect sensitive parameter blocks, which are aggregated with a trust-aware system and stored in a permissioned blockchain where all metadata, policy decisions, and provenance are logged, but the actual medical data is not stored.

Figure 2. Comparative predictive performance of centralized, standard federated, and privacy-preserving federated methods across MIT-BIH, CHB-MIT, BraTS, and NIH benchmarks. The proposed ATEB-AI framework achieves the best federated performance on all four datasets and remains consistently close to the centralized reference while also showing improved calibration on the classification tasks.

Figure 3. Remaining predictive utility with a stronger attack. ATEB-AI exhibits the least degradation under adversarial pressure, meaning that it is more resistant than other federated learning methods and fixed-privacy benchmarks.

Figure 4. The policy-scheduling/convergence schedule R1-R6, compared with the two-round benchmark package in Table 2, and the use of client-round adaptive encryption instead of Table 2 for ATEB-AI. The selected encryption strength (policy

p_{i}^{t}

) for each client and round is encoded per cell; this is not predictive accuracy. The inclusion of a 6-round horizon is done so that the swiping from strong to weak encryption, as sensitivity and trust conditions become favorable, can be seen.

Figure 4. The policy-scheduling/convergence schedule R1-R6, compared with the two-round benchmark package in Table 2, and the use of client-round adaptive encryption instead of Table 2 for ATEB-AI. The selected encryption strength (policy

p_{i}^{t}

) for each client and round is encoded per cell; this is not predictive accuracy. The inclusion of a 6-round horizon is done so that the swiping from strong to weak encryption, as sensitivity and trust conditions become favorable, can be seen.

Figure 5. Utility–latency trade-offs for compared federated methods. ATEB-AI is situated in the best area among the approaches, as it has the greatest normalized predictive utility, with significantly reduced round latency when compared with fixed HE-FL and blockchain-aided BC-FL.

Figure 6. Relative per-round-time factorization of competing methods. Training/communication, encryption, and decryption are separated by stacked bars, demonstrating that ATEB-AI is much more conservative regarding cryptographic overhead, in comparison with HE-FL and BC-FL, and maintains secure coordination.

Figure 7. Ablation profiles of ATEB-AI framework in terms of utility, privacy, fairness, and efficiency. The entire model gives the most even overall picture, and component truncation applies specific degradation to it, particularly in fairness, with the elimination of the trust weighting and efficiency with the substitution of adaptive encryption with fixed protection.

Figure 8. Comparison of the cross-site performance between FedAvg and ATEB-AI. Each dataset displays best-site, median-site, and worst-site results, indicating that worst-site results are uniformly better and inter-site disparity is lower with ATEB-AI.

Table 1. Overview of the data, task descriptions, client setup, and training–validation–test split employed in the multicenter federated trials.

Dataset	Task	Rows	Clients	Train	Val	Test	Classes	Representation
MIT-BIH	ECG arrhythmia classification	17,367	5	12,156	2605	2606	4	Beat-level engineered features
CHB-MIT	EEG seizure detection	640	5	448	96	96	2	Window-level seizure overlap features
BraTS	MRI slice tumor-present classification	364	5	254	54	56	2	Slice-level multimodal MRI features
NIH	Chest X-ray abnormality classification	1760	5	1232	264	264	2	Image-level intensity/edge features

Table 2. The preliminary functionality and test environments used to conduct federated training, including backbone representation, optimization, privacy benchmarks, encryption design, and blockchain configuration.

Component	Value
Signal/image backbone	MLP using engineered features; MRI/X-ray are summarized as numeric features for target package
Optimizer	Adam with β1 = 0.9, β2 = 0.999, ε = 1 × 10⁻⁸, η = 1 × 10⁻³, B = 64, weight decay = 1 × 10⁻⁴
Learning rate	1 × 10⁻³
Rounds	2 (quick-mode target package)
Local epochs	1 (quick-mode target package)
Privacy baselines	FedAvg, DP-FL, HE-FL, BC-FL, ATEB-AI
Encryption settings	Fixed HE vs. adaptive selective HE
Blockchain scope	Metadata-only provenance and audit logs
Federated schedule	Number of rounds: T = 2, local epochs: E = 1, client participation rate = 100%, deterministic selection
Backbone	Three-layer MLP with hidden sizes [128, 64, 32], ReLU, dropout 0.3
Blockchain framework	Hyperledger Fabric v2.5; 5-organization channel; 1 peer per organization
Consensus protocol	Raft (etcdraft, 5 orderer nodes); endorsement ≥ 3/5 peers
Block size/batch	MaxMessageCount = 100; AbsoluteMaxBytes = 2 MB; PreferredMaxBytes = 512 KB
Block interval	BatchTimeout = 2 s
Transaction payload	~1 KB per transaction (SHA-256 hash + metadata only)
Throughput/latency	TPS_avg ≈ 110 (saturated network ceiling, Caliper v0.5 stress test not the in-run load); L_commit^median ≈ 1.4 s; L_commit^p⁹⁵ < 3 s. Actual run workload ≈ 16 committed ≈ 1 KB transactions; latency-bound, not throughput-bound.
Adaptive encryption	N = 8192, coeff_modulus = [60, 40, 40, 60] bits, Δ = 240, λ = 128 bits.
Measurement environment	Intel i7, 32 GB RAM, 10 Gbps virtual network

Table 3. Main benchmark results across four datasets, reporting the principal discrimination metrics and calibration error for centralized, standard federated, and privacy-preserving federated methods.

Dataset	Method	Main Metric 1	Value 1	Main Metric 2	Value 2	Calibration Error
MIT-BIH	Centralized	Accuracy	0.965	F1	0.952	0.021
	FedAvg	Accuracy	0.946	F1	0.931	0.034
	DP-FL	Accuracy	0.938	F1	0.921	0.041
	HE-FL	Accuracy	0.941	F1	0.925	0.036
	BC-FL	Accuracy	0.944	F1	0.928	0.033
	ATEB-AI	Accuracy	0.951	F1	0.937	0.028
CHB-MIT	Centralized	AUROC	0.987	F1	0.964	0.018
	FedAvg	AUROC	0.975	F1	0.945	0.029
	DP-FL	AUROC	0.968	F1	0.936	0.036
	HE-FL	AUROC	0.971	F1	0.939	0.032
	BC-FL	AUROC	0.973	F1	0.941	0.03
	ATEB-AI	AUROC	0.98	F1	0.952	0.024
BraTS	Centralized	Dice	0.891	F1	0.812
	FedAvg	Dice	0.864	F1	0.771
	DP-FL	Dice	0.851	F1	0.754
	HE-FL	Dice	0.857	F1	0.762
	BC-FL	Dice	0.859	F1	0.765
	ATEB-AI	Dice	0.872	F1	0.783
NIH	Centralized	AUROC	0.901	F1	0.842	0.026
	FedAvg	AUROC	0.878	F1	0.812	0.039
	DP-FL	AUROC	0.866	F1	0.798	0.046
	HE-FL	AUROC	0.872	F1	0.804	0.042
	BC-FL	AUROC	0.874	F1	0.807	0.04
	ATEB-AI	AUROC	0.885	F1	0.821	0.031

Table 4. Privacy, inference-attack, and robustness outcomes for the competing methods, including membership inference success, inversion leakage, poisoning-induced utility loss, and Byzantine resilience.

Method	MIA Success	Inversion Leakage	Poisoning Drop	Byzantine Resilience
FedAvg	0.71	0.64	0.18	0.58
DP-FL	0.47	0.49	0.14	0.64
HE-FL	0.31	0.34	0.12	0.69
BC-FL	0.39	0.42	0.11	0.73
ATEB-AI	0.24	0.27	0.07	0.81

Table 5. Efficiency and deployment-cost comparison across federated methods, showing per-round encryption and decryption time, communication expansion, relative round latency, and audit completeness.

Method	Enc Time/Round (s)	Dec Time/Round (s)	Comm Expansion	Round Latency (× FedAvg)	Audit Completeness
FedAvg	0	0	1.0×	1.00×	Low
DP-FL	0	0	1.1×	1.18×	Low
HE-FL	7.4	2.1	5.8×	2.85×	Moderate
BC-FL	7.2	2	5.7×	3.50×	High
ATEB-AI	4	1.3	3.3×	1.90×	High

Table 6. An ablation study by ATEB-AI components in terms of the utility, privacy, fairness, latency, and governance dimensions.

Variant	Utility	Privacy	Fairness	Latency	Governance
Full ATEB-AI	0.91	0.9	0.88	0.79	0.94
No blockchain	0.91	0.9	0.88	0.84	0.52
Fixed encryption	0.89	0.87	0.86	0.63	0.94
No trust weighting	0.88	0.9	0.74	0.8	0.94
No adaptive selection	0.89	0.86	0.84	0.68	0.94

Table 7. FedAvg and ATEB-AI cross-site fairness statistics, with both best-site, median-site, and worst-site FedAvg and ATEB-AI performance and variation between sites.

Dataset	Method	Best-Site	Median-Site	Worst-Site	Site Variance
MIT-BIH	FedAvg	0.952	0.931	0.884	0.0061
MIT-BIH	ATEB-AI	0.956	0.937	0.911	0.0039
CHB-MIT	FedAvg	0.964	0.945	0.904	0.0058
CHB-MIT	ATEB-AI	0.972	0.952	0.926	0.0031
BraTS	FedAvg	0.879	0.864	0.822	0.0047
BraTS	ATEB-AI	0.886	0.872	0.844	0.0028
NIH	FedAvg	0.831	0.812	0.771	0.0052
NIH	ATEB-AI	0.844	0.821	0.792	0.0034

Table 8. Dominant failure modes and error analysis of the datasets within the benchmarks, which help to identify the most common residual signal/imaging patterns that most likely relate to misclassification or decreased model confidence.

Dataset	Failure Mode	Interpretation
MIT-BIH	Minority supraventricular class confusion	Rare atrial/supraventricular beats remain hardest under non-IID client splits.
CHB-MIT	Short pre-ictal windows near boundary	Windows with partial seizure overlap are less stable than full ictal windows.
BraTS	Small lesion/low-contrast slices	Tumor-absent vs. tiny-lesion slices remain the main failure mode.
NIH	Diffuse opacity vs. no-finding ambiguity	Subtle abnormal cases without strong localized evidence remain difficult.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hussein, A.F.; Al-Neami, A.Q. Adaptive Trust-Aware Encrypted Federated Artificial Intelligence with Blockchain Auditability for Multicenter Biomedical Signal and Medical Image Analysis. Informatics 2026, 13, 88. https://doi.org/10.3390/informatics13060088

AMA Style

Hussein AF, Al-Neami AQ. Adaptive Trust-Aware Encrypted Federated Artificial Intelligence with Blockchain Auditability for Multicenter Biomedical Signal and Medical Image Analysis. Informatics. 2026; 13(6):88. https://doi.org/10.3390/informatics13060088

Chicago/Turabian Style

Hussein, Ahmed F., and Auns Q. Al-Neami. 2026. "Adaptive Trust-Aware Encrypted Federated Artificial Intelligence with Blockchain Auditability for Multicenter Biomedical Signal and Medical Image Analysis" Informatics 13, no. 6: 88. https://doi.org/10.3390/informatics13060088

APA Style

Hussein, A. F., & Al-Neami, A. Q. (2026). Adaptive Trust-Aware Encrypted Federated Artificial Intelligence with Blockchain Auditability for Multicenter Biomedical Signal and Medical Image Analysis. Informatics, 13(6), 88. https://doi.org/10.3390/informatics13060088

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Adaptive Trust-Aware Encrypted Federated Artificial Intelligence with Blockchain Auditability for Multicenter Biomedical Signal and Medical Image Analysis

Abstract

1. Introduction

2. Related Work

2.1. Federated Learning in Healthcare

2.2. Privacy-Preserving Techniques in FL

2.3. Trust and Robustness in Heterogeneous Settings

2.4. Blockchain Integration in FL

3. Materials and Methods

3.1. Study Design and Benchmark Datasets

3.2. Federated Client Construction and Data Preprocessing

3.3. Modality Specific Model Development

3.4. Adaptive Trust-Aware Encrypted Federated Framework

3.4.1. Sensitivity-Aware Adaptive Encryption

3.4.2. Trust-Aware Aggregation

3.4.3. Permissioned Blockchain Auditability

3.5. Comparative Methods and Attack Settings

3.6. Evaluation Protocol and Statistical Analysis

4. Results

4.1. Comparative Predictive Performance

4.2. Privacy and Attack Resilience

4.3. Computational Overhead and Deployment Efficiency

4.4. Ablation and Cross-Site Heterogeneity Findings

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI