You are currently viewing a new version of our website. To view the old version click .
Information
  • Article
  • Open Access

1 December 2025

AspectFL: Aspect-Oriented Programming for Trustworthy and Compliant Federated Learning Systems

,
and
1
Information Systems and Technology Department, Utah Valley University, Orem, UT 84058, USA
2
School of Computing, Southern Illinois University, Carbondale, IL 62901, USA
3
School of Computing, Weber State University, Ogden, UT 84408, USA
4
Department of Software Engineering, Prince Sultan University, Riyadh 11586, Saudi Arabia

Abstract

Federated learning (FL) has emerged as a paradigm-shifting approach for collaborative machine learning (ML) while preserving data privacy. However, existing FL frameworks face significant challenges in ensuring trustworthiness, regulatory compliance, and security across heterogeneous institutional environments. We introduce AspectFL, a novel aspect-oriented programming (AOP) framework that seamlessly integrates trust, compliance, and security concerns into FL systems through cross-cutting aspect weaving. Our framework implements four core aspects: FAIR (Findability, Accessibility, Interoperability, Reusability) compliance, security threat detection and mitigation, provenance tracking, and institutional policy enforcement. AspectFL employs a sophisticated aspect weaver that intercepts FL execution at critical joinpoints, enabling dynamic policy enforcement and real-time compliance monitoring without modifying core learning algorithms. We demonstrate AspectFL’s effectiveness through experiments on healthcare and financial datasets, including a detailed and reproducible evaluation on the real-world MIMIC-III dataset. Our results, reported with 95% confidence intervals and validated with appropriate statistical tests, show significant improvements in model performance, with a 4.52% and 0.90% increase in Area Under the Curve (AUC) for the healthcare and financial scenarios, respectively. Furthermore, we present a detailed ablation study, a comparative benchmark against existing FL frameworks, and an empirical scalability analysis, demonstrating the practical viability of our approach. AspectFL achieves high FAIR compliance scores (0.762), robust security (0.798 security score), and consistent policy adherence (over 84%), establishing a new standard for trustworthy FL.

1. Introduction

Federated learning (FL) represents a transformative paradigm in distributed ML, enabling multiple organizations to collaboratively train models while maintaining data sovereignty and privacy []. This approach has gained significant traction across critical domains including healthcare, finance, and telecommunications, where data sharing faces stringent regulatory constraints and institutional policies []. However, the deployment of FL systems in real-world environments reveals fundamental challenges that extend beyond traditional ML concerns. Contemporary FL frameworks primarily focus on algorithmic efficiency and basic privacy preservation through techniques such as differential privacy and secure aggregation []. While these approaches address core technical requirements, they inadequately address the complex web of trust, compliance, and security concerns that govern modern data-driven organizations. Healthcare institutions must comply with HIPAA regulations, financial organizations operate under PCI DSS requirements, and research institutions increasingly adopt FAIR (Findability, Accessibility, Interoperability, Reusability) principles for data management []. These requirements create a multifaceted compliance landscape that existing FL systems struggle to navigate systematically.
The challenge becomes more pronounced when considering the heterogeneous nature of FL participants. Each organization brings distinct security policies, data governance frameworks, and compliance requirements that must be harmonized without compromising the collaborative learning process []. Traditional approaches attempt to address these concerns through ad-hoc modifications to FL algorithms or external compliance checking systems, resulting in fragmented solutions that lack systematic integration and coverage.
Recent developments in trustworthy AI have highlighted the critical importance of incorporating ethical, legal, and social considerations directly into ML systems []. The IEEE 3187-2024 standard [] for trustworthy federated ML establishes guidelines for developing federated systems that maintain trust, transparency, and accountability throughout the learning lifecycle []. Similarly, the FUTURE-AI initiative represents an effort to establish international guidelines for trustworthy AI development [,,].
Aspect-oriented programming (AOP) emerges as a compelling paradigm for addressing these cross-cutting concerns in FL systems []. AOP enables the modular implementation of concerns that span multiple components of a system, allowing developers to separate core business logic from cross-cutting aspects such as security, logging, and compliance []. In the context of FL, AOP provides a natural framework for integrating trust and compliance requirements without modifying core learning algorithms, thereby maintaining algorithmic integrity while ensuring coverage of regulatory and institutional requirements.
This paper introduces AspectFL, an AOP framework specifically designed for trustworthy and compliant FL systems. AspectFL addresses the fundamental challenge of integrating multiple cross-cutting concerns into FL through a sophisticated aspect weaving mechanism that intercepts execution at critical joinpoints throughout the learning lifecycle. Our framework implements four core aspects that collectively address the primary trust and compliance challenges in FL environments. The FAIRCA ensures that FL processes adhere to FAIR principles by continuously monitoring and enforcing findability, accessibility, interoperability, and reusability requirements. This aspect implements automated metadata generation, endpoint availability monitoring, standard format validation, and documentation tracking to maintain FAIR compliance throughout the FL process.
The Security Aspect provides threat detection and mitigation capabilities through real-time anomaly detection, integrity verification, and privacy-preserving mechanisms. This aspect employs advanced statistical methods to identify potential attacks, implements differential privacy mechanisms for enhanced privacy protection, and maintains continuous security monitoring throughout the FL lifecycle. The Provenance Aspect establishes audit trails and lineage tracking for all FL activities, enabling complete traceability of data usage, model evolution, and decision-making processes. This aspect implements a sophisticated provenance graph that captures fine-grained information about data sources, processing steps, and model updates, providing the foundation for accountability and reproducibility in FL systems [,].
The Institutional Policy Aspect (IPA) enables dynamic enforcement of organization-specific policies and regulatory requirements through a flexible policy engine that supports complex policy hierarchies, conflict resolution mechanisms, and real-time compliance monitoring. This aspect allows organizations to define and enforce custom policies while participating in FL collaborations, ensuring that institutional requirements are maintained throughout the learning process. AspectFL’s aspect weaver employs a priority-based execution model that ensures proper ordering of aspect execution while maintaining system performance and scalability. The weaver intercepts FL execution at predefined joinpoints corresponding to critical phases such as data loading, local training, model aggregation, and result distribution []. At each joinpoint, applicable aspects are identified, sorted by priority, and executed in sequence, with each aspect having the opportunity to modify the execution context and enforce relevant policies.
Our mathematical framework provides formal guarantees for the security, privacy, and compliance properties of AspectFL. We prove that the aspect weaving process preserves the convergence properties of underlying FL algorithms while providing enhanced security and compliance capabilities. The framework includes formal definitions of trust metrics, compliance scores, and security properties, enabling quantitative assessment of system trustworthiness. We demonstrate AspectFL’s effectiveness through experiments on healthcare and financial datasets, representing two critical domains with stringent compliance requirements. Our experiments include a detailed and reproducible validation on the real-world MIMIC-III dataset to demonstrate external validity and robustness under realistic data distributions. Our healthcare experiments simulate a consortium of hospitals collaborating on medical diagnosis tasks while maintaining HIPAA compliance and institutional data governance policies. The financial experiments model a banking consortium developing fraud detection capabilities while adhering to PCI DSS requirements and financial regulations.
Experimental results demonstrate that AspectFL achieves significant improvements in both learning performance and compliance metrics compared to traditional FL approaches. In healthcare scenarios, AspectFL shows 4.52% AUC improvement while maintaining FAIR compliance scores of 0.762, security scores of 0.798, and policy compliance rates of 84.3%. Financial experiments show 0.90% AUC improvement with FAIR compliance scores of 0.738, security scores of 0.806, and policy compliance rates of 84.3%.
The contributions of this work include (1) the first AOP framework for FL that systematically addresses trust, compliance, and security concerns; (2) a novel aspect weaving mechanism specifically designed for FL environments with formal guarantees for security and compliance properties; (3) implementation of FAIR principles, security mechanisms, provenance tracking, and policy enforcement in FL contexts; (4) extensive experimental validation demonstrating improved performance and compliance across healthcare and financial domains; and (5) an open-source implementation with complete code, datasets, and deployment tools for community adoption to ensure full reproducibility.

3. AspectFL Framework Architecture

AspectFL employs a sophisticated AO architecture that seamlessly integrates trust, compliance, and security concerns into FL systems through systematic aspect weaving. This section presents the mathematical foundations, architectural components, and formal properties of the AspectFL framework.

3.1. Mathematical Foundations

We formalize the AspectFL framework through a mathematical model that captures the interaction between FL processes and AO mechanisms. Let F = { F 1 , F 2 , , F n } represent a federation of n participating organizations, where each organization F i maintains a local dataset D i and participates in collaborative model training. The FL process operates over a sequence of communication rounds t { 1 , 2 , , T } , where each round involves local training at participating organizations followed by global model aggregation. We define the global model at round t as θ ( t ) R d , where d represents the model parameter dimension.
AspectFL introduces an aspect weaving mechanism that intercepts FL execution at predefined joinpoints. We define the set of joinpoints as J = { j 1 , j 2 , , j m } , where each joinpoint j k corresponds to a specific phase in the FL lifecycle. The primary joinpoints include:
j 1 : Data Loading Phase
j 2 : Local Training Phase
j 3 : Model Update Submission Phase
j 4 : Global Aggregation Phase
j 5 : Model Distribution Phase
j 6 : Evaluation Phase
Each joinpoint j k is associated with an execution context C k that captures the state of the FL system at that point. The execution context includes relevant data, model parameters, metadata, and environmental information necessary for aspect execution.
We define the set of aspects as A = { A 1 , A 2 , A 3 , A 4 } , where:
-
A 1 : FAIRCA
-
A 2 : Security Aspect
-
A 3 : Provenance Aspect
-
A 4 : IPA
Each aspect A i is characterized by a pointcut P i , advice function α i , and priority π i . The pointcut P i J specifies the joinpoints where aspect A i should be applied. The advice function α i : C C defines the transformation applied to the execution context when the aspect is triggered. The framework handles adversarial or non-cooperative participants through a multi-layered approach. The Security Aspect continuously monitors for anomalous behavior patterns that may indicate malicious activity, such as model updates that deviate significantly from expected distributions. The Policy Aspect enforces participation rules that can limit the influence of any single participant and implement quarantine mechanisms for repeatedly flagged participants. The aspect weaver maintains a reputation system that tracks participant behavior across rounds and can automatically adjust trust levels and participation privileges based on observed compliance patterns. The aspect weaver implements a priority-based execution model where aspects are applied in order of decreasing priority. For a given joinpoint j k , the set of applicable aspects is:
A k = { A i A : j k P i }
The aspects in A k are sorted by priority to obtain the ordered sequence { A i 1 , A i 2 , , A i p } where π i 1 π i 2 π i p .
The aspect weaving process transforms the execution context through a sequential application of aspect advice:
C k = α i p ( α i p 1 ( α i 1 ( C k ) ) )
This mathematical formulation ensures that aspects are applied in a consistent and predictable manner while maintaining the integrity of the underlying FL process.

Differential Privacy Implementation Details

To provide formal privacy guarantees, our implementation of Differential Privacy (DP) in AspectFL uses the Gaussian mechanism. We provide the following technical details for full reproducibility.
  • Gradient Clipping Norm (C): The gradient clipping norm was set to 1.0. This means that the L 2 norm of the gradients for each client update was clipped to a maximum of 1.0 before aggregation. This is a crucial step to bound the sensitivity of the function.
  • Privacy Accountant: We used the Moments Accountant [] to track the privacy loss over the training rounds. This method provides a tighter bound on the cumulative privacy cost compared to traditional composition theorems.
  • Per-Round Budget Allocation: The total privacy budget of ε = 2.0 was distributed uniformly across the 10 training rounds. This results in a per-round budget of ε round = 0.2 .
  • Noise Multiplier ( σ ): Given the parameters above ( ε = 0.2 per round, δ = 10 5 ), the noise multiplier σ for the Gaussian noise added to the aggregated gradients was calculated using the formula:
σ = 2 C 2 log ( 1.25 / δ ) ε round 2
which yields σ 24.22 for our configuration. Utility-Privacy Trade-off Analysis: We conducted an analysis to understand the trade-off between model utility (measured by AUC and PR-AUC) and the level of privacy (controlled by ε ). As shown in Figure 1, there is a clear relationship where stricter privacy (lower ε ) leads to a decrease in model performance. Our chosen value of ε = 2.0 provides a strong privacy guarantee while maintaining a high level of model utility, making it suitable for the healthcare context. This Figure shows the impact of varying the privacy budget ( ε ) on model performance (AUC and PR-AUC). The red dashed line indicates our selected privacy budget of ε = 2.0 , which offers a good balance between utility and privacy.
Figure 1. Utility-Privacy Trade-off Analysis.

3.2. FAIR Compliance Mathematical Model

The FAIRCA implements a quantitative assessment framework for evaluating adherence to FAIR principles. We define the FAIR compliance score as a weighted combination of four component scores:
S F A I R = w F · S F + w A · S A + w I · S I + w R · S R
where S F , S A , S I , and S R represent the Findability, Accessibility, Interoperability, and Reusability scores, respectively, and w F + w A + w I + w R = 1 .
The Findability score S F evaluates the completeness and quality of metadata associated with FL artifacts:
S F = 1 2 ( S c o m p l e t e n e s s + S q u a l i t y )
where:
S c o m p l e t e n e s s = | M p r e s e n t | | M r e q u i r e d |
S q u a l i t y = 1 | M p r e s e n t | m M p r e s e n t Q ( m )
Here, M r e q u i r e d represents the set of required metadata fields, M p r e s e n t represents the set of present metadata fields, and Q ( m ) evaluates the quality of metadata field m based on standardization, accuracy, and completeness criteria.
The Accessibility score S A assesses the availability and accessibility of FL endpoints and resources:
S A = 1 | E | e E A ( e )
where E represents the set of FL endpoints and A ( e ) evaluates the accessibility of endpoint e based on availability, authentication mechanisms, and protocol compliance.
The Interoperability score S I measures adherence to standard formats, protocols, and vocabularies:
S I = 1 3 ( S f o r m a t s + S p r o t o c o l s + S v o c a b u l a r i e s )
The Reusability score S R evaluates the presence of clear licensing, documentation, and usage examples:
S R = 1 3 ( S l i c e n s i n g + S d o c u m e n t a t i o n + S e x a m p l e s )

3.3. Security Mathematical Model

The Security Aspect implements a threat detection and mitigation framework based on statistical anomaly detection and privacy-preserving mechanisms. We define the security score as:
S s e c u r i t y = 1 3 ( S a n o m a l y + S i n t e g r i t y + S p r i v a c y )
The anomaly detection component employs statistical methods to identify potentially malicious model updates. For a model update Δ θ i from organization F i , we compute the anomaly score:
A ( Δ θ i ) = | | Δ θ i μ Δ θ | | 2 σ Δ θ
where μ Δ θ and σ Δ θ represent the mean and standard deviation of historical model updates.
The integrity verification component computes cryptographic hashes and digital signatures to ensure data and model integrity:
I ( D i ) = 1 if H ( D i ) = H e x p e c t e d 0 otherwise
The privacy preservation component implements differential privacy mechanisms with calibrated noise addition:
θ ˜ i = θ i + N ( 0 , σ 2 I )
where σ 2 = 2 Δ 2 log ( 1.25 / δ ) ϵ 2 and Δ , ϵ , and δ represent the sensitivity, privacy budget, and failure probability, respectively.

3.4. Provenance Mathematical Model

The Provenance Aspect maintains a provenance graph G = ( V , E ) where vertices V represent provenance entities and edges E represent relationships between entities. We define three types of provenance entities:
V d a t a : Data entities
V a c t i v i t y : Activity entities
V a g e n t : Agent entities
The provenance quality score evaluates the completeness and accuracy of provenance information:
S p r o v e n a n c e = 1 2 ( S c o v e r a g e + S a c c u r a c y )
where:
S c o v e r a g e = | V r e c o r d e d | | V t o t a l |
S a c c u r a c y = 1 | E | e E V ( e )
Here, V ( e ) represents the verification status of provenance relationship e.

3.5. Policy Enforcement Mathematical Model

The IPA implements a flexible policy engine that supports complex policy hierarchies and conflict resolution. We define a policy ρ as a tuple ( C c o n d i t i o n , A a c t i o n , π p r i o r i t y ) where C c o n d i t i o n specifies the conditions under which the policy applies, A a c t i o n defines the required actions, and π p r i o r i t y indicates the policy priority.
For a given execution context C , the set of applicable policies is:
P a p p l i c a b l e = { ρ P : C c o n d i t i o n ( C ) = true }
Policy conflicts are resolved through priority-based selection and conflict resolution algorithms. The policy compliance score is computed as:
S p o l i c y = | P s a t i s f i e d | | P a p p l i c a b l e |
where P s a t i s f i e d represents the set of satisfied policies.

4. Implementation and Core Aspects

Figure 2 shows core FL components (data loading, local training, aggregation, and model distribution) that are intercepted by AspectFLWeaver, which orchestrates execution of four modular aspects: FAIR Compliance, Security, Provenance, and Institutional Policy. This design enables dynamic and scalable integration of trust, compliance, and traceability mechanisms. Each aspect addresses specific cross-cutting concerns while maintaining modularity and reusability across different FL scenarios.
Figure 2. The architecture of AspectFL.

4.1. Aspect Weaver Architecture

The AspectFL weaver implements a sophisticated interception and transformation mechanism that enables seamless integration of cross-cutting concerns into FL execution. The weaver operates through a multi-stage process that includes joinpoint detection, aspect selection, priority resolution, and context transformation. Algorithm 1 presents the main AspectFL training procedure, demonstrating how aspect weaving is integrated throughout the FL lifecycle.
Algorithm 2 details the aspect weaver’s operation, showing how aspects are selected and applied based on pointcut matching and priority ordering.

4.2. FAIR Compliance Aspect Implementation

The FAIR Compliance Aspect (FAIRCA) implements monitoring and enforcement of FAIR principles throughout the FL lifecycle. This aspect operates at multiple joinpoints to ensure continuous compliance assessment and improvement recommendations. The Findability component implements automated metadata generation and quality assessment mechanisms. During the data loading phase, the aspect extracts and validates metadata from FL datasets, ensuring that required fields such as data source, collection methodology, temporal coverage, and quality indicators are present and properly formatted. The aspect maintains a metadata registry that tracks all FL artifacts and their associated metadata. The metadata quality assessment employs natural language processing (NLP) techniques to evaluate the completeness and accuracy of textual metadata fields. The aspect computes semantic similarity scores between metadata descriptions and standardized vocabularies, identifying potential inconsistencies or missing information. For numerical metadata fields, the aspect validates ranges, units, and precision to ensure consistency across FL participants. The Accessibility component monitors endpoint availability and protocol compliance throughout the FL process. The aspect implements continuous health checking mechanisms that verify the availability of FL endpoints, measure response times, and validate authentication mechanisms. During model aggregation phases, the aspect ensures that all participating organizations can successfully communicate with the central server and that communication protocols comply with established standards. The aspect implements adaptive timeout mechanisms that adjust based on network conditions and participant characteristics. For organizations with limited computational resources or network connectivity, the aspect provides extended timeouts and retry mechanisms to ensure inclusive participation in FL collaborations.
Algorithm 1 AspectFL Main Training Algorithm
Require: 
Federation F = { F 1 , F 2 , , F n } , Aspects A = { A 1 , A 2 , A 3 , A 4 }
Ensure: 
Trained global model θ ( T ) with compliance guarantees
  1:
Initialize global model θ ( 0 )
  2:
Initialize aspect weaver W with aspects A
  3:
Initialize provenance graph G = ( V , E )
  4:
for round t = 1 to T do
  5:
     C t CreateExecutionContext( θ ( t 1 ) , t, F )
  6:
     C t W .WeaveAspects( C t , j 1 ) {Data loading joinpoint}
  7:
     S t SelectParticipants( F , C t )
  8:
    for each participant F i S t  do
  9:
        C i , t CreateLocalContext( F i , θ ( t 1 ) , C t )
10:
        C i , t W .WeaveAspects( C i , t , j 2 ) {Local training joinpoint}
11:
        Δ θ i ( t ) LocalTrain( F i , C i , t )
12:
        C i , t UpdateContext( C i , t , Δ θ i ( t ) )
13:
        C i , t W .WeaveAspects( C i , t , j 3 ) {Update submission joinpoint}
14:
       Submit( Δ θ i ( t ) , C i , t )
15:
    end for
16:
     U t CollectUpdates( S t )
17:
     C a g g , t CreateAggregationContext( U t , C t )
18:
     C a g g , t W .WeaveAspects( C a g g , t , j 4 ) {Aggregation joinpoint}
19:
     θ ( t ) ProvenanceAwareAggregate( U t , C a g g , t )
20:
     C d i s t , t CreateDistributionContext( θ ( t ) , C a g g , t )
21:
     C d i s t , t W .WeaveAspects( C d i s t , t , j 5 ) {Distribution joinpoint}
22:
    Distribute( θ ( t ) , S t , C d i s t , t )
23:
end for
24:
C e v a l CreateEvaluationContext( θ ( T ) , F )
25:
C e v a l W .WeaveAspects( C e v a l , j 6 ) {Evaluation joinpoint}
26:
return  θ ( T ) , C e v a l
Algorithm 2 Aspect Weaver Algorithm
Require: 
Execution context C , joinpoint j, Aspects A
Ensure: 
Transformed execution context C
  1:
A a p p l i c a b l e { }
  2:
for each aspect A i A  do
  3:
    if  j P i  then
  4:
         A a p p l i c a b l e A a p p l i c a b l e { A i } {Check if joinpoint matches pointcut}
  5:
    end if
  6:
end for
  7:
Sort A a p p l i c a b l e by priority in descending order
  8:
C C
  9:
for each aspect A i A a p p l i c a b l e  do
10:
     C α i ( C ) {Apply aspect advice}
11:
    LogAspectExecution( A i , j, C )
12:
end for
13:
return C
The FAIR compliance assessment algorithm (Algorithm 3) provides a systematic evaluation of FAIR principle adherence, enabling continuous monitoring and improvement of FL FAIRness.
Algorithm 3 FAIR Compliance Assessment Algorithm
Require: 
Execution context C , Metadata registry M
Ensure: 
FAIR compliance score S F A I R
1:
S F AssessFindability( C , M )
2:
S A AssessAccessibility( C )
3:
S I AssessInteroperability( C )
4:
S R AssessReusability( C , M )
5:
S F A I R w F · S F + w A · S A + w I · S I + w R · S R
6:
return  S F A I R
The FAIR scoring weights were determined through the BIRCCs workshop expert process involving 5 specialists in data governance and federated learning using a modified Delphi method. We ask them to rate the relative importance of each aspect on a scale from 1 (not important) to 10 (critically important) for the given use case (e.g., healthcare mortality prediction). After each round and after the summary of the ratings, including the median and Interquartile Range (IQR), they were also given the opportunity to provide qualitative justifications for their ratings. Consensus was considered reached when the IQR for the ratings of each aspect was less than2.0, resulting in the following weight assignments: w F = 0.20 (Findability), w A = 0.35 (Accessibility), w I = 0.25 (Interoperability), and w R = 0.20 (Reusability). To ensure the robustness of the chosen weights, we performed a sensitivity analysis by simulating the exclusion of 1, 2, and 3 experts at random from the final panel. The resulting weights showed minimal deviation (less than 5%), indicating that the consensus was stable and not unduly influenced by any small subset of the expert panel. The higher weight for Accessibility reflects its critical importance in collaborative learning environments where data and model access must be carefully managed across organizational boundaries. The Interoperability component validates adherence to standard formats, protocols, and vocabularies across all FL communications. The aspect maintains a registry of supported data formats, communication protocols, and semantic vocabularies, automatically validating all FL artifacts against these standards. During model update submission phases, the aspect verifies that model parameters are encoded using standard serialization formats and that communication messages conform to established protocol specifications. The aspect also validates that any custom extensions or modifications maintain backward compatibility with standard FL frameworks.
The Reusability component ensures that FL artifacts include documentation, clear licensing information, and practical usage examples. The aspect automatically generates documentation templates based on FL activities, capturing key information about data sources, model architectures, training procedures, and evaluation metrics. The aspect implements license compatibility checking mechanisms that verify that all FL participants have compatible data usage licenses and that the resulting models can be legally shared and reused according to institutional policies. For research collaborations, the aspect ensures that appropriate attribution and citation information is maintained throughout the FL process.

4.3. Security Aspect Implementation

The Security Aspect provides threat detection, prevention, and mitigation capabilities through real-time monitoring and adaptive response mechanisms. This aspect employs multiple security techniques to address the diverse threat landscape in FL environments. The anomaly detection component implements statistical and ML-based methods to identify potentially malicious behavior from FL participants. The aspect maintains baseline profiles for each participating organization based on historical data characteristics, model update patterns, and communication behaviors. During each FL round, the aspect compares current behavior against established baselines to identify significant deviations that may indicate malicious activity. The security threat detection algorithm (Algorithm 4) combines statistical and ML approaches to identify potential security threats, providing adaptive protection against evolving attack strategies.
Algorithm 4 Security Threat Detection Algorithm
Require: 
Model update Δ θ i , Historical updates H i , Context C
Ensure: 
Threat assessment T
  1:
μ i ComputeMean( H i )
  2:
σ i ComputeStdDev( H i )
  3:
A s t a t | | Δ θ i μ i | | 2 σ i {Statistical anomaly score}
  4:
A m l MLAnomalyDetector( Δ θ i , H i ) {ML-based anomaly score}
  5:
A c o m b i n e d β 1 · A s t a t + β 2 · A m l
  6:
if  A c o m b i n e d > τ h i g h then
  7:
     T {severity: HIGH, action: EXCLUDE, confidence: A c o m b i n e d }
  8:
else if  A c o m b i n e d > τ m e d i u m
  9:
     T {severity: MEDIUM, action: MONITOR, confidence: A c o m b i n e d }
10:
else
11:
     T {severity: LOW, action: ACCEPT, confidence: A c o m b i n e d }
12:
end if
13:
RecordThreatAssessment( Δ θ i , T , C )
14:
return  T
Our differential privacy implementation employs local differential privacy using the Gaussian mechanism, where noise is added by each client before transmitting model updates. We specify ϵ = 2.0 and δ = 1 × 10 5 for healthcare scenarios, and ϵ = 1.5 and δ = 1 × 10 6 for financial applications, reflecting higher privacy requirements in financial contexts. The sensitivity parameter Δ = 1.0 was empirically estimated for normalized model updates through calibration runs. The aspect employs multiple anomaly detection algorithms including isolation forests, one-class support vector machines, and autoencoder-based approaches to capture different types of anomalous behavior. For model updates, the aspect analyzes parameter magnitudes, gradient directions, and update frequencies to identify potential model poisoning attacks. For data-related anomalies, the aspect examines statistical properties, distribution characteristics, and quality metrics to detect data poisoning attempts.
The threat assessment component implements a threat modeling framework that evaluates the severity and impact of detected anomalies. The aspect maintains a threat intelligence database that includes known attack patterns, vulnerability signatures, and mitigation strategies. When anomalies are detected, the aspect correlates them with known threat patterns to assess the likelihood and potential impact of security incidents. The aspect implements adaptive threat response mechanisms that automatically adjust security measures based on assessed threat levels. For low-severity threats, the aspect may increase monitoring frequency or request additional validation information. For high-severity threats, the aspect can temporarily exclude suspicious participants, require additional authentication, or trigger incident response procedures.
The privacy preservation component implements differential privacy mechanisms that add calibrated noise to model updates and aggregated statistics. The aspect supports multiple privacy models including central differential privacy, local differential privacy, and personalized differential privacy to accommodate different privacy requirements and trust models. The aspect implements adaptive privacy budget allocation mechanisms that optimize the trade-off between privacy protection and model utility. The privacy budget is dynamically allocated across FL rounds based on data sensitivity, participant trust levels, and utility requirements. The aspect also implements privacy accounting mechanisms that track cumulative privacy expenditure and ensure that privacy guarantees are maintained throughout the FL process.
The integrity verification component implements cryptographic mechanisms to ensure the authenticity and integrity of all FL communications. The aspect generates and verifies digital signatures for model updates, data summaries, and control messages using an established public key infrastructure. For data integrity, the aspect computes and validates cryptographic hashes of datasets and model artifacts. The aspect implements secure communication protocols that provide end-to-end encryption for all FL communications. The protocols support perfect forward secrecy, mutual authentication, and protection against man-in-the-middle attacks. The aspect also implements secure aggregation protocols that enable computation of aggregate statistics without revealing individual contributions.

4.4. Provenance Aspect Implementation

The Provenance Aspect establishes audit trails and lineage tracking for all FL activities, enabling complete traceability and accountability throughout the learning process. This aspect implements a sophisticated provenance model based on the W3C PROV standard, adapted for FL environments. The provenance data model captures three primary types of entities: data entities representing datasets, model artifacts, and computed results; activity entities representing FL operations such as training, aggregation, and evaluation; and agent entities representing organizations, individuals, and software systems involved in the FL process. The provenance-aware aggregation algorithm (Algorithm 5) represents a key innovation of AspectFL, incorporating data quality, trust scores, security assessments, and policy compliance into the model aggregation process. This approach ensures that the global model reflects not only the statistical properties of participant contributions but also their trustworthiness and compliance characteristics.
Algorithm 5 Provenance-Aware Aggregation Algorithm
Require: 
Model updates U t = { Δ θ 1 ( t ) , , Δ θ k ( t ) } , Context C
Ensure: 
Aggregated model θ ( t )
  1:
W { } {Initialize weights}
  2:
for each update Δ θ i ( t ) U t  do
  3:
     q i ComputeDataQuality( F i , C )
  4:
     t i ComputeTrustScore( F i , C )
  5:
     s i ComputeSecurityScore( Δ θ i ( t ) , C )
  6:
     p i ComputePolicyCompliance( F i , C )
  7:
     w i α q · q i + α t · t i + α s · s i + α p · p i
  8:
     W W { w i }
  9:
    RecordProvenance( Δ θ i ( t ) , F i , w i , G )
10:
end for
11:
Normalize weights: w i w i j = 1 k w j
12:
θ ( t ) i = 1 k w i · Δ θ i ( t )
13:
RecordAggregationProvenance( θ ( t ) , U t , W , G )
14:
return  θ ( t )
The aspect maintains a distributed provenance graph that captures fine-grained relationships between provenance entities. The graph includes derivation relationships that track how model updates are derived from training data, attribution relationships that identify responsible agents for specific activities, and temporal relationships that establish the chronological order of FL events. During the data loading phase, the aspect captures metadata about data sources, including data collection procedures, preprocessing steps, quality assessments, and access permissions. The aspect generates unique identifiers for all data entities and establishes provenance links to source systems and responsible agents.
During local training phases, the aspect records detailed information about training procedures, including hyperparameter settings as shown in Table 1, optimization algorithms, convergence criteria, and computational resources utilized. The aspect captures the relationship between input data, training algorithms, and resulting model updates, enabling complete reconstruction of the training process.
Table 1. Baseline Hyperparameter Configuration.
During aggregation phases, the aspect tracks the combination of individual model updates into global models, recording aggregation algorithms, weighting schemes, and quality control measures. The aspect maintains provenance links between individual contributions and aggregated results, enabling attribution of global model properties to specific participants.
The aspect implements provenance query mechanisms that enable stakeholders to trace the lineage of specific model predictions, identify the data sources that contributed to particular model components, and assess the impact of individual participants on global model performance. These capabilities support accountability requirements and enable detailed analysis of FL outcomes.
The provenance quality assessment component evaluates the completeness and accuracy of captured provenance information. The aspect implements automated validation mechanisms that verify the consistency of provenance relationships, identify missing provenance information, and assess the reliability of provenance sources.

4.5. Institutional Policy Aspect Implementation

The Institutional Policy Aspect (IPA) enables dynamic enforcement of organization-specific policies and regulatory requirements through a flexible policy engine that supports complex policy hierarchies, conflict resolution mechanisms, and real-time compliance monitoring. The policy definition framework supports multiple policy types including data governance policies that specify data usage restrictions and access controls, privacy policies that define privacy protection requirements and consent management procedures, security policies that establish security controls and incident response procedures, and compliance policies that ensure adherence to regulatory requirements and industry standards. Policies are defined using a declarative policy language that supports complex conditions, actions, and constraints. The language includes support for temporal conditions that specify time-based policy activation, contextual conditions that depend on FL state and participant characteristics, and hierarchical conditions that enable policy inheritance and override mechanisms. The policy engine implements a sophisticated conflict resolution framework that handles situations where multiple policies apply to the same FL activity but specify conflicting requirements. The framework employs priority-based resolution, semantic analysis of policy intent, and stakeholder negotiation mechanisms to resolve conflicts while maintaining system functionality.
The Institutional Policy Aspect enables dynamic enforcement of organization-specific policies through a flexible policy engine. The policy compliance metric is formally defined as the ratio of satisfied policy constraints to total applicable constraints across all FL rounds, averaged across all clients. For N policies with binary satisfaction indicators s i { 0 , 1 } , the compliance rate is computed as:
C p o l i c y = 1 N i = 1 N s i
Algorithm 6 presents the policy conflict resolution mechanism.
Algorithm 6 Policy Conflict Resolution Algorithm
Require: 
Applicable policies P a p p l i c a b l e , Execution context C
Ensure: 
Resolved policy set P r e s o l v e d
  1:
P c o n f l i c t s DetectConflicts( P a p p l i c a b l e )
  2:
P r e s o l v e d P a p p l i c a b l e P c o n f l i c t s
  3:
while  P c o n f l i c t s  do
  4:
     ρ c o n f l i c t SelectHighestPriorityConflict( P c o n f l i c t s )
  5:
     P g r o u p GetConflictGroup( ρ c o n f l i c t , P c o n f l i c t s )
  6:
     ρ w i n n e r ResolveConflictGroup( P g r o u p , C )
  7:
     P r e s o l v e d P r e s o l v e d { ρ w i n n e r }
  8:
     P c o n f l i c t s P c o n f l i c t s P g r o u p
  9:
    LogConflictResolution( P g r o u p , ρ w i n n e r , C )
10:
end while
11:
return  P r e s o l v e d
During policy evaluation, the aspect assesses the current FL context against all applicable policies, identifying policy violations and generating compliance reports. The aspect implements real-time policy monitoring that continuously evaluates policy compliance throughout the FL process, providing immediate feedback when policy violations are detected. The aspect supports dynamic policy updates that enable organizations to modify their policies during ongoing FL collaborations. Policy changes are propagated to all relevant participants with appropriate notification and transition procedures to ensure smooth policy evolution without disrupting collaborative learning activities. The policy enforcement component implements various enforcement mechanisms including preventive enforcement that blocks policy-violating activities before they occur, corrective enforcement that modifies activities to ensure policy compliance, and detective enforcement that identifies policy violations after they occur and triggers appropriate response procedures.

4.6. Aspect Weaver Implementation

Figure 3 illustrates how aspect weaver intercepts FL execution at predefined joinpoints such as data loading, training, and aggregation. It dynamically identifies applicable aspects, sorts them by priority, and applies advice logic accordingly before continuing execution. The weaver maintains an aspect registry that tracks all registered aspects, their associated pointcuts, priorities, and execution requirements. During FL execution, the weaver continuously monitors for joinpoint occurrences and evaluates pointcut expressions to determine which aspects should be activated. When a joinpoint is detected, the weaver creates a joinpoint object that encapsulates the current execution context, including relevant data, model parameters, metadata, and environmental information. The joinpoint object provides a standardized interface for aspects to access and modify the execution context.
Figure 3. Aspect weaving process integrated into FL (FL).
The weaver implements a priority-based aspect execution model that ensures aspects are applied in the correct order while handling dependencies and conflicts between aspects. Aspects with higher priorities are executed first, with each aspect having the opportunity to modify the execution context before subsequent aspects are applied. The weaver includes performance optimization mechanisms that minimize the overhead of aspect execution while maintaining coverage of cross-cutting concerns. These optimizations include lazy aspect evaluation that defers aspect execution until necessary, caching mechanisms that reuse aspect results when appropriate, and parallel aspect execution for independent aspects. This implements logging and monitoring capabilities that track aspect execution, performance metrics, and system behavior. This information supports debugging, performance optimization, and compliance auditing requirements.

4.7. Assumptions and Guarantees

AspectFL operates under the following formal assumptions: Data Distribution Assumptions: We assume that local datasets exhibit statistical heterogeneity following a Dirichlet distribution with concentration parameter α 0.1 , representing realistic non-IID conditions in federated environments. Participant Behavior Model: We assume that at most β < 0.5 fraction of participants may exhibit Byzantine behavior, ensuring that honest participants maintain majority control over the aggregation process. Network Conditions: We assume reliable network connectivity with bounded communication delays and packet loss rates below 5% under normal operating conditions. Convergence Guarantees: Under these assumptions, AspectFL maintains the same convergence rate as traditional FL algorithms for convex loss functions, with convergence bound O ( 1 / T ) for T communication rounds. Privacy Guarantees: The differential privacy mechanisms provide ( ϵ , δ ) -differential privacy guarantees with specified privacy budgets, ensuring formal privacy protection for participant contributions. Security Properties: The security aspect provides detection guarantees for statistical anomalies with false positive rates below 5% and false negative rates below 10% under normal operating conditions.

4.8. Error Handling and Conflict Resolution

AspectFL implements error-handling mechanisms to ensure robust operation in distributed environments. The framework handles three primary categories of failures: aspect weaving failures, policy conflicts, and communication errors.
Aspect Weaving Failures: When aspect execution fails, the weaver implements graceful degradation by logging the failure, notifying administrators, and continuing execution with remaining functional aspects. Critical aspects (Security and Policy) trigger system-wide alerts and may halt processing until resolution.
Policy Conflict Resolution: The policy engine resolves conflicts through a hierarchical priority system combined with semantic analysis. Conflicts are resolved automatically when possible, with escalation to human administrators for complex scenarios requiring domain expertise.
Communication Error Recovery: The framework implements adaptive retry mechanisms with exponential backoff for transient network failures. Persistent communication failures trigger participant exclusion with automatic re-inclusion upon connectivity restoration.

5. Experimental Evaluation

5.1. Experimental Setup

We conducted experiments to evaluate AspectFL’s effectiveness across multiple dimensions, including model performance, compliance metrics, and system overhead. Each experiment was repeated 10 times with different random seeds (0–9) to ensure statistical significance of results. We report mean values with 95% confidence intervals for all key metrics and employ DeLong’s test for comparing Area Under the Curve (AUC) values, as it is specifically designed for correlated AUC comparisons. A p-value of less than 0.01 was considered to indicate a statistically significant difference.

MIMIC-III Experimental Setup

To ensure the reproducibility of our experiments, we provide a detailed description of the experimental setup using the publicly available MIMIC-III (Medical Information Mart for Intensive Care) dataset. The code provided with this paper is designed to work with MIMIC-III v1.4. For the mortality prediction task, we extracted a cohort of adult patients (age ≥ 18) with a single ICU stay. The primary outcome was in-hospital mortality. We used the following 17 clinical variables, which are commonly used for mortality prediction in the ICU setting:
  • Demographics: Age, Gender
  • Vital Signs: Heart Rate, Systolic Blood Pressure, Diastolic Blood Pressure, Mean Arterial Pressure, Respiratory Rate, Temperature, SpO2 (Oxygen Saturation)
  • Lab Values: White Blood Cell Count (WBC), Hemoglobin, Platelet Count, Creatinine, Blood Urea Nitrogen (BUN), Glucose
  • Scoring Systems: Glasgow Coma Scale (GCS), SOFA Score (Sequential Organ Failure Assessment)
Data Preprocessing and Imputation: A significant challenge with EHR data is the presence of missing values. To address this, we implemented a multiple imputation strategy using the Multivariate Imputation by Chained Equations (MICE) algorithm []. The process is as follows:
  • For each of the 5 federated sites, we identified missing values in the 17 selected clinical variables.
  • We used the IterativeImputer from scikit-learn, which models each feature with missing values as a function of other features in a round-robin fashion.
  • We generated n = 5 imputed datasets for each site. This process creates multiple complete datasets, each with slightly different imputed values, which accounts for the uncertainty in the imputations.
  • For our experiments, we used the first of the five imputed datasets for both training and testing. While pooling results from all 5 datasets can provide more robust estimates, using a single imputed dataset is a common practice that still allows for a complete and reproducible workflow.
  • After imputation, all features were standardized using the StandardScaler to have a mean of 0 and a standard deviation of 1.

5.2. Ablation Study

Our experimental evaluation was conducted across three domains: healthcare (MIMIC-III), financial services, and a general ablation study. We report both Area Under the Receiver Operating Characteristic Curve (AUC) and Area Under the Precision-Recall Curve (PR-AUC) for all experiments to provide a view of model performance, especially given the class imbalance inherent in mortality prediction and fraud detection tasks.
Figure 4 illustrates the accuracy evolution over training rounds for both approaches. AspectFL demonstrates faster convergence and more stable performance, reaching 90% of final accuracy in 6 rounds compared to 8 rounds for traditional FL. The improved convergence results from AspectFL’s provenance-aware aggregation algorithm that weights participant contributions based on data quality, historical performance, and trust scores. AspectFL demonstrates significant improvements in both learning performance and compliance metrics. Table 2 also shows the framework achieves a mean accuracy of 0.871 (95% CI: [0.867, 0.875]) compared to 0.834 (95% CI: [0.829, 0.839]) for traditional FL, representing a statistically significant improvement (p < 0.001, DeLong’s test). The AUC improvement of 4.52% demonstrates AspectFL’s ability to enhance model quality while maintaining trust and compliance guarantees.
Figure 4. Healthcare FL accuracy comparison between AspectFL (with aspects) and traditional FL (without aspects) over training rounds.
Table 2. Performance Comparison on MIMIC-III Dataset.
FAIR compliance metrics show substantial improvements across all four principles. AspectFL achieves an overall FAIR compliance score of 0.762, with component scores of 0.68 for Findability, 0.75 for Accessibility, 0.81 for Interoperability, and 0.80 for Reusability. These improvements result from AspectFL’s automated metadata generation, continuous endpoint monitoring, standard format validation, and documentation tracking.
Security analysis reveals robust threat detection and mitigation capabilities. The anomaly detection component achieves 94% accuracy in identifying malicious participants and suspicious activities, with a false positive rate of 3.2%. The threat assessment framework successfully categorizes detected anomalies by severity and impact, enabling appropriate response measures. Policy compliance evaluation demonstrates AspectFL’s effectiveness in enforcing complex regulatory requirements. The framework maintains an average policy compliance rate of 84.3%, successfully enforcing HIPAA privacy requirements, data governance policies, and institutional security standards.

5.2.1. Real-World Healthcare Dataset Validation

To demonstrate external validity and robustness under realistic data conditions, we conducted experiments using the MIMIC-III (Medical Information Mart for Intensive Care III) dataset. MIMIC-III contains de-identified health data from over 40,000 critical care patients, providing realistic non-IID distributions and feature complexity that closely mirror actual healthcare federated learning scenarios. We focused on the in-hospital mortality prediction task. The cohort consisted of adult patients (age: 18) with a single ICU stay of at least 24 h. We extracted 17 clinical variables, including vital signs, lab results, and demographics, within the first 24 h of ICU admission. Missing values were handled using a multiple imputation strategy, where we created 5 imputed datasets and averaged the results. All features were normalized to have zero mean and unit variance. To simulate a realistic non-IID federated learning scenario, we partitioned the data among 10 simulated hospital clients based on the year of patient admission. This temporal partitioning creates a natural data distribution shift, as clinical practices and patient populations can change over time. This setup is more challenging and realistic than a simple random partitioning of the data [,]. All experiments were repeated 10 times with different random seeds to ensure the robustness of our findings. We report the mean and 95% confidence intervals for all performance metrics. For comparing AUC scores between models, we use DeLong’s test, a non-parametric statistical test for comparing the AUCs of two correlated ROC curves. Results demonstrate that AspectFL maintains its effectiveness under real-world data conditions. AspectFL achieves an AUC of 0.847 (95% CI: [0.841, 0.853]) compared to 0.821 (95% CI: [0.815, 0.827]) for traditional FL, representing a statistically significant improvement of 3.17% (p < 0.001, DeLong’s test). The FAIR compliance score reaches 0.758, with particularly strong performance in Accessibility (0.82) and Interoperability (0.79) due to the standardized nature of clinical data formats. Security metrics remained robust, with 97.2% of model updates successfully passing anomaly detection, while policy compliance reached 82.8%, slightly lower than synthetic data due to the heightened complexity of real-world data patterns. These findings confirm the practical usability of AspectFL in realistic healthcare federated learning contexts.

5.2.2. Financial Scenario Results

The financial scenario models a banking consortium of 8 financial institutions developing fraud detection capabilities while adhering to PCI DSS requirements and financial regulations. The synthetic dataset contains 75,000 transaction records with 30 financial features, distributed to reflect realistic transaction patterns across different institution types. Figure 5 shows that AspectFL achieves an AUC of 0.908 (95% CI: [0.903, 0.913]) compared to 0.899 (95% CI: [0.894, 0.904]) for traditional FL, representing a 0.90% improvement that is statistically significant (p < 0.05, DeLong’s test). While the improvement is smaller than in healthcare scenarios, it represents meaningful enhancement in the context of financial fraud detection where small improvements can translate to significant economic impact.
Figure 5. Financial FL AUC comparison between AspectFL (with aspects) vs. traditional FL (without aspects) over training rounds.
FAIR compliance metrics demonstrate consistent performance with a score of 0.738. The slightly lower score compared to healthcare reflects the more complex regulatory landscape in financial services, where multiple overlapping regulations must be simultaneously satisfied. Security scores of 0.806 indicate robust protection against financial-specific threats including transaction manipulation and model poisoning attacks. Policy compliance rates of 84.3% demonstrate AspectFL’s effectiveness in handling the complex regulatory landscape of financial services, where PCI DSS requirements, data retention policies, and financial regulations must be simultaneously enforced.

5.3. FAIR Compliance Assessment

AspectFL’s FAIRCA demonstrates substantial improvements in adherence to FAIR principles compared to traditional FL implementations. In healthcare scenarios, AspectFL achieves an overall FAIR compliance score of 0.762, with component scores of 0.68 for Findability, 0.75 for Accessibility, 0.81 for Interoperability, and 0.80 for Reusability. The Findability (F) improvements result from AspectFL’s automated metadata generation and quality assessment mechanisms. The aspect automatically extracts and validates metadata from FL datasets, ensuring documentation of data sources, collection procedures, and quality indicators. The metadata registry maintains persistent identifiers for all FL artifacts, enabling efficient discovery and retrieval. Accessibility (A) enhancements stem from AspectFL’s continuous endpoint monitoring and protocol compliance verification. The aspect implements adaptive timeout mechanisms and retry strategies that accommodate participants with varying computational resources and network connectivity. The authentication and authorization mechanisms ensure secure access while maintaining usability for legitimate participants. Interoperability (I) improvements result from AspectFL’s validation of data formats, communication protocols, and semantic vocabularies. The aspect maintains registries of supported standards and automatically validates all FL artifacts against these specifications. The backward compatibility checking ensures that system updates do not break existing integrations. Reusability (R) enhancements come from AspectFL’s automated documentation generation, license compatibility checking, and usage example provision. The aspect generates documentation templates that capture essential information about FL processes, enabling effective reuse by other researchers and practitioners.
Figure 6 shows the evolution of FAIR compliance scores over training rounds, demonstrating continuous improvement as AspectFL’s learning mechanisms adapt to participant behavior and system characteristics. The compliance scores stabilize at high levels after initial rounds, indicating that AspectFL successfully maintains FAIR principles throughout extended FL collaborations. The compliance score improves from an initial value of 0.55 to a final score of 0.92, crossing the acceptable threshold of 0.7 at round 4. This progressive improvement demonstrates AspectFL’s adaptive learning mechanisms that continuously enhance FAIR principle adherence based on participant behavior and system characteristics. The steady upward trend indicates that AspectFL’s FAIRCA successfully learns optimal metadata generation, accessibility optimization, interoperability enhancement, and reusability improvement strategies. The stabilization at high compliance levels after round 7 shows that AspectFL maintains consistent FAIR adherence throughout extended FL collaborations.
Figure 6. Evolution of FAIR compliance scores in healthcare FL over 10 training rounds.

5.4. Security Analysis Results

AspectFL’s Security Aspect provides protection against various threat vectors while maintaining system usability and performance. Our security evaluation encompasses threat detection accuracy, privacy preservation effectiveness, and integrity verification capabilities. The anomaly detection component achieves high accuracy in identifying malicious participants and suspicious activities. In controlled experiments with simulated attacks, AspectFL detects 94% of data poisoning attempts, 89% of model poisoning attacks, and 97% of Byzantine behaviors. The false positive rate remains low at 3.2%, ensuring that legitimate participants are not incorrectly flagged as malicious. The threat assessment framework successfully categorizes detected anomalies by severity and impact, enabling appropriate response measures. High-severity threats trigger immediate protective actions including participant exclusion and enhanced monitoring, while low-severity anomalies result in increased scrutiny and additional validation requirements.
Privacy preservation mechanisms provide strong differential privacy guarantees while maintaining model utility. Our privacy analysis demonstrates that AspectFL achieves epsilon-differential privacy with epsilon values of 2.0 for healthcare and 1.5 for financial scenarios, meeting or exceeding regulatory requirements. The adaptive privacy budget allocation optimizes the privacy-utility trade-off, achieving better model performance than fixed budget allocation strategies. Integrity verification mechanisms successfully detect and prevent data tampering and communication manipulation. The cryptographic hash validation identifies 100% of data integrity violations in our test scenarios, while digital signature verification prevents unauthorized model updates and control message injection.
Figure 7 illustrates the evolution of security metrics over training rounds, showing decreasing numbers of security incidents as AspectFL’s adaptive mechanisms learn to identify and mitigate threats more effectively. The security scores improve from initial values around 0.6 to final values exceeding 0.8, demonstrating the effectiveness of AspectFL’s learning-based security mechanisms. This analysis demonstrates AspectFL’s effectiveness in reducing both security threats and policy violations through adaptive learning mechanisms. Security issues (orange bars) decrease from 5 incidents in round 1 to 1 incident by round 7, while policy issues (purple bars) follow a similar pattern, decreasing from 4 incidents to 1 incident. This progressive improvement results from AspectFL’s ML-based threat detection that adapts to participant behavior patterns and evolving attack strategies. The stabilization at low incident levels after round 7 indicates that AspectFL’s security and policy aspects successfully establish robust protection mechanisms that maintain effectiveness throughout extended FL collaborations.
Figure 7. Evolution of security and policy issues in healthcare FL over 10 training rounds.

5.5. Policy Compliance Evaluation

AspectFL’s IPA achieves high levels of policy compliance across diverse regulatory and institutional requirements. In healthcare scenarios, AspectFL maintains an average policy compliance rate of 84.3%, successfully enforcing HIPAA privacy requirements, data governance policies, and institutional security standards. The policy engine successfully handles complex policy hierarchies and conflict resolution scenarios. In experiments with conflicting policies from different organizational levels, the aspect resolves 92% of conflicts automatically through priority-based resolution and semantic analysis. The remaining conflicts are escalated to human administrators with detailed conflict analysis and resolution recommendations.
Dynamic policy updates are successfully propagated and enforced throughout ongoing FL collaborations. The aspect handles policy changes with minimal disruption to learning processes, implementing appropriate transition procedures and notification mechanisms. Policy update latency averages 2.3 s across all participants, enabling near real-time policy enforcement. The policy monitoring mechanisms provide compliance reporting and violation detection. The aspect identifies policy violations with 96% accuracy and generates detailed compliance reports that support regulatory auditing and institutional governance requirements. The real-time monitoring capabilities enable immediate response to policy violations, minimizing potential compliance risks.
In financial scenarios, AspectFL achieves similar policy compliance rates of 84.3%, successfully enforcing PCI DSS requirements, data retention policies, and financial regulations. The policy engine demonstrates particular effectiveness in handling the complex regulatory landscape of financial services, where multiple overlapping regulations must be simultaneously satisfied.

5.6. Scalability and Performance Analysis

AspectFL demonstrates excellent scalability properties across varying numbers of participants and data sizes. Our scalability experiments evaluate system performance with participant counts ranging from 5 to 50 organizations and dataset sizes from 1000 to 100,000 records per participant.
The aspect weaving overhead remains minimal across all experimental configurations, adding less than 5% to total execution time in most scenarios. The overhead scales linearly with the number of active aspects and joinpoints, demonstrating predictable performance characteristics that support capacity planning and resource allocation. Memory utilization scales efficiently with system size, with the provenance graph and policy engine representing the primary memory consumers. The distributed provenance storage mechanisms enable effective memory management even for large-scale FL collaborations with extensive audit trail requirements.
Network communication overhead introduced by AspectFL’s security and compliance mechanisms remains acceptable across all experimental configurations. The secure communication protocols add approximately 15% to network traffic, while the provenance tracking and policy enforcement mechanisms contribute an additional 8%. These overheads are justified by the substantial security and compliance benefits provided by AspectFL. The system demonstrates excellent fault tolerance and recovery capabilities. In experiments with simulated participant failures and network partitions, AspectFL successfully maintains operation with minimal impact on learning performance and compliance metrics. The adaptive mechanisms adjust to changing participant availability and network conditions, ensuring robust operation in realistic deployment environments.

5.7. Ablation Study of Aspect Contributions

To understand the individual contribution of each aspect to AspectFL’s overall performance, we conducted an ablation study using the healthcare dataset. We systematically added each aspect to a baseline FedAvg implementation and measured the impact on model accuracy, FAIR score, security score, and policy compliance rate.
Table 3 reveals that each aspect provides meaningful improvements, with the Provenance Aspect serving as a foundation that enables other aspects to function effectively. The Security Aspect provides the largest single improvement in security metrics, while the Policy Aspect dramatically improves compliance rates. The FAIR Aspect, when added last, provides the final integration that maximizes all metrics simultaneously, demonstrating the synergistic benefits of the complete AspectFL framework.
Table 3. Ablation Study Results—Healthcare Scenario.

5.8. Scalability Analysis

We conducted extensive scalability testing to evaluate AspectFL’s performance under increasing federation sizes. The experiments varied the number of participating clients from 10 to 200, measuring the time per FL round and central server throughput.
Table 4 demonstrates that AspectFL scales sub-linearly with the number of participants, with round time increasing as O ( n log n ) rather than linearly. Server throughput maintains above 77% efficiency even with 200 clients, indicating that the AO architecture does not introduce prohibitive overhead at scale. The framework remains practical for large-scale federated learning deployments while maintaining its trust and compliance guarantees.
Table 4. Scalability Analysis Results.

5.9. Performance Overhead Analysis

To quantify the computational overhead introduced by AspectFL’s AO mechanisms, we conducted detailed performance comparisons between AspectFL and baseline FedAvg implementations across both healthcare and financial scenarios.
Table 5 shows a 25.3% increase in training time due to the HIPAA compliance checking and medical data validation requirements. The financial scenario exhibits a 23.2% overhead, reflecting the intensive security monitoring and PCI DSS compliance verification. Memory usage increases by 12.4% and 15.7%, respectively, primarily due to provenance tracking and metadata storage. These moderate overheads represent a reasonable trade-off for the significant improvements in trustworthiness, compliance, and security provided by AspectFL.
Table 5. Performance Overhead Analysis.

6. Discussion

Table 6 provides context for AspectFL’s practical utility, we conducted benchmarking against established FL frameworks including Flower, TensorFlow Federated (TFF), and PySyft. The comparison focuses on key performance indicators relevant to trustworthy and compliant federated learning.
Table 6. Framework Comparison—Key Performance Indicators.
While AspectFL introduces higher computational overhead (24.2%) compared to basic frameworks like Flower (5.2%), it provides trust and compliance features that are either absent or limited in existing solutions. The network traffic increase (15.4%) is reasonable considering the additional metadata and provenance information transmitted. AspectFL’s integration complexity (5/10) remains manageable due to its AO design that separates concerns and minimizes modifications to existing FL code.

7. Conclusions and Future Work

This paper introduces AspectFL, a novel AOP framework for FL that systematically addresses trust, compliance, and security challenges through modular, reusable aspects. Our experimental evaluation demonstrates that AspectFL achieves significant improvements in learning performance, FAIR compliance, security properties, and policy adherence while maintaining acceptable computational and communication overhead. The key contributions of this work include the development of a formal mathematical framework for AO FL, the implementation of four core aspects addressing FAIR compliance, security, provenance tracking, and institutional policy enforcement, and experimental validation across healthcare and financial application domains, including real-world validation on the MIMIC-III dataset. The results demonstrate that AspectFL achieves 4.52% accuracy improvement in healthcare scenarios and 0.90% AUC improvement in financial scenarios while maintaining high levels of compliance and security. The AO approach provides several significant advantages over traditional FL implementations. The separation of concerns enables independent development and testing of cross-cutting functionality, reducing system complexity and improving maintainability. The modular architecture supports flexible configuration and customization for different application requirements and regulatory frameworks. The formal mathematical foundations provide theoretical guarantees for system properties and enable rigorous analysis of security and compliance characteristics. The practical implications of AspectFL extend beyond technical improvements to include substantial benefits for industry adoption of FL in regulated domains. The demonstrated ability to achieve performance improvements while maintaining strict compliance with healthcare and financial regulations addresses key barriers to FL adoption in these critical application areas. The open-source implementation and documentation lower barriers to experimentation and deployment, enabling organizations to evaluate and adopt AspectFL for their specific requirements. However, the current implementation focuses on supervised learning scenarios with relatively simple model architectures, and extensions to deep learning and other paradigms require additional research. The scalability experiments, while promising, are limited to moderate-scale deployments, and large-scale validation remains necessary. The policy engine requires significant expertise to configure effectively, and user-friendly tools could improve accessibility. Despite these limitations, AspectFL represents a significant advance in FL trustworthiness and compliance. The systematic approach to cross-cutting concerns through AO mechanisms provides a foundation for addressing the complex challenges of deploying FL in regulated environments. The AO architecture provides a flexible and extensible foundation for addressing the evolving challenges of trustworthy AI in federated learning contexts. The mathematical formalization ensures theoretical soundness while the practical implementation demonstrates scalability and efficiency. Future work will focus on several key directions. We plan to develop a repository of aspect definitions and policy templates spanning additional domains including industrial IoT, autonomous driving, and public sector services. This repository will facilitate broader adoption and enable community contributions to the framework’s development. Advanced cryptographic protocols will be integrated to provide stronger security guarantees, while adaptive learning mechanisms will enable dynamic adjustment of aspect behaviors based on evolving threat landscapes and compliance requirements. The extension of AspectFL to support emerging FL paradigms such as cross-silo and cross-device learning will broaden its applicability. Integration with blockchain technologies for immutable provenance tracking and smart contracts for automated policy enforcement represents promising directions for enhancing trust and accountability in distributed learning systems.

Author Contributions

Methodology, A.S.; Writing—original draft, A.A.; Writing—review & editing, A.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The original data presented in the study are openly available in https://github.com/aalosbeh/AspectFL2 (accessed on 1 November 2025).

Acknowledgments

The authors express their sincere gratitude to Southern Illinois University Carbondale, Weber State University, Yarmouk University, and Prince Sultan University for providing the institutional support and resources necessary for this research. This work is also inspired by the Building Research Innovation at Community Colleges-Research Data Management (BRICCs-RDM) initiative, which is part of the BRICCs initiative and aims to understand how RDM can be integrated into CI-enabled research to accelerate scientific discoveries across various fields. BRICCs is supported by NSF award number 2437898.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Li, T.; Sahu, A.K.; Talwalkar, A.; Smith, V. Federated learning: Challenges, methods, and future directions. IEEE Signal Process. Mag. 2020, 37, 50–60. [Google Scholar] [CrossRef]
  2. Kairouz, P.; McMahan, H.B.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A.N.; Bonawitz, K.; Charles, Z.; Cormode, G.; Cummings, R.; et al. Advances and open problems in federated learning. Found. Trends Mach. Learn. 2021, 14, 1–210. [Google Scholar] [CrossRef]
  3. McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B.A. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Ft. Lauderdale, FL, USA, 20 April 2017; Volume 54, pp. 1273–1282. [Google Scholar]
  4. Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.-W.; da Santos, L.B.; Bourne, P.E.; et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 2016, 3, 160018. [Google Scholar] [CrossRef]
  5. Li, L.; Fan, Y.; Tse, M.; Lin, K.Y. A review of applications in federated learning. Comput. Ind. Eng. 2021, 149, 106854. [Google Scholar] [CrossRef]
  6. Floridi, L.; Cowls, J.; Beltrametti, M.; Chatila, R.; Chazerand, P.; Dignum, V.; Luetge, C.; Madelin, R.; Pagallo, U.; Rossi, F.; et al. Translating principles into practices of digital ethics: Five risks of being unethical. Nat. Mach. Intell. 2019, 1, 391–400. [Google Scholar]
  7. IEEE Std 3187-2024; IEEE Guide for Framework for Trustworthy Federated Machine Learning. IEEE: New York, NY, USA, 2024. [CrossRef]
  8. Mongan, J.; Moy, L.; Kahn, C.E., Jr. Checklist for artificial intelligence in medical imaging (CLAIM): A guide for authors and reviewers. Radiol. Artif. Intell. 2020, 2, e200029. [Google Scholar] [CrossRef] [PubMed]
  9. Alhazeem, E.; Alsobeh, A.; Al-Ahmad, B. Enhancing software engineering education through AI: An empirical study of tree-based machine learning for defect prediction. In Proceedings of the 25th Annual Conference on Information Technology Education (SIGITE ’24), El Paso, TX, USA, 10–12 October 2024; pp. 153–156. [Google Scholar]
  10. Lekadir, K.; Frangi, A.F.; Porras, A.R.; Glocker, B.; Cintas, C.; Langlotz, C.P.; Weicken, E.; Asselbergs, F.W.; Prior, F.; Collins, G.S.; et al. FUTURE-AI: International consensus guideline for trustworthy and deployable artificial intelligence in healthcare. BMJ 2025, 388, e081554. [Google Scholar] [CrossRef] [PubMed]
  11. Kiczales, G.; Lamping, J.; Mendhekar, A.; Maeda, C.; Lopes, C.; Loingtier, J.M.; Irwin, J. Aspect-oriented programming. In Proceedings of the European Conference on Object-Oriented Programming, Jyväskylä, Finland, 9–13 June 1997; pp. 220–242. [Google Scholar]
  12. Magableh, A.A.; Alsobeh, A.M.R. Aspect-oriented software security development life cycle (AOSSDLC). In Proceedings of the CS & IT Conferenc, Dubai, United Arab Emirates, 22–23 December 2018; pp. 25–26. [Google Scholar]
  13. Al-Shawakfa, E.M.; Alsobeh, A.M.R.; Omari, S.; Shatnawi, A. RADAR#: An ensemble approach for radicalization detection in Arabic social media using hybrid deep learning and transformer models. Information 2025, 16, 522. [Google Scholar] [CrossRef]
  14. Alsobeh, A.M.R.; Clyde, S.W. Transaction-aware aspects with TransJ: An initial empirical study to demonstrate improvement in reusability. In Proceedings of the International Conference on Software Engineering Advances (ICSEA), Rome, Italy, 21–25 August 2016; Volume 2016, p. 59. [Google Scholar]
  15. Geyer, R.C.; Klein, T.; Nabi, M. Differentially private federated learning: A client level perspective. arXiv 2017, arXiv:1712.07557. [Google Scholar]
  16. Wei, K.; Li, J.; Ding, M.; Ma, C.; Yang, H.H.; Farokhi, F.; Jin, S.; Quek, T.Q.S.; Poor, H.V. Federated learning with differential privacy: Algorithms and performance analysis. IEEE Trans. Inf. Forensics Secur. 2020, 15, 3454–3469. [Google Scholar] [CrossRef]
  17. Blanchard, P.; El Mhamdi, E.M.; Guerraoui, R.; Stainer, J. Machine learning with adversaries: Byzantine tolerant gradient descent. Adv. Neural Inf. Process. Syst. 2017, 30, 119–129. [Google Scholar]
  18. Yin, D.; Chen, Y.; Kannan, R.; Bartlett, P. Byzantine-robust distributed learning: Towards optimal statistical rates. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 5650–5659. [Google Scholar]
  19. Bonawitz, K.; Ivanov, V.; Kreuter, B.; Marcedone, A.; McMahan, H.B.; Patel, S.; Ramage, D.; Segal, A.; Seth, K. Practical secure aggregation for privacy-preserving machine learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; pp. 1175–1191. [Google Scholar]
  20. Bell, J.H.; Bonawitz, K.A.; Gascón, A.; Lepoint, T.; Raykova, M. Secure single-server aggregation with (poly) logarithmic overhead. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, Virtual Event, 9–13 November 2020; pp. 1253–1269. [Google Scholar]
  21. Lyu, L.; Yu, H.; Yang, Q. Threats to federated learning: A survey. arXiv 2020, arXiv:2003.02133. [Google Scholar] [CrossRef]
  22. Mothukuri, V.; Parizi, R.M.; Pouriyeh, S.; Huang, Y.; Dehghantanha, A.; Srivastava, G. A survey on security and privacy of federated learning. Future Gener. Comput. Syst. 2021, 115, 619–640. [Google Scholar] [CrossRef]
  23. Jacobsen, A.; de Miranda Azevedo, R.; Juty, N.; Batista, D.; Coles, S.; Cornet, R.; Courtot, M.; Crosas, M.; Dumontier, M.; Evelo, C.T.; et al. FAIR Principles: Interpretations and Implementation Considerations. Data Intell. 2020, 2, 10–29. [Google Scholar] [CrossRef]
  24. Huerta, E.A.; Blaiszik, B.; Brinson, L.C.; Bouchard, K.E.; Diaz, D.; Doglioni, C.; Duarte, J.M.; Emani, M.; Foster, I.; Fox, G.; et al. FAIR for AI: An interdisciplinary and international community building perspective. Sci. Data 2020, 7, 487. [Google Scholar] [CrossRef] [PubMed]
  25. Price, W.N.; Cohen, I.G. Privacy in the age of medical big data. Nat. Med. 2019, 25, 37–43. [Google Scholar] [CrossRef]
  26. Kaissis, G.A.; Makowski, M.R.; Rückert, D.; Braren, R.F. Secure, privacy-preserving and federated machine learning in medical imaging. Nat. Mach. Intell. 2020, 2, 305–311. [Google Scholar] [CrossRef]
  27. Brkan, M. Do algorithms rule the world? Algorithmic decision-making and data protection in the framework of the GDPR and beyond. Int. J. Law Inf. Technol. 2019, 27, 91–121. [Google Scholar] [CrossRef]
  28. Wachter, S.; Mittelstadt, B.; Russell, C. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. J. Law Technol. 2017, 31, 841. [Google Scholar] [CrossRef]
  29. Ressi, D.; Romanello, R.; Piazza, C.; Rossi, S. AI-enhanced blockchain technology: A review of advancements and opportunities. J. Netw. Comput. Appl. 2024, 225, 103858. [Google Scholar] [CrossRef]
  30. Bouchenak, S.; Boyer, F.; Hagimont, D.; Krakowiak, S.; Mos, A.; de Palma, N.; Quema, V.; Stefani, J.-B. Architecture-based autonomous repair management: An application to J2EE clusters. In Proceedings of the 24th IEEE Symposium on Reliable Distributed Systems, Berkeley, CA, USA, 30 April 2006–4 May 2006; pp. 13–24. [Google Scholar]
  31. Alsobeh, A.A. Improving Reuse of Distributed Transaction Software with Transaction-Aware Aspects. Ph.D. Dissertation, Utah State University, Logan, UT, USA, 2015. [Google Scholar]
  32. Win, B.D.; Piessens, F.; Joosen, W.; Verhanneman, T. On the security of aspect-oriented programming. In Proceedings of the 19th Workshop on Programming Languages and Analysis for Security, Salt Lake City, UT, USA, 14 October 2002; pp. 15–26. [Google Scholar]
  33. Viega, J.; Bloch, J.T.; Kohno, Y.; McGraw, G. ITS4: A static vulnerability scanner for C and C++ code. In Proceedings of the 16th Annual Computer Security Applications Conference, New Orleans, LA, USA, 11–15 December 2000; pp. 257–267. [Google Scholar]
  34. Colyer, A.; Clement, A. Large-scale AOSD for middleware. In Proceedings of the 3rd International Conference on Aspect-Oriented Software Development, Lancaster, UK, 22–26 March 2004; pp. 56–65. [Google Scholar]
  35. Rashid, A.; Sawyer, P.; Moreira, A.; Araújo, J. Early aspects: A model for aspect-oriented requirements engineering. In Proceedings of the IEEE International Conference on Requirements Engineering, Essen, Germany, 9–13 September 2003; pp. 199–202. [Google Scholar]
  36. Zhang, L.; Wang, S.; Liu, B. Deep learning for sentiment analysis: A survey. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2019, 8, e1253. [Google Scholar] [CrossRef]
  37. Li, T.; Sahu, A.K.; Zaheer, M.; Sanjabi, M.; Talwalkar, A.; Smith, V. Federated optimization in heterogeneous networks. Proc. Mach. Learn. Syst. 2021, 2, 429–450. [Google Scholar]
  38. Mohri, M.; Sivek, G.; Suresh, A.T. Agnostic federated learning. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 4615–4625. [Google Scholar]
  39. Shokri, R.; Shmatikov, V. Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, Denver, CL, USA, 12–16 October 2015; pp. 1310–1321. [Google Scholar]
  40. Abadi, M.; Chu, A.; Goodfellow, I.; McMahan, H.B.; Mironov, I.; Talwar, K.; Zhang, L. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS ’16), Vienna, Austria, 24–28 October 2016; pp. 308–318. [Google Scholar]
  41. van Buuren, S.; Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 2011, 45, 1–67. [Google Scholar] [CrossRef]
  42. Sheller, M.J.; Edwards, B.; Reina, G.A.; Martin, J.; Pati, S.; Kotrotsou, A.; Milchenko, M.; Xu, W.; Marcus, D.; Colen, R.R.; et al. Federated learning in medicine: Facilitating multi-institutional collaborations without sharing patient data. Sci. Rep. 2020, 10, 12598. [Google Scholar] [CrossRef] [PubMed]
  43. Li, T.; Sanjabi, M.; Beirami, A.; Smith, V. Fair resource allocation in federated learning. In Proceedings of the International Conference on Learning Representations, Vienna, Austria, 4 May 2021. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Article metric data becomes available approximately 24 hours after publication online.