Next Article in Journal
An Adaptive Fatigue Detection Model for Virtual Reality-Based Physical Therapy
Next Article in Special Issue
Topic Classification of Interviews on Emergency Remote Teaching
Previous Article in Journal
Measuring Innovation Potential in Ecuadorian ICT Companies: Development and Application of the CRI-IRT Model
Previous Article in Special Issue
Property Graph Framework for Geographical Routes in Sports Training
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Framework for Addressing Imbalanced Data in Aviation with Federated Learning

Engineering Faculty, Transport and Telecommunication Institute, Lauvas 2, LV-1019 Riga, Latvia
Information 2025, 16(2), 147; https://doi.org/10.3390/info16020147
Submission received: 23 January 2025 / Revised: 9 February 2025 / Accepted: 14 February 2025 / Published: 16 February 2025

Abstract

:
The aviation industry generates vast amounts of data across multiple stakeholders, but critical faults and anomalies occur rarely, creating inherently imbalanced datasets that complicate machine learning applications. Traditional centralized approaches are further constrained by privacy concerns and regulatory requirements that limit data sharing among stakeholders. This paper presents a novel framework for addressing imbalanced data challenges in aviation through federated learning, focusing on fault detection, predictive maintenance, and safety management. The proposed framework combines specialized techniques for handling imbalanced data with privacy-preserving federated learning to enable effective collaboration while maintaining data security. The framework incorporates local resampling methods, cost-sensitive learning, and weighted aggregation mechanisms to improve minority class detection performance. The framework is validated through extensive experiments involving multiple aviation stakeholders, demonstrating a 23% improvement in fault detection accuracy and a 17% reduction in remaining useful life prediction error compared to conventional models. Results show the enhanced detection of rare but critical faults, improved maintenance scheduling accuracy, and effective risk assessment across distributed aviation datasets. The proposed framework provides a scalable and practical solution for using distributed aviation data while addressing both class imbalance and privacy concerns, contributing to improved safety and operational efficiency in the aviation industry.

Graphical Abstract

1. Introduction

1.1. Background and Motivation

The aviation industry operates under stringent safety requirements, where even minor anomalies can have catastrophic consequences. To maintain and enhance operational safety, the industry relies on extensive data collection from various sources such as flight telemetry, maintenance logs, and sensor systems [1]. However, aviation datasets are inherently imbalanced, as rare events like system failures or safety incidents occur far less frequently than normal operations. This imbalance poses significant challenges to traditional machine learning methods, which tend to favor majority classes, leading to poor detection rates for critical minority events.
At the same time, aviation data are distributed across multiple stakeholders, including airlines, manufacturers, maintenance organizations, and regulatory bodies. The sharing of these data is often restricted due to privacy, security, and regulatory constraints [2]. These barriers hinder the development of centralized machine learning models that require access to large, diverse datasets to achieve high performance.
This study aims to fill these challenges by developing and evaluating a federated learning-based approach tailored to the specific challenges of imbalanced aviation data.

1.2. Related Works

1.2.1. Federated Learning for Imbalanced Data

The integration of federated learning (FL) to address imbalanced data challenges in aviation has garnered significant attention in recent years.
The paper [3] introduced an estimation scheme to infer class distributions without accessing raw data, proposing a multi-arm bandit-based algorithm to select client sets with minimal class imbalance. This approach enhances the global model’s performance by mitigating skewed data distributions across clients. The study [4] addressed data imbalance by proposing a clustered federated learning framework with weighted model aggregation. This method groups clients based on data similarity and assigns weights during aggregation, effectively improving learning efficiency in imbalanced scenarios.
Hou et al. [5] developed an asynchronous FL framework tailored for imbalanced data. By considering temporal inconsistencies and measuring informative differences in imbalanced datasets, this framework supports FL in heterogeneous environments, demonstrating superior performance in accuracy and communication efficiency. Paper [6] designed an FL method that optimizes feature extractors and classifiers to tackle heterogeneous and locally imbalanced data. This approach enhances the adaptability of FL models to diverse client data distributions, ensuring robust performance across varied datasets.
A framework capable of learning from both common and rare classes in long-tailed datasets is proposed in [7]. By addressing global and local data imbalance simultaneously, this framework achieves robust performance in real-world FL applications. The paper [8] introduced the global-local joint learning method, which embeds local and global factors into each client’s loss function to harmonize class imbalance issues. This approach effectively balances the influence of diverse client data distributions on the global model.
Wang et al. [9] proposed an adaptive clustering-based model aggregation method to tackle data imbalance in FL. By dynamically clustering clients and adjusting aggregation strategies, this method improves model performance in imbalanced data settings. The study [10] presented a federated fuzzy learning algorithm to construct fuzzy classification models in distributed settings with imbalanced data. This approach enhances the interpretability and accuracy of FL models dealing with skewed data distributions.
Mrad et al. [11] investigated the deployment of unmanned aerial vehicle swarms using FL to handle class imbalance and power constraints. Their framework improves classification accuracy and energy efficiency, demonstrating the applicability of FL in aviation contexts. The paper [12] introduced a self-balancing approach to FL that dynamically adjusts the contribution of each client based on the global data distribution. This method mitigates the adverse effects of data imbalance across clients, enhancing the overall model performance.
The dynamic margin federated learning strategy, which adjusts decision boundaries during training to account for class imbalance, was proposed in [13]. This approach improves the model’s ability to learn from skewed data distributions without requiring data resampling. The class rebalancing imbalanced federated semi-supervised learning framework, which incorporates class rebalancing techniques into federated semi-supervised learning was developed in [14]. This method effectively addresses class imbalance settings, expanding the applicability of federated semi-supervised learning.
The paper [15] conducted a comprehensive study on the effects of skewed data distributions in FL. Their research provides insights into the challenges posed by imbalanced data and lays the groundwork for developing data rebalancing strategies in FL. The study [16] addressed the challenges of class-imbalanced heterogeneous data in FL by proposing a novel aggregation method that considers the data distribution of each client. This approach enhances the robustness of the global model in diverse data environments.
The study [17] presents a privacy protection scheme integrating blockchain and federated learning to enhance data privacy and reduce blockchain burdens in industrial environments. The approach includes global and local adjustments to adapt to local datasets, addressing challenges related to imbalanced data distributions.
The survey [18] identifies the significance of multimodal federated learning and conducts a state-of-the-art review. It categorizes multimodal FL into congruent and incongruent types based on client modal combinations and explores feasible application tasks and related benchmarks. The paper [19] proposes a novel framework using the generative pre-trained transformer preview model for fault detection and diagnosis in complex systems. It addresses limitations of traditional approaches by integrating synthetic datasets generated via large language models to enhance accuracy in imbalanced scenarios.

1.2.2. Aviation-Specific FL Applications

The research [20] proposes a method based on timeline modeling to acquire and process unmanned aerial vehicle (UAV) fault flight data in a simulation environment. It constructs UAV flight missions and realizes the original collection of flight logs, addressing challenges related to imbalanced data in fault detection.
The paper [21] presents a comprehensive literature review of big data applications in the aerospace sector. The review highlights the utilization of various data sources and machine learning models to derive valuable insights from large, heterogeneous datasets.
Research paper [22] provides a comprehensive analysis of FL, a system where multiple users work together on machine learning tasks while maintaining data privacy and distribution. The review systematically categorizes FL, investigates its applications, and studies how it relates to blockchain technology. Research [23] presents a new FL clustering approach that uses self-attention mechanisms for detecting bearing faults in aerospace equipment. This method addresses the limitations of conventional fault detection systems that struggle with limited data and privacy concerns. The approach develops a neural network that processes both local and global data patterns for feature extraction and groups similar equipment data, effectively using distributed information while preserving privacy.
Study [24] explores how condition monitoring improves industrial efficiency through smart fault detection systems, which enable accurate condition-based maintenance scheduling. It suggests FL as an answer to challenges involving data privacy, security threats, and competitive concerns that arise when data are spread across different locations or organizations, allowing collaborative model training without compromising data security. Research [25] describes a privacy-protecting intrusion-detection system based on FL to protect against cyber threats in controller–pilot communication systems. The system was tested using a specially created dataset from air–ground communications near Sweden’s Arlanda airport. The FL-trained detection model performed better than both centralized and local models in terms of accuracy and precision, proving its effectiveness for protecting sensitive aviation communications. Research [26] analyzes distributed learning approaches, particularly FL, for predictive maintenance in aircraft health management systems. FL addresses the limitations of traditional centralized degradation assessment methods by enabling private, decentralized machine learning at network edges. Testing revealed that while the federated average algorithm matches centralized model performance but shows varying accuracy, the federated proximal term algorithm achieves more consistent error reduction with specific settings, indicating its potential for more reliable aviation maintenance predictions.
Paper [27] examines how machine learning can enhance UAV network intelligence, addressing traditional cloud-based ML challenges like privacy, delays, and resource constraints in both civilian and military uses. It proposes a decentralized FL model enabling UAVs to jointly train machine learning (ML) models without centralizing data, reducing single-point failures and adapting to unreliable network conditions.

1.2.3. Risk Assessment and Predictive Maintenance

Artificial intelligence (AI) finds extensive practical application in maintenance operations. Research [28] investigates combining convolutional neural networks with autonomous drones for automated visual inspection during aircraft maintenance. It builds on previous work by introducing methods to improve defect detection, particularly for identifying dents through specific image enhancement and pre-classification techniques.
Study [29] introduces a new machine learning and Internet of Things (IoT) method for predicting aircraft wing anti-icing system thermal performance. Using artificial neural networks, this approach proves more efficient than traditional fluid dynamics calculations, showing promise for aviation applications. Research [30] reviews various statistical and machine learning techniques for analyzing aircraft environmental impacts more efficiently, including fuel consumption, emissions, and noise. It outlines major research themes and potential areas for further integrating these methods to improve aviation sustainability.
Paper [31] suggests using deep neural networks and transfer learning to automatically detect corrosion in aircraft lap joints through image analysis. The method achieves accuracy akin to human experts, supporting maintenance staff and enabling more automated condition-based maintenance. Research [32] focused on helicopters, investigating active vibration control using individual blade control to decrease hub vibration. By combining various models including fuzzy neural networks, the study shows that this approach effectively reduces vibration loads, providing valuable insights for helicopter vibration control design.
Paper [33] examines four data-driven methods for predicting aeroengine exhaust gas temperature baselines, crucial for engine health monitoring and flight safety. Testing with actual engine data showed that the generalized regression neural network model achieved the best accuracy and efficiency for airline applications. Study [34] comprehensively reviews machine learning applications in lithium-ion battery research, particularly focusing on materials research, health monitoring, and fault detection in aviation batteries and green aviation technology. It analyzes various ML techniques’ strengths and weaknesses and discusses future development possibilities.
Research [35] investigates using machine learning, specifically multilayer perceptron neural networks, to model aero engine transient performance, focusing on heat transfer during transitional operations. The model, trained on simulation data and refined with actual engine measurements, accurately predicts thermal transitions for aviation applications. Paper [36] presents a data-driven approach for predicting base pressure in suddenly expanded flows, which affects base drag in aerodynamic vehicles. Machine learning models trained on response equation data accurately predict base pressure, helping optimize base drag for rockets and missiles.
A recent study [37] presents an advanced integration of foundation models (FMs) and federated learning within AIoT-based aircraft health monitoring systems (AHMS). This work highlights how combining centralized learning approaches with the privacy-preserving nature of FL enhances predictive maintenance and risk assessment capabilities in aviation. The framework proposed in [37] demonstrates significant improvements in anomaly detection, convergence speed, and model efficiency compared to standalone FL or FM implementations, offering a robust digital twin ecosystem for real-time aircraft monitoring and predictive maintenance.
This study is closely related to our work, as both approaches emphasize the benefits of federated learning in aviation while addressing key challenges such as data privacy, scalability, and real-time decision-making. Our research builds on these findings by further enhancing risk assessment efficiency through adaptive weighting mechanisms, cost-sensitive learning, and minority-class event detection in federated risk models. Additionally, while study [37] explores an AIoT-driven ecosystem, our approach introduces an optimized FL framework for handling highly imbalanced aviation safety datasets, ensuring improved predictive accuracy in safety-critical applications.
The reviewed studies demonstrate diverse approaches to handling imbalanced data and federated learning in aviation, with notable trends and trade-offs emerging across different methodologies. Data-centric approaches (studies [3,4,8,12]) focus on modifying data distributions through resampling and weighting, achieving improved minority class detection but potentially increasing computational overhead. Architecture-based solutions ([5,7,13]) modify model structures and learning algorithms, offering better scalability but requiring more complex implementations. Hybrid approaches combining multiple techniques ([6,9,14]) generally show superior performance but face integration challenges in real-world deployments. In the aviation-specific context, studies [20,21,22,23,24,25,26,27] reveal a progression from traditional centralized learning toward federated approaches, with varying emphasis on fault detection (accuracy ranging 82–89%), predictive maintenance (RUL prediction errors 5–15%), and risk assessment capabilities. The application of FL in aviation maintenance ([25,26]) demonstrates promise, with reported improvements of 10–25% in fault detection accuracy compared to centralized approaches, though at the cost of increased system complexity. Recent work integrating foundation models with FL ([37]) represents an emerging direction, showing potential for enhanced predictive capabilities while maintaining privacy. This review reveals a clear trend toward hybrid architectures that balance privacy preservation with performance optimization, though challenges remain in scaling these solutions across diverse aviation stakeholders.
The framework proposed in this paper builds upon these existing approaches by introducing an adaptive weighted aggregation mechanism that prioritizes minority-class instances based on aviation risk assessment metrics. Unlike previous methods that primarily address class imbalance from a statistical perspective, our approach integrates domain-specific considerations such as predictive maintenance, safety-critical event detection, and real-time decision-making. This ensures improved detection of rare but critical aviation faults while maintaining compliance with privacy regulations.

1.3. Research Gap, Contributions, and Paper Structure

While existing research has made significant progress in addressing imbalanced data challenges through federated learning, several critical gaps remain in the aviation context.
First, while many studies propose methods for handling imbalanced data in federated learning, few specifically address the unique characteristics of aviation data, where the consequences of misclassifying rare events can be catastrophic. Most existing approaches focus on general classification tasks rather than the specific challenges of fault detection and predictive maintenance in aviation systems.
Second, current federated learning frameworks often treat data imbalance as a purely statistical problem, without considering the operational context and varying criticality of different fault types in aviation. There is limited work on integrating domain-specific knowledge and risk assessment into the federated learning process.
Third, existing studies typically evaluate their approaches using standard benchmark datasets, which may not reflect the complex, multi-stakeholder nature of the aviation ecosystem and its stringent privacy requirements.
To address these challenges, this paper proposes a federated learning framework specifically designed to handle imbalanced aviation datasets while ensuring data privacy and security. The key contributions of this study are as follows:
  • Development of a federated learning-based approach that addresses data imbalance issues in aviation safety applications, including fault detection, predictive maintenance, and risk assessment.
  • Integration of aviation-specific risk assessment metrics into the federated learning process to enhance the detection of rare but critical faults while maintaining high performance in routine operations.
  • Implementation of an adaptive weighted aggregation mechanism that considers both data quality and operational significance, ensuring more effective collaboration among aviation stakeholders.
  • Validation through extensive experiments and case studies, demonstrating the framework’s ability to improve minority-class detection, optimize maintenance scheduling, and enhance risk assessment accuracy across distributed aviation datasets.
By combining advanced imbalance-handling techniques with privacy-preserving FL, this study provides a scalable and practical solution for improving safety and operational efficiency in aviation. The proposed framework contributes to the growing body of research on collaborative machine learning in safety-critical industries and offers a foundation for future advancements in data-driven aviation management.
This paper is organized as follows. Section 2 presents the materials and methods, including detailed descriptions of strategies for addressing imbalanced aviation data and the proposed FL framework. Section 3 presents the results of computational experiments and case studies demonstrating the framework’s effectiveness. Section 4 discusses the findings, challenges, limitations, and future research directions. Finally, Section 5 concludes the paper with a summary of key findings and implications for the aviation industry.

2. Materials and Methods

2.1. Strategies for Addressing Imbalanced Aviation Data

To mitigate the challenges of imbalanced aviation datasets distributed across multiple stakeholders, this study employs a combination of resampling techniques, cost-sensitive learning, and generative modeling. Synthetic minority over-sampling techniques and under-sampling balance datasets locally, while weighted cross-entropy and focal loss prioritize minority-class samples during training.
Generative adversarial networks and variational autoencoders are utilized to generate synthetic minority-class samples, enhancing rare-event detection. The validity of synthetic data is confirmed through expert review and comparison with real-world datasets from aviation stakeholders. Additionally, autoencoders and isolation forests aid in anomaly detection without reliance on extensive labeled data.
To optimize FL data augmentation techniques, stakeholders are able to share anonymized or synthetic minority-class samples, preserving privacy. Model aggregation is adjusted to emphasize contributions from nodes with higher-quality minority-class data. Personalized FL approaches, including multi-task learning and meta-learning, improve adaptation to heterogeneous datasets.
Evaluation metrics such as precision, recall, and F1-score focus on minority-class performance. The framework is validated through simulations and real-world pilot studies, ensuring scalability and applicability to aviation safety monitoring.

2.2. Imbalanced Data in Aviation for Critical Fault Detection and Predictive Maintenance

The aviation ecosystem is a complex and interconnected network comprising various stakeholders, including aircraft manufacturers, airlines, maintenance repair organizations (MROs), air traffic control (ATC), regulatory bodies, and passengers (Figure 1).
Aviation stakeholders generate and process vast amounts of real-time telemetry, predictive maintenance alerts, and incident reports, all contributing to safety, efficiency, and reliability. While airlines and MROs use rare fault data to optimize maintenance schedules and system reliability, regulatory bodies refine safety standards based on incident investigations. Despite the large volume of aviation data, critical faults occur infrequently, leading to highly imbalanced datasets. This imbalance poses a challenge for predictive maintenance models, which tend to favor majority-class data, often struggling to detect rare but critical anomalies.
Modern aircraft, equipped with integrated aircraft health monitoring systems (AHMSs), generate real-time condition monitoring data across engines, flight controls, landing gear, avionics, and structural components. Abnormal engine vibrations, hydraulic leaks, and structural strain are key indicators of potential failures that require immediate attention. Built-in diagnostic systems, such as engine indicating and crew alerting systems, log rare fault events, serving as vital inputs for maintenance and operational decision-making.
Figure 2 illustrates the flow of imbalanced data within the aviation ecosystem, highlighting its generation, processing, and utilization.
MROs play a vital role by generating data during inspections and repairs. This includes rare findings from visual inspections, non-destructive testing, and predictive maintenance systems. Alerts generated by condition monitoring systems flag early warning signs of potential failures, such as turbine blade fatigue or decreasing hydraulic pressure. Airlines contribute operational data, including flight telemetry, maintenance logs, and incident reports, which often contain sparse instances of anomalies or faults. For example, flight data recorders capture high-frequency telemetry that may include rare deviations from expected performance, while maintenance logs document faults detected during routine servicing.
Aircraft and component manufacturers also produce imbalanced datasets during prototype testing, production, and simulation. Rare failure modes identified in stress testing or structural analysis provide critical insights for refining designs. Digital twins simulate edge-case scenarios, generating synthetic data for faults that rarely occur in real-world operations. Regulatory bodies further contribute imbalanced data through incident investigations and safety audits. These reports provide detailed findings from rare but significant events, offering valuable information for improving safety standards.
Figure 3 presents a breakdown of the imbalanced data generated by each stakeholder and how it flows through the ecosystem.
Imbalanced aviation data are characterized by its sparsity, high dimensionality, noise, and complex interdependencies.
Despite these challenges, imbalanced data are crucial for predictive maintenance, where the goal is to forecast failures and intervene before they occur. Condition-based monitoring analyzes sensor trends to predict when components will require servicing, while failure pattern analysis identifies precursors to rare faults. Remaining useful life (RUL) prediction estimates the lifespan of critical components, supporting proactive maintenance strategies.

2.3. Aviation Technical Support as a Service as a Tool for Imbalanced Data Integration

The concept of aviation technical support as a service (ATSaaS) [38] offers a transformative approach to addressing imbalance in aviation data by using an integrated platform powered by Artificial Intelligence of Things (AIoT). This platform provides real-time insights and predictive analytics, enabling seamless integration and utilization of all data flows across the aviation ecosystem.
Figure 4 provides an overview of how the ATSaaS platform integrates data across the aviation ecosystem using AIoT.
At the IoT level, real-time telemetry and sensor data from AHMS are transmitted to the ATSaaS platform, complemented by stakeholder inputs. ATSaaS integrates data streams from the ground to space domains, supporting predictive maintenance, real-time fault detection, and cross-stakeholder collaboration. By consolidating diverse data sources, it enables a unified, intelligent, and efficient approach to managing aviation operations.
ATSaaS is a cloud-based, data-driven platform that centralizes aircraft sensors, maintenance logs, operational records, and regulatory reports. By offering aviation technical support as a service, it provides advanced analytics and decision-making tools to airlines, MROs, and manufacturers while maintaining data privacy.
A key strength of ATSaaS is its ability to address imbalanced data. The platform integrates ML algorithms and anomaly detection techniques to identify critical faults hidden within large datasets dominated by normal operational events. This enables predictive maintenance, real-time fault detection, and proactive safety management, transforming imbalanced data into actionable insights while ensuring scalability, interoperability, and security.
AIoT integration enhances ATSaaS by combining AI-driven analytics with IoT connectivity, enabling real-time engine performance, structural integrity, and system health monitoring. IoT devices transmit these data to ATSaaS, where AI models analyze patterns for potential faults. Edge computing further improves efficiency by processing critical issues directly on the aircraft, enabling immediate fault detection.
Predictive maintenance optimization is achieved through AIoT-driven condition monitoring, forecasting potential failures and RUL calculations. IoT connectivity automates maintenance alerts to MROs and airlines, ensuring timely interventions and minimal downtime. The platform also integrates diverse imbalanced datasets from flight recorders, maintenance logs, and regulatory audits, allowing AI models to detect rare but critical faults.
Federated learning within ATSaaS enables secure, collaborative AI model training across distributed datasets, preserving privacy while enhancing aviation safety and operational efficiency. These AIoT-powered features create a robust, real-time decision-making framework, improving data-driven collaboration between airlines, MROs, manufacturers, and regulators.
The ATSaaS platform, augmented with AIoT, provides unique opportunities to transform how the aviation industry handles imbalanced data:
  • ATSaaS uses AIoT to process and analyze data from aircraft systems, identifying rare anomalies that traditional systems might overlook. This capability ensures early detection of critical issues, such as engine overheating or structural fatigue, enabling airlines and MROs to perform targeted maintenance and reduce the risk of in-flight failures.
  • By integrating imbalanced data flows from multiple stakeholders, ATSaaS creates a unified ecosystem where airlines, MROs, manufacturers, and regulatory bodies can collaborate effectively. AIoT ensures seamless communication and real-time updates, while federated learning enables stakeholders to develop shared models without compromising data privacy.
  • AIoT-enabled edge computing allows ATSaaS to process data directly on the aircraft, providing immediate alerts for anomalies or faults. This real-time capability is critical for safety and operational efficiency, as it allows airlines to address issues proactively during flight or upon landing.
  • The modular architecture of ATSaaS makes it scalable across fleets of different sizes and adaptable to various operational contexts. AIoT ensures that the platform can handle the high volume and velocity of aviation data while remaining cost-effective for stakeholders.
  • ATSaaS supports regulatory compliance by integrating safety audit data, incident reports, and operational anomalies. AI algorithms analyze these data to identify systemic risks and recommend corrective actions, helping stakeholders meet stringent safety standards and reduce the likelihood of incidents.

2.4. Framework for Addressing Imbalanced Data in Aviation Using FL

The framework for addressing imbalanced data in aviation using federated learning integrates six key components, as illustrated in Figure 5.
The proposed framework integrates FL with advanced techniques for handling imbalanced aviation data, addressing critical challenges such as rare fault detection, data privacy, and stakeholder collaboration. This framework operates through several interconnected components to enhance the effectiveness of ML models while maintaining compliance with stringent privacy and operational requirements.
The first component involves data preprocessing and balancing. Each participating stakeholder preprocesses their local data by applying resampling techniques. Additionally, generative models are used to create synthetic data that enhance the representation of rare but critical fault scenarios.
At the core of the framework is the FL architecture, which facilitates decentralized collaboration among stakeholders. A shared global model is distributed across stakeholders, allowing local training on their datasets without exposing sensitive raw data. A key feature of the framework is weighted aggregation, where model updates are combined at the central server based on factors such as data quality, class distribution, and local performance on minority classes. This ensures fair representation and prevents biases in the global model.
To further address class imbalance, imbalance-specific techniques are incorporated. Cost-sensitive loss functions, such as weighted cross-entropy and focal loss, prioritize minority-class samples during training. For stakeholders with unique data distributions, personalized federated learning approaches enable the customization of local models to align with their specific needs.
Privacy and security are integral to the framework, ensuring compliance with aviation regulations and stakeholder trust. Stakeholders can share anonymized or synthetic minority-class data to collectively enhance the model’s performance without risking privacy breaches.
Evaluation and validation focus on metrics tailored to imbalanced datasets, including precision, recall, and F1-score for minority classes. These metrics emphasize the accurate detection of rare events, ensuring the model’s reliability in safety-critical applications. The framework is validated through simulations using synthetic aviation data and pilot studies involving real-world stakeholders, demonstrating scalability and practical applicability.
By combining FL with specialized techniques for addressing class imbalance, this framework provides a scalable, privacy-preserving solution for rare event detection in aviation, enhancing operational safety and efficiency across the industry.
The proposed study follows a structured six-stage workflow to develop and deploy a federated learning-based predictive maintenance system for aviation safety applications:
  • Data Collection—Gathering data from Airlines, MROs, and ATSaaS platforms;
  • Data Preprocessing—Handling imbalanced data, generating synthetic samples, and applying data augmentation;
  • Model Training—Implementing Federated Learning (FL), anomaly detection, and predictive models;
  • Evaluation and Validation—Measuring model performance, scalability, and accuracy;
  • Deployment and Integration—Applying trained models for real-time predictive maintenance;
  • Decision-Making and Optimization—Using insights for stakeholder collaboration, safety improvements, and operational decisions.
This structured workflow provides a clear representation of the study’s methodology with systematic approach to predictive maintenance in aviation.

2.5. The Role of ML in Fault Detection and Predictive Maintenance

Fault detection is a critical application of ML, focusing on identifying anomalies or deviations in system behavior that could indicate potential faults. These models excel at detecting rare deviations from normal behavior, even in the presence of class imbalance. Furthermore, unsupervised learning approaches model normal operations and flag outliers as potential issues, enabling early detection of faults that might otherwise go unnoticed. ML-based early warning systems further enhance safety by identifying anomalies before they escalate into critical failures, allowing timely intervention.
Predictive maintenance uses ML to forecast the likelihood of component failures, facilitating maintenance scheduling before faults occur. ML models analyze time-series data from sensors to monitor trends indicative of wear or degradation, such as gradual increases in engine temperature or vibration. ML also identifies recurring failure patterns from historical data, addressing root causes and improving system reliability. By transitioning from time-based to condition-based maintenance strategies, ML-driven optimization ensures that interventions occur only when necessary, maximizing asset utilization and cost efficiency.
Proactive safety management involves identifying and mitigating risks before they impact flight safety. ML enables the analysis of complex, multi-dimensional datasets to uncover hidden risk factors and recommend preventive measures. Risk assessment models process diverse inputs such as flight telemetry, maintenance logs, and incident reports to calculate risk scores for specific flights, components, or systems. These scores inform decision-making and prioritize interventions in high-risk scenarios. ML models trained on historical data also predict conditions that could lead to incidents, providing early insights into potentially hazardous situations. ML enhances real-time decision support for pilots, ground crews, and air traffic control by providing actionable insights, such as recommending alternate routes or landing strategies during emergencies.
By using the predictive and analytical capabilities of ML, the industry can continue to evolve toward more resilient and intelligent operations.

2.6. Federated Learning as a Decentralized Approach to Machine Learning

FL represents an approach to machine learning where models are trained across multiple decentralized devices or servers while keeping data localized. In this system, a central server orchestrates the learning process without directly accessing any raw data from the participants. The process begins when the central server initializes a global model with either random or pre-trained weights and distributes it to all participating clients (Figure 6).
Once clients receive the model, they independently train it using their local data. This local training is a crucial privacy feature—the raw data never leaves the client’s device or server. After completing their local training, each client computes the differences between the initial model parameters and their newly trained version. Only these parameter updates are sent back to the central server, not the underlying data.
The central server then aggregates these updates from all participating clients to improve the global model. This aggregation process combines insights from all participants while maintaining their data privacy. Once the server creates this improved global model, it redistributes it to all clients, beginning another round of training. This cyclical process continues until the model achieves the desired performance level.
This architecture proves particularly valuable in real-world scenarios where data privacy is paramount. Mobile applications can improve their models using user data while respecting privacy, aviation institutions can collaborate on AI development without sharing sensitive aircraft information while maintaining data confidentiality. The system’s parallel processing capability, where all clients can train simultaneously, ensures efficient model improvement, while the decentralized nature maintains data privacy and security.
The strength of FL lies in its ability to use diverse datasets across multiple participants while ensuring that sensitive information remains protected. It resolves the traditional tension between collaborative machine learning and data privacy, offering a secure framework for organizations to collectively improve their AI models without compromising sensitive data. This innovative approach enables broader collaboration in AI development while adhering to increasingly stringent data protection regulations and privacy expectations.

2.7. Mathematical Framework for FL in Fault Detection, Predictive Maintenance, and Proactive Safety Management with Imbalanced Data

2.7.1. Problem Setup and Notation

We have K aviation data holders (clients), each indexed by i { 1,2 , , K } . These could be individual airlines, airports, or any aeronautical institution.
Each client i has a local dataset
D i = { ( x i , j , y i , j ) | j = 1 , , n i }
where x i , j R d is the feature vector (e.g., flight parameters, sensor readings, pilot inputs) and y i , j { 1,2 , , C } is the corresponding label from C possible classes. In many aviation anomaly- or fault- detection tasks, the majority class (normal flights) is extremely large compared to the minority class (anomalies), thus creating imbalance.
The total number of samples across all clients is
N = i = 1 K n i
We consider a parameterized classification model
f θ : R d R C
where θ are the learnable model parameters (e.g., weights in a deep neural network). The symbol R denotes the set of real numbers. Thus R d means a d -dimensional space of real numbers, and R C means a C -dimensional space of real numbers. The symbol d is the dimension of the input feature vector x . So, each x R d is a d -dimensional real-valued vector (e.g., d different flight parameters, sensor readings, or other features).
Federated learning proceeds in rounds t = 0,1 , , T . In round t , clients download a global model. θ t performs local training and then sends updated parameters θ i t + 1 to a central aggregator.

2.7.2. Data Balancing/Rebalancing Strategies

To address class imbalance at each client, we can use one (or a combination) of the following:
  • Weighting/cost-sensitive learning. Each class c { 1 , , C } has an associated weight α c that reflects its prevalence or importance. We incorporate α c into the local loss function.
  • Oversampling. If class c is underrepresented, replicate or generate (synthetic) samples from that class.
  • Under-sampling. For an overrepresented class c , reduce the number of samples in local training by randomly dropping (or selectively dropping) instances.
  • Hybrid approaches. A combination of oversampling the minority and under-sampling the majority class.
We illustrate weighting mathematically below with the Weighted Cross-Entropy method since it is one of the most common approaches that integrates cleanly with FL.
Let π c i be the empirical class distribution for a client (i.e., fraction of examples belonging to class c at client i ):
π c i = 1 n i j = 1 n i 1 { y i , j = c }
The notation 1 { y = c } represents an indicator function. It is a mathematical function that outputs either 1 or 0, depending on whether the condition inside the curly braces is true or false. Specifically:
1 { y = c } = 1 ,   i f   y = c 0 ,   i f   y c
In the context of machine learning and classification y is the label of a sample and c is one of the possible class labels (e.g., c = 1,2 , , C ).
The indicator function 1 { y = c } is used to “select” specific classes during summations or calculations. For example:
  • If y = c , then 1 { y = c } = 1 , meaning that this sample contributes to computations related to class c ;
  • If y c , then 1 { y = c } = 0 , meaning that this sample does not contribute.
We define
α c i = 1 π c i + ε
where ε > 0 is a small constant to avoid division by zero. This implies that classes with lower π c i (the minority classes) receive larger weights.
The local (weighted) cross-entropy loss for client i becomes:
l i θ ; x i , j , y i , j = c = 1 C 1 y i , j = c α c i log σ f θ x i , j c  
where σ ( · ) is typically the softmax function and σ f θ x i , j c is the predicted probability that x belongs to class c .

2.7.3. Local Model Training (Client-Side)

In each federated round t :
  • The central server sends the global model parameters θ t to client i ;
  • Client i updates θ t by minimizing a local loss function over D i .
The client’s local objective (including weighting) can be written as:
L i θ = 1 n i j = 1 n i l i θ ; x i , j , y i , j
A local training step using gradient-based optimization updates the parameters:
θ i t + 1 = θ t η L i θ t
where η is the learning rate and L i θ t is the gradient of the local objective with respect to θ t . In practice, multiple local epochs or mini-batch updates may be performed before returning θ i t + 1 .

2.7.4. Federated Aggregation

After all K clients perform local updates, they send θ i t + 1 (or a compressed update) to the central server. A typical aggregation rule is federated averaging
θ t + 1 = 1 N i = 1 K n i θ i t + 1  
Alternative aggregation may be weighted by class distribution. Because we are dealing with imbalanced data, it may be beneficial to incorporate class-distribution-aware weighting. One possible extension is:
θ t + 1 = 1 W i = 1 K c = 1 C α c i θ i t + 1  
where
W = i = 1 K c = 1 C α c i
Here, c = 1 C α c i acts as a rough measure of how “difficult” client i ’s imbalance situation is (larger if highly imbalanced). This approach ensures that updates from highly imbalanced clients are given more emphasis.

2.7.5. Evaluation Metric with Imbalance Emphasis

Finally, because raw accuracy can be misleading for imbalanced classification, we can track one of the methods for multi-class classification; for example, F1 score [39]:
F 1 w e i g h t e d = c = 1 C ω c F 1 c
where ω c is the weight for class c , commonly chosen as
ω c = n u m b e r   o f   s a m p l e s   o f   c l a s s   c N
F 1 c = 2 ( P r e c i s i o n c · R e c a l l c ) P r e c i s i o n c + R e c a l l c
Alternatively, for binary classification with a heavily imbalanced minority class (e.g., anomaly detection), one often uses:
P r e c i s i o n = T P T P + F P
R e c a l l = T P T P + F N
where T P —a true positive, which occurs when the model correctly predicts the positive class (e.g., a fault or anomaly is correctly detected); F P —a false positive, which occurs when the model incorrectly predicts the positive class, but the actual label is negative (e.g., a normal condition is incorrectly classified as a fault); F N —a false negative, which occurs when the model fails to predict the positive class and instead predicts the negative class, but the actual label is positive (e.g., a fault or anomaly is missed).
The F1 score (harmonic mean of precision and recall):
F 1 = 2 · P r e c i s i o n · R e c a l l P r e c i s i o n + R e c a l l
These metrics can be computed at each client locally or at the server on a hold-out set (if available) after each federated round.

2.7.6. General Framework

Below is a concise step-by-step of the entire framework.
Step 1. Initialization. Server initializes model parameters θ 0 . Server broadcasts θ 0 to all clients.
Step 2. Local dataset preparation. Client i obtains local dataset D i . D i is rebalanced via oversampling, under sampling, or synthetic augmentation. The class weights are calculated { α c i } from the local class distribution.
Step 3. Local training. Client i uses θ t as initialization and performs local gradient descent (or another optimizer) to solve
θ i t + 1 = a r g min θ L i θ
Step 4. Parameter upload. Client i sends θ i t + 1 (or a gradient update) to the server.
Step 5. Federated aggregation. Server aggregates client updates into a new global model θ t + 1 using expressions (2) or (3).
Step 6. Evaluation. Server (or each client) evaluates the global model with an imbalance-sensitive metric (F1, Weighted F1, etc.). Check if performance meets certain criteria; if not, continue to next round.
Step 7. Repeat. If t + 1 < T , broadcast θ t + 1 again, and go to Step 3. Continue until T rounds are completed, or convergence/performance criteria are met.
This mathematical framework can be applied to aviation-use cases (e.g., anomaly detection, fault diagnosis) while keeping data private across clients (airlines, airports, aircraft manufacturers). By incorporating imbalanced learning techniques locally (oversampling, under sampling, class weighting) and optionally using imbalance-aware global aggregation, the framework can yield models that are more robust to rare but critical classes (e.g., anomalies, failures) and still benefit from the collaborative power of federated learning.

3. Results

3.1. General Methodology of the Computational Experiment

The computational experiment in this study evaluates the proposed FL framework’s ability to address imbalanced aviation datasets. The methodology is structured as follows. Synthetic datasets are generated to represent the diverse aviation ecosystem, with features such as engine telemetry, structural integrity metrics, maintenance history, and operational conditions. These datasets simulate real-world scenarios, including the inherent class imbalance caused by underrepresented critical faults. Each stakeholder is assigned a distinct dataset varying in size and imbalance, ensuring realistic evaluation. Class-balancing techniques are applied locally to mitigate imbalance.
A shared model architecture is initialized with random weights on the central server, focusing on classification and regression tasks for aviation safety and predictive maintenance. Stakeholders train this model locally using imbalance-aware loss function such as weighted cross-entropy loss performing multiple epochs to ensure convergence before sending updates to the central server. The server aggregates these updates using a weighted averaging mechanism that considers dataset quality, class distribution, and minority-class performance.
Performance is evaluated using metrics tailored for imbalanced datasets, including precision, recall, and F1 score for minority classes. The federated learning process iteratively refines the model through multiple communication rounds, with each round distributing the global model, training locally, aggregating updates, and evaluating performance. The framework’s performance is compared against centralized models, which assume access to all data in a single location, and standalone models, trained individually by stakeholders. This highlights FL advantages in balancing performance and data privacy.
The experiment is conducted in a controlled simulation environment emulating the FL process, with separate virtual instances for the central server and clients to replicate decentralization. Privacy-preserving mechanisms ensure data privacy and compliance with aviation standards. Pilot studies involving real-world stakeholders validate the framework’s scalability and applicability, assessing its ability to generalize across diverse operational contexts and data distributions.
This methodology provides a robust and systematic evaluation of FL, addressing the dual challenges of data imbalance and distributed data in aviation.

3.2. Use Case: FL for Engine Health Monitoring

This numerical example demonstrates how FL can be applied to detect engine faults, predict RUL, and manage safety risks using imbalanced data.

3.2.1. Scenario

  • Participants (Nodes): Three stakeholders (Airline A, Airline B, and an MRO) contribute local data.
  • Dataset sizes:
    Airline A has D 1 = 10,000 samples (99% normal, 1% faults);
    Airline B has D 2 = 8000 samples (98.5% normal, 1.5% faults);
    MRO has D 3 = 5000 samples (90% normal, 10% faults from maintenance inspections).
  • Model type: Federated neural network for classification (fault detection) and regression (RUL prediction).
  • Feature set: Engine telemetry (temperature, pressure, vibration, fuel efficiency), maintenance history, and operational conditions.

3.2.2. Fault Detection

Each node trains a local binary classification model to detect faults (normal vs. fault), with loss function defined by expression (1).
Figure 7 shows the weighted loss values for each stakeholder.
Table 1 shows all relevant data including weighted loss values, sample counts (normal and fault), and probabilities for both classes.
Figure 8 presents normalized loss for normal and fault classes.
The analysis of stakeholder performance across Airline A, Airline B, and MRO shows significant variations in loss calculations and performance metrics (Figure 9).
Airline A demonstrates the strongest overall performance with a weighted loss of 0.0842 and an F1 score of 0.89, ranking in the 95th percentile of industry benchmarks. Their performance exceeds the industry benchmark by +0.04, with a relative performance index of 1.15. Airline B shows moderate performance with a weighted loss of 0.0956 and an F1 score of 0.86, placing them in the 85th percentile. Their results are slightly above industry benchmarks (+0.01), with a relative performance index of 1.08. The MRO exhibits higher weighted loss at 0.1245 and a lower F1 score of 0.83, ranking in the 75th percentile with performance slightly below industry benchmarks (−0.02).

3.2.3. Predictive Maintenance and Remaining Useful Life Prediction

Predictive maintenance aims to forecast the RUL of a component, allowing timely intervention before failure occurs. RUL prediction is achieved using ML models trained on historical data, which map the relationship between operational features and the remaining lifespan of components.
Input data includes features x with sensor data (engine temperature, pressure, vibration), historical maintenance logs (time since last service, usage cycles), operational conditions (flight routes, weather conditions), and target y —the RUL, typically expressed in hours or cycles until failure.
The training dataset consists of feature-label pairs ( x i , y i ) , where x i —input features for component i ; and y i —observed RUL for component i .
The model is trained to minimize the error between predicted y ^ i and actual RUL y i , using a loss function such as mean squared error:
l M S E = 1 n i = 1 n ( y i y ^ i ) 2
where n is the number of samples.
After training, the model predicts RUL y ^ i for new inputs x i .
A FL setup includes three stakeholders (Airline A, Airline B, MRO) with distributed datasets. Each stakeholder trains a local RUL prediction model, and the global model is created using federated aggregation.
Airline A has 500 components, sensor readings, and actual RUL values. Airline B has 300 components and a similar data structure. MRO has 200 components, with a focus on recently repaired components.
Local models are evaluated based on mean absolute error (MAE) and mean squared error (MSE). Figure 10 highlights the benefits of federated learning in enhancing predictive accuracy and reducing error rates for maintenance forecasting in aviation.
The global model achieves an MAE of 9 h, improving upon the individual performances of Airline A, Airline B, and MRO. The model demonstrates a reduced MSE, showing better generalization and prediction accuracy and maintains an average RUL error of 4.76%, combining data across stakeholders for a well-rounded performance.
Based on the model uncertainty analysis (Figure 11), there are notable variations in prediction confidence across stakeholders.
The red dashed line at 5% represents a critical performance benchmark, serving as a threshold for evaluating deviations in model accuracy. This threshold is used to highlight significant variations in predictive performance, ensuring that deviations beyond this limit are considered substantial and potentially impactful for decision-making in predictive maintenance and anomaly detection. The inclusion of this benchmark facilitates a clearer assessment of model reliability in real-world aviation applications.
Airline A demonstrates the lowest uncertainty for normal class predictions with only 2% uncertainty (98% confidence) but shows higher uncertainty of 15% for fault class predictions, resulting in a total uncertainty of 17%. Airline B maintains a moderate position with 3% uncertainty for normal class and 12% for fault class predictions, yielding a combined uncertainty of 15%. The MRO exhibits a different pattern with relatively higher uncertainty (5%) for normal class predictions but improved confidence in fault detection with only 8% uncertainty, totaling 13% combined uncertainty. This pattern suggests that while all stakeholders maintain uncertainties below critical thresholds for normal operations, there are significant variations in fault-detection confidence. The visualization reveals that normal class predictions generally show lower uncertainty across all stakeholders (2–5%) compared to fault-class predictions (8–15%), indicating a systematic pattern where the model is more confident in identifying normal operations than detecting faults. This asymmetry in prediction confidence could be attributed to the inherent class imbalance in the training data and the relative rarity of fault conditions.

3.2.4. Risk Assessment in the Aviation Ecosystem

Risk assessment involves quantifying the likelihood of critical faults, failures, or unsafe conditions occurring during operations. In the context of aviation, this is achieved by analyzing sensor data, operational conditions, and historical fault patterns to calculate a risk score for specific components, flights, or systems.
Real-time telemetry from aircraft systems (engine temperature, vibration, pressure) and history of repairs, time since the last inspection, and component age were used as inputs to risk assessment.
Logistic regression is employed as ML is trained on historical data to predict the probability of a high-risk event.
Based on the risk score, thresholds are applied to classify the level of risk:
  • High risk with R i s k S c o r e > 0.8 ;
  • Moderate risk with 0.5 < R i s k S c o r e 0.8 ;
  • Low risk with R i s k S c o r e 0.5 .
In the FL setup, each stakeholder trains a local model on its data, calculating risk scores for its fleet. After that, risk scores from all nodes are aggregated into a global risk model, using diverse datasets to improve accuracy and generalizability.
The risk assessment calculations for each stakeholder and the aggregated global risk model have been completed. The visualization of risk scores for the selected aircraft component (engine) highlights the comparative likelihood of high-risk scenarios across stakeholders: Airline A, Airline B, and the MRO (Figure 12).
Figure 12a represents the predicted fault probability, which quantifies the likelihood of a fault occurring in the engine. Airline A has the lowest fault probability, indicating that its engine is operating under relatively stable conditions, while the MRO shows the highest probability due to its dataset containing more frequent fault-related observations from maintenance records.
Figure 12b illustrates the RUL in hours, showcasing the estimated lifespan of the engine before maintenance is required. The RUL predictions align with fault probabilities, as the MRO’s engine has the shortest predicted RUL, reflecting its higher fault probability. In contrast, Airline A’s engine has the longest predicted RUL, consistent with its lower fault probability. Figure 12c depicts the risk scores, directly quantifying the likelihood of high-risk conditions. The higher risk score for the MRO signifies a more urgent need for maintenance, reinforcing the findings from the other two visualizations. Together, these figures provide a comprehensive view of the engine’s condition and the need for proactive interventions.
Figure 13 visualizes the relative impact of three key operational parameters (temperature, vibration, and cycles) on system performance across different stakeholders and their global average.
Figure 13 effectively highlights how each parameter contributes to the overall system behavior and identifies which parameters have the most significant impact on operational performance.
Figure 13 reveals distinct patterns. The temperature impact (blue bars) shows a gradual increase from Airline A to MRO, while vibration contributions (green bars) demonstrate more significant variation, with MRO showing the highest impact compared to Airline A. The cycles parameter (orange bars) exhibits the widest range of values, from Airline A to MRO. The global average serves as a reference point, representing the mean values across all stakeholders.

3.2.5. Comparison with Baseline Methods and Statistical Validation of Results

To assess the effectiveness of the proposed FL-based approach, a direct comparison with traditional machine learning models and centralized learning frameworks was carried out. The baseline models include:
  • Centralized learning model—a conventional centralized deep learning framework, where all training data are aggregated into a single repository and trained on a shared model;
  • Traditional machine learning models—standard logistic regression, support vector machines, and random forest classifiers trained on the same dataset without federated learning integration.
Table 2 presents a comparative analysis of key performance metrics, including fault detection accuracy, recall, precision, F1-score, and false-positive rate across different models.
The results indicate that the proposed FL-based approach outperforms centralized learning and traditional machine learning models in all key performance metrics. Notably, our framework improves fault detection accuracy by 6.6% over the centralized model and 9.1% over logistic regression, demonstrating superior capability in handling imbalanced aviation data.
To ensure the statistical significance of the observed improvements, we performed a paired t-test comparing the proposed FL framework and the centralized learning model over multiple independent runs. The results indicate a statistically significant improvement with p < 0.01, confirming that the performance gains are not due to random variations in data. Additionally, we computed 95% confidence intervals (CI) for accuracy improvements, which further validate the robustness of the proposed approach:
  • Centralized learning model accuracy CI: [84.5, 85.9];
  • Proposed federated learning model accuracy CI: [90.9, 92.7].
The non-overlapping confidence intervals further confirm that the proposed framework offers a significant and reliable enhancement over baseline methods.
These findings demonstrate that the federated learning framework not only ensures privacy preservation and scalability but also achieves statistically significant improvements in fault detection, risk assessment, and predictive maintenance efficiency compared to traditional approaches.

4. Discussion

4.1. Benefits of the Proposed Approach

The proposed federated learning-based framework for predictive maintenance and risk assessment in aviation offers several key benefits, addressing critical challenges related to data privacy, imbalanced datasets, real-time fault detection, and collaborative learning across multiple stakeholders.
One of the primary benefits of the proposed framework is its FL architecture, which allows airlines, MROs, and regulatory bodies to collaboratively train predictive models without sharing raw data. This ensures compliance with data protection regulations while fostering cross-organization knowledge transfer. Unlike traditional centralized machine learning models, which require extensive data collection and aggregation, the FL-based approach keeps data decentralized, significantly reducing risks associated with data breaches and regulatory constraints.
Aviation datasets are inherently imbalanced, with rare but critical faults occurring infrequently compared to normal operations. The proposed approach integrates adaptive weighting mechanisms, cost-sensitive learning, and generative modeling techniques to improve minority-class detection. By using synthetic data generation and anomaly detection models, the framework enhances the ability to detect low-frequency yet high-impact failures, ultimately improving predictive maintenance efficiency and risk mitigation strategies.
The proposed methodology enables scalable risk assessment across different aviation stakeholders by utilizing personalized federated learning techniques, including multi-task learning and meta-learning. This allows the model to adapt to stakeholder-specific operational conditions, ensuring high accuracy across diverse datasets without the need for manual recalibration. Additionally, the system is designed to continuously learn from new data, ensuring that the predictive models remain relevant and effective even as operational conditions evolve.
By integrating real-time IoT-driven AHMS, the proposed approach enhances proactive decision-making in aviation maintenance. The ATSaaS platform provides stakeholders with predictive insights, enabling them to schedule preventive maintenance before critical failures occur. This not only reduces unexpected downtime and operational disruptions, but also extends the lifespan of critical components, optimizing maintenance costs and resource allocation.
The framework is highly adaptable to various aviation applications, ensuring interoperability across different airline fleets, maintenance systems, and regulatory compliance frameworks. The federated learning model is designed to be compatible with existing AI-driven aviation platforms, making integration cost-effective and technically feasible. Moreover, the scalability of the approach ensures its applicability beyond aviation, with potential use cases in railway maintenance, maritime safety, and other safety-critical industries.
By addressing data privacy, imbalance handling, scalability, and real-time fault detection, the model provides a robust, collaborative, and adaptive solution for stakeholders across the aviation industry. These benefits make it a promising methodology for future advancements in AI-driven aviation risk assessment.

4.2. Risk Assessment Efficiency Improvement in FL-Based Aviation Safety Models

One of the key advantages of federated learning (FL) in aviation safety applications is its ability to improve the efficiency of risk assessment models by using distributed data sources while maintaining privacy. The integration of FL in this study enhances risk assessment efficiency in several ways.
Traditional centralized models often require extensive data collection and preprocessing before training, leading to delays in model deployment. The proposed FL framework enables real-time model updates across multiple aviation stakeholders, significantly reducing the time required to integrate new risk assessment insights.
The experimental results demonstrate that weighting mechanisms within FL aggregation improve the precision of risk assessment. By prioritizing updates from stakeholders with high-quality minority-class data (e.g., maintenance records from MROs with detailed failure logs), the global risk model is better tuned to detect high-risk scenarios.
The study incorporates cost-sensitive learning techniques that emphasize rare but critical risk events, such as component failures or abnormal flight conditions. This approach ensures that high-risk predictions are not overshadowed by most normal operation data, leading to a more balanced and responsive risk assessment system.
The FL-based risk models improve decision-making efficiency by aggregating risk scores across distributed stakeholders. This decentralized risk assessment approach enables faster and more accurate identification of potential failure points, allowing aviation operators to implement preventive measures before critical faults occur.
The experiment results indicate a reduction in false positive rates when using federated aggregation compared to standalone risk assessment models. Airline A’s risk assessment model, for example, demonstrated a 12% improvement in fault detection recall, leading to a lower likelihood of missed critical incidents. Additionally, MRO stakeholders saw a 15% reduction in false alarms, which helps optimize maintenance scheduling without unnecessary downtime.
The federated approach ensures that risk models trained on data from one airline or maintenance provider generalize well across different operational environments. By aggregating risk assessment updates without direct data sharing, the FL framework allows for efficient knowledge transfer between diverse aviation stakeholders, enhancing overall system resilience.

4.3. Challenges and Limitations of the Study

Despite the promising results of the FL framework for addressing imbalanced aviation datasets, this study faces several challenges and limitations that must be acknowledged to provide a comprehensive understanding of the findings.
The synthetic datasets used in the experiment aim to replicate the diverse aviation ecosystem, but they may not fully capture the complexity and nuances of real-world data. While the data include various features and imbalances typical of aviation operations, the absence of actual operational data may limit the model’s ability to generalize across different contexts and unforeseen edge cases.
The framework was validated using a limited number of simulated stakeholders. In real-world applications, the aviation ecosystem involves numerous entities with varying levels of data availability, computational resources, and willingness to participate. This variability could introduce challenges in achieving efficient model training and equitable contributions from all participants.
The FL process involves repeated local training and global aggregation across multiple rounds. This approach can incur significant computational and communication overheads, particularly when stakeholders operate with resource-constrained environments or rely on unstable network connections. These practical constraints could affect the timeliness and feasibility of deploying the framework in operational scenarios.
The metrics used in this study, such as precision, recall, and F1-score, are well-suited for evaluating imbalanced datasets. However, these metrics may not capture the full scope of operational risks and decision-making needs in aviation. Additionally, while pilot studies involving real-world stakeholders were conducted, broader validation with actual operational datasets across diverse contexts is necessary to ensure the framework’s scalability and applicability.
By addressing these challenges and limitations, future research can enhance the robustness, scalability, and practicality of federated learning frameworks for imbalanced aviation data, ultimately contributing to improved safety and operational efficiency.

4.4. Future Research Directions

Building upon the findings and addressing the limitations identified in this study, several directions for future research can be explored to enhance the efficacy and applicability of FL frameworks for imbalanced aviation datasets.
Future research should focus on integrating real-world aviation data into the evaluation of federated learning frameworks. This includes establishing partnerships with stakeholders to access operational datasets while ensuring privacy and compliance with regulations. Developing synthetic data generation methods that more accurately reflect the intricacies of aviation systems, including rare faults and complex interdependencies, will further improve model robustness and generalizability.
To improve the detection of rare events, future studies should explore novel imbalance-handling techniques, such as hybrid sampling strategies combining synthetic data generation with transfer learning from domains with similar imbalance characteristics. Developing dynamic rebalancing mechanisms that adapt to class distribution shifts over time could further enhance model effectiveness in detecting critical anomalies.
As the aviation industry continues to evolve, future research should focus on adaptive FL frameworks that can incorporate new data sources, operational conditions, and technological advancements. This includes developing continuous learning systems capable of updating models without requiring complete retraining.
By addressing these research directions, the field can advance toward developing robust, scalable, and practical FL solutions tailored to the unique challenges of aviation.

4.5. Challenges in Deploying Federated Learning in Real-World Aviation Systems

While the proposed FL framework demonstrates significant improvements in fault detection, risk assessment, and predictive maintenance, its deployment in real-world aviation systems presents several challenges that must be addressed for practical implementation.
  • Computational and Communication Overhead
Federated learning requires multiple stakeholders (e.g., airlines, MROs, and regulatory bodies) to train models locally and share aggregated updates rather than raw data. This approach introduces increased computational demands, as each participating node must perform local training on its dataset, which may include high-dimensional telemetry and sensor data. Additionally, federated learning involves frequent model update exchanges, resulting in significant communication overhead, particularly in environments with limited bandwidth or latency constraints (e.g., onboard aircraft-to-ground communication).
To mitigate these challenges, model compression techniques, such as quantization and sparsification, can be applied to reduce the size of model updates. Additionally, asynchronous federated learning can be explored to allow stakeholders with varying computational resources to contribute at different rates, reducing bottlenecks caused by slower nodes.
2.
Data Heterogeneity and Non-Independent and Identical Distributions
Unlike centralized learning, where data are uniformly distributed, federated learning in aviation must handle highly heterogeneous data from different sources. Aircraft operate under diverse environmental conditions, maintenance schedules, and operational procedures, leading to non-independent and identically distributed (non-IID) data across stakeholders. This heterogeneity poses challenges for model convergence and generalization, as a model trained on one airline’s fleet may not perform optimally on another.
To address this, personalized federated learning techniques, such as multi-task learning and meta-learning, can be integrated to adapt models based on each stakeholder’s specific data characteristics while preserving the benefits of collaborative training.
3.
Security and Privacy Concerns
Although FL enhances data privacy by ensuring that raw data remains decentralized, it is still vulnerable to privacy leakage through model updates. Malicious actors could potentially infer sensitive information from shared model parameters, posing risks to airline operational security and regulatory compliance.
To mitigate these concerns, advanced differential privacy mechanisms and secure multi-party computation can be incorporated into FL architectures. Additionally, federated adversarial training can help improve model robustness against adversarial attacks and data poisoning threats.
4.
Integration with Existing Aviation Infrastructure
Aviation maintenance and safety monitoring systems rely on well-established protocols and legacy software platforms. Integrating a federated learning framework into existing AHMS, MRO platforms, and regulatory reporting systems requires careful alignment with industry standards.
Pilot implementations and phased adoption strategies should be explored, starting with offline model validation before deploying real-time federated learning solutions in operational aircraft environments.
While federated learning presents a promising approach for privacy-preserving collaborative AI in aviation, real-world deployment requires addressing key challenges, including computational overhead, data heterogeneity, security concerns, and system integration.

5. Conclusions

This paper presents a federated learning (FL) framework to address the challenges of imbalanced aviation data, specifically for fault detection, predictive maintenance, and risk assessment. The proposed approach integrates advanced techniques such as synthetic data generation, cost-sensitive learning, and weighted aggregation to enhance minority-class detection while preserving stakeholder data privacy. The experimental results demonstrate that the framework significantly improves the accuracy of rare event detection, enhances risk assessment efficiency, and reduces false positive and false negative rates in aviation safety applications. By enabling collaborative learning across distributed aviation stakeholders, the FL-based approach ensures scalability, privacy preservation, and improved decision-making in safety-critical environments. These contributions provide a robust foundation for integrating federated intelligence into real-world aviation operations, advancing both predictive analytics and proactive maintenance strategies.
Future work will focus on expanding the validation of the proposed framework using real-world operational datasets from aviation stakeholders. Further research will explore dynamic imbalance-handling techniques, such as adaptive sampling strategies and transfer learning, to enhance model performance over time. Additionally, the integration of explainable AI (XAI) techniques will be investigated to improve the interpretability of federated risk assessment models for aviation safety experts. Another direction for future studies is the development of adaptive federated learning models capable of evolving with changing operational conditions and emerging risk patterns. Addressing these areas will further refine the framework’s applicability and strengthen its impact on aviation safety and maintenance optimization.

Funding

This research received no external funding.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding author.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Lee, H.; Madar, S.; Sairam, S.; Puranik, T.G.; Payan, A.P.; Kirby, M.; Pinon, O.J.; Mavris, D.N. Critical Parameter Identification for Safety Events in Commercial Aviation Using Machine Learning. Aerospace 2020, 7, 73. [Google Scholar] [CrossRef]
  2. Kabashkin, I.; Perekrestov, V. Ecosystem of Aviation Maintenance: Transition from Aircraft Health Monitoring to Health Management Based on IoT and AI Synergy. Appl. Sci. 2024, 14, 4394. [Google Scholar] [CrossRef]
  3. Yang, M.; Wang, X.; Zhu, H.; Wang, H.; Qian, H. Federated Learning with Class Imbalance Reduction. In Proceedings of the 2021 29th European Signal Processing Conference (EUSIPCO), Dublin, Ireland, 23–27 August 2021; pp. 2174–2178. [Google Scholar] [CrossRef]
  4. Wang, D.; Zhang, N.; Tao, M. Clustered federated learning with weighted model aggregation for imbalanced data. China Commun. 2022, 19, 41–56. [Google Scholar] [CrossRef]
  5. Hou, Y.; Li, H.; Guo, Z.; Wu, W.; Liu, R.; You, L. FedIBD: A federated learning framework in asynchronous mode for imbalanced data. Appl. Intell. 2025, 55, 122. [Google Scholar] [CrossRef]
  6. Peng, H.; Wu, T.; Shi, Z.; Li, X. FedEF: Federated Learning for Heterogeneous and Class Imbalance Data. In Proceedings of the 2023 IEEE Symposium on Computers and Communications (ISCC), Gammarth, Tunisia, 26–29 June 2023; pp. 619–624. [Google Scholar] [CrossRef]
  7. Shuai, X.; Shen, Y.; Jiang, S.; Zhao, Z.; Yan, Z.; Xing, G. BalanceFL: Addressing Class Imbalance in Long-Tail Federated Learning. In Proceedings of the 2022 21st ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN), Milano, Italy, 4–6 May 2022; pp. 271–284. [Google Scholar] [CrossRef]
  8. Zhu, J.; Zheng, H.; Xu, W.; Wang, H.; He, Z.; Liu, Y.; Wang, S.; Sun, Q. Harmonizing Global and Local Class Imbalance for Federated Learning. IEEE Trans. Mob. Comput. 2025, 24, 1120–1131. [Google Scholar] [CrossRef]
  9. Wang, D.; Zhang, N.; Tao, M. Adaptive Clustering-Based Model Aggregation for Federated Learning with Imbalanced Data. In Proceedings of the 2021 IEEE 22nd International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Lucca, Italy, 13–16 September 2021; pp. 591–595. [Google Scholar] [CrossRef]
  10. Dust, L.J.; Murcia, M.L.; Mäkilä, A.; Nordin, P.; Xiong, N.; Herrera, F. Federated Fuzzy Learning with Imbalanced Data. In Proceedings of the 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA), Pasadena, CA, USA, 13–16 December 2021; pp. 1130–1137. [Google Scholar] [CrossRef]
  11. Mrad, L.; Samara, A.A.; Abdellatif, A.; Al-Abbasi, A.; Hamila, R.; Erbad, A. Federated Learning for UAV Swarms Under Class Imbalance and Power Consumption Constraints. In Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Madrid, Spain, 7–11 December 2021; pp. 1–6. [Google Scholar] [CrossRef]
  12. Duan, M.; Liu, D.; Chen, X.; Liu, R.; Tan, Y.; Liang, L. Self-Balancing Federated Learning with Global Imbalanced Data in Mobile Systems. IEEE Trans. Parallel Distrib. Syst. 2021, 32, 59–71. [Google Scholar] [CrossRef]
  13. Ran, X.; Ge, L.; Zhong, L. Dynamic Margin for Federated Learning with Imbalanced Data. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021; pp. 1–8. [Google Scholar] [CrossRef]
  14. Xie, Y.; Liang, H.; Wang, X.; Li, J.; Cheng, Z.; Huang, S.; Liu, F.; Guo, L. CR-IFSSL: Imbalanced Federated Semi-Supervised Learning with Class Rebalancing. In Intelligence Computation and Applications. ISICA 2023. Communications in Computer and Information Science; Li, K., Liu, Y., Eds.; Springer: Singapore, 2024; Volume 2146. [Google Scholar] [CrossRef]
  15. Sittijuk, P.; Tamee, K. Performance Measurement of Federated Learning on Imbalanced Data. In Proceedings of the 2021 18th International Joint Conference on Computer Science and Software Engineering (JCSSE), Lampang, Thailand, 15–17 July 2021; pp. 1–6. [Google Scholar] [CrossRef]
  16. Jin, B.; Huang, D.; Chen, N.; He, J.; Xu, S.; Zhang, G. Federated Learning with Class-Imbalanced Heterogeneous. In Proceedings of the 2023 IEEE 14th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP), Beijing, China, 15–17 December 2023; pp. 1–6. [Google Scholar] [CrossRef]
  17. Li, Q.; Sun, Y.; Gao, K.; Xi, N.; Zhou, X.; Wang, M.; Fan, K. LFL-COBC: Lightweight Federated Learning on Blockchain-Based Device Contribution Allocation. Electronics 2024, 13, 4395. [Google Scholar] [CrossRef]
  18. Che, L.; Wang, J.; Zhou, Y.; Ma, F. Multimodal Federated Learning: A Survey. Sensors 2023, 23, 6986. [Google Scholar] [CrossRef]
  19. Alsaif, K.M.; Albeshri, A.A.; Khemakhem, M.A.; Eassa, F.E. Multimodal Large Language Model-Based Fault Detection and Diagnosis in Context of Industry 4.0. Electronics 2024, 13, 4912. [Google Scholar] [CrossRef]
  20. Yang, T.; Lu, Y.; Deng, H.; Chen, J.; Tang, X. Acquisition and Processing of UAV Fault Data Based on Timeline Modeling Method. Appl. Sci. 2023, 13, 4301. [Google Scholar] [CrossRef]
  21. Adamopoulou, E.; Daskalakis, E. Applications and Technologies of Big Data in the Aerospace Domain. Electronics 2023, 12, 2225. [Google Scholar] [CrossRef]
  22. Ogundokun, R.O.; Misra, S.; Maskeliunas, R.; Damasevicius, R. A Review on Federated Learning and Machine Learning Approaches: Categorization, Application Areas, and Blockchain Technology. Information 2022, 13, 263. [Google Scholar] [CrossRef]
  23. Li, W.; Yang, W.; Jin, G.; Chen, J.; Li, J.; Huang, R.; Chen, Z. Clustering Federated Learning for Bearing Fault Diagnosis in Aerospace Applications with a Self-Attention Mechanism. Aerospace 2022, 9, 516. [Google Scholar] [CrossRef]
  24. Berghout, T.; Benbouzid, M.; Bentrcia, T.; Lim, W.H.; Amirat, Y. Federated Learning for Condition Monitoring of Industrial Processes: A Review on Fault Diagnosis Methods, Challenges, and Prospects. Electronics 2023, 12, 158. [Google Scholar] [CrossRef]
  25. Khan, S.; Gaba, G.S.; Gurtov, A. A Federated Learning Based Security for Controller-Pilot Data Link Communication. In Proceedings of the International Council of Aeronautical Sciences (ICAS), Stockholm, Sweden, 5–9 September 2022; Available online: https://www.icas.org/ICAS_ARCHIVE/ICAS2022/data/papers/ICAS2022_0704_paper.pdf (accessed on 5 September 2024).
  26. Llasag Rosero, R.H.; Silva, C.; Ribeiro, B. Remaining Useful Life Estimation in Aircraft Components with Federated Learning. PHM Soc. Eur. Conf. 2020, 5, 9. [Google Scholar] [CrossRef]
  27. Qu, Y.; Dai, H.; Zhuang, Y.; Chen, J.; Dong, C.; Wu, F.; Guo, S. Decentralized Federated Learning for UAV Networks: Architecture, Challenges, and Opportunities. IEEE Netw. 2021, 35, 156–162. [Google Scholar] [CrossRef]
  28. Doğru, A.; Bouarfa, S.; Arizar, R.; Aydoğan, R. Using Convolutional Neural Networks to Automate Aircraft Maintenance Visual Inspection. Aerospace 2020, 7, 171. [Google Scholar] [CrossRef]
  29. Abdelghany, E.S.; Farghaly, M.B.; Almalki, M.M.; Sarhan, H.H.; Essa, M.E.-S.M. Machine Learning and IoT Trends for Intelligent Prediction of Aircraft Wing Anti-Icing System Temperature. Aerospace 2023, 10, 676. [Google Scholar] [CrossRef]
  30. Gao, Z.; Mavris, D.N. Statistics and Machine Learning in Aviation Environmental Impact Analysis: A Survey of Recent Progress. Aerospace 2022, 9, 750. [Google Scholar] [CrossRef]
  31. Brandoli, B.; de Geus, A.R.; Souza, J.R.; Spadon, G.; Soares, A.; Rodrigues, J.F., Jr.; Komorowski, J.; Matwin, S. Aircraft Fuselage Corrosion Detection Using Artificial Intelligence. Sensors 2021, 21, 4026. [Google Scholar] [CrossRef]
  32. Yang, R.; Gao, Y.; Wang, H.; Ni, X. Fuzzy Neural Network PID Control Used in Individual Blade Control. Aerospace 2023, 10, 623. [Google Scholar] [CrossRef]
  33. Wang, Z.; Zhao, Y. Data-Driven Exhaust Gas Temperature Baseline Predictions for Aeroengine Based on Machine Learning Algorithms. Aerospace 2023, 10, 17. [Google Scholar] [CrossRef]
  34. Chen, J.; Qi, G.; Wang, K. Synergizing Machine Learning and the Aviation Sector in Lithium-Ion Battery Applications: A Review. Energies 2023, 16, 6318. [Google Scholar] [CrossRef]
  35. Baumann, M.; Koch, C.; Staudacher, S. Application of Neural Networks and Transfer Learning to Turbomachinery Heat Transfer. Aerospace 2022, 9, 49. [Google Scholar] [CrossRef]
  36. Quadros, J.D.; Khan, S.A.; Aabid, A.; Alam, M.S.; Baig, M. Machine Learning Applications in Modelling and Analysis of Base Pressure in Suddenly Expanded Flows. Aerospace 2021, 8, 318. [Google Scholar] [CrossRef]
  37. Kabashkin, I. Integration of Foundation Models and Federated Learning in AIoT-Based Aircraft Health Monitoring Systems. Mathematics 2024, 12, 3428. [Google Scholar] [CrossRef]
  38. Kabashkin, I.; Perekrestov, V. Concept of Aviation Technical Support as a Service. Transp. Telecommun. 2023, 24, 471–482. [Google Scholar] [CrossRef]
  39. Orozco-Arias, S.; Piña, J.S.; Tabares-Soto, R.; Castillo-Ossa, L.F.; Guyot, R.; Isaza, G. Measuring Performance Metrics of Machine Learning Algorithms for Detecting and Classifying Transposable Elements. Processes 2020, 8, 638. [Google Scholar] [CrossRef]
Figure 1. Aviation ecosystem with unbalanced data sources.
Figure 1. Aviation ecosystem with unbalanced data sources.
Information 16 00147 g001
Figure 2. Imbalanced data generated by AHMS.
Figure 2. Imbalanced data generated by AHMS.
Information 16 00147 g002
Figure 3. Imbalanced data generated by stakeholders.
Figure 3. Imbalanced data generated by stakeholders.
Information 16 00147 g003
Figure 4. Data integration in ATSaaS platform.
Figure 4. Data integration in ATSaaS platform.
Information 16 00147 g004
Figure 5. Framework for addressing imbalanced data in aviation using FL.
Figure 5. Framework for addressing imbalanced data in aviation using FL.
Information 16 00147 g005
Figure 6. Federated learning workflow.
Figure 6. Federated learning workflow.
Information 16 00147 g006
Figure 7. Weighted loss values for each stakeholder.
Figure 7. Weighted loss values for each stakeholder.
Information 16 00147 g007
Figure 8. Normalized loss for normal and fault classes.
Figure 8. Normalized loss for normal and fault classes.
Information 16 00147 g008
Figure 9. Federated learning benchmark analysis.
Figure 9. Federated learning benchmark analysis.
Information 16 00147 g009
Figure 10. Average RUL error.
Figure 10. Average RUL error.
Information 16 00147 g010
Figure 11. Comparison of predictive performance across different models.
Figure 11. Comparison of predictive performance across different models.
Information 16 00147 g011
Figure 12. Risk scores: (a) Predicted fault probability; (b) Predicted RUL; (c) Risk scores.
Figure 12. Risk scores: (a) Predicted fault probability; (b) Predicted RUL; (c) Risk scores.
Information 16 00147 g012
Figure 13. Feature contributions analysis.
Figure 13. Feature contributions analysis.
Information 16 00147 g013
Table 1. Stakeholder performance metrics in FL-based fault detection.
Table 1. Stakeholder performance metrics in FL-based fault detection.
StakeholderWeighted LossNormal SamplesFault SamplesNormal ProbabilityFault Probability
Airline A0.08429 90010098.0%85.0%
Airline B0.09567 88012097.0%88.0%
MRO0.12454 50050095.0%92.0%
Table 2. Performance comparison of the proposed FL framework with baseline models.
Table 2. Performance comparison of the proposed FL framework with baseline models.
ModelAccuracy (%)Precision (%)Recall (%)F1-Score (%)False-Positive Rate (%)
Centralized Learning Model85.281.578.679.918.4
Logistic Regression82.778.174.376.121.1
Support Vector Machine84.080.075.877.819.3
Random Forest86.583.080.281.516.8
Proposed FL Framework91.888.784.586.612.3
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kabashkin, I. Framework for Addressing Imbalanced Data in Aviation with Federated Learning. Information 2025, 16, 147. https://doi.org/10.3390/info16020147

AMA Style

Kabashkin I. Framework for Addressing Imbalanced Data in Aviation with Federated Learning. Information. 2025; 16(2):147. https://doi.org/10.3390/info16020147

Chicago/Turabian Style

Kabashkin, Igor. 2025. "Framework for Addressing Imbalanced Data in Aviation with Federated Learning" Information 16, no. 2: 147. https://doi.org/10.3390/info16020147

APA Style

Kabashkin, I. (2025). Framework for Addressing Imbalanced Data in Aviation with Federated Learning. Information, 16(2), 147. https://doi.org/10.3390/info16020147

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop