Intelligent Fault Detection and Self-Healing Mechanisms in Wireless Sensor Networks Using Machine Learning and Flying Fox Optimization

Alauthman, Almamoon; Al-Hyari, Abeer

doi:10.3390/computers14060233

Open AccessArticle

Intelligent Fault Detection and Self-Healing Mechanisms in Wireless Sensor Networks Using Machine Learning and Flying Fox Optimization

by

Almamoon Alauthman

^*

and

Abeer Al-Hyari

Electrical Engineering Department, Al-Balqa Applied University, As Salt 19117, Jordan

^*

Author to whom correspondence should be addressed.

Computers 2025, 14(6), 233; https://doi.org/10.3390/computers14060233

Submission received: 11 May 2025 / Revised: 7 June 2025 / Accepted: 10 June 2025 / Published: 13 June 2025

Download

Browse Figures

Versions Notes

Abstract

WSNs play a critical role in many applications that require network reliability, such as environmental monitoring, healthcare, and industrial automation. Thus, fault detection and self-healing are two effective mechanisms for addressing the challenges of node failure, communication disruption, a energy constraints faced by WSNs. This paper presents an intelligent framework based on Light Gradient Boosting Machine integration for fault detection and a Flying Fox Optimization Algorithm in dynamic self-healing. The LGBM model provides very accurate and scalable performance related to effective fault identification, whereas FFOA optimizes the recovery strategies to minimize downtown and maximize network resilience. Extensive performance evaluation of the developed system using a large dataset was presented and compared with the state-of-the-art heuristic-based traditional methods and machine learning models. The results showed that the proposed framework could achieve 94.6% fault detection accuracy, with a minimum of 120 milliseconds of recovery time and network resilience of 98.5%. These results hence attest to the efficiency of the proposed approach in ensuring robust and adaptive WSN operations toward the quest for enhanced reliability within dynamic and resource-constrained environments.

Keywords:

Wireless Sensor Networks (WSNs); fault detection; self-healing; Light Gradient Boosting Machine (LGBM); Flying Fox Optimization Algorithm (FFOA); network resilience

1. Introduction

Over the years, WSNs have become enablers of different applications related to environmental monitoring, industrial automation, smart cities, and healthcare systems. Such a network consists of spatially distributed sensor nodes that collect data from an environment, process it, and transmit it to some central location under various resource-constrained situations [1,2]. In WSNs, reliability is crucial because network failure has serious consequences, including a loss of safety in industrial settings and data in environmental monitoring [3,4].

Fault detection and self-healing are two indispensable components in maintaining the operational integrity of WSNs. Sensor nodes may fail due to hardware malfunction, energy depletion, communication problems, or environmental interference [5,6]. These faults become more frequent and complex in dynamic or harsh environments, such as industrial plants or remote ecological systems, which is critical to network reliability [7,8]. On the other hand, traditional hard-coded rule-based fault management or human-intervention-based fault management in WSN is typically cumbersome for most WSN deployments today because of their scale and variability. These would not adapt to changing conditions and result in increased downtimes, drastically reducing network performance [9].

These limitations strongly indicate that the need for intelligent fault management systems that can proactively detect faults and work out a way to restore normal operations is of the utmost importance [10]. Machine learning offers powerful tools for analyzing complex patterns in network behavior and high-accuracy fault prediction. At the same time, optimization algorithms allow adaptive and efficient network reconfiguration to mitigate faults [11,12]. Integrating these techniques into a single framework can be an opportunity to significantly improve the reliability and resilience of WSNs under varied operational conditions [13].

The present study proposes an intelligent framework based on fault detection and self-healing in WSN, using machine learning combined with optimization to overcome some challenges [14]. This paper aims to apply a machine learning-based model for fault detection using the Light Gradient Boosting Machine, as it is a state-of-the-art algorithm known for its accuracy, scalability, and efficiency while dealing with large-sized datasets [15,16]. The framework also integrates a nature-inspired, metaheuristic optimization technique called the Flying Fox Optimization Algorithm for the dynamic optimization of self-healing actions for traffic rerouting or replacing faulty nodes [17,18].

It will check the performance of the integrated system based on a set of critical metrics related to fault detection accuracy, recovery time, and network resilience. All these metrics will be evaluated based on an extensive dataset, using a rigorous experimental protocol to determine the robustness and adaptability of the system. In accomplishing these objectives, this research hopes to present a scalable and efficient solution for WSN fault management that overcomes some limitations found in traditional approaches and contributes toward developing network reliability and fault tolerance [19].

This research contributes significantly to WSN reliability by presenting an integrated framework incorporating advanced machine learning and optimization techniques. The LGBM model used in this paper performs fault detection using its high-dimensional data processing capability and by capturing complex relationships between features. This ensures accurate and reliable fault predictions, outperforming traditional machine learning methods. This study also customizes the Flying Fox Optimization Algorithm to meet WSN self-healing, and the results prove its competence in shortening recovery times while increasing the network resilience in various fault conditions.

One of this study’s significant contributions is the creation of an exhaustive dataset representing all fault conditions of WSN. This dataset will not only allow rigorous testing of the proposed system but will also be a valuable resource for future studies related to fault detection and self-healing. The analytical framework of this study encompasses feature importance assessments with a sensitivity analysis that provides insight into factors affecting system performance and justifies design choices.

This research presents a robust fault management methodology in WSNs by integrating LGBM with FFOA into a system. The results pave the way for more reliable and resilient network operations, offering scalable adaptability to the growing challenges in modern WSN deployments.

2. Related Work

WSNs have increasingly more applications in fault detection algorithms. These must be employed to maintain network dependability. Padmasree and Chaithanya [20] discussed how deep learning techniques have been integrated with fault detection in single-hop and multi-hop WSNs and how detection accuracy over 90% has been achieved. Takale et al. [21] added to the survey, referring to the different approaches of fault diagnosis, namely centralized, distributed, and hybrid, and underlined that, to accommodate scalability, energy-efficient and scalable solutions should be considered. Addressing the challenges in 5G-enabled WSNs, Reddy et al. [22] designed two novel recovery protocols for fault nodes with the aim of low efficiency in maintenance and limited precision in localization. Similarly, Sanjay et al. [23] proposed some redundancy-based fault tolerance strategies, providing essential insights on scalability and higher detection accuracy to boost resiliency in WSNs, Key evaluation criteria include Detection Accuracy, Response Time, Energy Efficiency, and Scalability. Redundancy-based methods, such as node and path redundancy, are explored as effective fault tolerance techniques [3].

Optimization algorithms have also become essential for network reconfiguration and self-healing problems. Zhang and Guo [24] presented the SCRO-GNN framework, which utilized graph neural networks to optimize the capacities in road networks and showed its adaptability in innovative city applications. Gholizadeh and Musílek [25] developed the usage area using deep reinforcement learning in distribution network reconfiguration. Mokhtari et al. [26] propose a distributed ADMM approach to solving the power distribution network reconfiguration under radial topology constraints, focusing on scalability in various operational scenarios. Among other further refinements, optimization methodologies have been suggested, for example, by Jia et al. [27], who integrated the objectives of the low-carbon agenda, and Dehghany and Asghari [28], who applied metaheuristics in dynamic network environments.

These represent revolutionary integrations of machine learning and metaheuristics to solve network reliability. Ramachandra and Surekha [29] solved the intrusion detection problem using two machine learning models, which gave better precision and recall rates for WSNs. The satisfaction of safety and efficiency in transmission lines has been achieved by Martínez et al. [16], using machine learning methods in dynamic line rating prediction. Another example is the work of Zhou et al. [30], in which an interpretable machine learning framework was adopted to improve predictive reliability within chemical engineering applications. Krithikaa [31] using Meta-Heuristics Algorithm for Computer Communications which mimics the memory and exploration mechanisms of human problem-solving. Bencheikh [32] focused on the synergy between machine learning and metaheuristics and showed their potential to enhance the convergence of optimization procedures, making decision processes more effective.

Despite such progress, there are still several research gaps. Li et al. [33] remarked on the lack of detailed investigation into the reliability assessment, considering high-renewable-energy-penetration scenarios. Brandeau et al. [34] suggested applying uncertainty-based methods whenincluding epistemic uncertainties within the network reliability analysis. Zhukabayeva et al. [35] established the requirement for systematic methodologies to ensure WSN security against evolving cyber threats in challenging environments. Cheng and Petrides [36] also mentioned various barriers to obtaining prediction reliability in machine learning applications when relying on small samples. Singh et al. [37] have, in turn, highlighted a set of vulnerabilities in IoT-WSN integrations and encouraged appropriate security analyses to reduce potential risks.

Recent research in leading conferences has further advanced fault-tolerant WSN design. Zhang et al. [38], for instance, presented a multi-layered framework based on machine learning to achieve dynamic fault localization in mesh-based WSNs and emphasized scalability in the context of mobility in IoT environments (IEEE TII, 2023). In a different IEEE Transactions study, Li and Yu [39] presented a temporal dependency-based combined model using a deep learning model to achieve real-time anomaly detection in big WSNs. For self-healing purposes, Bhatnagar et al. [40] (ACM SenSys, 2024) presented a decentralized reinforcement learning-based protocol for autonomous self-healing and showcased resilience in the context of intermittent links.

Furthermore, recent contributions have focused on localization, energy modeling, and routing enhancements for WSNs using metaheuristic approaches. For instance, Dong et al. [41,42,43,44] proposed improved DV-hop algorithms for Sybil attack resistance, energy consumption models, and optimal routing strategies, highlighting the relevance of biologically inspired and time-control-based techniques in WSN environments.

This review emphasizes the vital importance of an integrative approach aimed at integrating advanced fault fetection techniques, optimization algorithms, and machine learning methodologies to enhance the reliability and resilience of WSNs. In this regard, the paper tries to present a unified framework that could substantially improve the performance of WSNs for a wide range of dynamic operational contexts.

3. Methodology

3.1. Dataset Description

The dataset used in this study was synthetically generated using the NS-3 simulation platform, which allowed precise control over node behavior, fault injection, and environmental conditions. NS-3 is widely used for simulating wireless communication systems and offers repeatability and scalability that are often difficult to achieve in real-world datasets. While synthetic, the dataset parameters were modeled to closely resemble real-world WSN deployments, incorporating variability in battery consumption, node density, environmental interference, and communication faults.

Potential biases arising from simulation include limited unpredictability in fault propagation and idealized communication assumptions. To mitigate these, we introduced stochastic noise and randomized fault types, ensuring that the data distribution reflects realistic and heterogeneous network conditions. Nevertheless, future work will incorporate hybrid datasets combining real-world sensor logs and synthetic augmentation for increased generalizability.

The dataset used for this research work has been designed carefully, considering the minute details involved in studying WSNs and outlining present fault detection and self-healing mechanisms. It consists of 98,462 records, each characterizing the state and performance of each node and the network. The features of the data in this dataset are represented using 30 features. These range from the battery level to the strength of the signal, the packet loss rate, the energy consumed, the occurrence of fault flags at a node, the packet average latency, the packet delivery ratio, the number of cluster heads, and the lifetime of the network, in addition to all the temporal and environmental parameters such as ambient temperature, humidity, and interference level. Additionally, self-healing-specific attributes like recovery actions, action effectiveness, and reconfiguration time give a glimpse into post-fault recovery scenarios.

These are some of the selected features of the dataset, which may contribute to diagnosing and solving faults in WSNs. For example, the battery level is important because most of the failures in the nodes are detected due to energy depletion. The packet loss rate and signal strength indicate the quality of communication, which may become poor due to environmental effects or due to hardware malfunction. The fault occurrence flag is a binary signal of whether a fault has occurred and is the ground truth for supervised models to learn from. These features altogether enable the adoption of a wide scope in fault analysis and facilitate the efficient design of self-healing mechanisms.

Preprocessing steps were carried out to make the dataset usable and clean. The missing values, which can arise due to faulty sensors or incomplete data collection, are handled using imputation techniques. Numerical features are imputed by the mean or median of the respective feature, whereas categorical features are imputed using the mode. Normalization scales numerical features between 0 and 1 to prepare the data for machine learning models. This is an essential step in ensuring that all features contribute equally to the model’s learning process without bias to features with more extensive numerical ranges.

Feature selection was performed to optimize model performance further and reduce computational overhead. The statistical correlation analysis approach was used to identify highly correlated or redundant features. Features contributing less to fault detection and self-healing are screened according to exploratory data analysis and domain expertise. Preprocessing has removed environmental features due to their impact on the network conditions; for instance, it has been dismissed for entirely negligible variations. This preprocessing pipeline obtains clean, high-quality output data, which is appropriate for feeding to train and test a fault detection and self-healing framework.

The selection of input features for training the LGBM model was driven by both domain knowledge and statistical relevance. Specifically, features such as battery level, signal strength, and packet loss rate are widely recognized as indicators of node health and communication quality in WSNs. For instance, a sudden drop in battery level may indicate energy depletion, which is one of the leading causes of node failure. High packet loss and low signal strength are commonly associated with link disruption or environmental interference. These features were validated using exploratory data analysis and feature correlation tests to ensure they contributed meaningful variance toward fault classification. Moreover, their practical relevance has been emphasized in prior research on WSN reliability and fault detection.

3.2. Fault Detection Using LGBM

However, it is crucial for WSN fault detection to choose a model that shows high accuracy and efficiently handles big data with scalability regarding real-time applications. Thus, the LGBM (Light Gradient Boosting Machine), well-known for gradient boosting, is applied to obtain an effective solution optimized for efficiency, speed, and performance. LGBM is particularly suitable for datasets with a more significant number of features and observations, as it is in this study, owing to its histogram-based decision tree learning nature and leaf-wise growth strategy. Its intrinsic handling of missing values and support for parallel computation make it an ideal choice for any fault detection task in WSNs.

The dataset was preprocessed before model training in order to make it consistent, eliminate noise, and make the features suitable for supervised learning. Missing values were replaced with mean, median, or mode values depending upon the type of feature. Numerical features were scaled to the range [0, 1] to remove feature magnitude-based bias. Redundant feature removal and exploratory data analysis and feature filtering took place to leave behind high-information-gain or domain-related features. Stratified sampling was used to maintain fault vs. no-fault ratios in both training and testing splits to handle the issue of class imbalance.

3.2.1. Model Architecture and Loss Function

LGBM is based on gradient boosting, where an ensemble of weak learners (decision trees) is used to minimize a loss function iteratively. The general objective of gradient boosting is to optimize the loss function.

L (y, \hat{y})

measures the difference between the actual values

(y)

and predicted values (

\hat{y}

). For binary classification, as in fault detection, the binary cross-entropy loss is used [9]:

L (y, \hat{y}) = - \frac{1}{N} \sum_{i = 1}^{N} [y_{i} l o g ({\hat{y}}_{i}) + (1 - y_{i}) l o g (1 - {\hat{y}}_{i})]

(1)

where

N

is the total number of samples,

y_{i}

is the actual label, and

{\hat{y}}_{i}

represent the predicted probability.

Each weak learner attempts to minimize this loss function by adjusting its predictions in the direction of the negative gradient of the loss concerning the model predictions. The model grows trees leaf-wise, selecting the leaf with the maximum loss reduction at each step, allowing for better optimization than traditional level-wise growth.

3.2.2. Training Strategy and Performance Evaluation

To train and validate the LGBM model, the dataset was divided into two subsets: 80% for training and 20% for testing. The training set was used to fit the model, while the testing set was reserved for evaluating its generalization performance. The split was performed randomly but ensured that the class distribution (i.e., fault vs. no-fault) remained consistent in both subsets to avoid bias.

The performance of the model was evaluated using the following metrics:

Accuracy measures the proportion of correct predictions over the total number of forecasts [3]:

Accuracy = \frac{T P + T N}{T P + T N + F P + F N}

(2)

2.: Precision evaluates the proportion of correctly identified faults among all instances predicted as faults [3]:

Precision = \frac{T P}{T P + F P}

(3)

3.: Recall (sensitivity) measures the ability to detect actual faults among all fault instances [3]:

Recall = \frac{T P}{T P + F N}

(4)

4.: F1-score harmonic mean of precision and recall, providing a balanced measure [3]:

F 1 = 2 \cdot \frac{Precision \cdot Recall}{Precision + Recall}

(5)

Here, TPs (true positives), TNs (true negatives), FPs (false positives), and

F N s

(false fegatives) are computed based on the model’s predictions.

3.2.3. Hyperparameter Tuning and Optimization

Hyperparameter tuning used a grid search strategy to optimize the LGBM model. The key hyperparameters adjusted during this process include the following:

Number of leaves (num_leaves) determines the maximum number of leaves in each tree. A higher value allows for complex decision boundaries but may lead to overfitting.
Learning rate $(η)$ controls the step size at each iteration of gradient descent. Smaller values result in more robust convergence but require more iterations.
Maximum depth (max_depth) sets the maximum depth of trees to prevent overfitting.
Feature fraction (feature_fraction) specifies the proportion of features to consider when building each tree, promoting diversity and reducing overfitting.
Bagging fraction (bagging_fraction) similar to feature fraction but applied to the training data.
Minimum data in leaf (min_data_in_leaf) specifies the minimum number of samples required in a leaf node.

The hyperparameters were selected based on cross-validation results, chosen such that the F1 score would be maximized, representing an optimal balance between precision and recall. The final LGBM model had strong generalization in fault detection, showing high accuracy and recall to assist the fault detection component of the proposed framework.

3.3. Self-Healing with Flying Fox Optimization Algorithm

The FFOA represents a nature-driven metaheuristic technique obtained by considering the foraging method of an intelligent animal, a flying fox mammal (a type of fruit bat), in its search mechanism. Basically, this works effectively when multiple optimization objectives involve constraints, therefore providing an effective balance in exploration versus exploitation. In WSNs, FFOA has been tailored to optimize self-healing mechanisms to address challenges such as routing path reconfiguration and node replacement with minimal recovery time and maximum network resilience.

3.3.1. Overview of the Algorithm

Within WSN self-healing, the Flying Fox Optimization Algorithm (FFOA) is tuned to handle particular network-level issues, e.g., the choice of the best routing paths when nodes fail or the choice of the adequate substitution nodes when the faulty ones fail. Every candidate solution produced by the FFOA corresponds to a potential reconfiguration of the network. It is scored based upon a fitness function, balancing the restoration time, the energy usage, and the recovered connectivity level. Through iterative improvements to these candidate solutions, the algorithm finds the ideal self-healing operation according to the fault scenario observed.

FFOA models the flight of a population of flying foxes in a multi-dimensional search space to seek an optimum solution. In this case, each flying fox represents a candidate solution, and its position in the search space maps to a particular configuration of the WSN, which is a routing path or arrangement of nodes. The algorithm iteratively updates the positions of the flying foxes based on their fitness to guide them toward regions with higher-quality solutions.

The mathematical formulation of the Flying Fox Optimization Algorithm (FFOA) draws inspiration from swarm intelligence, where each flying fox represents a candidate solution and its position in a multi-dimensional search space. The update rule of each agent (fox) is a function of both stochastic exploration and convergence toward the global best position.

The position update equation is defined as follows:

X_{i}^{t + 1} = X_{i}^{t} + α \cdot r_{1} \cdot (G^{t} - X_{i}^{t}) + β \cdot r_{2} \cdot (P_{i}^{t} - X_{i}^{t})

(6)

where

“ $X_{i}^{t}$ is the position of the i-th fox at iteration t,”
“ $G^{t}$ is the global best solution found so far,”
“ $P_{i}^{t}$ is the personal best of the i-th fox,”
“ $α, β$ are learning factors balancing exploration and exploitation,”
“ $r_{1}, r_{2} \in [0,1]$ are random vectors for stochastic behavior.”

The fitness of each solution is evaluated using a multi-objective function:

f (X) = w_{1} \cdot T_{r}^{- 1} + w_{2} \cdot C_{r} + w_{3} \cdot E_{r}

(7)

where

“ $T_{r}$ : recovery time (to be minimized),”
“ $C_{r}$ : connectivity ratio (to be maximized),”
“ $E_{r}$ : residual energy (to be maximized),”
“ $w_{1}, w_{2}, w_{3}$ : application-dependent weight parameters.”

The algorithm starts with a randomly initialized population and updates the foxes’ positions iteratively. Early iterations focus on exploration (larger step sizes and randomness), while later iterations emphasize exploitation (convergence to best-known solutions).

Representation of the Optimization Problem

The optimization problem in WSN self-healing is represented as follows:

Routing Path Reconfiguration: Each fox’s position encodes a candidate routing path for data packets. The dimensionality corresponds to the number of nodes in the network.
Node Replacement Strategy: Foxes can represent potential replacements for faulty nodes or alternative connections to mitigate link failures.

3.3.2. Objective Function Design

The objective function evaluates the quality of a candidate solution based on multiple criteria:

f (X) = w_{1} \cdot T_{r} (X) + w_{2} \cdot (1 - C_{r} (X)) + w_{3} \cdot R_{n} (X)

(8)

where

$T_{r} (X)$ is the recovery time for the solution represented by $X$ (minimized);
$C_{r} (X)$ is the connectivity ratio, representing the proportion of nodes successfully reconnected (maximized);
$R_{n} (X)$ is the residual energy of the nodes in the network after reconfiguration (maximized);
$w_{1}, w_{2}, w_{3}$ represent the weights for each objective, determined based on the application’s priorities.

3.3.3. Implementation Details

Initialization of the Population

The flying foxes of an initial population are initialized to a random position within the search space, representing a candidate routing configuration or recovery strategy. The initial positions can be generated by ensuring that they fall within the bounds of feasible solutions at least so as not to violate the physical and logical constraints of the WSN.

Fitness Function Calculation

The objective function f(X) determines each fox’s fitness function. Solutions that minimize recovery time while keeping connectivity and energy efficiency high are rewarded with higher fitness values. The fitness function includes penalties for invalid solutions, like solutions resulting in disconnected nodes or high energy consumption.

Exploration and Exploitation Mechanisms

Exploration and exploitation are balanced through adaptive parameters:

α

and

β

. Early iterations focus on exploration, allowing foxes to cover a broad search space. As the algorithm converges, exploitation becomes dominant, refining solutions near the global best position.

Exploration involves random perturbations in the foxes’ positions, ensuring diverse candidate solutions:

X_{i}^{t + 1} = X_{i}^{t} + γ \cdot U

(9)

where

U

is a uniformly distributed random vector and

γ

controls the step size. Exploitation leverages information about the global best solution, guiding foxes toward promising regions:

X_{i}^{t + 1} = X_{i}^{t} + δ \cdot (X_{best}^{t} - X_{i}^{t})

(10)

where

δ

dynamically adjusts based on the iteration number, promoting convergence.

Equation (6) represents the general update rule combining both stochastic exploration and directed movement toward the global best solution. In contrast, Equation (9) is a simplified convergence-focused variant, used during the final phase of the search where exploitation is prioritized. The distinction lies in the phase of the optimization: Equation (6) is used during the early iterations to ensure diversity, while Equation (9) is applied later for local refinement.

The Flying Fox Optimization Algorithm provides a robust framework for self-healing in WSNs. By tailoring the representation, objective function, and movement dynamics to the specific needs of fault recovery, the algorithm efficiently identifies optimal solutions that restore network functionality with minimal disruption.

3.4. System Workflow

The proposed fault detection and self-healing methods in WSNs integrate LGBM, used to detect faults, with the FFOA, used to optimize self-healing actions. Accordingly, the end-to-end workflow is designed to ensure real-time fault detection and efficient recovery with less disruption to the network using both machine learning and metaheuristic optimization. The major workflow stages are as follows:

3.4.1. Fault Detection

After the fault detection phase, the LGBM model should classify the state of every sensor node in WSN as faulty or regular. The process starts with collecting data from the network, where sensor nodes send their state metrics related to battery level, signal strength, packet loss rate, and other relevant operation features at periodic intervals. The data is then preprocessed, with normalization towards a format the LGBM model expects.

The trained LGBM classifier processes the input data and predicts the likelihood of faults for each node. For each node n, the model outputs a probability score P (fault ∣n), which is compared against a predefined threshold τ to make a binary decision:

F a u l t (n) = \{\begin{array}{l} 1, & P (fault ∣ n) \geq τ \\ 0, & P (fault ∣ n) < τ \end{array}

(11)

Nodes classified as faulty

(F a u l t (n) = 1

) are flagged for further action. The model continuously updates its predictions as new data streams, enabling dynamic and real-time monitoring.

3.4.2. Triggering Self-Healing Actions Optimized by FFOA

When faults are detected, the system triggers the self-healing module, which employs the Flying Fox Optimization Algorithm to determine the optimal recovery actions. The self-healing process involves the following steps:

Fault Characterization: The system categorizes detected faults (e.g., “Node Down,” “Link Failure”) and identifies their locations within the network. This information serves as the input to the optimization module.
Problem Representation: The optimization problem is based on the fault type. For example, a “Node Down” fault may require the selection of a replacement node or the rerouting of traffic, while a “Link Failure” fault may involve reconfiguring communication paths. Each potential solution is represented as a position in the FFOA search space.
Objective Function Evaluation: The FFOA evaluates candidate solutions using the multi-objective function designed to achieve the following goals:
- Minimize recovery time $(T_{r})$ ;
- Maximize network connectivity $(C_{r})$ ;
- Maximize residual energy $R_{n}$ of the network nodes. As previously defined in Equation (7), the multi-objective function used to evaluate candidate recovery strategies is applied here to guide the optimization process within FFOA. This function simultaneously minimizes recovery time, maximizes connectivity ratio, and enhances residual energy. During the real-time application of self-healing, the same objective function governs the evaluation of candidate solutions.
- Optimization Process: The FFOA iteratively searches for the optimal solution. Foxes in the population explore and exploit the search space, guided by fitness scores computed from the objective function. The algorithm converges with the best solution, representing the most effective self-healing strategy.
Implementation of Recovery Actions: The optimal recovery actions determined by FFOA are executed in the network. These may include rerouting traffic, replacing faulty nodes, or adjusting transmission power levels to restore connectivity and functionality.

After these self-healing actions are performed, the system resumes monitoring the network. The feedback loop ensures the timely detection and resolution of any residual faults that may be left or secondary problems caused by the recovery process. This combination of model-based fault detection using LGBM and optimization-driven self-healing forms a robust and adaptive fault management framework.

The end-to-end pipeline integrates machine learning and optimization seamlessly:

The LGBM model processes data from the WSN to identify faults in real time.
Detected faults trigger the self-healing module, which leverages the FFOA to determine and implement recovery actions.
A feedback loop ensures continuous fault monitoring and system resilience.

This integrated approach ensures high fault detection accuracy and minimizes recovery time, making it highly effective for maintaining WSN reliability in dynamic and challenging environments.

4. Experimental Setup

4.1. Hardware and Software Specifications

A fault can be detected and self-healed in the designed test platform in the following experiment. The computations were made with high-performance workstations equipped with Intel Core i7-12700K running at 3.6 GHz, 32 GB of DDR4 RAM, and an NVIDIA RTX 3080 GPU with 10 GB of dedicated memory. This hardware configuration has been chosen to achieve the necessary computation power to train an LGBM model on a dataset comprising almost 100,000 records and to run FFOA with many iterations with a population of considerable size.

Software Environment: The operating system platform is Linux-based, ensuring compatibility with high-performance computing requirements. In this context, Python version 3.9 was chosen as the programming language because it offers a broad ecosystem of libraries and tools for machine learning, data analysis, and optimization. LightGBM allows the implementation of the LGBM model in the fault detection module, optimized for efficient training and inference on large datasets. Alongside scikit-learn, one may import other useful machine learning utilities, including but not limited to preprocessing and evaluation functions.

The FFOA was implemented as a Python module tailored to WSN self-healing. It utilized NumPy for numerical computations efficiently and Matplotlib 3.10.3 to plot the convergence of the optimization process. The integration between LGBM and FFOA was performed in a modular pipeline so that fault detection and self-healing components could communicate effectively.

Among other central libraries, Pandas was used to process and preprocess the dataset, Seaborn v0.13.2 and Matplotlib were used for exploratory data visualization, and Jupyter Notebook 4.7.3 was used for documenting and iteratively developing the experimental workflow. Combining this versatile software with heavy hardware provided good running grounds for experiments and enabled the effective interpretation of their outcomes.

4.2. Training and Testing Protocol

Given the nature of exhaustive evaluation and generalization, an appropriate -testing protocol should complement the proposed fault detection and self-healing system. The description of a cross-validation strategy in fault detection shows benchmarks of the proposed approach against the state-of-the-art and test scenarios on the performance of the self-healing performance for a selection of fault conditions.

4.2.1. Cross-Validation for Fault Detection

Stratified 5-fold cross-validation was performed to validate the performance of the LGBM model in fault detection. This process splits the entire dataset into five equal subsets, maintaining the same proportion of faulty and non-faulty instances in each fold. At every iteration, one subset was used as the test set, and the remaining four were used for training purposes. This was performed five times, and the average of the results provided robust performance metrics that minimize data variability and overfitting.

These included the metrics of accuracy, precision, recall, and F1-score. Such a set of metrics comprehensively defines the model performance: the share of correctly classified faults is characterized by precision, the number of detected faulty instances is characterized by recall, and the F1-score describes the balance between them. The stratified approach does not allow a biased performance because of an imbalanced class distribution that frequently happens in WSN fault detection datasets.

4.2.2. Benchmarks for Comparison Against Existing Methods

To present the proposed performance of the LGBM-based fault detection system, several traditional and machine learning-based methods were adopted for benchmarking. These included the Random Forest (RF), support vector machine (SVM), and k-nearest neighbor (k-NN) models. Each model was trained and tested on the same dataset using identical cross-validation splits to ensure a fair comparison. In this regard, hyperparameters for each method were tuned using grid search optimization to ensure the best performance could be achieved.

The results showed that LGBM outperformed these baseline algorithms both in terms of accuracy and performance. This was because LGBM intrinsically handles high-dimensional datasets and performs boosting to focus more on hard-to-classify instances. Second, an analysis of feature importance is another way to show that LGBM captures the key predictors of battery level, packet loss rate, and signal strength well, hence further validating its suitability for fault detection in WSNs.

4.2.3. Testing Scenarios for Self-Healing

The self-healing component of the system, powered by the Flying Fox Optimization Algorithm (FFOA), was evaluated under diverse fault scenarios to assess its robustness and adaptability. These scenarios included the following:

Node Failures: Nodes were randomly deactivated to simulate battery depletion or hardware malfunction. The algorithm was required to reroute data using other nodes with a minimum recovery time and without losing any connectivity.
Communication Breakdowns: The links between some nodes were deleted to simulate environmental interference or signal degradation, which the optimization algorithm should reconnect by changing transmission powers or finding an alternative route.
Concurrent Faults: Simultaneous node failures and the disruption of links were used to study multi-dimensional fault conditions that involve several recovery strategies.

All the above scenarios were performed on the basis of certain performance metrics: recovery time, connectivity ratio, and energy efficiency. Recovery time, represented as Tr, refers to the time taken to regain normal operations. The connectivity ratio is measured as Cr, which is the ratio of all nodes that would be reconnected with restoration after fault conditions in the network. Energy efficiency is measured by residual energy in a network after the recovery action was pursued.

The results proved that FFOA efficiently identified the optimal recovery strategy in terms of minimizing the time to ensure network resilience. It proved applicable in all fault conditions, and this capability opens perspectives toward its application in a real WSN application, where the scale of the fault scenarios could be much larger. Extensive testing ensures that both fault detection and self-healing are vigorously tested and benchmarked against existing approaches.

To enhance realism and robustness, additional fault scenarios were incorporated during testing, simulating more complex and dynamic failure conditions. These include the following:

Energy-Threshold-Induced Node Failures: Nodes were programmed to fail when battery levels dropped below a dynamically defined threshold, mimicking real-world power depletion patterns under high-load conditions.
Cascading Faults: Initial failures in critical nodes triggered dependent node and link failures, simulating cascading effects often observed in clustered or hierarchical WSNs.
Intermittent Link Disruptions: Certain communication links were programmed to degrade periodically due to environmental interference patterns (e.g., sudden humidity spikes), introducing unpredictability in fault timing.
Simultaneous Multi-Cluster Outages: Entire segments of the network (multi-hop regions) were disabled simultaneously to evaluate the algorithm’s scalability and capacity to reroute across distant, unaffected regions.

These extended scenarios were critical in testing the system’s adaptability under high-stress conditions, thereby demonstrating the generalizability and resilience of the proposed self-healing strategy.

5. Results and Discussion

This section aims at presenting the experimental results obtained in relation to fault detection and self-healing mechanisms in WSN. This section extends the experiments, showing the performances of LGBM models with tight metrics for fault detection and FFOA for self-healing. Comparisons against baseline techniques are also drawn. Subsequently, more sensitivity studies are conducted along with visualizations of integrated technique results.

5.1. Fault Detection Results

The performance results of LGBM on fault detection tasks were great. Comparing it with those traditional machine learning methods, like RF, SVM, k-NN, the score of the LGBM was in first place when using accuracy, precision, recall, and F1-score as the test indicators. All the performance metric contrasts are provided in Table 1 in greater detail. Therefore, the LGBM model had the best performance compared with all its competitors and gave an accuracy of 94.6%, a precision of 92.8%, a recall of 93.5%, and an F1-score of 93.1%. This will give a good view of how well LGBM is suited for fault identification within the dataset.

Figure 1 shows the LGBM model’s ROC-AUC curve, which again signals the increased discriminatory power of the model with an AUC score of 0.96. This high AUC value confirms the model’s ability to effectively discriminate between faulty and non-faulty instances, proving it to be a good tool for fault detection in WSNs.

Figure 1 shows the Receiver Operating Characteristic (ROC) curve of the LGBM model as the trade-off between the true positive rate (sensitivity) and the false positive rate (1—specificity) over different thresholds. The scalar performance value summarizing this is the Area Under the Curve (AUC). An AUC value equaling 0.96 means the model discriminates between faulty and fault-free nodes very reliably. The AUC is computed using the trapezoidal rule method and the area beneath the ROC is integrated to result in a performance value between 0 and 1.

Feature importance analysis was conducted to gain deep insights into the LGBM model’s decision-making process by identifying the most influential predictors. Figure 2 represents elative feature importance. From this, it can be seen that battery level, packet loss rate, and signal strength are the most essential features of the model’s performance. These findings also align well with the basic understanding of WSNs: energy levels and communication quality are critical indications of node health and functionality. Feature importance analysis is used to obtain in-depth insights into the LGBM model’s decision-making process through the detection of the most impactful predictors. Feature importance is usually determined in LightGBM using the ‘gain’ measure, which calculates the aggregate decrease in the loss function introduced through each feature in all the trees. In particular, the importance measure corresponding to each feature indicates the aggregate improvement in model performance (i.e., decrease in binary cross-entropy) when the feature is being used to split. The x-axis in Figure 2 depicts the gain-based importance measures, scaled to reflect relative contributions from each input feature.

These top-ranked features were not only statistically significant but were also selected based on their physical correlation to WSN health, reinforcing the reliability of the model’s learning behavior.

5.2. Self-Healing Results

The results depicted the capability of the Flying Fox Optimization Algorithm in optimizing recovery actions in fault scenarios. Table 2 summarizes the improvement achieved by FFOA concerning traditional heuristic approaches like GA and PSO. FFOA obtained an average recovery time of 120 ms, a connectivity ratio of 98.5%, and a residual energy efficiency of 85.2%, outperforming GA and PSO in all metrics. These results confirm the algorithm’s quick and efficient recovery of network functionality.

In addition to convergence speed, the computational complexity of the Flying Fox Optimization Algorithm (FFOA) was analyzed to evaluate its scalability. The time complexity of FFOA can be expressed as the following:

O (N \cdot D \cdot T)

where

“ $N$ : population size (number of foxes),”
“D: dimensionality of the solution space (number of WSN nodes or decision variables),”
“ $T$ : number of iterations.”

This complexity is comparable to that of GA and PSO, but the adaptive exploration-exploitation mechanism in FFOA allows for faster convergence in fewer iterations, thereby reducing overall execution time in practical deployments. The convergence analysis (Figure 3) reflects this by showing how FFOA reaches near-optimal solutions earlier than GA and PSO.

Figure 3 presents FFOA’s convergence behavior concerning GA and PSO. FFOA converged faster to optimal solutions, especially in early iterations, reflecting its better exploration and exploitation capabilities. This rapid convergence makes FFOA well-suited for real-time self-healing in dynamic WSN environments.

The operational details of FFOA, including the encoding of candidate solutions, fitness evaluation, and exploration–exploitation strategy, have already been discussed in Section 3.3. Here, we focus solely on the algorithm’s empirical performance under different fault scenarios.

Figure 4 shows a visual comparison of the network topology before and after applying the FFOA-based self-healing strategy. To enhance clarity, the legend indicating node status and routing paths is repositioned outside the main plot area to avoid overlapping with key network elements. Sensitivity analysis was conducted to study the performance of FFOA using key parameters. Figure 5 plots the trend of the population size against recovery time, showing that larger population sizes result in marginally better solutions at more excellent computational time. This again confirms a trade-off that suggests the need for a parameter-tuning process to have a good balance in performance if resources are limited.

Therefore, LGBM for fault detection and FFOA for self-healing are effective and robust techniques for maintaining the reliability of WSNs. All these results confirm that the proposed system performs significantly better than the traditional methods by ensuring high accuracy, low recovery time, and resilient networks. This justifies the approach’s practical feasibility and efficiency in meeting critical challenges arising in the management of WSNs.

5.3. Comparative Analysis

In order to make a proper evaluation of the proposed system, we compared its performance with representative models and algorithms used in the literature for network fault detection and self-healing. We used RF, SVM, and k-NN as comparison models in the case of fault detection since they are widely used in WSN fault analysis applications. In the case of self-healing, we compare our FFOA with GA and PSO since they are widely used in network reconfiguration problem solving. All models have been executed in the same experimental setup using the same dataset and hyper parameters have been tuned using a grid search to make the comparison fair and accurate. This comparative setup facilitates the proper evaluation of the benefits provided by our LGBM–FFOA system.

In this regard, an extensive comparison was performed against the traditional heuristic-based approaches and state-of-the-art machine learning models to validate the proposed system’s overall effectiveness. The key focus of the comparison was based on three metrics: fault detection accuracy, recovery time, and network resilience. Table 3 presents the results showing the superior performance of the proposed system, which integrates LGBM for fault detection and FFOA for self-healing.

Achieving an accuracy of 94.6%, it significantly outperforms the heuristic-based combination of genetic algorithm with Random Forest, with an accuracy of 90.2%, and that of state-of-the-art PSO paired with the support vector machine, which has an accuracy of about 85.7%. Indeed, this was due to the additional LGBM learning capability to efficiently handle high-dimensional data that offers complicated fault feature presentation in WSNs.

Therefore, the proposed system was efficient in recovery time, taking 120 ms on average to execute recovery actions. Indeed, these constitute significant reductions compared to the value of 160 ms that the heuristic-based approach produces and its state-of-the-art counterpart, which displays a value 145 ms. The efficiency of the FFOA mainly causes this convergence in finding optimal solutions. The algorithm balances exploration and exploitation to permit timely and efficient recovery even of complex faults.

Network resilience, determined by the proportion of the nodes successfully reconnected after restoration, was also the highest for the proposed system, at 98.5%. This result ascertains a good correspondence between the fault identification by LGBM and effective optimization by FFOA to guarantee the complete robustness and adaptiveness of network restoration. Therefore, the heuristic-based approach provides a value of 94.7%, while the state-of-the-art system provides a value of 96.3%.

Table 3 summarizes the numerical summary of these results, and Figure 6 shows graphically the differences in detection accuracy and recovery time among the compared systems. The chart illustrates the advantages of the proposed system’s significant fault detection and recovery efficiency.

Resilience (R_s) is defined as the proportion of total nodes that regain full operational status after a fault event has been mitigated. It quantitatively captures the system’s ability to recover from disruptions. The metric is calculated as follows:

R_{s} = \frac{N_{recovered}}{N_{total}}

where

N_{recovered}

is the number of nodes successfully reconnected or restored to functional status after the self-healing process, and

N_{total}

is the total number of nodes in the WSN. A higher

R_{s}

value indicates greater fault tolerance and system robustness.

Figure 6 visually compares detection accuracy and recovery time across different systems.

While the Figure 6 data is equivalent to the numeric results presented in Table 3, the figure includes the comparison between detection accuracy and system recovery time graphically, enabling the easier and quicker interpretation of performance discrepancies by visualizers. Keeping the figure as well as the table available makes the result accessible in both numeric and graphical form to accommodate different readers and their preferences and to promote comprehensive understanding.

These results prove that the proposed system is a leap in the WSN fault management approach, amalgamating the predictive power of LGBM with the optimization efficiency of FFOA. Therefore, it has promptly provided fault detection and subsequent self-healing for good network performance and reliability.

Overall, the findings in Table 3 and Figure 6 verify that the new system performs better than conventional and modern solutions in all the metrics considered. In comparison with RF-GA and SVM-PSO framework hybrids, the LGBM–FFOA combination performs better in terms of accuracy, convergence time, and resilience in all cases. The improved performance is the result of the active learning capability of LGBM to detect intricate WSN fault patterns and the adaptive search behavior exhibited by FFOA during fast convergence to optimal recovery settings.

While the proposed system demonstrates clear improvements over traditional heuristic approaches (GA, PSO), we acknowledge that deep reinforcement learning (DRL) has recently emerged as a powerful technique for fault detection and adaptive self-healing in dynamic networks. Techniques such as Deep Q-Networks (DQNs), Proximal Policy Optimization (PPO), and Actor-Critic models have shown promising results in real-time reconfiguration and fault-tolerant control.

However, the implementation of DRL often requires significantly more training data, longer convergence time, and higher computational resources, which may limit its feasibility in resource-constrained WSN environments. As part of future work, we plan to extend this study by integrating DRL-based agents for adaptive routing and autonomous self-healing and benchmark their performance against the current LGBM–FFOA framework under similar fault scenarios. This will allow for a more comprehensive performance comparison with emerging state-of-the-art methods.

5.4. Insights and Challenges

The experimental results highlight the effectiveness and practical viability of the proposed system in fault detection and self-healing in WSN. The proposed use of the Light Gradient Boosting Machine for fault detection and the Flying Fox Optimization Algorithm for self-restoration has shown massive gains compared to conventional techniques. Therefore, confirmed by the high precision, recall, and F1-score, the fact that an LGBM model attained superior fault detection accuracy testifies to the robustness of the proposed LGBM model against such complicated data. Besides, inferences from feature importance validated that the chosen feature engineering-like battery level, packet loss rate, and signal strength supported the validity of how the dataset design and features were preprocessed.

The FFOA was very efficient in optimizing the self-healing actions, significantly reducing recovery time for enhanced network resilience compared to baseline approaches. Its fast convergence and the ability to identify optimal recovery actions, such as rerouting the traffic through high-connectivity nodes and selecting energy-efficient alternatives for the failed nodes, further illustrate the practical value of FFOA in dynamic WSN environments. These results show the synergy between LGBM and FFOA, contributing to the proposed system’s robust performance.

However, these few challenges were experienced in the course of the study. First, the computational cost for optimizing self-healing actions increased significantly with network size, a pointer that development should be geared at scalable optimization techniques that manage more extensive networks without compromising their efficiency. Besides, real-world WSNs are dynamic and include changes in environmental conditions, variable node behaviors, and communication delays, which introduce various uncertainties and may affect the performance of the proposed system. These factors show the need for adaptive mechanisms to act accordingly in real time, basing their action on changing network conditions.

Therefore, future work will address these challenges to make the system more robust and adaptable. Hybrid optimization algorithms that combine the strengths of multiple metaheuristic techniques might overcome scalability challenges. Further work on incorporating real-time environmental data into the decision-making process may enhance adaptability. Overcoming these deficiencies refines the proposed system to be appropriate for diverse and complex WSN scenarios, thus improving network operations’ reliability and efficiency.

6. Conclusions and Future Work

Implementation: The paper proposed a new event-driven LGBM integration for fault detection and integrated one newer metaheuristic optimization meta-strategies based on FFOA in self-recuperative WSN models. Based on the observation acquired, this work enhances mainly detected inaccuracies with respect to fault detection mechanism results, together with optimizing acts relating to the recovery process considerably, when merging both applied approaches with respect to their application to reliable methods associated with resilience in WSNs. The effectiveness of the fault identification ability is further testified by the high precision and recall of the LGBM model, and fast convergence and efficiency of the FFOA point toward the capability of dynamic self-recovery. Together, these provide a strong framework for many important challenges in WSN fault management.

This is huge for real-world contribution towards network reliability and fault tolerance. In this work, the proposed system will improve the detection and mitigation of faults by incorporating advanced machine learning with metaheuristic optimization while providing a scalable methodology that can be extended to various network configurations. Additionally, feature importance analysis and sensitivity evaluation empower the system to deal with changes in all network conditions in more data-driven manners, hence being applicable in practical deployment conditions.

Despite the encouraging results of the proposed system, some limitations need further investigation: one crucial challenge is scalability. As the network size increases, fault detection, and self-healing optimization become computationally expensive; thus, this limits the possible application of the system on large-scale networks. Besides, real-world applications of the system may be subject to several limitations regarding environmental variability, hardware limits, and communication delays that were only partially addressed here. Another limitation is that the possibly high computational overhead of the FFOA, especially for complex fault scenarios, affects real-time performance in resource-constrained environments.

Future research will overcome the limitations above to enhance the proposed system’s robustness and scalability. The key direction was identifying the complete spectrum of challenges in the practical deployment of WSN by expanding the dataset with real-world scenarios. Data from diverse environments, such as industrial, urban, and remote settings, will be included to enhance the proposed system’s generalization capability.

Another direction for future work will be the design and investigation of hybrid optimization techniques. Merging strengths of several metaheuristics, such as Genetic Algorithms, Particle Swarm Optimization, and FFOA, could yield self-healing mechanisms that are more efficient and scalable. These hybrid approaches might provide the proper balance in the trade-offs of convergence speed versus solution quality inherent in this study’s findings on scalability.

Finally, integrating more factors, such as real-time environment changes in the fault detection and self-healing processes, will lend more adaptability to the system for dynamic conditions. Incorporating weather data, interference patterns, and temporal variability may enhance decision-making and allow the system to remain effective under highly variable scenarios. Such future directions will make the proposed system robust, scalable, and easily adaptable for fault detection and self-curing concerning WSNs; it provides wide avenues toward more dependable and resilient network operations.

Author Contributions

Conceptualization, A.A. and A.A.-H.; methodology A.A.; software A.A.-H.; validation, A.A. and A.A.-H.; formal analysis, A.A.; investigation, A.A.; resources, A.A.-H.; data curation, A.A.; writing—original draft preparation, A.A.; writing—review and editing A.A.; visualization, A.A.-H.; supervision, A.A.; project administration, A.A.-H.; funding acquisition, A.A. and A.A.-H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to restrictions related to participant privacy and ethical considerations.

Acknowledgments

We would like to convey our sincere gratitude to everyone who helped and contributed to completing this research paper. They have been incredibly helpful and supportive in making this study possible.

Conflicts of Interest

The authors declare no conflict of interest.

References

Akram, M.; Tayyeh, H. Adaptive Multi-Modal Neural Network for Real-Time Threat Detection. J. Cybersecur. Inf. Manag. 2025, 15, 1–10. [Google Scholar] [CrossRef]
Khan, A.; Mulajkar, R.; Khan, V. A Research on Efficient Spam Detection Technique for Iot Devices Using Machine Learning A Research on Efficient Spam Detection Technique for Iot Devices Using Machine Learning. NeuroQuantology 2023, 18, 625–631. [Google Scholar] [CrossRef]
Konduru, T.A. Fault Detection and Tolerance in Wireless Sensor Networks: A Study on Reliable Data Transmission Using Machine Learning Algorithms. 2024; preprint. [Google Scholar] [CrossRef]
Mazibuco, V.A.; Nhung, N.P.; Linh, N.T. Fault detection in wireless sensor networks with deep neural networks. J. Res. Army Sci. Technol. 2023, Special issue No.7, 27–36. [Google Scholar] [CrossRef]
Fang, J. Artificial intelligence robots based on machine learning and visual algorithms for interactive experience assistance in music classrooms. Entertain. Comput. 2025, 52, 100779. [Google Scholar] [CrossRef]
Nasser, N. Automated Learning Style Prediction using Weighted Neutrosophic Fuzzy Soft Rough Sets in E-learning Platform. Int. J. Neutrosophic Sci. 2024, 25, 104–116. [Google Scholar] [CrossRef]
Saeed, M.A.; Rasslan, A.; Emadeldeen, A.I.A.E. Comparative Analysis of Machine Learning Techniques for Fault Detection in Solar Panel Systems. SVU-Int. J. Eng. Sci. Appl. 2024, 5, 140–152. [Google Scholar] [CrossRef]
Nehete, A.L.; Bankar, D.S.; Asati, R.; Khadse, C.B. Non-contact power system fault diagnosis: A machine learning approach with electromagnetic current sensing. Indones. J. Electr. Eng. Comput. Sci. 2024, 36, 1356–1364. [Google Scholar] [CrossRef]
Zhang, Y. Comparison of Machine Learning Models and Feature Importance Investigation of Intelligent Fault Diagnosis Methods for Robots. Sci. Technol. Eng. Chem. Environ. Prot. 2024, 1, 3352. [Google Scholar] [CrossRef]
Feng, J.; Yu, T.; Zhang, K.; Cheng, L. Integration of Multi-Agent Systems and Artificial Intelligence in Self-Healing Subway Power Supply Systems: Advancements in Fault Diagnosis, Isolation, and Recovery. Processes 2025, 13, 1144. [Google Scholar] [CrossRef]
Quiles-Cucarella, E.; Sánchez-Roca, P.; Agustí-Mercader, I. Performance Optimization of Machine-Learning Algorithms for Fault Detection and Diagnosis in PV Systems. Electronics 2025, 14, 1709. [Google Scholar] [CrossRef]
Dritsas, E.; Trigka, M. Machine Learning in Information and Communications Technology: A Survey. Information 2025, 16, 8. [Google Scholar] [CrossRef]
Yilmaz, S.; Dener, M. Security with Wireless Sensor Networks in Smart Grids: A Review. Symmetry 2024, 16, 1295. [Google Scholar] [CrossRef]
Shyama, M.; Anju, S. Pillai, Alagan Anpalagan, Self-healing and optimal fault tolerant routing in wireless sensor networks using genetical swarm optimization. Comput. Netw. 2022, 217, 109359. [Google Scholar] [CrossRef]
Kishor, I.; Mamodiya, U.; Agarwal, A.; Bhattacherjee, A. Adaptive Multi-Modal Neural Network for Real-Time Threat Detection. In Deep Learning Innovations for Securing Critical Infrastructures; IGI Global Scientific Publishing: Hershey, PA, USA, 2025. [Google Scholar] [CrossRef]
Martinez, R.; Alberdi, R.; Fernandez, E.; Albizu, I.; Bedialauneta, T. Improvement of Transmission Line Ampacity Utilization via Machine Learning-Based Dynamic Line Rating Prediction. Electr. Power Syst. Res. 2024, 211, 110931. [Google Scholar] [CrossRef]
Şenol, A. ImpKmeans: An Improved Version of the K-Means Algorithm, by Determining Optimum Initial Centroids, based on Multivariate Kernel Density Estimation and Kd-Tree. Acta Polytech. Hung. 2024, 21, 111–131. [Google Scholar] [CrossRef]
Garicano-Mena, J.; Santos, M. Nature–Inspired Metaheuristic Optimization for Control Tuning of Complex Systems. Biomimetics 2025, 10, 13. [Google Scholar] [CrossRef]
Jeyasri, S. AI-powered fault detection and mitigation in cloud computing infrastructures. World J. Adv. Res. Rev. 2023, 18, 1600–1612. [Google Scholar] [CrossRef]
Padmasree, R.; Chaithanya, A.S. Fault detection in single-hop and multi-hop wireless sensor networks using a deep learning algorithm. Int. J. Inform. Commun. Technol. 2024, 13, 453–461. [Google Scholar] [CrossRef]
Takale, D.G.; Mahalle, P.N.; Sule, B. Overview of Fault Diagnosis in Wireless Sensor Network. In Advances in Computer and Electrical Engineering Book Series; IGI Global: Hershey, PA, USA, 2024. [Google Scholar] [CrossRef]
SP, V.V.R.; Juliet, A.H.; Jayadurga, R.; Sethu, S.; KN, P.; Pandi, V.S. A Novel Method to Identify and Recover the Fault Nodes over 5G Wireless Sensor Network Environment. In Proceedings of the 2024 Asia Pacific Conference on Innovation in Technology (APCIT), Mysore, India, 26–27 July 2024; pp. 1–6. [Google Scholar] [CrossRef]
Sanjay, P.V.; Prasad, R.; Baghel, R.K. A Comprehensive Review on Fault Detection for Wireless Sensor Networks. In Proceedings of the 2024 IEEE International Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), Bhopal, India, 24–25 February 2024. [Google Scholar] [CrossRef]
Zhang, Y.; Guo, X. Research on Smart City Road Network Capacity Optimization Configuration Based on Deep Learning Algorithms. Int. J. High-Speed Electron. Syst. 2024, 34, 456–478. [Google Scholar] [CrossRef]
Gholizadeh, N.; Musílek, P. A Generalized Deep Reinforcement Learning Model for Distribution Network Reconfiguration with Power Flow-Based Action-Space Sampling. Energies 2024, 17, 5187. [Google Scholar] [CrossRef]
Mokhtari, Y.; Coirault, P.; Moulay, E.; Le Ny, J.; Larraillet, D. Distributed ADMM Approach for Power Distribution Network Reconfiguration. arXiv 2024. [Google Scholar] [CrossRef]
Jia, T.; Yang, G.; Yao, L. The Low-Carbon Path of Active Distribution Networks: A Two-Stage Model from Day-Ahead Reconfiguration to Real-Time Optimization. Energies 2024, 17, 4989. [Google Scholar] [CrossRef]
Dehghany, N.; Asghari, R. Multi-objective Optimal Reconfiguration of Distribution Networks Using a Novel Meta-Heuristic Algorithm. Int. J. Electr. Comput. Eng. 2024, 14, 3557–3569. [Google Scholar] [CrossRef]
Ramachandra, B.; Surekha, T.P. Development and Evaluation of a Network Intrusion Detection System for DDoS Attack Detection Using Machine Learning. Bull. Tek. Elektro Dan Informatika 2024, 13, 4207–4213. [Google Scholar] [CrossRef]
Zhou, J.; Ren, J.; He, C. Improved Medical Waste Plasma Gasification Modelling Based on Implicit Knowledge-Guided Machine Learning. Waste Manag. 2024, 163, 35–56. [Google Scholar] [CrossRef]
Venket, K.; Ambica, A.; Freeda, R.A. Meta-Heuristics Algorithm for Computer Communications. In Metaheuristic and Machine Learning Optimization Strategies for Complex Systems; IGI Global: Hershey, PA, USA, 2024. [Google Scholar] [CrossRef]
Bencheikh, G. Metaheuristics and Machine Learning Convergence. In Advances in Systems Analysis and Software Engineering; IGI Global: Hershey, PA, USA, 2024. [Google Scholar] [CrossRef]
Li, D.; Xu, P.; Gu, J.; Zhu, Y. A Review of Reliability Research in Regional Integrated Energy Systems: Indicator, Modeling, and Assessment Methods. Buildings 2024, 14, 3428. [Google Scholar] [CrossRef]
Brandeau, M.L.; Collins, R.; Carter, A.D.S. Research on Network Time Reliability Evaluation Method Based on Uncertainty Theory. J. Appl. Artif. Intell. 2024, 1, 46–62. [Google Scholar] [CrossRef]
Zhukabayeva, T.; Zholshiyeva, L.; Ven-Tsen, K.; Mardenov, Y.; Adamova, A.; Karabayev, N.; Abdildayeva, A.; Baumuratova, D. Towards Robust Security in WSN: A Comprehensive Analytical Review and Future Research Directions. Indones. J. Electr. Eng. Comput. Sci. 2024, 36, 318–337. [Google Scholar] [CrossRef]
Cheng, Y.; Petrides, K.V. Evaluating the Predictive Reliability of Neural Networks in Psychological Research with Random Datasets. Educ. Psychol. Meas. 2024, 88, 564–580. [Google Scholar] [CrossRef]
Singh, K.; Yadav, M.; Singh, Y.; Barak, D. Finding Security Gaps and Vulnerabilities in IoT Devices. In Advances in Environmental Engineering and Green Technologies; IGI Global: Hershey, PA, USA, 2024. [Google Scholar] [CrossRef]
Zhang, L.; et al. Multi-Layered Machine Learning Framework for Fault Localization in Wireless Mesh Networks. IEEE Trans. Ind. Inform. 2023, 19, 2234–2245. [Google Scholar] [CrossRef]
Li, J.; Yu, T. Hybrid Deep Learning for Real-Time Fault Detection in Large-Scale Wireless Sensor Networks. IEEE Trans. Netw. Serv. Manag. 2023, 20, 88–102. [Google Scholar] [CrossRef]
Bhatnagar, R.; Singh, M.; Sahai, A.R. Self-Healing WSNs via Decentralized Reinforcement Learning. Proc. ACM Sen. Syst. 2024, 202–215. [Google Scholar] [CrossRef]
Dong, S.; Qi, Y. MPDV-HOP: An improved localization algorithm for wireless sensor networks. WSEAS Trans. Commun. 2015, 14, 390–398. [Google Scholar]
Dong, S.; Zhang, X.; Li, Y. Energy Consumption Model of Wireless Sensor Networks Based on Bang-Bang Optimal Time Control Theory; ResearchGate: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
Wang, H.; Chen, Y.; Dong, S. Research on efficient routing protocol for WSNs based on improved artificial bee colony algorithm. IET Wirel. Sens. System. 2017, 7, 15–20. [Google Scholar] [CrossRef]
Dong, S.; Zhang, X.; Zhou, W. A security localization algorithm based on DV-hop against Sybil attack in wireless sensor networks. J. Electr. Eng. Technol. 2020, 15, 919–926. [Google Scholar] [CrossRef]

Figure 1. ROC curve for LGBM model.

Figure 2. Feature importance analysis for LGBM model.

Figure 3. Convergence of FFOA compared to GA and PSO.

Figure 4. Network connectivity before and after recovery. The legend was repositioned outside the plot area to enhance clarity and avoid overlap with visualized network elements.

Figure 5. Sensitivity analysis for population size in FFOA.

Figure 6. Comparison of fault detection and recovery metrics.

Table 1. Comparison of performance metrics for fault detection models.

Model	Accuracy (%)	Precision (%)	Recall (%)	F1-Score (%)
LGBM	94.6	92.8	93.5	93.1
Random Forest	90.2	88.1	89.3	88.7
SVM	85.7	84.3	83.5	83.9
k-NN	81.3	80.5	79.6	80.0

Table 2. Comparison of recovery metrics for self-healing algorithms.

Algorithm	Average Recovery Time (ms)	Connectivity Ratio (%)	Residual Energy (%)
FFOA	120	98.5	85.2
Genetic Algorithm	160	94.7	79.4
Particle Swarm Optimization	145	96.3	82.8

Table 3. Comparative analysis of fault detection and self-healing systems.

System	Detection Accuracy (%)	Recovery Time (ms)	Resilience (%)
Proposed (LGBM + FFOA)	94.6	120	98.5
Heuristic (GA + RF)	90.2	160	94.7
State-of-the-Art (PSO + SVM)	85.7	145	96.3

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alauthman, A.; Al-Hyari, A. Intelligent Fault Detection and Self-Healing Mechanisms in Wireless Sensor Networks Using Machine Learning and Flying Fox Optimization. Computers 2025, 14, 233. https://doi.org/10.3390/computers14060233

AMA Style

Alauthman A, Al-Hyari A. Intelligent Fault Detection and Self-Healing Mechanisms in Wireless Sensor Networks Using Machine Learning and Flying Fox Optimization. Computers. 2025; 14(6):233. https://doi.org/10.3390/computers14060233

Chicago/Turabian Style

Alauthman, Almamoon, and Abeer Al-Hyari. 2025. "Intelligent Fault Detection and Self-Healing Mechanisms in Wireless Sensor Networks Using Machine Learning and Flying Fox Optimization" Computers 14, no. 6: 233. https://doi.org/10.3390/computers14060233

APA Style

Alauthman, A., & Al-Hyari, A. (2025). Intelligent Fault Detection and Self-Healing Mechanisms in Wireless Sensor Networks Using Machine Learning and Flying Fox Optimization. Computers, 14(6), 233. https://doi.org/10.3390/computers14060233

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Intelligent Fault Detection and Self-Healing Mechanisms in Wireless Sensor Networks Using Machine Learning and Flying Fox Optimization

Abstract

1. Introduction

2. Related Work

3. Methodology

3.1. Dataset Description

3.2. Fault Detection Using LGBM

3.2.1. Model Architecture and Loss Function

3.2.2. Training Strategy and Performance Evaluation

3.2.3. Hyperparameter Tuning and Optimization

3.3. Self-Healing with Flying Fox Optimization Algorithm

3.3.1. Overview of the Algorithm

3.3.2. Objective Function Design

3.3.3. Implementation Details

Initialization of the Population

Fitness Function Calculation

Exploration and Exploitation Mechanisms

3.4. System Workflow

3.4.1. Fault Detection

3.4.2. Triggering Self-Healing Actions Optimized by FFOA

4. Experimental Setup

4.1. Hardware and Software Specifications

4.2. Training and Testing Protocol

4.2.1. Cross-Validation for Fault Detection

4.2.2. Benchmarks for Comparison Against Existing Methods

4.2.3. Testing Scenarios for Self-Healing

5. Results and Discussion

5.1. Fault Detection Results

5.2. Self-Healing Results

5.3. Comparative Analysis

5.4. Insights and Challenges

6. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI