Novel Prognostic Methodology of Bootstrap Forest and Hyperbolic Tangent Boosted Neural Network for Aircraft System

This


Introduction
The effectiveness of aircraft systems in industrial performance is contingent upon reliability, availability, and safety considerations, necessitating the need for improved maintenance assistance.The conventional notions of preventive and corrective techniques are being supplanted by emerging concepts such as predictive and proactive maintenance [1,2].
The fundamental maintenance principles encompass fault detection, failure diagnosis, and response development, which are essential for perceiving phenomena, comprehending them, and taking appropriate action.In contrast to merely comprehending a newly developed incident, such as a failure, it is advantageous to proactively foresee it in order to promptly initiate appropriate measures.This might be characterised as a prognostic procedure.According to the International Organisation for Standardisation (ISO), prognostics refers to the process of estimating the time till failure and assessing the associated risk for one or more failure modes, both current and potential.An alternative way to express this concept is to estimate a system's lifespan or remaining useful life (RUL), taking into account Appl.Sci.2024, 14, 5057 2 of 17 its present condition and prior performance [3,4].Prognostic maintenance methodologies have the potential to reduce costs and improve dependability.
Numerous methodologies have been suggested to facilitate the anticipation of RUL and prognosis.However, traditional limitations such as data accessibility, system intricacy, implementation prerequisites, and the presence of monitoring equipment hinge on the selection of an appropriate strategy.There is a lack of consensus regarding a universally accepted set of metrics suitable for prognostics.The process of selecting appropriate metrics for prognostics is inherently uncertain for both researchers and practitioners in the field of condition-based monitoring (CBM).Therefore, when utilising decisional metrics to determine preventive actions, prognostic tools must consider this inherent uncertainty.
There are a multitude of potential methodologies for data analysis in the context of predictive maintenance.The most widely favoured options are enumerated as follows.The physics-based methodology necessitates a precise physical representation of system behaviour, encompassing both normal and defective states [5].By comparing the data obtained from sensors with the model's predictions, one can infer the health of the system [6].Physical techniques encompass the utilisation of physics-of-failure models.One approach to failure mode analysis entails integrating experiments, observation, geometry, and condition monitoring data in order to assess the extent of damage caused by a specific failure mechanism.
Data-driven methods use historical data to make predictions about a system's future state or identify similar trends in the past in order to estimate RUL [7].The aforementioned factors encompass statistical methodologies, reliability functions, and artificial intelligence (AI) techniques [8].The following is a list of the subsequent prognostic approaches based on the data: • The various techniques employed in the field of artificial intelligence include neural net- works, fuzzy logic, decision trees, support vector machines (SVMs), anomaly detection algorithms, reinforcement learning, classification, clustering, and Bayesian methods.• Academic research and practical applications commonly employ various conventional numerical techniques, such as wavelets, Kalman filters, particle filters, regression, and statistical methods.• There are various statistical approaches utilised in research, such as the gamma process, hidden Markov model, regression-based model, relevance vector machine (RVM), and numerous more.
Data-driven approaches rely on the analysis and utilization of historical run-to-failure data.These strategies are frequently employed to make estimations by relying on a preestablished threshold for failure.The use of a wavelet packet decomposition technique and hidden Markov models (HMMs) can be beneficial, as the incorporation of time-frequency information enables more accurate outcomes compared to solely considering temporal factors.Nevertheless, the methodologies that rely on historical data for asset lifespan prediction necessitate a comprehensive understanding of the asset's physical characteristics [9,10].This paper employs the N-CMAPSS dataset, which provides simulated run-to-failure data for aircraft engines under real flight conditions.This paper describes a new way to make predictions that combines bootstrap forest and hyperbolic tangent-boosted neural networks.It is meant to fill in the gaps in current methods for predicting when aircraft systems will need maintenance.Despite the success of traditional prognostic approaches, their capacity to manage complex, nonlinear deterioration patterns in aviation components remains restricted.The proposed method improves the accuracy and reliability of predictions by combining the collective power of bootstrap forests with the adaptive learning skills of hyperbolic tangent neural networks.This hybrid model greatly enhances the accuracy of predictions for the RUL in different operational settings.This is a crucial improvement compared to previous models, which often face difficulties in dealing with the changing characteristics of aircraft system data.It offers a more precise and dependable approach to forecasting the RUL of crucial components.This facilitates the implementation of improved maintenance methods and helps to ensure safer and more efficient aircraft operations.This work specifically focuses on modelling the deterioration of selected fault types, which flow and efficiency modulation can accurately represent.It acknowledges that this limitation is important to ensure the precision and application of the model.
The structure of the current paper is as follows: the first chapter offers a comprehensive review of the prognostic methods currently employed in the field of RUL prediction.Chapter two provides an overview of the suggested data-driven prognostic approaches and their underlying assumptions.Chapter three presents a comparative examination of a hyperbolic tangent integrated neural network, which uses datasets from aviation engines and other data-driven techniques.Chapter four provides an overview of the RUL prediction's results.The conclusion is outlined in chapter five.

Bootstrap Forest
Bootstrap forest pertains to the statistical methodology called bootstrap aggregation, or bagging, which is utilised on a collection of decision trees inside an ensemble model, resulting in the averaging of several trees [11].The training of each decision tree involves the utilisation of a bootstrap sample derived from the training data.The splitting process in each tree entails the evaluation of a randomly chosen set of predictors.Through the utilisation of this methodology, a multitude of models possessing restricted individual capacities are combined to produce a more resilient and impactful model.The final prediction for an observation is derived by computing the average of the expected values associated with that observation across all decision trees.
The predictive capacity of a bootstrap forest is attained by aggregating the anticipated response values from many decision trees, which are then averaged [12].Each individual tree in the model is nurtured by employing a bootstrap sample that is derived from the training data.A bootstrap sample is defined as a subset of observations that is selected randomly and with replacement from a given dataset.Additionally, the predictors are sampled at each split point within the decision tree.The construction of the decision tree is accomplished by utilising the recursive partitioning technique [13].The subsequent steps outline the methodology utilised for the fitting of the training set [14].

•
In order to enhance the robustness of the analysis, it is advisable to use a bootstrap sampling strategy for the selection of data for each individual tree.• The individual decision tree is constructed using a recursive partitioning algorithm.
To perform each split, a set of predictors is randomly chosen, and the order of the predictors within the set is randomised.The procedure of partitioning should be iterated until a predetermined termination criterion, as delineated in the bootstrap forest configuration, is met.• The iterative procedure of executing steps one and two will persist until the desired quantity of trees specified is attained, or unless early stopping is activated.
Boosting refers to the methodology of constructing a substantial tree through the iterative fitting of a series of smaller decision trees, known as layers.The tree at each stratum is composed of a limited number of splits [15].The condition of the tree is determined by evaluating the residuals of the preceding levels, enabling each layer to rectify the fit for inadequately fitting data inherited from the preceding layers.The ultimate forecast for an observation is obtained by aggregating the individual predictions for that observation across all levels.
The bootstrap forest generates an ensemble model consisting of many decision trees, which are developed in a sequential manner to form an additive model [16].Each tier of the tree is composed of a limited number of splits.The recursive fitting process mentioned is employed to fit each layer.The sole distinction lies in the fact that the fitting process concludes upon reaching a predetermined number of divisions.The predicted value for an observation in a leaf of a given tree is determined by calculating the mean of all observations included within that leaf.Figure 1 below shows the first branch of the first tree of the proposed 100 distributed bootstrap forests.
The bootstrap forest generates an ensemble model consisting of many decision trees, which are developed in a sequential manner to form an additive model [16].Each tier of the tree is composed of a limited number of splits.The recursive fitting process mentioned is employed to fit each layer.The sole distinction lies in the fact that the fitting process concludes upon reaching a predetermined number of divisions.The predicted value for an observation in a leaf of a given tree is determined by calculating the mean of all observations included within that leaf.Figure 1 below shows the first branch of the first tree of the proposed 100 distributed bootstrap forests.The following steps delineate the methodology for the process of training: 1.To begin, it is recommended to apply an initial layer.2. Compute the residuals.The numbers mentioned are obtained by subtracting the expected mean of observations contained inside a leaf from their actual numerical values.3. Employ a layer on the remaining values.4. The construction of the additive tree is necessary.In order to get the total expected value across several layers for a given observation, it is necessary to aggregate the projected values obtained from each individual layer.5.The aforementioned procedure, as outlined in steps two through four, ought to be iteratively replicated until the appropriate quantity of layers has been attained.In the event that a validation measure is utilised, it is advisable to persist with the iterative process until the addition of another layer ceases to result in an enhancement of the validation statistics.
The probability estimates for the decision tree and bootstrap forest approaches are computed using the prob statistic [17].Equation (1) used to compute the probability for the  response level at a specific node is defined as follows: where the summation is taken over all response levels; the variable  represents the number of observations at the node for the  response level; and the variable  The following steps delineate the methodology for the process of training: 1.
To begin, it is recommended to apply an initial layer.

2.
Compute the residuals.The numbers mentioned are obtained by subtracting the expected mean of observations contained inside a leaf from their actual numerical values.

3.
Employ a layer on the remaining values.4.
The construction of the additive tree is necessary.In order to get the total expected value across several layers for a given observation, it is necessary to aggregate the projected values obtained from each individual layer.5.
The aforementioned procedure, as outlined in steps two through four, ought to be iteratively replicated until the appropriate quantity of layers has been attained.In the event that a validation measure is utilised, it is advisable to persist with the iterative process until the addition of another layer ceases to result in an enhancement of the validation statistics.
The probability estimates for the decision tree and bootstrap forest approaches are computed using the prob statistic [17].Equation (1) used to compute the probability for the i th response level at a specific node is defined as follows: where the summation is taken over all response levels; the variable n i represents the number of observations at the node for the i th response level; and the variable prior i denotes the prior probability for the i th response level, which is derived using the following Equation (2): where p i is the prior i from the parent node, P i represents the Prob i from the parent node, and λ is a weighting factor that is assigned as a fixed value so as to ensure that the predicted probabilities are not equal to zero.The comprehensive forecast is derived by consolidating the individual projections for a certain observation across all levels.The model's overall fit can be improved by progressively incorporating layers and fitting them to the residuals derived from the preceding layers [18].The existing configuration solely supports binary responses within category variables.The offsets of linear logits are used to indicate the residuals fit at each layer for a categorical answer.The final prediction is derived by employing a logistic transformation on the collective sum of the linear logits across all layers.

Hyperbolic Tangent Neural Network
Neural networks can be defined as a mathematical function that manipulates a set of derived inputs, commonly known as hidden nodes.The concealed nodes can be represented as non-linear transformations of the initial inputs.The researcher is provided with the choice to define how many hidden layers they prefer, where each layer can be customised to include any desired quantity of hidden nodes.The neural network architecture encompasses a fully connected multi-layer perceptron, consisting of either one or multiple layers.Neural networks are utilised in order to predict one or more response variables by employing a flexible function of the input variables.Neural networks demonstrate robust predictive ability in scenarios where there is no need to explicitly define the functional structure of the response surface or establish a clear description of the relationship between the input variables and the response variable [19].Figure 2 displays the proposed neural network design that showcases 45 inputs derived from the dataset, with the only output being the prediction result for RUL.
denotes the prior probability for the  response level, which is derived using the following Equation ( 2): where  is the  from the parent node,  represents the  from the parent node, and  is a weighting factor that is assigned as a fixed value so as to ensure that the predicted probabilities are not equal to zero.The comprehensive forecast is derived by consolidating the individual projections for a certain observation across all levels.The model's overall fit can be improved by progressively incorporating layers and fitting them to the residuals derived from the preceding layers [18].The existing configuration solely supports binary responses within category variables.The offsets of linear logits are used to indicate the residuals fit at each layer for a categorical answer.The final prediction is derived by employing a logistic transformation on the collective sum of the linear logits across all layers.

Hyperbolic Tangent Neural Network
Neural networks can be defined as a mathematical function that manipulates a set of derived inputs, commonly known as hidden nodes.The concealed nodes can be represented as non-linear transformations of the initial inputs.The researcher is provided with the choice to define how many hidden layers they prefer, where each layer can be customised to include any desired quantity of hidden nodes.The neural network architecture encompasses a fully connected multi-layer perceptron, consisting of either one or multiple layers.Neural networks are utilised in order to predict one or more response variables by employing a flexible function of the input variables.Neural networks demonstrate robust predictive ability in scenarios where there is no need to explicitly define the functional structure of the response surface or establish a clear description of the relationship between the input variables and the response variable [19].Figure 2 displays the proposed neural network design that showcases 45 inputs derived from the dataset, with the only output being the prediction result for RUL.The activation function is a mathematical operation that applies a non-linear transformation to the weighted sum of the input variables, sometimes denoted as  variables.The function used for the response entails either a linear combination in the case of continuous responses or a logistic transformation for nominal or ordinal responses [20].One significant advantage of utilising a neural network model is its capacity to efficiently record and depict a wide range of response surfaces [21].By employing an ample quantity of hidden nodes and layers, it becomes feasible to approximate any given surface with a The activation function is a mathematical operation that applies a non-linear transformation to the weighted sum of the input variables, sometimes denoted as x variables.The function used for the response entails either a linear combination in the case of continuous responses or a logistic transformation for nominal or ordinal responses [20].One significant advantage of utilising a neural network model is its capacity to efficiently record and depict a wide range of response surfaces [21].By employing an ample quantity of hidden nodes and layers, it becomes feasible to approximate any given surface with a high degree of accuracy.One significant constraint linked to neural network models is the difficulty of comprehending their results.The behaviour in question can be attributed to the inclusion of intermediate layers, which differ from the typical regression model where a straight link exists from the x variables to the y variables.

Hidden Layer Structure
The neural network possesses the capacity to adapt to either a single or multiple layer configuration.The addition of nodes in the initial layer, or the inclusion of an extra layer, improves the flexibility of the neural network.It is feasible to integrate an unlimited number of nodes into either layer.The nodes in the second layer exhibit a dependency on the x variables.The operational capabilities of the initial layer nodes are contingent upon the second layer nodes.The dependent variables, also known as y variables, are defined by the functions performed by the nodes in the first layer.
This research paper examines the process of function approximation through the use of feedforward artificial neural networks, specifically focusing on the restriction of connections just between adjacent layers.In the subsequent section, we present a formal exposition of the notion of a neural network and the associated nomenclature.
Let L be an element of the set of natural numbers, and let l 0 , . . ., l L ∈ N also be elements of the set of σ : R −→ R be an activation function and define the parameter space in Equation (3) [22]: where x is the input and the output is given by A k .The formula is defined for all values of k such that 1 ≤ k ≤ L. Furthermore, it denotes Ψ θ as a function that maps from R l 0 → R l L , x −→ Ψ θ (x) in Equation ( 4): where σ is applied elementwise.The term Ψ θ is used to denote the instantiation of the neural network linked to the parameter θ, which consists of L layers and has widths (l 0 , l 1 , . . . ,l L ).The initial L − 1 layers are commonly denoted as hidden layers.For each integer k such that 1 ≤ k ≤ L, it is defined that the width of the layer k as l k .Additionally, the weights and biases associated with layer k are denoted as W k and b k , respectively.The width of Ψ θ is formally defined as the value among the set of lengths max(l 0 , l 1 , . . ., l L ).
When the value of L = 2, it is appropriate to refer to Ψ θ as a deep neural network.Therefore, it can be observed that a shallow neural network consists of a single hidden layer, while deep neural networks are characterised by the presence of two or more hidden layers.This paper will primarily examine neural networks that employ the hyperbolic tangent as an activation function.The hyperbolic tangent function, denoted as σ(x), is defined as for x ∈ R. The networks shall be denoted as tanh neural networks.While it is possible to apply concepts to various smooth activation functions, concentrating on a specific activation function enables the establishment of precise and explicit bounds without compromising the clarity of the arguments [23].It is worth mentioning that the findings are specifically applicable to the sigmoid or logistic activation function, which may be understood as a transformed version of the hyperbolic tangent function with shifts and scaling.This concludes by revisiting the fundamental features of neural network calculus that will be employed throughout the subsequent discussion, albeit without explicit mention.
It is noteworthy to state that in certain circumstances, there are two often employed activation functions, namely the linear and Gaussian functions, which typically exhibit linearity and Gaussian distribution characteristics [24].The term linear pertains to a mathematical concept that characterises a correlation, or the notion of the identity function.The process of transforming the linear combination of the x variables is not executed.The utilisation of the linear activation function is frequently observed in conjunction with one of the non-linear activation functions.In this particular instance, the linear activation function is situated within the second layer, whereas the non-linear activation functions are situated within the first layer.This methodology demonstrates its efficacy when the aim is to first reduce the dimensionality of the x variables and subsequently utilise a nonlinear model for the y variables [25].When contemplating a continuous dependent variable y, if the model exclusively utilises linear activation functions, the resultant model for y can be mathematically represented as a linear amalgamation of the independent variables x.When confronted with a dependent variable y that is either nominal or ordinal in nature, it is possible to simplify the model by employing logistic regression.
The term Gaussian pertains to a mathematical distribution that is named in honour of Carl Friedrich Gauss [26].The Gaussian function, often known as the normal distribution, is a mathematical function that finds widespread application in many academic disciplines such as mathematics, physics, and statistics.This alternative should be employed when there is a wish to observe the behaviour of the radial basis function or when the response surface exhibits a Gaussian (normal) form [27].The Gaussian function is commonly represented as e −x 2 , where x is a linear combination of the X variables.

Prognostic Problem
The degradation pattern consists of multivariate time-series data obtained from condition monitoring sensors , along with the accompanying RUL values T .These measurements were collected from a fleet of N units, denoted as s i ∈ R p consists of a vector containing p raw measurements that were taken under specific operating conditions ω (t) i ∈ R s .The duration of the sensory signal for the i th unit is denoted as m i , which may vary among different units.The cumulative length of the dataset in question is m = ∑ N i=1 m i .In a more concise manner, the available dataset is denoted as D = {W i , X s i , Y i } N i=1 .Considering the provided scenario, the objective is to develop a predictive model G that can accurately estimate the RUL ( Ŷ) on a test dataset consisting of M units D T * = X sj * M j=1 .These units are represented by multivariate time-series data X sj * = [x 1 sj * , . . ., x k j sj * ], which captures readings from various sensors [28].The cumulative length of the test data is m * = ∑ M j=1 k j .

Prediction Dataset and Boosting Process
The NASA N-CMAPSS dataset offers simulated degradation trajectories of a fleet of turbofan engines, which are designed to mimic run-to-failure scenarios under real flight conditions.The baseline health status of these engines are unknown.The dataset comprises eight distinct sets of simulated data collected from a total of 128 individual engines.These units are subject to five distinct failure modes, all of which have the potential to impact the flow and efficiency of the rotating sub-components.Table 1 presents a comprehensive summary of flight classes and their corresponding failure modes, as shown by the available datasets [28].
Appl.Sci.2024, 14, 5057 8 of 17 Figure 3 depicts the intricate connections between various important operating factors, such as altitude (alt), throttle-resolver angle (TRA), flight cycles, total temperature, and the relative temperature decrease at the fan inlet (T2).This figure illustrates the interplay and impact of these variables, presenting data obtained from a dataset comprising six aircraft engines operating under different flight conditions.This representation emphasises the complex and interconnected aspects of our analysis and the detailed relationships that our model takes into account during the predictive process.
Figure 3 depicts the intricate connections between various important operating factors, such as altitude (alt), throttle-resolver angle (TRA), flight cycles, total temperature, and the relative temperature decrease at the fan inlet (T2).This figure illustrates the interplay and impact of these variables, presenting data obtained from a dataset comprising six aircraft engines operating under different flight conditions.This representation emphasises the complex and interconnected aspects of our analysis and the detailed relationships that our model takes into account during the predictive process.The dataset used for this research is referred to as DS01 and DS02, which depict simulated examples of typical flight cycles, as evidenced by the traces of the scenariodescriptor variables.Each simulated flight cycle consists of recordings of varying durations that record the climb, cruise, and descent phases of flight when the aircraft's altitude reaches 10,000 feet.These recordings correspond to the numerous flight paths undertaken by the aircraft.The flight trajectories of the remaining units of the fleet exhibit similarities.Both the training dataset and the validation dataset are contained within each data file.The dataset comprises six distinct types of variables: the operative conditions ω, the measured signals x s , the virtual sensors x v , the engine health parameters θ, the RUL label, and the auxiliary data such as unit number u, flight class number Fc, flight cycle number c, and the binary health state.Additionally, the parameters ω, x s , x v , and θ contain specified variable names.A summary of numerous variables can be found in [28].
The bootstrap forest encompasses the concepts of in-bag (IB) and out-of-bag (OOB) observations.The training set observations used to form a tree are referred to as IB observations.Observations in the training set that are not used in constructing a tree are referred to as OOB observations.The model offers the RASE values, which are the mean values for both IB and OOB observations, calculated over all trees.The OOB RASE is calculated for each tree by taking the square root of the sum of squared errors divided by the number of OOB observations.The data used to train an individual tree in the bootstrap method is randomly selected with replacement.In cases where the sampling procedure is intended to encompass all observations, the incorporation of replacement in the sampling approach leads to an expected proportion of unused observations equivalent to the reciprocal of the mathematical constant e.Every tree consists of over 1000 splits that include rankings, losses, and observations associated with OOB and IB data.The tree that exhibits the lowest OOB loss is assigned Rank 1. Table 2 presents the first 21 per-tree summaries of the bootstrap forest training processes.OOB loss is a metric that quantifies the overall predicted inaccuracy of the decision tree when it is applied to the rows that were not used for training.Smaller numbers imply greater prediction accuracy.Among the 21 trees, tree 19 has shown the highest prediction accuracy.Therefore, we prioritise and choose tree 19 and its splits to be included in the fusion with the suggested neural networks.In the context of a continuous response, the predicted value for a specific observation is obtained by computing the average of its predicted values across the ensemble of individual trees.Within a categorical framework, the projected likelihood of an observation is determined by taking the average of its expected probabilities across the collection of individual trees.The categorisation of the observation is determined by assigning it to the level that corresponds to its highest projected likelihood.The cumulative validation is a graphical representation of the fit statistics for the validation set, illustrating their fluctuations in response to the number of trees.Accessibility is contingent upon the successful deployment of validation.The R-Square statistic serves as the sole measure of adequacy when evaluating a continuous response variable.

Discussion and Results
The methodology employed in generating the dataset involves the identification and analysis of the failure modes associated with the sub-components of the main engine.Figure 4 depicts a comprehensive methodological profiler for predicting the RUL of engines, which clearly showcases the positive, neural, and negative properties and profilers that influence the entire RUL prediction process.
serves as the sole measure of adequacy when evaluating a continuous response variable.

Discussion and Results
The methodology employed in generating the dataset involves the identification and analysis of the failure modes associated with the sub-components of the main engine.Figure 4 depicts a comprehensive methodological profiler for predicting the RUL of engines, which clearly showcases the positive, neural, and negative properties and profilers that influence the entire RUL prediction process.The components that exhibit a consistent decrease are the fan, low pressure compressor (LPC), high pressure compressor (HPC), low pressure turbine (LPT), and high-pressure turbine (HPT).The simulation of deterioration effects involves the modification of the flow capacity and efficiency of the engine sub-components, which are denoted by the engine health parameters θ.The given quantity of hyperbolic tangent boosted nodes is three, while a sequence of models that are additive in nature and scaled by the boosting learning rate is set to 0.1.The penalty method is mathematically defined as the square of a certain value, where the number of iterations is fixed at one. Figure 5 presents visual depictions of the highest rated response contours for pairs of variables, obtained by extracting notable positive and negative features through profiler manipulation.The interactive contour profiling capability is used to visually improve response surfaces.Every contour is accompanied by a dashed line that shows the direction of higher response values, giving a distinct indicator of orientation.
a certain value, where the number of iterations is fixed at one. Figure 5 presents visual depictions of the highest rated response contours for pairs of variables, obtained by extracting notable positive and negative features through profiler manipulation.The interactive contour profiling capability is used to visually improve response surfaces.Every contour is accompanied by a dashed line that shows the direction of higher response values, giving a distinct indicator of orientation.The prediction profiler displays profile traces for every  variable.A profile trace refers to the anticipated outcome when a single variable is altered while keeping all other variables at their present levels.The prediction profiler recalculates the profiles and projected reactions instantaneously when you modify the value of an  variable.The vertical The prediction profiler displays profile traces for every x variable.A profile trace refers to the anticipated outcome when a single variable is altered while keeping all other variables at their present levels.The prediction profiler recalculates the profiles and projected reactions instantaneously when you modify the value of an x variable.The vertical dashed line for each x variable represents its current value or configuration.The value shown above each factor name is the current value of the corresponding x variable.The horizontal dashed line represents the present forecasted value of each y variable based on the current values of the x variables.To fully understand how the x factors are affecting the expected values, some variables show profiles with positive slopes, while others display negative slopes.For instance, the variable T40 has a favourable upward trend.This implies a clear relationship between the total temperature at the burner output T40 and the anticipated median value.Specifically, as the total temperature rises, the predicted median value similarly increases.T48 is the total temperature at the outlet of the HPT.It has a negative slope, indicating that as the temperature rises, the median value declines.
The fitting procedure of the neural network incorporates a validation technique.Various validation methods can be utilised, such as holdback, K-fold, and the incorporation of a validation column.To facilitate the development of a model, a series of steps are taken.The model parameters are subjected to a penalty, and the penalties imposed on these parameters are optimised using the validation set.The validation method is chosen as a random holdback, which means that the original data is divided into training and validation sets in a random manner, and the proportion of the original dataset to be allocated as the validation set, often known as the holdback.The process of random selection in this study is grounded in the principles of stratified sampling, which involves dividing the model factors into distinct strata.This approach aims to provide training and validation sets that exhibit a higher degree of balance compared to those obtained through simple random sampling.Figure 6 displays plots that compare the actual values with the predicted values for the training and validation procedures.The plots use the bootstrap forest method shown in red, the boosted tree method represented in green, and the neural boosted method represented in blue.The data points exhibit a strong correlation with the line, suggesting a high degree of relationship between the anticipated values and the actual values.
horizontal dashed line represents the present forecasted value of each  variable based on the current values of the  variables.To fully understand how the  factors are affecting the expected values, some variables show profiles with positive slopes, while others display negative slopes.For instance, the variable T40 has a favourable upward trend.This implies a clear relationship between the total temperature at the burner output T40 and the anticipated median value.Specifically, as the total temperature rises, the predicted median value similarly increases.T48 is the total temperature at the outlet of the HPT.It has a negative slope, indicating that as the temperature rises, the median value declines.
The fitting procedure of the neural network incorporates a validation technique.Various validation methods can be utilised, such as holdback, K-fold, and the incorporation of a validation column.To facilitate the development of a model, a series of steps are taken.The model parameters are subjected to a penalty, and the penalties imposed on these parameters are optimised using the validation set.The validation method is chosen as a random holdback, which means that the original data is divided into training and validation sets in a random manner, and the proportion of the original dataset to be allocated as the validation set, often known as the holdback.The process of random selection in this study is grounded in the principles of stratified sampling, which involves dividing the model factors into distinct strata.This approach aims to provide training and validation sets that exhibit a higher degree of balance compared to those obtained through simple random sampling.Figure 6 displays plots that compare the actual values with the predicted values for the training and validation procedures.The plots use the bootstrap forest method shown in red, the boosted tree method represented in green, and the neural boosted method represented in blue.The data points exhibit a strong correlation with the line, suggesting a high degree of relationship between the anticipated values and the actual values.Boosting refers to the procedure of constructing a substantial neural network model by training a series of smaller models.Each of the smaller models is fitted using the scaled residuals of the preceding model.The models are aggregated to create the overarching final model.The technique employs validation to determine the appropriate number of component models to fit, ensuring that it does not exceed the set limit.Boosting often exhibits faster computational speed compared to training a single, big model.Nevertheless, the fundamental model needs to consist of a single layer with 1 to 2 nodes.The advantage of quicker fitting might be negated if a substantial quantity of models is supplied.Learning rates approaching 1 lead to quicker convergence on a final model but also have a greater inclination to overfit data.In order to address the issue of neural networks being Boosting refers to the procedure of constructing a substantial neural network model by training a series of smaller models.Each of the smaller models is fitted using the scaled residuals of the preceding model.The models are aggregated to create the overarching final model.The technique employs validation to determine the appropriate number of component models to fit, ensuring that it does not exceed the set limit.Boosting often exhibits faster computational speed compared to training a single, big model.Nevertheless, the fundamental model needs to consist of a single layer with 1 to 2 nodes.The advantage of quicker fitting might be negated if a substantial quantity of models is supplied.Learning rates approaching 1 lead to quicker convergence on a final model but also have a greater inclination to overfit data.In order to address the issue of neural networks being prone to overfitting, the fitting procedure includes a penalty on the probability.The penalty term is represented as λp(βi), where λ represents the regularisation parameter, and p( ) denotes the penalty function, which is a function of the estimated parameters.The utilisation of the validation method is employed in order to ascertain the optimal value for the penalty parameter.Table 3 provides an overview of several penalty techniques.Each of the training, validation, and test sets is provided with a range of prognostic measures.The prognostic measures vary based on whether the response is categorical or continuous.For continuous response scenarios, the cross-validation statistics consist of a generalised R 2 , an entropy R 2 , the (negative) loglikelihood, the root mean squared error (RMSE), and the mean absolute deviation.A list of measures of fit for both training and validation processes is summarised in Table 5 as follows.

Measures Description
Generalised R 2 Generalised R 2 extends the concept of the coefficient of determination (R 2 ) to models that do not fit within the framework of linear regression.The adjusted coefficient of determination quantifies the percentage of variation accounted for by the model while taking into account the number of predictors.It is a useful metric for assessing the goodness of fit between different models.

Entropy R 2
It is a statistical measure commonly employed in the analysis of categorical data.The concept of entropy from information theory serves as the foundation for this assessment, which measures the decrease in uncertainty resulting from the model.Higher numbers suggest a model that effectively explains the variability in the categories.

R 2
Gives the RSquare for the model.

RASE
Gives the root average squared error.When the response is nominal or ordinal, the differences are between 1 and p, where p is the predicted probability of the response level.
Mean Abs Dev (MAD) It quantifies the average absolute difference between expected and observed values, irrespective of their direction.It is a reliable indicator of the extent to which these faults are distributed.

Misclassification rate (MR)
The misclassification rate (MR) is a metric employed in classification problems to quantify the proportion of inaccurate predictions out of the total number of predictions made.It is a simplistic approach to assessing the frequency of the model's inaccuracies.
-Loglikelihood Provides the negative of the loglikelihood SSE Gives the error sums of squares.It is only available when the response is continuous.

Sum Freq
It represents the complete analysis of observations.When a neural network variable Freq is supplied, the Sum Freq function sums the frequency column values.The congruity in RASE and MAD values between the training and validation stages implies that the model exhibits strong generalisation capabilities towards unfamiliar input.This is a crucial element of prognostic models, as it indicates their dependability in practical situations where the model needs to generate precise forecasts on data that it has not been exposed to during its training.Due to the low SSE, the model's predictions closely align with the actual observations, providing more evidence of its efficiency.
To determine the effectiveness and strength of the suggested methodology, a comparison analysis was performed against existing methodologies described in the literature as follows: • A method was presented by [5] that uses data fusion with stage division to forecast the RUL of a system.This method, called an ensemble RUL prediction method, has been shown to achieve considerable gains in accuracy.The utilisation of bootstrap forests and hyperbolic tangent neural networks in our approach yields similar re-sults while providing increased adaptability and resilience in managing nonlinear deterioration patterns.

•
Ref. [2] conducted a comprehensive analysis of predictive maintenance for defence fixed-wing aircraft, emphasising the importance of flexible and precise prognostic models.The hybrid model we have developed successfully integrates numerous prediction algorithms to ensure both high accuracy and reliability.

•
Ref. [29] criticised the reliability prediction techniques used in avionics applications, highlighting the constraints of conventional methods in dynamic and intricate settings.
The suggested neural boosted model overcomes these constraints by providing a prediction framework that is more dynamic and adaptive.• The study conducted by [30] investigated the use of multi-task learning to forecast the RUL in different working settings.The results showed that advanced learning approaches have the potential to enhance the accuracy of RUL predictions.The findings of our study demonstrate that the combination of bootstrap forests and neural networks can produce comparable or better performance, especially when dealing with diverse operational scenarios.• The significance of feature fusion and similarity-based prediction models for precise estimation of RUL was highlighted by [1].Our suggested model improves upon current approaches by using hyperbolic tangent functions and bootstrap aggregation, resulting in increased prediction accuracy.
Table 7 demonstrates the enhanced performance of the suggested methodology compared to established models through statistical comparisons.
engines with lifespans of hundreds of cycles.It incorporates a sophisticated combination of bootstrap forest and hyperbolic tangent boosted neural networks and has been developed and researched extensively.
Overall, the NtanH(3)NBoost(20) model represents a significant advancement in the precision and dependability of data-based predictions for aircraft systems.The models demonstrated a high level of accuracy, correctly predicting observed flight cycles during validation with an accuracy rate of 95-97%.Additionally, it paves the way for future improvements and developments.The encouraging outcomes achieved prompt continuous improvement of both the model itself and the fundamental physical techniques, with the goal of developing even more accurate and hybrid prognostic models in the future.This is to strengthen predictive maintenance procedures, which will ultimately lead to safer, more efficient, and more dependable aircraft operations through ongoing improvement and adaptation.

Figure 1 .
Figure 1.Illustration of the initial branch of a complete bootstrap forest, which is used for predicting the RUL.Each branch in the forest has a comparable structure.

Figure 1 .
Figure 1.Illustration of the initial branch of a complete bootstrap forest, which is used for predicting the RUL.Each branch in the forest has a comparable structure.

Figure 2 .
Figure 2. Display of a neural network diagram with 45 inputs.

Figure 2 .
Figure 2. Display of a neural network diagram with 45 inputs.

Figure 3 .
Figure 3. Interconnected relationships between scenario descriptors such as Mach number, throttle resolver angle (TRA), altitude (alt), and operative parameters such as fight cycles, HPT efficiency, total temperature, and its relative temperature drop.(a) Cycle versus HPT efficiency.(b) Cycle versus relative temperature drop.(c) Throttle resolver angel time series variations.(d) Engine Mach number variations.(e) Flight altitude changes.(f) Engines HPT efficiency variation against individual engine units.

Figure 3 .
Figure 3. Interconnected relationships between scenario descriptors such as Mach number, throttle resolver angle (TRA), altitude (alt), and operative parameters such as fight cycles, HPT efficiency, total temperature, and its relative temperature drop.(a) Cycle versus HPT efficiency.(b) Cycle versus relative temperature drop.(c) Throttle resolver angel time series variations.(d) Engine Mach number variations.(e) Flight altitude changes.(f) Engines HPT efficiency variation against individual engine units.

Figure 4 .
Figure 4. Depiction of the visual representation of prediction profiler traces that involve numerous inputs.

Figure 4 .
Figure 4. Depiction of the visual representation of prediction profiler traces that involve numerous inputs.

Figure 5 .
Figure 5. Representation of the most highly rated extracted features used for predicting RUL.(a) Interrelated profiler relationship between ratio of fuel flow to Ps30 (phi) and total temperature at burner outlet (T40); (b) Interrelated profiler relationship between flow out of LPT (W50) and total temperature at HPT outlet (T48); (c) Interrelated profiler relationship between flow out of LPC (W22) and static pressure at HPC outlet (Ps30); (d) Interrelated profiler relationship between W50 and T40.

Figure 5 .
Figure 5. Representation of the most highly rated extracted features used for predicting RUL.(a) Interrelated profiler relationship between ratio of fuel flow to Ps30 (phi) and total temperature at burner outlet (T40); (b) Interrelated profiler relationship between flow out of LPT (W50) and total temperature at HPT outlet (T48); (c) Interrelated profiler relationship between flow out of LPC (W22) and static pressure at HPC outlet (Ps30); (d) Interrelated profiler relationship between W50 and T40.

Figure 6 .
Figure 6.Illustration of the prediction outcomes and their accompanying metrics, which include the bootstrap forest, boosted tree, and neural boosted network.(left) Process of training.(right) The validation process.

Figure 6 .
Figure 6.Illustration of the prediction outcomes and their accompanying metrics, which include the bootstrap forest, boosted tree, and neural boosted network.(left) Process of training.(right) The validation process.
(3)NBoost(20) model accounts for almost all of the variability in the target variable.The marginal decline in the validation R 2 indicates a remarkably high degree of model precision and ability to generalise.The RASE scores for both training (1.2688147) and validation (1.2710539) are comparably low, indicating that the model's predictions closely match the actual data points.The congruity between the RASE values of the training and validation sets suggests consistent and reliable performance across various datasets.The SSE represents the sum of the squared differences between the predicted values and the actual values.The SSE for the training data is 5,649,332.2,and the SSE for the validation data is 2,834,219.7.The correctness of the model is supported by these figures, as well as the strong R 2 values.

Table 1 .
Overview of the aircraft turbine dataset.

Table 2 .
The description of the initial 21 branch trees in the bootstrap forest training operations.

Table 3 .
Overview of several penalty mechanisms used for differentiation.This approach is recommended when a significant number of x factors are believed to be influential in enhancing the predictive capacity of the model.If one is faced with a substantial number of x variables and suspects that only a subset of them significantly contribute to the predictive capacity of the model, either of the following approaches can be employed.This option is suitable for situations where there is a substantial volume of data and expeditious completion of the fitting procedure is desired.Nevertheless, this alternative may result in models exhibiting inferior predictive accuracy compared to models that incorporate a penalty.The authors of this research article conduct a comparative analysis between the proposed methodology and established prediction approaches, such as K Nearest Neighbours (kNNs) and boosted tree.The neural boosted model has outstanding performance, as shown by its consistently high training RSquare values throughout several folds and its higher validation RASE score.The mean validation RSquare for neural boosted is 0.9968, and the mean RASE for neural boosted is 1.2711.The elapsed time data indicates that the neural boosted models exhibited the longest duration for fitting.Table4displays the comprehensive comparison of forecast outcomes in relation to various methods.

Table 4 .
A compilation of the outcomes obtained from comparing predictions.

Table 5 .
Training and validation metrics of fit.

Table 6
displays the results of the suggested NtahH(3)NBoost(20) measure of fit for both the training and validation phases.The R 2 values are remarkably high for both the training (0.9967847) and validation (0.9967703) datasets, suggesting that the NtanH