Adaptively Lightweight Spatiotemporal Information-Extraction-Operator-Based DL Method for Aero-Engine RUL Prediction

Accurate prediction of machine RUL plays a crucial role in reducing human casualties and economic losses, which is of significance. The ability to handle spatiotemporal information contributes to improving the prediction performance of machine RUL. However, most existing models for spatiotemporal information processing are not only complex in structure but also lack adaptive feature extraction capabilities. Therefore, a lightweight operator with adaptive spatiotemporal information extraction ability named Involution GRU (Inv-GRU) is proposed for aero-engine RUL prediction. Involution, the adaptive feature extraction operator, is replaced by the information connection in the gated recurrent unit to achieve adaptively spatiotemporal information extraction and reduce the parameters. Thus, Inv-GRU can well extract the degradation information of the aero-engine. Then, for the RUL prediction task, the Inv-GRU-based deep learning (DL) framework is firstly constructed, where features extracted by Inv-GRU and several human-made features are separately processed to generate health indicators (HIs) from multi-raw data of aero-engines. Finally, fully connected layers are adopted to reduce the dimension and regress RUL based on the generated HIs. By applying the Inv-GRU-based DL framework to the Commercial Modular Aero Propulsion System Simulation (C-MAPSS) datasets, successful predictions of aero-engines RUL have been achieved. Quantitative comparative experiments have demonstrated the advantage of the proposed method over other approaches in terms of both RUL prediction accuracy and computational burden.


Introduction
Remaining useful life (RUL) prediction, as a significant research domain in prognostics and health management (PHM) [1], offers the potential to forecast the future degradation trajectory of equipment based on its current condition. Transforming scheduled maintenance into proactive operations substantially mitigates the risks of personnel casualties and economic losses resulting from mechanical failures.
With the increasing complexity and sophistication of equipment, conventional PHM methods by dynamic models, expert knowledge, and manual feature extraction have become increasingly limited. Nowadays, fueled by rapid advancements in technologies such as sensors, the Internet of Things, and artificial intelligence, attention has been drawn to DL-based techniques with remarkable performance for RUL prediction [2][3][4]. Therefore, with industrial data accumulation, conducting DL-based RUL prediction research for equipment, which possess powerful feature extraction capabilities, has not only emerged as a hot research topic in academia but also holds significant practical implications for the industry. et al. [34] introduced a new LSTM variant to predict the RUL of aircraft engines by combining autoencoders and RNNs. The proposed method conducted the pooling operation with LSTM's gating mechanism while retaining the convolutional operations, allowing for parallel processing. Dulaimi et al. [35] proposed a parallel DL framework based on CNN and LSTM to extract the temporal and spatial features from raw measurements. To solve the inconsistent problem of inputs, Xia et al. [36] proposed a CNN-BLSTM method, which has a different time scale processing ability. Xue et al. [37] introduced a data-driven approach for predicting the RUL, which incorporates two parallel pathways: one pathway combines multi-scale CNN and BLSTM, while the other pathway only utilizes BLSTM.
Research works based on LSTM variants and convolution operators have achieved significant success in RUL prediction, but they still have some gaps. The convolutional kernel exhibits redundancy in the channel dimension, and the extraction features lack the ability to adapt flexibly based on the input itself [38]. Moreover, the ability to capture flexible spatiotemporal features not only saves computational resources, but also enables the extraction of rich features, thereby improving the accuracy of mechanical RUL prediction. Additionally, the computation burden is also an important requirement for mechanical RUL prediction. Therefore, it is worth investigating how to enhance the spatiotemporal capturing capability of prediction models while minimizing model parameters to improve prediction speed.
Consequently, considering the aforementioned limitations, a lightweight operator with adaptive feature capturing capabilities named involution GRU (InvGRU) is proposed, and a deep learning framework is constructed based on this operator to predict the RUL of aircraft engines. The RUL prediction results of the C-MAPSS dataset [24] demonstrate that the proposed method outperforms other publicly available methods in terms of prediction accuracy and computational burden.
Below are the contributions of the article: • Introducing InvGRU: We propose a novel operator called InvGRU, which replaces the connection operator in GRU and allows for adaptive capture of spatiotemporal information based on the input itself. InvGRU demonstrates the ability to extract spatiotemporal information with fewer parameters compared with other models. • Constructing a deep learning framework: Building upon InvGRU, we construct a deep learning framework that achieves higher prediction accuracy. • Experimental validation: The experimental results on aircraft engine RUL prediction validate the effectiveness and superiority of the proposed InvGRU-based deep learning framework. It outperforms other models in terms of prediction accuracy and showcases the potential for improved RUL estimation in practical applications.
The outline of the article is as follows. Section 1 introduces the research topic. Section 2 presents a concise explanation of the fundamental principles of GRU and involution. In Section 3, the novel operator InvGRU, which has the adaptively spatiotemporal information extraction ability, is introduced. Then, the proposed methods are thoroughly validated and compared through experiments on the C-MAPSS dataset in Section 4. Finally, Section 5 presents the conclusion.

Inverse Convolution
Thanks to its spatial invariance and channel specificity, CNN has been widely employed for feature extraction. The formula for CNN is as follows: where X ∈ R H×W×C i and Y ∈ R H×W×C o are the input tensor and the output tensor, respectively; F ∈ R C o ×C i ×K×K denotes the kernel of convolution; c o , c i , and K denote the output channel number, input channel number, and kernel size, respectively; while H and W represent the spatial dimensions of the output and input channels. Although sharing spatial parameters alleviates some computational burden, it also introduces certain drawbacks. For instance, the extracted features tend to be relatively simplistic, and the convolution kernel lacks flexibility in adapting to input data [38]. Furthermore, the convolutional kernel exhibits redundancy in the channel dimension [38]. The recently proposed inverse convolutional neural network (INN) [38] addresses the aforementioned limitations in a manner that preserves channel invariance and spatial specificity. For the channel dimension, INN allows for sharing of involution kernels, causing INN to provide more flexible modeling of the involution kernels in the spatial dimension, thereby exhibiting characteristics opposite to those of convolutional neural networks. The mathematical expression of INN is as follows: where H ∈ R H×W×K×K×G represents the kernel of involution, G represents that all of the channels share G involution kernels, and it is noted that G C. Compared with CNN, INN cannot utilize fixed weight matrices as learnable parameters. Instead, it generates corresponding involution kernels by the input features.
where W 0 ∈ R c r ×c and W 1 ∈ R (K×K×G)× c r denote the linear transformation matrix, r is the channel reduction rate, BN is the batch normalization, Relu is the Relu activation function, and X Ψ i,j denotes the index set of coordinate (i, j). The principle of INN is shown in Figure 1, which is demonstrated as the example when G = 1. c , and K denote the output channel number, input channel number, and kernel size, respectively; while H and W represent the spatial dimensions of the output and input channels. Although sharing spatial parameters alleviates some computational burden, it also introduces certain drawbacks. For instance, the extracted features tend to be relatively simplistic, and the convolution kernel lacks flexibility in adapting to input data [38]. Furthermore, the convolutional kernel exhibits redundancy in the channel dimension [38]. The recently proposed inverse convolutional neural network (INN) [38] addresses the aforementioned limitations in a manner that preserves channel invariance and spatial specificity. For the channel dimension, INN allows for sharing of involution kernels, causing INN to provide more flexible modeling of the involution kernels in the spatial dimension, thereby exhibiting characteristics opposite to those of convolutional neural networks. The mathematical expression of INN is as follows: where represents the kernel of involution, G represents that all of the channels share G involution kernels, and it is noted that GC . Compared with CNN, INN cannot utilize fixed weight matrices as learnable parameters. Instead, it generates corresponding involution kernels by the input features.  Figure 1, which is demonstrated as the example when G = 1.

GRU
GRU, which had fewer parameters compared with LSTM, only has the reset gate

GRU
GRU, which had fewer parameters compared with LSTM, only has the reset gate r t and an update gate z t . The structure of GRU is demonstrated in Figure 2. The output h t of GRU at the current time step t can be represented by the following equation: where w denotes the weight matrix of the input data x t and recurrent data h t−1 ; b is the bias; h t represents the hidden state; is the dot product operator; tanh and σ are the activation functions; and h t denotes the output data. is the dot product operator; tanh and  are the activation functions; and t h denotes the output data. Figure 2. Schematic diagram of GRU, in which  is the addition operator.

Proposed InvGRU
Using convolutional operations to learn representations from multi-source raw data has been shown to outperform hand-crafted features in machine diagnosis and prognosis [28][29][30]. Recent studies have proposed combining RNN models with CNN representations to capture spatio-temporal information [35][36][37]. This approach improves the model's ability to understand patterns and relationships over space and time, leading to better analysis and prediction in various domains. A novel operator called Involution GRU (InvGRU) is proposed to address the limitations of the convolution operator. InvGRU introduces involution operations in both the input-to-state and state-to-state transitions, enabling adaptive feature extraction from multi-source raw data while reducing model parameters. This approach enhances the model's ability to capture spatio-temporal information effectively. The diagram of InvGRU is shown in Figure 3.

Proposed InvGRU
Using convolutional operations to learn representations from multi-source raw data has been shown to outperform hand-crafted features in machine diagnosis and prognosis [28][29][30]. Recent studies have proposed combining RNN models with CNN representations to capture spatio-temporal information [35][36][37]. This approach improves the model's ability to understand patterns and relationships over space and time, leading to better analysis and prediction in various domains. A novel operator called Involution GRU (InvGRU) is proposed to address the limitations of the convolution operator. InvGRU introduces involution operations in both the input-to-state and state-to-state transitions, enabling adaptive feature extraction from multi-source raw data while reducing model parameters. This approach enhances the model's ability to capture spatio-temporal information effectively. The diagram of InvGRU is shown in Figure 3.
where w denotes the weight matrix of the input data t x and recurrent data is the bias; t h represents the hidden state; is the dot product operator; tanh and  are the activation functions; and t h denotes the output data. Figure 2. Schematic diagram of GRU, in which  is the addition operator.

Proposed InvGRU
Using convolutional operations to learn representations from multi-source raw data has been shown to outperform hand-crafted features in machine diagnosis and prognosis [28][29][30]. Recent studies have proposed combining RNN models with CNN representations to capture spatio-temporal information [35][36][37]. This approach improves the model's ability to understand patterns and relationships over space and time, leading to better analysis and prediction in various domains. A novel operator called Involution GRU (InvGRU) is proposed to address the limitations of the convolution operator. InvGRU introduces involution operations in both the input-to-state and state-to-state transitions, enabling adaptive feature extraction from multi-source raw data while reducing model parameters. This approach enhances the model's ability to capture spatio-temporal information effectively. The diagram of InvGRU is shown in Figure 3. To enhance the feature processing method for one-dimensional time series data, a one-dimensional involution algorithm based on one-dimensional vectors as inputs, namely 1D-INN, is adopted. The mathematical expression of 1D-INN is presented below: where H ∈ R H×K×G is the kernel of ID-INN, X Ψ i is the index set of coordinate (i, 1), W 0 ∈ R c r ×c and W 1 ∈ R (K×G)× c r are the weight connection matrixes to make a linear transformation, and Mish is the Mish activation function. The other parameters are the same as raw INN. We enhance the feature representation using INN to incorporate longer temporal convolutions, allowing for the prediction of RUL at a larger temporal scale. In the Sensors 2023, 23, 6163 6 of 17 article, the INN kernel is set to 5 and the size of max-pooling is set to 2, while r is set to 2. InvGRU, similar to the conventional GRU, comprises update gates, reset gates, and cells. The forward process of InvGRU, responsible for computing the output, is defined by the following equations: Update gates: Reset gates: Cells: Cell outputs: where * is the operator of the ID-INN; w and b terms are the learnable weights and biases, respectively; and the other parameters are the same as GRU.

The Adopted DL Framework
Based on the proposed InvGRU, a DL framework is adopted to estimate the aeroengines' RUL. The framework diagram in Figure 4 integrates HIs from both neural networks (NNs) and human-made features, enabling a comprehensive approach to RUL prediction. First, InvGRU is employed to extract features based on multi-raw measurements, including multiple sensors' data and engine operational condition (OC) information. The attention weights [39] are calculated by the obtained hidden features and combined with the hidden features, and the merged features are input into the following FC layers to generate the HIs from the NN. In the next step, commonly used handcrafted features such as the mean and trend coefficient are calculated from the raw data. The mean represents the average value of a window, while the trend coefficient corresponds to the slope coefficient derived from linear regression on the windowed time series. To obtain the HIs of human-made features, these handcrafted features are then fed into a new fully connected FC layer. Finally, HIs obtained from the neural network and human-made features are concatenated to form the HI set. This concatenated set is inputted into the regression layer, which predicts the RUL.
During the training, supposed that N represents the sample number and the loss function is the mean square error (MSE), which is adopted to evaluate the similarity between the predicted RUL Rul i and the true RUL Rul i of each sample i. The MSE is calculated using Equation (15) as follows: Adam is used as the optimization method to tune the parameters θ of the proposed method based on the error gradients during the back-propagation processing. Dropout, a technique for preventing overfitting, is implemented in the model during training. Table 1 shows the hyper-parameters of the prosed DL framework based on InvGRU.
as the mean and trend coefficient are calculated from the raw data. The mean represents the average value of a window, while the trend coefficient corresponds to the slope coefficient derived from linear regression on the windowed time series. To obtain the HIs of human-made features, these handcrafted features are then fed into a new fully connected FC layer. Finally, HIs obtained from the neural network and human-made features are concatenated to form the HI set. This concatenated set is inputted into the regression layer, which predicts the RUL.  During the training, supposed that N represents the sample number and the loss function is the mean square error (MSE), which is adopted to evaluate the similarity between the predicted RUL Adam is used as the optimization method to tune the parameters  of the proposed method based on the error gradients during the back-propagation processing. Dropout, a technique for preventing overfitting, is implemented in the model during training. Table  1 shows the hyper-parameters of the prosed DL framework based on InvGRU.

Evaluation Indexes
The RUL prediction performance of the method is quantitatively characterized using score and root mean square error (RMSE), which are defined by the following formulas: These metric values are inversely proportional to the RUL prediction performance. In other words, a lower value indicates better model performance. Score penalizes delayed predictions more heavily than RMSE, as shown in Figure 5, making it more aligned with engineering practices. Therefore, the score is more reasonable, especially when the RMSE values are close. In the figure, the vertical axis represents the value of RMSE and score, while the horizontal axis represents the errors between the predicted RUL and actual RUL. These metric values are inversely proportional to the RUL prediction performance. In other words, a lower value indicates better model performance. Score penalizes delayed predictions more heavily than RMSE, as shown in Figure 5, making it more aligned with engineering practices. Therefore, the score is more reasonable, especially when the RMSE values are close. In the figure, the vertical axis represents the value of RMSE and score, while the horizontal axis represents the errors between the predicted RUL and actual RUL.

The Details of the C-MAPSS Dataset
The C-MAPSS dataset, developed by NASA, simulates degradation data for turbofan engines, whose structure is shown in Figure 6. The C-MAPSS dataset can be divided into four subsets based on different operating conditions and fault modes, as described in Table 2. The 21 simulation outputs of C-MAPSS are listed in Table 3, in which ~, ↑ and ↓ represent the stable, upward and descend trends of the sensor measurements.

The Details of the C-MAPSS Dataset
The C-MAPSS dataset, developed by NASA, simulates degradation data for turbofan engines, whose structure is shown in Figure 6. The C-MAPSS dataset can be divided into four subsets based on different operating conditions and fault modes, as described in Table 2. The 21 simulation outputs of C-MAPSS are listed in Table 3, in which~, ↑ and ↓ represent the stable, upward and descend trends of the sensor measurements.
These metric values are inversely proportional to the RUL prediction performance. In other words, a lower value indicates better model performance. Score penalizes delayed predictions more heavily than RMSE, as shown in Figure 5, making it more aligned with engineering practices. Therefore, the score is more reasonable, especially when the RMSE values are close. In the figure, the vertical axis represents the value of RMSE and score, while the horizontal axis represents the errors between the predicted RUL and actual RUL.

The Details of the C-MAPSS Dataset
The C-MAPSS dataset, developed by NASA, simulates degradation data for turbofan engines, whose structure is shown in Figure 6. The C-MAPSS dataset can be divided into four subsets based on different operating conditions and fault modes, as described in Table 2. The 21 simulation outputs of C-MAPSS are listed in Table 3, in which ~, ↑ and ↓ represent the stable, upward and descend trends of the sensor measurements.   Each subset of the dataset consists of training data, testing data, and corresponding actual RUL values. The training data comprise all engine data from a healthy state to failure, while the testing data include data from engines that were operated before failure. In both the training and testing datasets, a diverse set of engines with varying initial health states is included. This results in variations in the operating cycles of different engines within the same dataset, reflecting the heterogeneous nature of the engine population. To demonstrate the effectiveness of the proposed method, experiments are conducted on all subsets of the dataset.

Data Preprocessing
Firstly, not all sensor measurements are included as inputs in the RUL prediction model. Some stable measurements (sensors 1, 5, 6, 10, 16, 18, and 19) are excluded in advance. These sensor measurements contain limited degradation information of the engine and are not suitable for predicting the RUL. Additionally, operating condition information affects the predictive capability of the model. Therefore, the 14 selected sensor measurements and operating condition information serve as the final input for the model. Secondly, we segment the data using the technique demonstrated in Figure 7. T, l, and m represent the total lifecycle, the window size, and the sliding step, respectively. The size of the i-th input is l × n, where n represents the dimension number of the final input of the proposed model. The RUL at this point is Ts − l − (i − 1) × m. Based on the results of the experiments, the sliding window size l is set to 30 and the sliding step m is set to 1. Finally, the linear piecewise RUL technique is used to construct the RUL labels as follows: (19) where the preset Rul max is 125.
where the preset max Rul is 125.

Full life lenghth T
Step size m Window size l RUL=T l RUL=T l m Figure 7. Processing of data segmentation.

The Analysis and Comparison of RUL Prediction Results
First, the proposed InvGRU-based DL framework is trained using the training sets from all of the subsets. Then, the test set of the subsets is adopted to test the predictive performance of InvGRU-based DL framework. The prediction results are shown in Figures 8-12. In the figures, the x-axis is the tested aircraft engine unit number and the y-axis denotes the RUL cycles. The predicted RUL and the actual RUL are represented by the solid blue line and dashed green line, respectively.

The Analysis and Comparison of RUL Prediction Results
First, the proposed InvGRU-based DL framework is trained using the training sets from all of the subsets. Then, the test set of the subsets is adopted to test the predictive performance of InvGRU-based DL framework. The prediction results are shown in Figures 8-12. In the figures, the x-axis is the tested aircraft engine unit number and the y-axis denotes the RUL cycles. The predicted RUL and the actual RUL are represented by the solid blue line and dashed green line, respectively.

The Analysis and Comparison of RUL Prediction Results
First, the proposed InvGRU-based DL framework is trained using the training sets from all of the subsets. Then, the test set of the subsets is adopted to test the predictive performance of InvGRU-based DL framework. The prediction results are shown in Figures 8-12. In the figures, the x-axis is the tested aircraft engine unit number and the y-axis denotes the RUL cycles. The predicted RUL and the actual RUL are represented by the solid blue line and dashed green line, respectively.

The Analysis and Comparison of RUL Prediction Results
First, the proposed InvGRU-based DL framework is trained using the training sets from all of the subsets. Then, the test set of the subsets is adopted to test the predictive performance of InvGRU-based DL framework. The prediction results are shown in Figures 8-12. In the figures, the x-axis is the tested aircraft engine unit number and the y-axis denotes the RUL cycles. The predicted RUL and the actual RUL are represented by the solid blue line and dashed green line, respectively.     From Figures 8-12, it can be observed that, across all subsets (FD001, FD002, FD003,  and FD004), the proposed model demonstrates a consistent prediction of the RUL that aligns closely with the actual RUL for the majority of the tested aircraft engine units. This    From Figures 8-12, it can be observed that, across all subsets (FD001, FD002, FD003,  and FD004), the proposed model demonstrates a consistent prediction of the RUL that aligns closely with the actual RUL for the majority of the tested aircraft engine units. This    From Figures 8-12, it can be observed that, across all subsets (FD001, FD002, FD003,  and FD004), the proposed model demonstrates a consistent prediction of the RUL that aligns closely with the actual RUL for the majority of the tested aircraft engine units. This is evident from the substantial overlap between the blue and green data points, indicating From Figures 8-12, it can be observed that, across all subsets (FD001, FD002, FD003, and FD004), the proposed model demonstrates a consistent prediction of the RUL that aligns closely with the actual RUL for the majority of the tested aircraft engine units. This is evident from the substantial overlap between the blue and green data points, indicating the high accuracy of the proposed model in predicting RUL. Upon closer examination, Figure 8 shows a closer proximity between the RUL and the actual RUL compared with Figures 9-11. This indicates that the proposed model achieves its best performance on the FD001 dataset. Additionally, the RUL prediction performance of the proposed method is superior on the FD003 dataset compared with the FD002 dataset, while it performs worst on the FD004 dataset. Moreover, the RUL prediction effectiveness of the proposed model is higher on the FD001 and FD003 datasets compared with the FD002 and FD004 datasets, highlighting its superior performance under consistent failure modes (FD001 and FD003) compared with multiple operating conditions (FD002 and FD004). This is attributed to the relatively simpler degradation trend of engines under a single operating condition, coupled with significant overlap between the training and testing sets. Furthermore, the accuracy of RUL prediction results is higher for the FD001 dataset than for the FD003 dataset, and higher for the FD002 dataset than for the FD004 dataset. This suggests that, under consistent operating conditions, the proposed model exhibits better RUL prediction performance for single failure modes (FD001 and FD002) compared with composite failure modes (FD003 and FD004). Hence, the proposed model demonstrates higher RUL prediction accuracy for single failure modes compared with multiple failure modes. Additionally, the RUL prediction results on the FD003 dataset surpass those on the FD002 dataset, indicating that the complex failure mode in the C-MAPSS dataset has less influence on the RUL prediction of the proposed model compared with the operating conditions of the aircraft engine units.
To further show the InvGRU-based DL framework performance in predicting the RUL of individual engine units during the overall degradation process, four test engine units randomly selected from all subsets were used to showcase the full-life estimation process shown in Figures 12-15. The blue line in the figures represents the predicted RUL (PR) of the engine unit, while the red line represents the actual RUL (AR). The green bars represent the absolute error (AE) between PR and AR for each cycle. Additionally, the mean of the absolute errors (MAE) between PR and AR across all cycles of the engine unit was computed to evaluate the average prediction error.      It can be observed from Figures 12-15 that the predicted RUL of the selected test engine units closely aligns with the actual RUL, effectively revealing their degradation trends. Moreover, considering the average values of the MAE in Figures 12-15, the average MAE on the FD001 dataset is 10.7, while the average MAE on the FD002, FD003, and FD004 datasets are 12.1, 15.2, and 11.3, respectively. This indicates that the proposed model exhibits significantly better RUL prediction performance on the FD001 dataset compared with the FD002, FD003, and FD004 datasets. As the number of engine cycles gradually increases, the degradation process begins to manifest and worsen. For most engines, the accuracy of predicting the RUL in the later stages of the degradation process tends to be higher than in the earlier stages. This is evident in Figures 12c, 13a-c, 14b,d, 15a,c, and 16d.
To demonstrate the lightweight nature of the proposed methods and illustrate the lower computational resource consumption, we compare the parameter count and computational cost of the models.   It can be observed from Figures 12-15 that the predicted RUL of the selected test engine units closely aligns with the actual RUL, effectively revealing their degradation trends. Moreover, considering the average values of the MAE in Figures 12-15, the average MAE on the FD001 dataset is 10.7, while the average MAE on the FD002, FD003, and FD004 datasets are 12.1, 15.2, and 11.3, respectively. This indicates that the proposed model exhibits significantly better RUL prediction performance on the FD001 dataset compared with the FD002, FD003, and FD004 datasets. As the number of engine cycles gradually increases, the degradation process begins to manifest and worsen. For most engines, the accuracy of predicting the RUL in the later stages of the degradation process tends to be higher than in the earlier stages. This is evident in Figure 12c, Figure 13a-c, Figure 14b,d and Figure 15a,c,d.
To demonstrate the lightweight nature of the proposed methods and illustrate the lower computational resource consumption, we compare the parameter count and computational cost of the models. For general validation purposes, INN and CNN are employed in a two-dimensional configuration. The parameter count of INN is K 2 GC+C 2 r , while the computational burden of INN can be divided into two parts: the involution kernel generation component, which is HW × K 2 GC+C 2 r , and the multiplication-addition component, which is HW × K 2 C. On the other hand, CNN has a parameter count and computational burden of K 2 C 2 and HW × K 2 C 2 , respectively, which is higher than that of INN. This indicates that, under the same hyper-parameters, INN has a smaller computational load compared with CNN. Simultaneously, GRU has a parameter count of 9 × c n and a computational burden of 8 × (c n × i l ) + 3 × (c n × i l ) 2 + 3 × (c n 3 × i l 2 ) + 9 × (c n 3 × i l 4 ), while LSTM has a parameter count of 12 × c n and a computational burden of 12 × (c n × i l ) + 18 × (c n × i l ) 2 + 54 × (c n × i l ) 3 , where c n represents the number of hidden neurons and i l represents the input length. GRU exhibits lower computational costs compared with LSTM. From this observation, it is evident that the computational complexity of InvGRU is lower than that of ConvLSTM.
To evaluate the computational efficiency, we selected the challenging FD004 dataset for performance testing. Specifically, we compared the runtime of InvGRU with ConvLSTM on the FD004 dataset. Using the same computing device consisting of Nvidia GeForce RTX2060, Intel(R) Core(TM) i7-10875H, and 16 GB RAM, InvGRU achieved a remarkable 16% reduction in time per epoch, taking only 4 s. In the training stage, each epoch required 8 s and a total of 32 epochs were executed, resulting in a cumulative training time of 256 s. In the testing stage, the inference time was exceptionally fast, with a calculation time of just 0.07 s per sample. Therefore, the proposed method is more concise.
To further highlight the advantages of the InvGRU-based DL framework in predicting RUL, this study conducted comparative experiments on RUL prediction capabilities between the proposed model and several other models, including statistical-based models [34], shallow machine learning models [39], classical deep models [40][41][42], and recently published deep learning models [4,14,34,43]. To obtain comprehensive performance results, these models were subjected to 10 parallel experiments for RUL prediction on each subset. Subsequently, performance evaluation metrics, namely score and RMSE values, were computed based on the prediction results and are presented in Tables 4-6. Table 4 displays the evaluation metric values for the compared methods on the FD001 and FD002 datasets, Table 5 presents the evaluation metric values for the compared methods on the FD003 and FD004 datasets, and Table 6 represents the mean evaluation metric values for the compared methods across all subsets, providing an average performance assessment of the predictive capabilities of the compared methods on the C-MAPSS dataset.  Table 6. The comparisons of different methods for RUL prediction based on the C-MAPSS dataset.

RMSE Score
Cox's regression [34] 49.70 596,603 SVR [39] 32.335 108,277 RVR [39] 27.96 11,716 RF [39] 24.72 29,553 CNN [40] 24.42 7006 LSTM [42] 21.25 2797 DBN [41] 21.73 4461 MONBNE [41] 20.32 3225 LSTM+attention+ handscraft feature [20] 20.80 2985 Acyclic Graph Network [43] 16.80 1716 AEQRNN [34] 19.85 3908 MCLSTM-based [4] 17.40 1216 SMDN [14] 15.36 900 Proposed 13.58 689 Moreover, from Tables 4-6, it can be observed that the proposed model exhibits favorable predictive performance and a significant improvement compared with other deep learning models. This demonstrates that the utilization of spatiotemporal information of the input leads to feature diversification and enhances the model's RUL predictive capability. The proposed Inv-GRU adopted an involution operator to replace the information connection in the gated recurrent unit, enabling the adaptively spatiotemporal information extraction ability and reducing the parameters, and further enhancing the prediction performance of aircraft engine RUL. Based on the aforementioned analysis, it can be concluded that the proposed model exhibits satisfactory universality and accuracy in predicting RUL on the C-MAPSS dataset. Thus, the proposed method can be successfully applied in the aero-engine RUL prediction tasks.

Conclusions
To overcome the complexity and limited feature extraction capability of conventional models used for processing spatiotemporal information, a lightweight operator called InvGRU is introduced to enhance the prediction of RUL for aero-engines. InvGRU replaces the information connection in the gated recurrent unit with an adaptive feature extraction operator known as Involution. By doing so, InvGRU can dynamically extract spatiotemporal information while reducing the number of parameters involved. The output of InvGRU is then passed through a neural network (NN) to transform it into aero-engine health features. These health features, along with manually crafted features, are concatenated and fed into FC layers for dimension reduction and subsequent RUL estimation. The proposed model is trained using existing data and, once trained, it can be utilized to estimate the RUL of aero-engines using new measurements. The proposed method exhibits a 23.44% improvement in the score metric and an 11.58% improvement in the RMSE metric compared with other methods, highlighting its superiority. These results demonstrate the advantages of the proposed approach in accurately predicting the RUL of aero-engines.