1. Introduction
With the continuous growth of energy demand and the rapid development of energy infrastructure, gas turbines (GTs), as core power equipment in fields such as aviation, marine propulsion, industrial power generation, and long-distance natural gas pipeline boosting, have become increasingly strategically important [
1,
2]. As complex rotating machinery operating under high-temperature, high-pressure, and high mechanical stress conditions, GTs inevitably undergo progressive performance degradation during long-term service due to mechanisms such as fouling, wear, erosion, and thermal damage [
3]. The gradual degradation of key components, including the compressor, combustor, and turbine, alters the thermodynamic matching relationship of the entire unit, leading to reduced efficiency, decreased power output, increased fuel consumption, and even unplanned shutdowns. Therefore, accurate evaluation and prediction of GT health state are of great significance for achieving intelligent operation and maintenance, predictive maintenance, and ensuring the safe and economical operation of GT systems.
However, most current research on GT health management focuses on fault diagnosis and abnormal detection of specific components. For example, Wang L et al. [
4,
5] systematically analyzed gas path fault diagnosis of GTs from aspects of thermodynamic models and knowledge-based models, demonstrating the advantages of physics-based methods in locating gas path performance deviations. The core idea is to compare deviations between actual values and standard values to determine faults, but both small deviation methods and nonlinear diagnosis methods suffer from complex modeling, susceptibility to external interference, and low model accuracy. Yan et al. [
6] developed a semi-supervised anomaly detection framework for GT combustors based on deep representation learning from multivariate time-series measurement data, showing that data-driven monitoring methods are still capable of detecting early combustor anomalies even when fault samples are scarce. Tang Jingpeng et al. [
7] proposed a fault diagnosis method for GT rotors using deep transfer learning, transferring a deep learning model pre-trained on bearing datasets to the GT rotor domain for fault diagnosis, demonstrating the potential of cross-domain feature transfer in data-scarce scenarios. Sarwar et al. [
8,
9] effectively improved the accuracy and efficiency of fault diagnosis using machine learning methods such as artificial neural networks and principal component analysis, fully proving the applicability of data-driven methods in this field. Bunyan et al. [
10] conducted research on gas turbine thermal condition monitoring and predictive maintenance based on machine learning. By extracting features from exhaust temperature data and combining them with the XGBoost model, they achieved effective discrimination between healthy and faulty states, further demonstrating the application potential of data-driven methods in GT health management. An et al. [
11] proposed a GT fault diagnosis method based on sliding radar charts. By utilizing the spatial distribution and temporal variation characteristics of combustor outlet temperature sensors and validating the method on real power plant data, they showed that combining spatial and temporal information can help improve GT fault diagnosis performance. Fahmi et al. [
12] developed an anomaly detection model that combines a temporal convolutional network with an autoencoder, and further introduced a multi-head attention mechanism, achieving high-accuracy GT fault detection on real power plant vibration data. Yu et al. [
13] proposed a GT fault diagnosis method based on a feature-fusion cascaded neural network, which was used to identify both incipient faults and abrupt faults under conditions of limited measurement points, further indicating the application potential of deep feature fusion in GT fault identification.
Nevertheless, research on modeling the continuous degradation process of overall GT performance and reliably predicting health state remains relatively scarce. Fundamentally, high-precision overall degradation modeling and health assessment rely heavily on complete life-cycle data from initial health status to complete failure [
14,
15]. Under actual operating conditions, GTs have a design life of tens of thousands to hundreds of thousands of hours, making the collection of complete life-cycle data impractical in terms of time and economic costs. Additionally, operational data often lacks clear health parameter labels [
16,
17]. The scarcity of full-life-cycle data and the lack of health parameter labels are key issues in research related to GT health assessment.
In sharp contrast, in the field of aero-engines, which share similar structures and operating principles, remaining useful life (RUL) prediction has developed into a relatively mature research direction. Benefiting from open and detailed benchmark datasets such as C-MAPSS released by NASA, a systematic research system has been established in this field, with numerous data-driven RUL prediction methods emerging [
18]. Wu et al. [
19] proposed a method based on a Deep Convolutional Neural Network (DCNN) that can effectively extract deep spatial features of data. Huang et al. [
20] adopted Bidirectional Long Short-Term Memory (BiLSTM) neural networks, which more comprehensively characterize health evolution trends by simultaneously learning forward and backward dependencies of time-series data. These methods indicate that deep learning can effectively capture complex temporal degradation patterns, laying a methodological foundation for high-precision health assessment.
Furthermore, to address the cross-domain prediction challenges often faced in actual aero-engine operations—i.e., distribution differences between training data (source domain) and test data (target domain)—solutions centered on transfer learning have gradually been developed. Relevant studies have proposed various methods to mitigate the decline in prediction performance caused by inter-domain distribution inconsistencies, such as weight-optimized transfer learning [
21], deep domain adversarial networks [
22], and Domain Adversarial Neural Network (DANN) [
23]. Among these, DANN forces the network to learn domain-invariant features that strip domain-specific characteristics through adversarial training between the feature extractor and domain discriminator, significantly improving the model’s generalization capability under distribution differences. This method provides an important theoretical tool for solving cross-equipment and cross-operating condition prediction tasks. Summarizing the above progress, in the aero-engine field, relying on the C-MAPSS dataset, researchers have not only established mature deep learning prediction models but also developed domain adaptation technologies represented by DANN to overcome data distribution differences.
Considering the similarities in structure and degradation mechanisms between aero-engines and GTs, as well as the availability of rich labeled health parameter data and mature cross-domain adaptation methods in the aero-engine field, a cross-domain GT health assessment method based on DANN’s domain adaptation technology is researched in this paper. Using label-rich C-MAPSS data as the source domain, the method leverages a domain adversarial training mechanism to drive the BiLSTM feature extractor to learn domain-invariant health features, constructing an assessment model that can be transferred to the GT target domain with scarce health parameter labels, thereby achieving reliable assessment of GT health.
The remainder of this paper is organized as follows.
Section 2 presents the BiLSTM-DANN;
Section 3 conducts simulations and discussion;
Section 4 provides a summary.
3. BiLSTM-DANN-Based Health Assessment
3.1. Health Assessment Based on the C-MAPSS Dataset
To ensure the reliability of the source domain data, this paper selects the C-MAPSS simulation dataset released by NASA for model training and verification. This dataset is obtained through multiple simulation experiments, simulating the health process of aero-engines under different operating conditions, and is widely used in RUL prediction research.
The C-MAPSS dataset includes four datasets—FD001, FD002, FD003, and FD004—each containing a training set and a test set. The training set includes time-series data of state parameters during the complete life-cycle of each aero-engine, while the test set includes time-series data of state parameters during a certain period of the incomplete life-cycle of each aero-engine. As shown in
Table 1, the number of operating conditions and failure modes corresponding to each dataset are different. Considering that the research object of this paper is a GT with simple operating conditions, FD001 and FD003 are selected for model training and verification. Among them, FD001 corresponds to a single operating condition and a single failure mode, whereas FD003 corresponds to a single operating condition and two failure modes. The reasons for selecting the FD001 and FD003 datasets for validation are as follows: on the one hand, both subsets are more consistent with the characteristics of industrial GTs, which typically operate only at ground operating points; on the other hand, they enable a more comprehensive assessment of the model under both single-component degradation and simultaneous multi-component degradation conditions. Both subsets provide full-life-cycle data for aero-engines and include 21 sensor measurements, such as pressure and temperature parameters at multiple sections (e.g., fan inlet, compressor outlet, and turbine outlet), high and low pressure rotational speeds, bypass ratio, pressure ratio, and other variables.
3.2. Data Preprocessing
During data preprocessing, parameters that remained unchanged during the health decline period were initially filtered out, retaining only 14 parameters numbered 2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 17, 20, and 21.
To realize effective knowledge transfer from aero-engine simulation data to industrial power generation GT operational data, this paper manually aligns the C-MAPSS source domain parameters with the GT target domain parameters based on thermodynamic principles and data characteristics. The final dataset retained parameters including high-pressure compressor outlet total temperature T30, low-pressure turbine outlet total temperature T50, high-pressure compressor outlet total pressure P30, and low-pressure turbine rotational speed Nf. These correspond to GT compressor outlet temperature Tc, turbine exhaust temperature Tt, compressor discharge pressure Pc, and power output Pe, respectively. These parameters are closely related to GT health status because they reflect the thermodynamic state, compression performance, and power output capability of the unit. In general, increases in Tc and Tt may indicate greater efficiency loss and performance deterioration associated with fouling, whereas a decrease in Pc may imply a weakening of the compressor’s compression capability. A reduction in Pe, on the other hand, directly reflects a decline in the effective output capacity of the unit under the same operating conditions. However, it should be noted that the selection of these parameters is based primarily on their positions in the thermodynamic cycle, their functional roles, and their sensitivity to GT health. Therefore, they represent correspondences only at the level of engineering mechanisms, rather than strictly equivalent relationships in a physical sense. For example, we match Pe with Nf to create a load-reflective feature alignment for health assessment. This correlation serves as an approximation of the dynamic equilibrium between the GT’s shaft power and the propulsive thrust under varying operational conditions.
The alignment of T30 and P30 of C-MAPSS with Tc and Pc of GT is because both are located at the inlet of the combustion chamber in their respective thermodynamic cycles and are parameters characterizing the compression efficiency of the compressor. The comparison between the total temperature T50 at the low-pressure turbine outlet of an aero-engine and the exhaust temperature Tt of a GT is made because both represent end-of-cycle parameters. They integrate the overall efficiency information from the compressor, combustion chamber, and turbine, serving as comprehensive indicators of health. The comparison with the low-pressure turbine speed Nf of an aero-engine and the electrical power output Pe of a GT is based on the fact that the subject of this study is a GT used for industrial power generation. Such turbines exhibit nearly constant rotational speed after grid connection. Therefore, electrical power output, which directly reflects performance degradation, is chosen for comparison instead of turbine speed. Both are key parameters for measuring the output load.
A similarity transformation is performed on the sensor parameters.
where
T2 and
P2 are the total temperature and total pressure at the aero-engine inlet, and parameters with the subscript “c” represent parameters after similarity transformation.
Subsequently, data standardization is performed.
where
X denotes the parameter after similarity transformation,
Xz represents the normalized parameter,
is the mean value of parameter
X, and
σ is the standard deviation.
Taking the first engine in the FD001 dataset as an example, the trend of parameter changes after similarity transformation and normalization is shown in
Figure 4.
To ensure consistent sample lengths, this study employs a sliding window method for data processing. With a window length of 30 and a step size of 1, each sample segment has dimensions of 30 × 4.
3.3. Establishing RUL Labels
During the full life-cycle of an aero-engine, the health of the engine can be divided into two stages. In the first stage, the early operation period, the engine health is relatively stable, and the RUL remains at the maximum value RULmax. In the second stage, the engine performance degrades, and the RUL shows a linear decreasing trend until it drops to 0. Therefore, the change trend of RUL with the operating cycle
t is:
where
tmax represents the maximum operating cycle count.
As can be seen from Equation (16), a piecewise RUL labeling strategy is adopted for the C-MAPSS dataset in this study, with RULmax set to 125 [
24]. This is because engines in the early stage are generally in a healthy condition; although their RUL values are relatively large, the health information contained in the sensor signals is limited. Setting an upper bound for RUL can reduce the influence of these low-information samples, improve training stability, and enhance comparability with existing studies.
Taking the first engine in the FD001 dataset as an example, the RUL change trend is illustrated in
Figure 5.
As shown in
Figure 5, the inflection point in the RUL curve mainly arises from the piecewise labeling strategy rather than from an abrupt change in the actual physical health decline process. Since the maximum RUL is capped at 125 in this study, the RUL remains constant during the early stage and begins to decrease linearly after reaching the corresponding threshold.
5. Conclusions
Addressing the challenges posed by sparse and unlabeled full-lifetime data for GT health assessment, this paper transfers mature deep learning and domain adaptation techniques from the aviation engine field to the prediction of overall GT health assessment. Leveraging commonalities in aerothermal and structural characteristics between the two, this study uses the fully labeled C-MAPSS dataset as the source domain. Employing domain adaptation techniques, a BiLSTM-DANN transfer learning model is constructed to achieve precise assessment and prediction of GTs’ whole-machine health performance.
First, based on mechanism analysis and physical properties, we performed physical mapping between the sensor parameters of aero-engines and GTs, screened parameters, and introduced similarity transformation and normalization to eliminate dimensional and operating condition differences. A sliding window method reconstructed samples from the preprocessed time-series data for model input. Subsequently, a dual-layer BiLSTM was used as the feature extractor for the DANN, fully capturing bidirectional dependencies within the time-series data. The adversarial training mechanism between the domain discriminator and feature extractor within the DANN drives the feature extractor to learn domain-invariant, intrinsic health characteristics. Comparative experiments on the FD001 and FD003 test sets demonstrated that the BiLSTM-DANN model developed in this study significantly outperforms the conventional BiLSTM and DCNN models in terms of MAE, RMSE, and RA, thereby achieving higher prediction accuracy and stronger generalization capability. When applied to operational data from target-domain GTs, it achieves satisfactory predictions consistent with physical principles.
This study provides a data migration solution for assessing and predicting the health of GT in scenarios with sparse labels and presents a novel approach for fully data-driven health assessment of GT. However, given the limited number of currently available measurement parameters for GTs, this paper has utilized mechanism analysis to select only four alignment parameters to develop a health assessment model. In the future, we will further conduct data-based feature importance analysis, aiming to provide more comprehensive and reliable data validation support for the GT health assessment model. Meanwhile, current research still lacks the support of full-life-cycle data of GTs, and the analysis conducted in this paper on real GTs mainly leads to qualitative conclusions rather than strictly statistical quantitative metrics. In subsequent work, we will continuously monitor the health changes of the GTs under study and conduct trend analysis and evolutionary pattern prediction of their health throughout the entire life-cycle.