Fault Diagnosis of Oil-Immersed Transformers Based on the Improved Neighborhood Rough Set and Deep Belief Network

: As one of the essential components in power systems, transformers play a pivotal role in the transmission and distribution of renewable energy generation. Accurate diagnosis of transformer fault types is crucial for maintaining the safety of power systems. The current focus in research lies in transformer fault diagnosis methods based on Dissolved Gas Analysis (DGA). Traditional diagnostic methods directly utilize the five fault gases from DGA data as model input features, but this approach does not comprehensively reflect all potential fault types in transformers. In this paper, a non-coding ratio method was employed to generate 35 fault gas ratios based on the five fault gases, subsequently refined through correlation analysis to eliminate redundant feature variables, resulting in 15 significantly representative fault gas ratios. To further streamline the feature variables and remove non-contributing elements to fault diagnosis, an improved Neighborhood Rough Set (INRS) algorithm was introduced, leveraging symmetrical uncertainty measurement. By resorting to the proposed INRS, eight most representative fault gas ratios were selected as input variables for constructing a Deep Belief Network (DBN) diagnostic model. Experimental results on Dissolved Gas Analysis (DGA) data confirmed the effectiveness and accuracy of the proposed method.


Introduction
Oil-immersed transformers are essential components of power systems and play a critical role in the transmission and distribution of electrical energy [1,2].However, prolonged operation and high-load conditions can lead to a deterioration in equipment performance, and even severe damage, posing a threat to the stability and reliability of power systems [3,4].Traditional transformer maintenance and inspection primarily rely on periodic inspections and tests, but this approach may not detect internal potential faults in a timely manner, leading to overlooked or delayed maintenance and increased risk and maintenance costs.To take effective maintenance and preventive measures in a timely manner, accurate prediction of fault types becomes increasingly important [5,6].
Dissolved Gas Analysis (DGA) is one of the most commonly used methods for diagnosing faults in oil-immersed transformers [7].During the operation of transformers, chemical reactions occur in the oil-paper composite insulation materials, releasing low-molecularweight gases such as hydrogen, hydrocarbons, and carbon-containing gas compounds, which dissolve in the insulating oil [8,9].Different types of faults or abnormal conditions result in the production of different gases, with the most significant ones being hydrogen (H 2 ), methane (CH 4 ), ethane (C 2 H 6 ), ethylene (C 2 H 4 ), and acetylene (C 2 H 2 ).Based on the type and quantity of fault gases, it is possible to determine the presence of specific fault types in the transformer [10].Traditional diagnostic methods, such as the IEC three-ratio method, Doernenberg ratio method, and Rogers ratio method, encode faults based on the ratios of fault gases and associate them with fault types to diagnose transformer fault types [11][12][13].However, in practice, it is possible to encounter fault combinations that fall outside the coding range, making traditional diagnostic methods unable to accurately diagnose transformer fault types.
Learning from data is a core research area in modern artificial intelligence [14].Machine learning-based fault diagnosis techniques have been successfully applied to predict fault types in oil-immersed transformers.Typical intelligent diagnostic approaches encompass the BP neural network [15], Support Vector Machine (SVM) model [16], and other methods.An approach that integrates neural networks with the three-ratio method was introduced in [17], which is designed to transform samples with diagnostic errors from neural networks to the three-ratio method for diagnosis.Nevertheless, the accuracy of neural network judgments relies on the choice of weights and thresholds, demanding substantial training data, which complicates the operation and compromises stability.The study in [18] presented an intelligent diagnosis approach for transformer faults, which combines empirical wavelet transform and an enhanced convolutional neural network.The findings indicate that this diagnostic model can proficiently recognize the fault states of transformers.In [19], a novel multiclass probabilistic diagnosis framework for dissolved gas analysis, based on Bayesian networks and hypothesis testing, was proposed.This framework learns patterns from data and infers the uncertainty associated with diagnostic outcomes.In [20], SVM was employed to establish a classification system for power transformer faults and to select the most suitable gas signature between traditional DGA methods and a novel extension method.This approach led to significant improvements in the accuracy of power transformer fault classification.It is worth noting that both [19] and [20] used the traditional set of five fault gases (H 2 , CH 4 , C 2 H 6 , C 2 H 4 , and C 2 H 2 ) as input variables for the diagnostic models.However, these five feature variables contain incomplete fault information, resulting in lower diagnostic accuracy.In order to fully leverage the fault information embedded in the fault gases, Dai et al. employed a non-coding ratio method to derive nine fault feature gas ratios.These nine features were then used as input variables for a deep belief network, resulting in a notable enhancement in diagnostic accuracy [21].Currently, fault diagnosis techniques based on machine learning and deep learning are still evolving.Continual learning methods are discussed in reference [22].Integrated approaches are highlighted in reference [23] and have demonstrated promising results in fault diagnosis.
This paper constructed 35 fault feature gas ratios based on five fault gases and eliminated redundant features through correlation analysis.To further reduce the number of features contributing insignificantly to transformer faults and consequently simplify the model, an improved neighborhood rough set (INRS) algorithm was proposed.Compared to the traditional approach of directly using the five fault gases as feature variables, the feature reduction method introduced in this study can effectively harness the fault information inherent in these five fault gases.The eight features extracted through the INRS algorithm contribute more significantly and representatively to the types of transformer faults.Using the obtained ratios of eight characteristic gases as input variables, a deep belief network (DBN) diagnostic model was constructed.The average accuracy of 10 experiments on the DGA test set reached 90.2%.

Transformer Fault Characteristics Analysis
Currently, traditional power distribution systems extensively utilize oil-immersed electrical transformers, which are commonly classified into three main fault types: mechanical, thermal, and electrical.As mechanical faults might appear as thermal or electrical faults, our focus is solely on non-mechanical fault categories.The specific fault categories pertaining to oil-immersed transformers are described in Table 1.
Thermal faults or electrical faults in transformers are primarily reflected in the changes in the concentration of various gases dissolved in the oil.The most significant of these gases include hydrogen (H 2 ), methane (CH 4 ), ethane (C 2 H 6 ), ethylene (C 2 H 4 ), and acetylene (C 2 H 2 ).The distinctive gas concentration features for different fault types are outlined in Table 2. From Table 2, it is apparent that different fault types often lead to the release of specific gases.Analyzing the gases dissolved in the oil both qualitatively and quantitatively enables insights into the operational status and potential fault types present within the transformer.Consequently, Dissolved Gas Analysis (DGA) serves as a valuable method for diagnosing fault types in transformers within power distribution systems.Typically, datasets containing concentrations of the five fault gases along with their associated fault types are referred to as DGA data.These data facilitate the identification and assessment of transformer conditions, aiding in predictive maintenance and timely fault detection.

Fault Diagnosis of Oil-Immersed Transformers Based on INRS and DBN
Based on the analysis in the previous section, the fault types of oil-immersed transformers can be summarized as six categories: LED, HED, PD, HTO, MLTO, and Normal.Consequently, the fault diagnosis problem for oil-immersed transformers can be treated as a six-class classification task.To accomplish this classification task, we have constructed a DBN diagnostic model based on the proposed INRS algorithm.The overall framework is illustrated in Figure 1.DGA data contain historical data on the content of five fault gases in oil-immersed transformers under different fault types, which can be used for model training in transformer fault diagnosis.The DGA data used in this paper can be obtained from https://github.com/Cliango/DGA.git(accessed on 20 July 2023).The dataset contains a total of 617 samples, including 102 LEDs, 168 HEDs, 47 PDs, 133 HTOs, 77 MLTOs, and 90 Normal samples.The specific distribution of data samples can be found in Table 3

Non-Coding Ratio Processing
Conventional methods for diagnosing transformer faults using fault gases from DGA data (such as the IEC three-ratio method, Doernenberg ratio method, and Rogers ratio method) have demonstrated the utility of gas ratios in fault diagnosis for oil-immersed transformers.Additionally, there is a close connection between the changes in the proportion of fault gases and the fault types.Hence, gas ratios among the five fault gases can be utilized as features to analyze and determine the internal operational status of the transformer.The five basic fault gases alone cannot fully reflect the fault information of the transformer.To further explore the fault information, a total of 35 gas ratios have been constructed using a noncoding ratio method, as outlined in Table 4. Here, C 1 represents first-order hydrocarbons (i.e., CH 4 ), and C 2 represents the sum of second-order hydrocarbons (i.e., C 2 H 6 Although we conducted non-coding ratio processing on five types of fault gases, resulting in 35 ratios indicative of these faults and allowing for a more comprehensive reflection of transformer fault types, it is important to note that these features may exhibit linear relationships among themselves.To avoid introducing redundant feature variables, we performed a correlation analysis among the 35 features, further eliminating highly correlated feature variables to streamline the input features of the model.
Let D = {x i , y i } 617 i=1 represent the dataset obtained after non-coding ratio processing of the DGA data, where ] is the i-th sample, x ij represents the j-th feature within the sample x i , and y i ∈ {Normal, MLTO, HTO, PD, LED, HED}.Using all 35 feature gas ratios as input features may result in high dimensionality, increasing the complexity of the diagnostic model.Moreover, an excessive number of input features can introduce interference from features with low correlation, potentially affecting the diagnostic accuracy.Therefore, before establishing the diagnostic model, feature selection and dimensionality reduction are essential to ensure the model's efficiency and accuracy while avoiding unnecessary interference.To achieve this, a Pearson correlation analysis is first applied to the data D, eliminating features that exhibit linear relationships, thereby preventing the introduction of redundant information or multicollinearity.Let data matrix where the i-th row of X (i.e., X i ) represents the i-th feature of the samples.The correlation coefficient between any two features can be calculated by where µ X i and S X i represent the mean and variance of X i , respectively.The correlation coefficient R has a range between −1 and 1.When R is close to 1 (−1), it indicates a stronger positive (negative) correlation between features X i and X j .When R is close to 0, it signifies no linear correlation between the two features.In this paper, we remove the gas ratio features in the data where |R| ≥ 0.7.The reason for removing feature gas ratios with correlation coefficients greater than 0.7 is that during the feature selection process, we noticed that coefficients exceeding 0.7 may indicate strong linear relationships among features, thereby introducing multicollinearity, which can affect the model's robustness and interpretability.However, through a series of experiments, we found that setting the correlation coefficient threshold to 0.7 effectively streamlined the model, maintaining a high diagnostic accuracy while efficiently reducing model complexity by avoiding excessive redundant information.This strategy not only enhanced the model's interpretive capacity but also improved the overall experimental outcomes and diagnostic precision.
The results indicate that there are 20 gas ratio features with correlation coefficient |R| ≥ 0.7, specifically, features numbered 2, 3, 6, 9, 16, 19, 21-23, and 25-35 in Table 4.These features exhibit strong linear correlations with each other.To avoid introducing redundant information, these features are removed from the dataset D, resulting in the dataset D, containing 15 gas ratio features.After removing linearly correlated feature gas ratios, there are a total of 15 remaining, as detailed in Table 5.

Feature Selection Based on the Improved NRS
Correlation analysis can eliminate redundant information between features, but to comprehensively assess the importance of features, it is essential to examine the relationship between features and the target variable, i.e., the correlation between features and the target variable.In general, features that exhibit a higher correlation with the target variable are more likely to contribute to the predictive capability of the model.The neighborhood rough sets (NRS) algorithm is a data mining algorithm based on rough set theory, used for feature selection and data reduction.It evaluates each attribute by calculating attribute importance, thereby eliminating redundant information and unimportant attributes from the dataset while retaining the most valuable attributes.
For a decision system DS = (U, C ∪ E, V, f ), where U is the universe of discourse, C represents conditional attributes, E ̸ = ∅ is the set of decision attributes, and C ∪ E ̸ = ∅, V = {V a |a ∈ C ∪ E} denotes the collection of attributes' values.The information function f : U × (C ∪ E) → V represents the mapping relationship between samples, attributes, and attribute values.In this paper, the set composed of feature gases represents the set of conditional attributes, denoted as C, while the set consisting of the five fault types serves as the set of decision attributes.Let B be a subset of conditional attributes, specifically, a subset of all feature gases.For any B ⊆ C, the dependency of decision attributes E on conditional attributes B is defined as where Pos B (E) represents the lower approximation of the attribute subset.The formula for calculating the importance of a certain conditional attribute to the decision attribute is The NRS have certain limitations and drawbacks in feature selection.When the number of samples varies significantly across different classes within the dataset, the NRS might exhibit bias towards classes with larger sample sizes, impacting the feature selection process.Moreover, these methods heavily rely on dataset partitioning, leading to potentially different outcomes based on various data splits, thus affecting the consistency and stability of feature selection.Symmetrical Uncertainty (SU) is a measure based on information theory, designed to quantify the association between features and target variables.As a metric for feature selection, SU aids in assessing the correlation between features and target variables, enabling the identification of influential features impacting the target.By eliminating highly correlated features, it mitigates multicollinearity, reducing the risk of model overfitting and enhancing model generalization.The application of SU facilitates the reduction of feature dimensions while retaining critical features, thus streamlining the model and improving its efficiency.The introduction of SU as an alternative method helps overcome some of the limitations associated with domain rough set methods.
] ∈ R 617×15 be the data matrix after the correlation analysis in Section 3.1, where Di ∈ R 617 for i ∈ {1, 2, • • • , 15} represents the i-th feature after reduction.The SU value between the 15 gas ratio features and the label vector can be calculated using the following formula: where Y is the vector of the class label for sample, IG( Di , Y) = H( Di ) − H( Di |Y) represents information gain, and H( Di ) represents information entropy.By incorporating the measure of uncertainty (4) into the attribute importance (3), we have developed a rough set-based attribute reduction method based on SU By incorporating SU into the attribute importance assessment within the NRS algorithm, we have developed an improved neighborhood rough set algorithm used to evaluate the correlation between feature variables and the target variable (i.e., label vector).
The main steps of this algorithm are as follows: Step 1: Data normalization.
Step 2: Calculate the attribute importance SUSIG for 15 attributes according to (5), and sort the attributes in descending order based on SUSIG, red = ϕ 0 , and the sorted attributes are denoted as C = {a 1 , • • • , a 15 }.
Step 3: Taking the attribute a 1 ∈ C with the highest attribute importance as the initial reduction, denoted as red 1 = red 0 ∩ {a 1 }, calculate P OS according to (2), and set red = red 1 .
Step 4: Neighborhood construction.Calculate the standard deviation Std(a i ) for each attribute a i , and construct the neighborhood radius δ = (Std(a i ))/τ, where τ is a predetermined parameter used to adjust the neighborhood size, typically ranging from 2 to 4. Based on the importance of attributes, select a set of the most important attributes to form the neighborhood, creating a neighborhood rough set.
Step 5: Set i = i + 1 and red i = red i−1 ∩ {a i }.Calculate γ B (E) according to (2) and set B = red i .If γ B i−1 (E) < |γ B i (E)|, then red = red i and proceed to the next step; otherwise, stop.
Step 6: Data reduction.Utilize the neighborhood rough set for data reduction, eliminating redundant information and unimportant attributes from the dataset while retaining the most valuable attributes.
In order to minimize the reduced features, we set the parameter τ = 2. Subsequently, the algorithm steps described above are applied to the dataset D, leading to the removal of low-importance gas ratio features.The result is a set of 8 gas ratio features that exhibit high correlation with the fault labels, as detailed in Table 6.

Transformer Diagnostic Model Based on DBN
DBN is a deep learning model constructed by stacking multiple Restricted Boltzmann Machines (RBM).The network structure is illustrated in Figure 2.
Each RBM consists of two layers of neurons, with the visible layer receiving input data and the hidden layer used to capture abstract features of the data.The training process of a DBN comprises two phases: unsupervised pre-training and fine-tuning.
Unsupervised Pre-Training: Starting from the bottom, each RBM is trained layer by layer.The hidden layer's output of each RBM is used as the visible layer input for the next RBM.Through parameter updates, it reconstructs the distribution of the input data.During this process, network connection weights between neurons with the smallest reconstruction error are chosen, resulting in a new hidden layer for RBM1.This new hidden layer is then employed as the visible layer for training RBM2.This process continues, stacking multiple layers of RBMs to extract data features.The goal is to make the final feature representation as close as possible to the distribution of the original input data.Throughout the pre-training process, no labels of the data are used, making this phase an unsupervised learning process.The pseudocode in Algorithm 1 describes the training process of the DBN model.Train RBM layer with input data 12: end for 13: Fine-Tune DBN: 14: Fine-tune the entire DBN using backpropagation or other optimization algorithms 15: Update all weights and biases Fine-Tuning: While the DBN model can establish initial deep features through layerwise pre-training, it cannot guarantee the attainment of globally optimal deep feature representations since each RBM is trained independently to minimize the reconstruction error.To further optimize the entire DBN model and ensure the acquisition of superior deep feature representations, it is common to add a back-propagation network connected to a classifier at the end of the DBN.This is conducted for fine-tuning.The fine-tuning process employs supervised learning, using labeled data to adjust the parameters of the entire DBN, including the weights and biases, in order to minimize the classifier's loss function.This way, the entire DBN model can better adapt to a specific classification task and obtain improved feature representations.
Step 2: Conduct non-coding ratio processing for the fault gases in the data to obtain 35 gas ratio features.
Step 3: Remove redundant features through correlation analysis and normalize the data.Utilize the Neighborhood Rough Sets algorithm for feature selection to eliminate features that have minimal contributions to fault types, optimizing the feature set.
Step 4: Split the processed data into training and testing sets in a certain proportion to ensure the independence of model training and evaluation.
Step 5: Use the selected gas ratio features and binary-encoded fault types as the input and output layers of the DBN, respectively.Determine the DBN network parameters, including the number of network layers, learning rate, and the number of neurons in the hidden layers.
Step 6: Pre-train and fine-tune the DBN network until reaching the specified number of training iterations or the desired error threshold to complete the DBN fault diagnosis model.Input the test data into the model to obtain the output results.
When training a DBN, it is necessary to set and select network parameters such as the number of network layers, learning rate, and the number of neurons in the hidden layers, as mentioned in Step 5. Properly configuring these network parameters can optimize the DBN model and improve its performance and effectiveness.Since there are no fixed rules or criteria to determine the best parameters, experimentation and practical trials are required to continuously try and optimize to find the most suitable parameter configuration.
According to Figure 2, in the processed data, each class of samples is divided into a testing set and a training set in a 7:3 ratio, with 70% of the data used for training and 30% for testing the model's performance.In the model, the learning rate for RBMs is set to 0.01.This learning rate is used during the pre-training process and controls the rate at which the RBM network weights are updated to gradually converge to better feature representations.In the BP fine-tuning algorithm, dynamic learning rates are generally used, with an initial value set to 0.01.Dynamic learning rates are an adaptive learning rate strategy that allow for the dynamic adjustment of the learning rate during training based on the model's performance.The purpose of this approach is to use a larger learning rate in the early stages of training to expedite convergence and gradually reduce the learning rate in the later stages to stabilize the convergence process of the model.
The number of neurons in the hidden layer is equivalent to the number of nodes in the hidden layer.When the number of hidden layers is determined, the number of neurons in the hidden layer also becomes a significant factor affecting diagnostic accuracy.If the number of neurons is much larger than the number of input and output nodes, it may result in overfitting during the feature extraction process, causing the original data's features to overly disperse, thereby failing to capture the essential characteristics.Conversely, if the number of neurons is too small compared to the number of input and output nodes, it might lead to insufficient learning of the original signal's features.Currently, there are four main approaches for determining the number of neurons: fixed-value combination, concaveconvex combination, decreasing-value combination, and increasing-value combination.There are empirical formulas for selecting the number of neurons, which are as follows: where m represents the number of neurons in the input layer, n represents the number of neurons in the output layer, p denotes the number of neurons in the hidden layer, and d stands for an additional compensatory value, typically within the range of [0, 10].
To determine the optimal number of hidden layers and hidden layer nodes, nine different configurations of DBN network models based on (6) were set up: 8-5-6, 8-10-6, 8-15-6, 8-5-5-6, 8-10-10-6, 8-15-15-6, 8-5-5-5-6, 8-10-10-10-6, and 8-15-15-15-6.Each model was experimented with 10 times, and the average diagnostic accuracy was calculated.The specific results are shown in Table 7. From Table 7, it can be observed that as the number of neurons in the hidden layers increases, the diagnostic accuracy of the DBN model gradually improves.This is because having more neurons allows for better feature extraction, enhancing the model's fitting capacity.However, when the number of hidden layers increases to 2 or more, the diagnostic accuracy of the DBN model starts to decline.The reason for this could be that for a specific DGA dataset, when the number of hidden layers exceeds 2, the DBN network may become too complex and may not generalize well to unseen data, resulting in a decrease in diagnostic accuracy.Based on this analysis, we adopt a 3-layer DBN network structure.

Experiment on DGA Dataset
All algorithms and experiments are conducted on the MATLAB R2022a simulation platform.The computer specifications used are as follows: Processor: Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz 1.80 GHz; Memory: 8.00GB RAM; Display adapter: Intel(R) UHD Graphics 620.

Evaluation Metrics and Diagnostic Results
We adopted the accuracy to measure the effectiveness of the proposed diagnosis method Accuracy = (TP + TN)/(TP + FP + TN + FN), where  As shown in Figure 3, with an increasing number of iterations, the training error of the DBN diagnostic model gradually decreases, reaching an error below 0.1 after 700 iterations.
To avoid experimental variability, the DGA dataset was randomly split five times, and five experiments were conducted using the constructed DBN diagnostic model.Table 9 presents the average number of correctly diagnosed samples and the average accuracy for each fault type in the ten experiments.Notably, the diagnostic accuracy for the MLTO fault type is 100%, and the average diagnostic accuracy for LED, HED, PD, HTO, and Normal exceeds 90%. Figure 4 illustrates the confusion matrix of the average prediction results for the DBN diagnostic model in the five random data partitioning experiments.From the color distribution in the confusion matrix, it is evident that the colors off the main diagonal blocks are relatively light, while the colors on the main diagonal blocks are much darker.This indicates that the constructed DBN diagnostic model exhibits strong predictive performance.In summary, the DBN diagnostic model developed in this study demonstrates accuracy and effectiveness in predicting faults in oil-immersed transformers.

Ablation Analysis and Comparative Experiment
In order to investigate the impact of correlation analysis and neighborhood rough set feature reduction on the performance of transformer diagnosis models, we conducted ablation analysis on correlation analysis and rough set feature reduction, respectively.Table 10 detailed the 10 average experimental results obtained after each method was ablated.From Table 10, it can be seen that feature reduction based on neighborhood rough sets has a positive impact on the model.Under the same correlation analysis, the diagnostic accuracy of the model with NRS algorithm feature reduction is higher than that without INRS algorithm feature reduction.In addition, using all 35 characteristic gas ratios as input features of the model directly can lead to a decrease in diagnostic accuracy.
To further demonstrate the effectiveness of the proposed method, we compared it with Support Vector Machine (SVM) and Backpropagation Neural Network (BP), conducting 10 experiments for each method using identical training and testing datasets.Based on the diagnostic results of different methods on the same dataset, the diagnostic accuracies achieved by SVM, BP, and the proposed method are 87.8%,88.4%, and 90.2%, respectively.Compared to SVM and BP, the proposed method's diagnostic accuracy is 2.4% and 1.8% higher, respectively.Therefore, the proposed method in this paper can effectively assess the transformer's condition.

Conclusions
This paper presents a diagnostic model for fault classification in oil-immersed transformers, leveraging an improved neighborhood rough set combined with Deep Belief Network.Through correlation analysis and the domain rough set algorithm, nine feature gas ratios that significantly contribute to fault types were successfully extracted.These features have demonstrated enhanced representativeness and information content compared to traditional methods in identifying transformer fault types.
Utilizing the identified eight feature gas ratios as input variables, a DBN-based diagnostic model was constructed.On the DGA test dataset, this model achieved an impressive average accuracy of 90.2%.This high accuracy signifies the model's effectiveness in diagnosing fault types in oil-immersed transformers.
The practical application of this method holds immense promise for maintenance and operational purposes.Its ability to promptly identify transformer faults and discern their respective types empowers maintenance personnel to implement effective repair and maintenance measures, thereby mitigating potential impacts of faults on the power system.This approach aids in ensuring the reliability and longevity of transformers within power distribution systems.

Figure 1 .
Figure 1.The diagnostic process for an oil-immersed transformer based on NRS and DBN.

Figure 4 .
Figure 4. Confusion matrices for the average prediction results of the DBN diagnostic model in 10 random data partition experiments.

Table 1 .
Types of faults in oil-immersed transformers.

Table 2 .
Gas content of different fault types in oil-immersed transformers.
. Each sample consists of gas content of five fault gases: H 2 , CH 4 , C 2 H 6 , C 2 H 4 , and C 2 H 2 .

Table 3 .
Distribution of DGA data samples.

Table 4 .
Ratios of feature gas concentrations.

Table 5 .
Gas ratio features after reduction through correlation analysis.

Table 6 .
Gas ratio features after reduction through INRS.

Table 7 .
Average accuracy for different hidden layers.
TP, FP, TN, and FN represent True Positive, False Positive, True Negative, and False Negative, respectively.According to the data partitioning and parameter settings in Section 3.3, a DBN model with a network structure of 8-15-6 was selected.The DBN model was trained on the training dataset, and upon completion of the training, it was used to predict the classification of six fault types on the test dataset.Table8provides the diagnostic accuracy for each fault type.Figure3depicts the training error curve on the test set for a single experiment.

Table 8 .
Diagnostic accuracy for each fault type in one experiment.
Training Figure 3. Training error curve of the DBN diagnostic model.

Table 9 .
Diagnostic accuracy for each fault type in ten experiments.

Table 10 .
The 10 average experimental results of the proposed method after ablation analysis.