Next Article in Journal
A Multi-Step Topological Optimization Approach for Spacer Shape Design in Double-Sided SiC MOSFET Power Modules Considering Thermo-Mechanical Effects
Previous Article in Journal
Appliance-Specific Noise-Aware Hyperparameter Tuning for Enhancing Non-Intrusive Load Monitoring Systems
Previous Article in Special Issue
Transformer Oil Acid Value Prediction Method Based on Infrared Spectroscopy and Deep Neural Network
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Mechanical Fault Diagnosis Method for On-Load Tap Changers Based on GOA-Optimized FMD and Transformer

1
Kunming Power Supply Bureau of Yunnan Electric Grid Co., Ltd., Kunming 650011, China
2
School of Electrical Engineering, Shangdong University, Jinan 250100, China
*
Author to whom correspondence should be addressed.
Energies 2025, 18(14), 3848; https://doi.org/10.3390/en18143848 (registering DOI)
Submission received: 20 June 2025 / Revised: 14 July 2025 / Accepted: 17 July 2025 / Published: 19 July 2025

Abstract

Mechanical failures frequently occur in On-Load Tap Changers (OLTCs) during operation, potentially compromising the reliability and stability of power systems. The goal of this study is to develop an intelligent and accurate diagnostic approach for OLTC mechanical fault identification, particularly under the challenge of non-stationary vibration signals. To achieve this, a novel hybrid method is proposed that integrates the Gazelle Optimization Algorithm (GOA), Feature Mode Decomposition (FMD), and a Transformer-based classification model. Specifically, GOA is employed to automatically optimize key FMD parameters, including the number of filters (K), filter length (L), and number of decomposition modes (N), enabling high-resolution signal decomposition. From the resulting intrinsic mode functions (IMFs), statistical time domain features—peak factor, impulse factor, waveform factor, and clearance factor—are extracted to form feature vectors. After feature extraction, the resulting vectors are utilized by a Transformer to classify fault types. Benchmark comparisons with other decomposition and learning approaches highlight the enhanced performance of the proposed framework. The model achieves a 95.83% classification accuracy on the test set and an average of 96.7% under five-fold cross-validation, demonstrating excellent accuracy and generalization. What distinguishes this research is its incorporation of a GOA–FMD and a Transformer-based attention mechanism for pattern recognition into a unified and efficient diagnostic framework. With its high effectiveness and adaptability, the proposed framework shows great promise for real-world applications in the smart fault monitoring of power systems.

1. Introduction

The On-Load Tap Changer (OLTC) plays a fundamental role in maintaining voltage stability by allowing tap transitions under load without power interruptions [1,2,3]. The performance and reliability of OLTCs are essential to the overall stability and voltage control of power systems, directly impacting the quality and continuity of electrical service delivery. According to statistical analyses, OLTC-related failures constitute over 20% of total transformer malfunctions, with mechanical issues comprising more than 70% of these cases [4,5,6]. These statistics indicate that OLTC mechanical faults rank among the most frequent and disruptive failure types in power transformers. Mechanical faults can lead to abnormal switching behavior, prolonged tap transition time, contact overheating, or even permanent failure. These faults often result in voltage fluctuations, power quality deterioration, and in severe cases, forced transformer shutdowns or grid instability [7]. As a result, the efficient and precise extraction of vibration signal features for the accurate assessment of equipment condition and fault diagnosis has emerged as a central concern in recent research.
OLTC vibration signals encapsulate abundant information related to mechanical operating states and fault characteristics, effectively reflecting the condition of mechanical components. Xu et al. employed time–frequency analysis in conjunction with signal decomposition techniques to extract diagnostic features from vibration signals [8]. The Empirical Mode Decomposition (EMD) algorithm decomposes the entire signal during OLTC tap transitions to obtain time domain energy features by constructing envelopes based on local extrema and interpolating intrinsic mode functions (IMFs). However, due to its reliance on extrema and symmetric envelopes, EMD is prone to mode mixing and boundary artifacts, which can lead to ambiguous decomposition results and the potential omission of critical information [9]. To address these limitations, Liu et al. introduced the Variational Mode Decomposition (VMD) technique for OLTC fault diagnosis [10]. Theoretically, VMD separates the signal into multiple narrowband components with a primary emphasis on frequency domain characteristics [11,12]. Nevertheless, it fails to adequately capture the impulsive and periodic features inherent in mechanical faults. A recent study introduced (FMD), a technique designed to utilize adaptive filters with limited-bandwidth impulse responses to decompose signals into modal components that exhibit minimal correlation with the original signal [13]. FMD retains both periodic and transient signal features and demonstrates notable robustness under noise interference. However, its performance is highly sensitive to parameter configuration, which has a substantial impact on the quality of decomposition. The Gazelle Optimization Algorithm (GOA), a recently developed swarm intelligence technique, offers strong global search capabilities and high adaptability, making it particularly effective for solving complex, multimodal, and high-dimensional optimization tasks [14]. While FMD is capable of decomposing signals into IMFs that capture relevant fault information, its efficacy is heavily dependent on optimal parameter settings [15]. To address this, the present study integrates GOA to automatically optimize FMD parameters, thereby enhancing decomposition accuracy. Precise signal decomposition serves as the foundation for reliable feature extraction and classification, and is essential for realizing the intelligent fault diagnosis of OLTC mechanical systems. To illustrate the strengths and limitations of existing signal decomposition approaches in OLTC vibration analysis, Table 1 summarizes a comparative overview of the methods used in previous studies and the proposed GOA-FMD approach.
In the domain of OLTC fault diagnosis, achieving high-precision fault classification is essential for enhancing diagnostic accuracy and ensuring model generalization. Several classic machine learning techniques—including Support Vector Machines (SVMs), K-Nearest Neighbors (KNN), and Random Forests (RFs)—offer a certain level of diagnostic capability [16,17,18]. However, these methods typically rely on manually engineered features and exhibit limited adaptability and scalability in complex or dynamic environments. By contrast, deep learning techniques leverage neural networks for automatic feature extraction, thereby improving classification performance [19]. As a type of dense neural network with forward-only signal propagation, the Multilayer Perceptron (MLP) is typically effective for processing data with limited complexity and uniform input dimensions. Nonetheless, its dependence on handcrafted features and its inherent limitations in handling high-dimensional, nonlinear, or unstructured data can compromise its generalization ability [20]. In recent years, the Transformer architecture, underpinned by attention mechanisms, has emerged as a powerful model for feature representation [21,22,23]. Its capacity to capture long-range dependencies and integrate multi-source feature interactions makes it highly effective for complex classification tasks involving multiple feature dimensions. To highlight the advantages of the proposed Transformer-based classifier, Table 2 provides a comparative summary of the common classification methods used in OLTC fault diagnosis, outlining their strengths and limitations in terms of feature extraction, generalization, and adaptability.
To address the limitations in diagnostic accuracy caused by incomplete feature extraction from OLTC mechanical vibration signals, this paper presents an adaptive feature extraction scheme that integrates FMD with parameter tuning performed via the GOA. To improve decomposition accuracy, GOA is employed to fine-tune essential FMD parameters, such as filter count (K), filter length (L), and the number of modes (N). The obtained Intrinsic Mode Functions (IMFs) undergo detailed inspection across time, frequency, and joint time–frequency domains to ensure thorough signal characterization. On this basis, a Transformer-based fault classification model is developed to leverage attention mechanisms for effective feature learning. An experimental platform is subsequently established to simulate OLTC mechanical fault conditions. Three representative fault types are designed: (1) gear jamming in drive shaft; (2) loose screw in drive shaft; (3) arc plate looseness and contact wear. Fault identification is conducted using the proposed feature extraction method in conjunction with the Transformer classifier. To validate the proposed approach, comparative experiments are performed against alternative signal decomposition and classification techniques. The results confirm that the proposed approach achieves a high diagnostic precision and strong resilience under varying operating conditions.

2. GOA-FMD-Based Vibration Signal Feature Extraction and Fault Diagnosis Method

2.1. Principles and Characteristics of FMD

When the OLTC operates under mechanical fault scenarios, its vibration responses embed rich diagnostic characteristics that reflect distinct fault behaviors. FMD, initially designed for diagnosing faults in rotating equipment, is adapted in this study to suit OLTC diagnostic requirements and aims to iteratively extract modal components that carry diagnostic features by applying a sequence of limited-bandwidth impulse response filters. This approach suppresses irrelevant components while improving the separability and interpretability of fault-relevant features [13]. FMD has exhibited strong adaptability and reliability in the context of OLTC fault diagnosis. The fundamental procedure of FMD signal decomposition is summarized as follows:
  • Provide the input signal x, define the filter length T, and initialize the iteration index i = 1. Specify the total number of filters K, then perform the initial modal decomposition based on the current parameters.
  • Output the decomposition components u k i = x * f k i , where k = 1 , 2 , 3 , K , ‘ * ’ indicates convolution, and fk represents the k-th finite impulse response (FIR) filter.
  • Utilize the original signal x, the decomposed component u k i , and the calculated period T k i to update filter coefficients. T k i is identified as the time lag at which the autocorrelation function R k ( τ ) reaches its peak after the zero crossing event. The autocorrelation is calculated as follows:
    R k ( τ ) = u k i [ t ] u k i [ t + τ ]
    where τ denotes the time delay and t is the time index of the signal. Finally, the iteration index is updated as i = i + 1.
  • Continue to Step 2 if the predefined iteration limit has not been met; otherwise, move forward to Step 5.
  • Compute the correlation coefficient C C p q for every pairwise combination of modal components u p and u q , using the formula below.
    C C p q = ( u p ( t ) u ¯ p ) ( u q ( t ) u ¯ q ) ( u p ( t ) u ¯ p ) 2 ( u q ( t ) u ¯ q ) 2
    where u ¯ denotes the mean value of the corresponding modal component. Form a K × K matrix C C ( K × K ) containing the pairwise correlation coefficients, and identify the modal component with the lowest correlation—i.e., the one associated with the smallest maximum value in its corresponding row (or column) of C C ( K × K ) .
  • Compute the periodic correlation degree CK of each modal component uk using the following formula:
    C K ( u k ) = m = 0 M u k ( t m T k ) 2 u k ( t ) 2
    where M is the correlation order, representing the number of periodic points used in the computation of CK. The mode exhibiting the highest degree of periodic correlation is designated as the final extracted component. Then, increment the mode index: K = K + 1.
  • Assess whether the total number of extracted components K has attained the set value N. If this requirement is met, continue to Step 8; otherwise, revert to Step 2 and repeat the decomposition.
A schematic representation of the FMD algorithm is provided in Figure 1.
The performance of FMD in analyzing OLTC vibration signals is highly sensitive to the parameter settings, which significantly affect the decomposition results [24,25,26]. A short filter length L may result in under-decomposition, whereas an excessively long filter may introduce significant noise into the decomposed components. Similarly, setting too few number of modes N may cause the loss of critical features, while too many modes may result in redundant information [15,27]. The number of filters K also plays a crucial role—if improperly chosen, it can either reduce the decomposition resolution or increase computational complexity unnecessarily [28]. Since FMD lacks an inherent mechanism for adaptive parameter selection, optimization algorithms are required to perform parameter tuning and ensure the accurate extraction of critical fault features from OLTC vibration signals.

2.2. FMD Parameter Optimization via GOA

GOA draws inspiration from natural predator–prey dynamics and operates within the framework of swarm intelligence techniques [14,29]. The optimization process begins with the random initialization of a gazelle population, which serves as candidate solutions in the search process. The population is represented by an n × d position matrix X, where n denotes the number of agents and d is the dimensionality of the problem. Each row in X denotes the position vector of a candidate gazelle in the search domain. Gazelle population positions are defined by the position matrix X as shown in Equation (4) as follows:
X = x 1 , 1 x 1 , 2 x 1 , d 1 x 1 , d x 2 , 1 x 2 , 2 x 2 , d 1 x 2 , d x i , j x n , 1 x n , 2 x n , d 1 x n , d
where xi,j indicates the location of the i-th gazelle in the j-th dimension. The number of gazelles is n, and the dimensionality of the problem space is d. Each element xi,j is initialized using the following:
x i , j = r a n d × ( U B j L B j ) + L B j
where a random variable is r a n d [ 0 , 1 ] , and UBj and LBj represent the minimum and maximum permissible values along the j-th search dimension. After each iteration, all individuals are evaluated using the fitness function. The top-performing gazelles are selected as elites, and their positions are stored in the elite matrix E for guidance in subsequent generations. The elite matrix E stores the positions of the top-performing individuals (elite gazelles), and is defined as follows:
E = x 1 , 1 x 1 , 2 x 1 , d 1 x 1 , d x 2 , 1 x 2 , 2 x 2 , d 1 x 2 , d x i , j x n , 1 x n , 2 x n , d 1 x n , d
where x i , j denotes the coordinate of the i-th elite individual along the j-th axis.
During the development phase, each gazelle updates its position through a stochastic movement governed by the following rule:
g i + 1 = g i + v r 1 ( E i r 2 g i )
where gi represents the present location of the i-th gazelle; v refers to the movement scaling coefficient; and r 1 , r 2 [ 0 , 1 ] are random values drawn from a uniform distribution. Ei corresponds to the position vector of the associated elite agent from matrix E.
When a predator is detected, the gazelle initiates an escape response modeled by either Lévy flight or Brownian motion. This process involves two stages: a global exploration phase characterized by large step movements and a local exploitation phase involving fine-tuned adjustments. The corresponding position update equations are defined as follows:
g i + 1 = g i + S λ R R L ( E i R L g i )
g i + 1 = g i + S λ C F R R B ( E i R L g i )
C F = ( 1 m T ) 2 m T
where S represents the upper bound of the gazelle’s movement speed; λ is the directional control factor; RL and RB are Lévy-distributed and Brownian-distributed random vectors, respectively; R [ 0 , 1 ] is a uniformly distributed random number; Ei is the position of the i-th elite gazelle; gi is the current position of the i-th gazelle; m and T denote the current and the maximum allowed iterations; and CF decreases in a nonlinear manner as iterations proceed, aiming to achieve a trade-off between exploration and exploitation.
To improve the algorithm’s potential for escaping from local optima, a predator–prey strategy is employed. This strategy simulates the natural behavioral shift between escape and social movement observed in prey animals. A decision is made based on a comparison between a random variable uniformly sampled from the interval r [ 0 , 1 ] and the predation success rates (PSRs). If r < P S R s , the gazelle performs a random relocation within the search space boundary using a controlled scaling factor. Otherwise, a differential movement is executed based on randomly selected individuals from the population. The position update rule, which mathematically models this predator–prey strategy, is defined in Equation (11) below.
g i + 1 = g i + C F [ L B + R ( U B L B ) U ] , r P S R s g i + P S R s ( 1 r ) + r ( g r 1 g r 2 ) , r > P S R s
where CF is the control factor from Equation (10), and R [ 0 , 1 ] is a uniformly distributed random number. The binary control variable U is defined as follows:
U = 0 , r < 0.34 1 , r 0.34
where, gr1, gr2 are two randomly selected position vectors from the current population. The variable U is used to probabilistically suppress position updates in certain dimensions, thereby introducing adaptive stochastic behavior during the escape process.
This work utilizes the GOA to optimize three parameters of the FMD process: the number of filters K, the filter length L, and the mode number N. The algorithmic flow for GOA-based parameter optimization is depicted in Figure 2. In the diagram, r [ 0 , 1 ] denotes a uniformly distributed random number, and T represents the predefined maximum number of iterations, while m indicates the index of the current iteration.
Each gazelle represents a parameter combination {L, K, N} corresponding to the filter length, the number of filters, and the mode number, respectively. FMD is applied to the original signal using these parameters, and the envelope entropy of the decomposed signal is used as the fitness function, which is defined as follows:
E n k = k = 1 N p k l g p k
p k = a k / k = 1 N a k
In these equations, ak corresponds to the amplitude value of the k-th component in IMF set after envelope extraction, and pk is the normalized amplitude. A lower envelope entropy implies that the extracted features are more concentrated and that the noise is reduced, which is advantageous for accurate fault identification [30,31].

2.3. Time Domain Feature Extraction from Vibration Signals

The FMD process generates IMFs, each corresponding to specific frequency bands of the OLTC vibration signals, thereby capturing critical information reflective of the system’s operational state. In order to characterize the impulsive and oscillatory behavior of the signal, four typical statistical indices in the time domain—namely peak factor, impulse factor, waveform factor, and margin factor—are computed from the IMFs [32,33,34,35].
The peak factor (PF), which quantifies the signal’s sharpness or impulsiveness, is mathematically described as follows:
P F = max ( | x ( n ) | ) 1 N n = 1 N x 2 ( n )
A high peak factor typically reveals sudden transient phenomena within OLTC vibration signals, often caused by mechanical shocks or abrupt contacts. The impulse factor (IF) reflects the impulsiveness of the signal and is expressed as follows:
I F = max ( | x ( n ) | ) 1 N n = 1 N | x ( n ) |
For impact-type faults in OLTC systems, the impulse factor tends to exhibit significantly elevated values. The waveform factor (WF) describes the waveform characteristics and distortion level of the signal. It is computed by the following:
W F = 1 N n = 1 N x 2 ( n ) 1 N n = 1 N | x ( n ) |
This metric is frequently employed to assess the extent of waveform deformation and to distinguish periodic shock patterns from stochastic noise elements in the signal. The margin factor (MF) reflects whether the signal contains isolated high peaks, and is defined as follows:
M F = max ( | x ( n ) | ) 1 N n = 1 N x 4 ( n ) 1 / 4
This indicator evaluates whether the signal is approaching an extreme operating condition. A relatively large margin factor implies the presence of a prominent, isolated peak in the signal, which often corresponds to a transient event or an abnormal operating state.

2.4. Transformer-Based Classification Model Architecture and Implementation

The multiple intrinsic mode functions (IMFs) derived from the decomposition of the OLTC vibration signals exhibit varying levels of significance and distinct time domain characteristics. In addition, potential correlations may exist among the IMFs, which necessitate that classification models account for inter-feature dependencies. The Transformer, recognized as a cutting-edge deep learning framework, excels in modeling sequences and capturing global feature relationships, rendering it highly effective for handling intricate temporal data [21,22,36].
The proposed Transformer-based classification model comprises an encoder module and a classification head. The model receives a 24-dimensional input vector, derived by calculating four time domain statistical indicators from each of the six extracted IMFs. Positional encoding is initially added to preserve the sequential information. The encoded input is then processed by a multi-head self-attention mechanism, consisting of two encoder layers with four attention heads each and a model dimensionality of 128. The processed output is passed through a dense layer containing 64 neurons, followed by a Dropout operation set at a rate of 0.1 to reduce the risk of overfitting. Finally, a Softmax activation layer is employed to produce the classification probabilities. The schematic diagram of the designed model architecture is shown in Figure 3.
To ensure both robust feature representation and architectural simplicity, the model is tailored for fault diagnosis, particularly in cases with scarce annotated data [37]. The configuration parameters employed in the Transformer model are summarized in Table 3.
Based on the aforementioned model architecture and training configuration, the proposed OLTC mechanical fault diagnosis framework consists of five key stages: data acquisition, modal decomposition, feature extraction, sample construction, and fault classification. The overall process is illustrated in Figure 4.

3. Simulation of Mechanical Faults in OLTC

3.1. Experimental Platform Setup and Vibration Signal Acquisition

In this study, an experimental configuration for the KM-type OLTC was deployed to simulate vibration responses under both normal operating and representative mechanical fault conditions. A domestically manufactured KM-III800Y/126C-10193W OLTC prototype served as the test subject. Although the platform is based on a KM-type OLTC, the proposed method is not device-specific and is applicable to other OLTC models with similar mechanical structures. Vibration signals were acquired during the tap changing operation, providing empirical data to support the subsequent fault diagnosis research. The platform was instrumented with accelerometers (model YD38D, 100 mV/g sensitivity). The sensors were mounted in a triangular configuration on the top insulating cover of the OLTC unit, with their sensing axes oriented vertically (axial direction) as illustrated in Figure 5. This layout was chosen based on preliminary tests indicating that vibrations in the vertical (axial) direction consistently exhibited the highest amplitude in response to mechanical disturbances. In contrast, lateral directions were more susceptible to structural shaking and installation-related deviations during the tap changing operation, which could introduce measurement bias. Therefore, aligning the sensor axes vertically ensures measurement stability for vibration-based diagnostics. The triangular layout improves the spatial coverage and helps capture the localized vibration responses caused by different fault types.
The data acquisition system is capable of performing the multi-channel synchronous capture of vibration signals, offering a peak sampling frequency of 102.4 kHz. Considering that most of the energy in OLTC vibration signals lies below 10 kHz [38,39], a sampling frequency of 25 kHz was adopted in accordance with the Nyquist criterion to maintain signal fidelity and measurement accuracy [40].

3.2. Fault Type Setup and Analysis of Vibration Response Characteristics

To analyze the vibration characteristics of the OLTC under different fault conditions, three types of typical mechanical faults were introduced in the experiment (see Figure 6): (a) gear jamming in the drive shaft, (b) loose screw in the drive shaft, (c) arc plate looseness and contact wear. These fault types were selected based on their high occurrence frequency in engineering practice, ease of simulation on the experimental platform, and their capacity to generate distinguishable vibration features [6]. Specifically, gear jamming and screw looseness affect the torque transmission and stability of the drive mechanism, while arc plate and contact wear degrade electrical continuity and switching performance—both contributing to observable mechanical anomalies.
Under normal operating conditions, the structural components of the OLTC remain intact, and the contact mechanism functions without obstruction or abnormal friction. To simulate the gear jamming fault of the drive shaft, metal shims were inserted into the gear gaps to artificially introduce abnormal frictional torque between the gear teeth. The screw loosening fault was replicated by partially loosening the locking screw mounted on the drive shaft. The cam plate loosening fault was simulated by loosening its mounting screw, resulting in relative displacement between the cam and adjacent mechanical components. In addition, contact head wear was simulated under the same setup to reflect the degradation that typically co-occurs with cam plate looseness.
The vibration signal was recorded beginning at the instant the tap changer was actuated, with a total acquisition duration of 0.2 s and a sampling rate of 25 kHz, thereby capturing the complete switching process. To obtain a unified and noise-reduced signal representation, the raw signals from the three accelerometers were first synchronized and then averaged point-by-point across all sampling instances. This signal-level fusion approach helps suppress sensor-specific disturbances while preserving the global vibration characteristics, thereby enhancing the quality of the subsequent decomposition and feature extraction process. The corresponding waveform plots for this process are presented in Figure 7a–d, which clearly illustrate the variations in vibration response under different mechanical fault conditions.
Under standard working conditions, as depicted in Figure 7a, the vibration signal displays a smooth and low-amplitude waveform, suggesting that the system operates in a steady and well-balanced state. In contrast, Figure 7b illustrates that gear jamming in the drive shaft induces sharp transient spikes and significant amplitude fluctuations, indicative of high-frequency impulse components caused by mechanical impact. In Figure 7c, the loosening of the drive shaft screw results in an overall increase in vibration amplitude; however, the observed peaks are less pronounced than those in the gear jamming scenario. Figure 7d reveals that cam plate looseness combined with contact head wear leads to waveform irregularities, the presence of sharp peaks, and substantial amplitude variations, reflecting persistent fault characteristics throughout the switching process. The comparative waveform analysis indicates that both the amplitude and duration of vibration signals are considerably greater under fault conditions than in the normal state. To further quantify these distinctions, Table 4 presents the peak value, root mean square (RMS), and dominant frequency band associated with each operating condition [41].
As observed from Table 4, all fault conditions exhibit significantly higher root mean square (RMS) and peak values compared with the normal operating condition. In addition, the dominant frequency band demonstrates an upward shift under fault scenarios. These results suggest that the evolution of mechanical faults leads to an increased vibration intensity and the presence of higher-frequency components, thereby highlighting the distinct dynamic distinctive features corresponding to various fault categories.

4. Experimental Data Analysis

4.1. GOA-FMD-Based Feature Extraction and Results Analysis

In this study, the maximum number of iterations for FMD was set to 10. The parameters of the GOA were configured as follows: the velocity factor s was set to 0.88, the predator success rate PSRs was 0.34, the initial population size was 20, and the total iteration count of iterations for GOA was set to 80. The parameter search range and step size for FMD are summarized in Table 5, where the constraint N K is satisfied to ensure decomposition feasibility.
During the GOA optimization process, the fitness function utilized in the optimization process was envelope entropy, which facilitated convergence towards the global optimum. The convergence behavior depicted in Figure 8 indicates that the fitness function undergoes a gradual decline with successive iterations, eventually approaching a steady value, indicating that the algorithm gradually approaches the optimal solution. This behavior confirms both the convergence and the effectiveness of the proposed optimization strategy.
The vibration signal was decomposed using FMD, and multiple IMFs were extracted. Taking the signal under normal operating conditions as an example, its time domain and frequency domain representations can be seen in Figure 9. It is evident that the frequency bands of the individual IMFs have been effectively separated, with clearly defined boundaries and no mode aliasing. These observations confirm that the decomposition achieved a satisfactory performance under the current parameter configuration.
To ensure both computational efficiency and diagnostic accuracy, only the first six IMFs were retained for feature extraction. As shown in Figure 10, the cumulative energy contribution rate of the first six IMFs exceeds 99%, which means that these components preserve almost all the meaningful vibration information while eliminating redundant or noise-dominated higher-order IMFs.
For each decomposed IMF, four representative time domain statistical indicators—peak factor, impulse factor, waveform factor, and margin factor—were calculated to characterize vibration features. Table 6 presents the results using IMF3 as a representative case.
Under fault conditions, all four time domain indicators generally exhibit higher values compared with the normal operating condition. Notably, the peak factor values for the “Loose Screw in Drive Shaft” and “Arc Plate Looseness and Contact Wear” cases both exceed 23, indicating the presence of strong impact components in the vibration signal. The margin factor also shows a significant increase, reflecting greater deviation from steady-state operation and a higher likelihood of transient disturbances. These four time domain features, extracted from each of the six IMFs—resulting in a 24-dimensional feature vector—are subsequently used as input for the Transformer-based fault classification model.

4.2. Fault Classification Model Construction and Performance Comparison

In this study, vibration signals associated with OLTC contact operation were collected under four distinct conditions: (1) normal operation, (2) fault 1, gear jamming in the drive shaft, (3) fault 2, loose screw in the drive shaft, (4) fault 3, arc plate looseness and contact wear. For each condition, 150 signal samples were acquired, leading to a cumulative total of 600 samples. Among them, 120 samples per class were assigned to the training dataset, while the remaining 30 were used for testing, resulting in 480 training instances and 120 testing instances overall.

4.2.1. Comparison of Feature Extraction Performance Across Signal Decomposition Methods

To validate the effectiveness of FMD, a comparative analysis was conducted using three benchmark signal decomposition methods: Empirical Mode Decomposition (EMD), Ensemble Empirical Mode Decomposition (EEMD), and Variational Mode Decomposition (VMD). To ensure a fair comparison, the GOA parameters were held constant across all methods. GOA was employed to optimize the key parameters of each decomposition technique, and the optimized parameter configurations are summarized in Table 7. The baseline decomposition methods, including EMD, EEMD, and VMD, were re-implemented by the authors based on the algorithmic descriptions and parameter settings reported in the relevant literature [9,11,42].
The OLTC vibration signals were decomposed using different signal decomposition methods (FMD, VMD, EMD, and EEMD). For each method, the same set of time domain features was extracted and used as the input to an identical Transformer-based classification model. To evaluate the fault identification performance of each decomposition technique, classification accuracy and confusion matrices were analyzed under consistent model configurations. The comparative classification results are presented in Figure 11 and summarized numerically in Table 8.
Among the tested decomposition methods, the GOA-FMD achieved the highest classification performance, with an overall accuracy of 95.83%. Notably, it reached near-perfect accuracy under both the normal operating condition and fault 3 (arc plate looseness and contact wear). In comparison, the GOA-VMD approach attained an overall accuracy of 86.67%, but exhibited noticeable performance degradation in identifying fault 1 (gear jamming). The GOA-EMD and GOA-EEMD methods yielded lower overall accuracies of 74.17% and 80.00%, respectively. These results suggest that GOA-FMD offers superior classification accuracy and robustness for OLTC vibration signal analysis, making it more suitable for practical fault identification applications. A five-fold cross-validation was performed for all methods to ensure a consistent and unbiased performance evaluation, as shown in Figure 12.
As shown in Figure 12, the GOA-FMD method achieved the best performance under five-fold cross-validation, with an average classification accuracy of 96.7%. This result highlights its effectiveness in suppressing mode mixing and enhancing both feature extraction and fault recognition accuracy. The GOA-VMD method attained an average accuracy of 87.3%, outperforming both GOA-EMD (73.5%) and GOA-EEMD (79.7%). However, it still exhibited noticeable fluctuations across folds, indicating limited adaptability to varying signal characteristics. In contrast, the EMD-based methods suffered from considerable drops in accuracy in certain folds, suggesting a relatively weaker capability in handling signal interference and preserving fault-relevant features. In summary, the GOA-FMD framework demonstrates a superior classification performance, enhanced stability, and strong generalization ability, rendering it particularly effective for diagnosing OLTC faults when only limited data is available.

4.2.2. Performance Comparison of Classification Models in Fault Identification

To gain deeper insight into the effectiveness of the Transformer model for diagnosing OLTC faults, this study conducted comparative experiments with several classic classification models, including SVM, RF, and MLP. The goal of the comparison was to assess the classification accuracy and robustness of each model when trained and tested with the same feature inputs and experimental settings. These baseline models were reproduced by the authors based on established methods reported in prior studies [18,20,43]. The hyperparameter settings for all baseline models are detailed in Table 9. The hyperparameter settings for each baseline model were selected based on commonly accepted practices in the literature and fine-tuned through preliminary experiments to ensure optimal performance.
For consistency and a fair comparison, all classification models were trained using the same input features—namely, the 24-dimensional time domain feature vectors (comprising four statistical indicators extracted from six IMFs) obtained via GOA-FMD signal decomposition. The corresponding classification performance of each model is summarized in Table 10 and visualized in Figure 13.
According to the data in Table 10, the Transformer model demonstrated the greatest performance in OLTC fault classification, achieving an overall accuracy of 95.83%. It reached 100% accuracy in identifying both the normal condition and fault 3 (arc plate looseness and contact wear). In comparison, the MLP model attained an overall accuracy of 87.5%, but its accuracy for fault 2 (loose screw in drive shaft) dropped to 80.00%. The Random Forest (RF) model yielded an overall accuracy of 80.83%, with relatively poor classification results for fault 1 and fault 2 (both at 76.67%). The Support Vector Machine (SVM) model achieved 83.33% accuracy, indicating a more stable but consistently lower performance than the Transformer model. These results suggest that the Transformer model exhibits superior capability in capturing global and discriminative features from complex vibration signals. Its enhanced representation power makes it more suitable for high-precision fault classification tasks, particularly in the context of OLTC diagnostics.
As shown in Figure 14, the Transformer model obtained the greatest average classification accuracy of 96.8% across all five cross-validation folds, significantly outperforming the other benchmark models. This result highlights its strong ability to capture both the temporal dynamics and global representations inherent in OLTC vibration signals, thereby enhancing diagnostic precision. In contrast, the MLP model achieved an average accuracy of 89.5%, while the Random Forest (RF) and Support Vector Machine (SVM) models obtained 81.1% and 86.6%, respectively. All three models fell short of the Transformer’s performance in terms of both accuracy and stability. Such findings provide confirmation of the superior representational capacity and generalization ability of the Transformer model, confirming its effectiveness as a high-precision diagnostic framework for complex OLTC fault classification tasks.

5. Conclusions

This study investigates the mechanical fault diagnosis problem of the KM-type OLTC and proposes a novel classification framework that integrates GOA-FMD with a Transformer-based deep learning model. By incorporating vibration signal analysis, robust feature extraction, and advanced classification techniques, the proposed approach aims to enhance diagnostic accuracy and equipment reliability. The core conclusions drawn from the study are listed below:
1. A comprehensive experimental setup was established to simulate both normal and faulty OLTC operating conditions. Three representative mechanical fault types were introduced: fault 1—gear jamming in the drive shaft; fault 2—loose screw in the drive shaft; and fault 3—arc plate looseness and contact wear. A total of 600 vibration signal samples were acquired, providing a reliable data foundation for subsequent signal characterization and model training.
2. The vibration signals were decomposed using the GOA-optimized FMD method, in which key parameters—including the number of filters K, filter length L, and the mode number N—were adaptively optimized. This approach effectively suppressed mode aliasing and improved the separability of signal components. Subsequently, four time domain features (peak factor, impulse factor, waveform factor, and margin factor) were extracted from each IMF to construct a discriminative feature vector.
3. Comparative experiments among multiple decomposition methods (GOA-VMD, GOA-EMD, and GOA-EEMD) demonstrated that GOA-FMD achieved the best signal decomposition and classification performance. It obtained the highest average classification accuracy of 96.7%, significantly outperforming GOA-VMD (87.3%), GOA-EMD (73.5%), and GOA-EEMD (79.7%). These results confirm the superior fault feature preservation and anti-noise capability of GOA-FMD.
4. To perform fault classification, a Transformer model was adopted, which takes advantage of attention-based processing to extract global and long-range features from the input data. A comparative evaluation with classic models (SVM, RF, and MLP) revealed that Transformer achieved the best diagnostic performance on both the test set (95.83%) and in the five-fold cross-validation (96.8%), demonstrating a superior generalization ability and robustness under varying data conditions.
In conclusion, the proposed GOA-FMD + Transformer framework provides a feasible and scalable solution for intelligent condition monitoring and fault diagnosis in complex power equipment. The core innovation of this study lies in employing the GOA to adaptively optimize the key parameters of FMD, thereby achieving a superior signal decomposition performance. In addition, a Transformer-based classifier is introduced to enhance fault identification by leveraging its capability to model long-range dependencies in vibration signals. The integrated approach offers both a high diagnostic accuracy and practical engineering applicability. However, this study is limited to three predefined mechanical fault types. Further research is required to extend the approach to more diverse and complex fault scenarios to enhance its generalizability and deployment in broader real-world settings.

Author Contributions

Conceptualization, R.W. and Z.C.; Methodology, R.W. and D.L.; Software, Q.W. and F.J.; Validation, Z.C., Y.D., and F.J.; Formal analysis, R.W.; Investigation, Q.W.; Resources, Y.D. and H.W.; Data curation, H.W.; Writing—original draft preparation, D.L.; Writing—review and editing, Z.C. and X.W.; Visualization, F.J.; Supervision, X.W.; Project administration, X.W.; Funding acquisition, R.W. and Z.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Yunnan Province “Xingdian Talent Support Program” Chief Technician Special Fund and the China Southern Power Grid Science and Technology Project (YNKJXM20230542).

Data Availability Statement

The data that support the findings of this study are not publicly available due to privacy concerns and legal restrictions, but can be provided by the authors upon reasonable request.

Acknowledgments

The authors gratefully acknowledge Kunming Power Supply Bureau of Yunnan Electric Grid Co., Ltd. for its support in supplying the experimental samples and data used in this study. Appreciation is also extended to Shandong University for its valuable technical guidance and assistance throughout the research process.

Conflicts of Interest

The authors Ruifeng Wei, Zhenjiang Chen, Qingbo Wang, Yongsheng Duan, Hui Wang, and Feiming Jiang are affiliated with Kunming Power Supply Bureau, Yunnan Electric Grid Co., Ltd. No other authors report any commercial or financial ties that might be interpreted as potential conflicts of interest. This research received funding from Kunming Power Supply Bureau of Yunnan Electric Grid Co., Ltd. The funding organization was involved in the provision of experimental data and supporting documents, participated in the investigation and formal analysis, and contributed to the preparation and technical validation of the original manuscript draft.

Abbreviations

The following abbreviations are used in this manuscript:
OLTCOn-Load Tap Changer
EMDEmpirical Mode Decomposition
VMDVariational Mode Decomposition
FMDFeature Mode Decomposition
RFRandom Forest
MLPMultilayer Perceptron
SVMSupport Vector Machine
EEMDEnsemble Empirical Mode Decomposition
GOAGazelle Optimization Algorithm
PFPeak Factor
IFImpulse Factor
WFWaveform Factor
MFMargin Factor
IMFIntrinsic Mode Function

References

  1. Geng, J.; Zhang, Z.; Wang, X.; Gao, S.; Wang, P. On-load tap-changer fault mode recognition based on the singular value of Hilbert energy spectrum time-frequency matrix and spectrum entropy. IET Gener. Transm. Distrib. 2022, 16, 3256–3266. [Google Scholar] [CrossRef]
  2. Ismail, F.B.; Mazwan, M.; Al-Faiz, H.; Marsadek, M.; Hasini, H.; Al-Bazi, A.; Yang Ghazali, Y.Z. An offline and online approach to the OLTC condition monitoring: A review. Energies 2022, 15, 6435. [Google Scholar] [CrossRef]
  3. Secic, A.; Krpan, M.; Kuzle, I. Vibro-acoustic methods in the condition assessment of power transformers: A survey. IEEE Access 2019, 7, 83915–83931. [Google Scholar] [CrossRef]
  4. Nadolny, Z. Design and Optimization of Power Transformer Diagnostics. Energies 2023, 16, 6466. [Google Scholar] [CrossRef]
  5. Yang, R.; Zhang, D.; Li, Z.; Yang, K.; Mo, S.; Li, L. Mechanical fault diagnostics of power transformer on-load tap changers using dynamic time warping. IEEE Trans. Instrum. Meas. 2019, 68, 3119–3127. [Google Scholar] [CrossRef]
  6. Wang, S.; Hong, Z.; Min, Q.; Zou, D.; Zhao, Y.; Qi, R.; Zhao, T. Diagnosis of Power Transformer On-Load Tap Changer Mechanical Faults Based on SABO-Optimized TVFEMD and TCN-GRU Hybrid Network. Energies 2025, 18, 2934. [Google Scholar] [CrossRef]
  7. Shi, Y.; Ruan, Y.; Li, L.; Zhang, B.; Huang, Y.; Xia, M.; Yuan, K.; Luo, Z.; Lu, S. A Mechanical Fault Identification Method for On-Load Tap Changers Based on Hybrid Time—Frequency Graphs of Vibration Signals and DSCNN-SVM with Small Sample Sizes. Vibration 2024, 7, 970–986. [Google Scholar] [CrossRef]
  8. Xu, Y.; Chen, B.; Ma, H.; Xu, H.; Wang, L.; Wang, C. Vibration signal feature extraction method of the onload tap changer based on EMDPSD. J. Electr. Power Sci. Technol. 2021, 35, 3–10. [Google Scholar]
  9. Huang, N.E. New method for nonlinear and nonstationary time series analysis: Empirical mode decomposition and Hilbert spectral analysis. In Wavelet Applications VII; SPIE: Bellingham, WA, USA, 2000; pp. 197–209. [Google Scholar]
  10. Liu, J.; Wang, G.; Zhao, T.; Zhang, L. Fault diagnosis of on-load tap-changer based on variational mode decomposition and relevance vector machine. Energies 2017, 10, 946. [Google Scholar] [CrossRef]
  11. Dragomiretskiy, K.; Zosso, D. Variational mode decomposition. IEEE Trans. Signal Process. 2013, 62, 531–544. [Google Scholar] [CrossRef]
  12. Lei, Y.; Lin, J.; He, Z.; Zuo, M.J. A review on empirical mode decomposition in fault diagnosis of rotating machinery. Mech. Syst. Signal Process. 2013, 35, 108–126. [Google Scholar] [CrossRef]
  13. Miao, Y.; Zhang, B.; Li, C.; Lin, J.; Zhang, D. Feature mode decomposition: New decomposition theory for rotating machinery fault diagnosis. IEEE Trans. Ind. Electron. 2022, 70, 1949–1960. [Google Scholar] [CrossRef]
  14. Agushaka, J.O.; Ezugwu, A.E.; Abualigah, L. Gazelle optimization algorithm: A novel nature-inspired metaheuristic optimizer. Neural Comput. Appl. 2023, 35, 4099–4131. [Google Scholar] [CrossRef]
  15. Li, Z.; Zhou, Z.; Zhou, X. A sensor-based modified FMD method to identify fault feature for mechanical fault diagnosis of ship-borne antennae. IEEE Access 2023, 11, 40018–40028. [Google Scholar] [CrossRef]
  16. Duan, X.; Zhao, T.; Li, T.; Liu, J.; Zou, L.; Zhang, L. Method for diagnosis of on-load tap changer based on wavelet theory and support vector machine. J. Eng. 2017, 2017, 2193–2197. [Google Scholar] [CrossRef]
  17. Peterson, L.E. K-nearest neighbor. Scholarpedia 2009, 4, 1883. [Google Scholar] [CrossRef]
  18. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  19. Fernandes, M.; Corchado, J.M.; Marreiros, G. Machine learning techniques applied to mechanical fault diagnosis and fault prognosis in the context of real industrial manufacturing use-cases: A systematic literature review. Appl. Intell. 2022, 52, 14246–14280. [Google Scholar] [CrossRef] [PubMed]
  20. Tolstikhin, I.O.; Houlsby, N.; Kolesnikov, A.; Beyer, L.; Zhai, X.; Unterthiner, T.; Yung, J.; Steiner, A.; Keysers, D.; Uszkoreit, J. Mlp-mixer: An all-mlp architecture for vision. Adv. Neural Inf. Process. Syst. 2021, 34, 24261–24272. [Google Scholar]
  21. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar] [CrossRef]
  22. Lin, T.; Wang, Y.; Liu, X.; Qiu, X. A survey of transformers. AI Open 2022, 3, 111–132. [Google Scholar] [CrossRef]
  23. Xu, Z.; Guan, H.; Kang, J.; Lei, X.; Ma, L.; Yu, Y.; Chen, Y.; Li, J. Pavement crack detection from CCD images with a locally enhanced transformer network. Int. J. Appl. Earth Obs. Geoinf. 2022, 110, 102825. [Google Scholar] [CrossRef]
  24. He, X.; Zhou, X.; Li, J.; Mechefske, C.K.; Wang, R.; Yao, G.; Liu, Q. Adaptive feature mode decomposition: A fault-oriented vibration signal decomposition method for identification of multiple localized faults in rotating machinery. Nonlinear Dyn. 2023, 111, 16237–16270. [Google Scholar] [CrossRef]
  25. Li, S.; Dou, L.; Li, H.; Li, Z.; Kang, Y. An Innovative Electromechanical Joint Approach for Contact Pair Fault Diagnosis of Oil-Immersed On-Load Tap Changer. Electronics 2023, 12, 3573. [Google Scholar] [CrossRef]
  26. Meng, T.; Jiang, X.; Song, Q.; Hu, Z.; Guo, J.; Zhu, Z. Modified Feature Mode Decomposition Guided by Spectral Structure Information for Machinery Fault Diagnosis. IEEE Trans. Instrum. Meas. 2024, 73, 1–10. [Google Scholar] [CrossRef]
  27. Qin, S.; Zeng, H.; Sun, W.; Wu, J.; Yang, J. Multi-strategy improved particle swarm optimization algorithm and gazelle optimization algorithm and application. Electronics 2024, 13, 1580. [Google Scholar] [CrossRef]
  28. Chauhan, S.; Vashishtha, G.; Kumar, R.; Zimroz, R.; Gupta, M.K.; Kundu, P. An adaptive feature mode decomposition based on a novel health indicator for bearing fault diagnosis. Measurement 2024, 226, 114191. [Google Scholar] [CrossRef]
  29. Zhou, C.; Xiong, Z.; Bai, H.; Xing, L.; Jia, Y.; Yuan, X. Parameter-adaptive TVF-EMD feature extraction method based on improved GOA. Sensors 2022, 22, 7195. [Google Scholar] [CrossRef] [PubMed]
  30. Chen, Z.; Yang, Y.; He, C.; Liu, Y.; Liu, X.; Cao, Z. Feature extraction based on hierarchical improved envelope spectrum entropy for rolling bearing fault diagnosis. IEEE Trans. Instrum. Meas. 2023, 72, 1–12. [Google Scholar] [CrossRef]
  31. Yang, Y.; Liu, H.; Han, L.; Gao, P. A feature extraction method using VMD and improved envelope spectrum entropy for rolling bearing fault diagnosis. IEEE Sensors J. 2023, 23, 3848–3858. [Google Scholar] [CrossRef]
  32. Yang, Y.; Zhang, Y.; Zeng, Q. Research on coal gangue recognition based on multi-layer time domain feature processing and recognition features cross-optimal fusion. Measurement 2022, 204, 112169. [Google Scholar] [CrossRef]
  33. Yan, X.; Jia, M. Bearing fault diagnosis via a parameter-optimized feature mode decomposition. Measurement 2022, 203, 112016. [Google Scholar] [CrossRef]
  34. Zhang, X.; Li, L.; Liu, S.; Lei, J. Empirical wavelet transform based on energy peak location and its application in weak bearing fault diagnosis. J. Xi’an Jiaotong Univ. 2021, 55, 1–8. [Google Scholar]
  35. Rivas, E.; Burgos, J.C.; Garcia-Prada, J.C. Vibration analysis using envelope wavelet for detecting faults in the OLTC tap selector. IEEE Trans. Power Deliv. 2010, 25, 1629–1636. [Google Scholar] [CrossRef]
  36. Hua, M.; Yan, K.; Li, X. A Transformer-based self-supervised learning model for fault diagnosis of air-conditioning systems with limited labeled data. Eng. Appl. Artif. Intell. 2025, 146, 110331. [Google Scholar] [CrossRef]
  37. Trujillo-Guerrero, M.F.; Román-Niemes, S.; Jaén-Vargas, M.; Cadiz, A.; Fonseca, R.; Serrano-Olmedo, J.J. Accuracy comparison of CNN, LSTM, and transformer for activity recognition using IMU and visual markers. IEEE Access 2023, 11, 106650–106669. [Google Scholar] [CrossRef]
  38. Cichoń, A.; Włodarz, M. OLTC Fault detection based on acoustic emission and supported by machine learning. Energies 2023, 17, 220. [Google Scholar] [CrossRef]
  39. Ezzaidi, H.; Fofana, I.; Picher, P.; Gauvin, M. On the Feasibility of Detecting Faults and Irregularities in On-Load Tap Changers (OLTCs) by Vibroacoustic Signal Analysis. Sensors 2024, 24, 7960. [Google Scholar] [CrossRef] [PubMed]
  40. Shannon, C.E. Communication in the presence of noise. Proc. IRE 2006, 37, 10–21. [Google Scholar] [CrossRef]
  41. Shang, R.; Peng, C.; Fang, R. A segmented preprocessing method for the vibration signal of an on-load tap changer. Electronics 2021, 10, 131. [Google Scholar] [CrossRef]
  42. Wu, Z.; Huang, N.E. Ensemble empirical mode decomposition: A noise-assisted data analysis method. Adv. Adapt. Data Anal. 2009, 1, 1–41. [Google Scholar] [CrossRef]
  43. Sun, J.; Xiao, Q.; Wen, J.; Wang, F. Natural gas pipeline small leakage feature extraction and recognition based on LMD envelope spectrum entropy and SVM. Measurement 2014, 55, 434–443. [Google Scholar] [CrossRef]
Figure 1. Flowchart of FMD process.
Figure 1. Flowchart of FMD process.
Energies 18 03848 g001
Figure 2. Flowchart of the GOA-optimized FMD.
Figure 2. Flowchart of the GOA-optimized FMD.
Energies 18 03848 g002
Figure 3. Transformer-based architecture.
Figure 3. Transformer-based architecture.
Energies 18 03848 g003
Figure 4. Flowchart of OLTC mechanical fault diagnosis.
Figure 4. Flowchart of OLTC mechanical fault diagnosis.
Energies 18 03848 g004
Figure 5. Accelerometer placement on the KM-type OLTC for vibration monitoring.
Figure 5. Accelerometer placement on the KM-type OLTC for vibration monitoring.
Energies 18 03848 g005
Figure 6. Three typical mechanical fault types simulated on the KM-type OLTC: (a) gear jamming in the drive shaft; (b) loose screw in the drive shaft; (c) arc plate looseness and contact wear.
Figure 6. Three typical mechanical fault types simulated on the KM-type OLTC: (a) gear jamming in the drive shaft; (b) loose screw in the drive shaft; (c) arc plate looseness and contact wear.
Energies 18 03848 g006
Figure 7. Time domain waveforms of OLTC contact operation under different conditions: (a) normal condition; (b) gear jamming in drive shaft; (c) loose screw in drive shaft; (d) arc plate looseness and contact wear.
Figure 7. Time domain waveforms of OLTC contact operation under different conditions: (a) normal condition; (b) gear jamming in drive shaft; (c) loose screw in drive shaft; (d) arc plate looseness and contact wear.
Energies 18 03848 g007
Figure 8. Convergence curve of the fitness function value (envelope entropy) during GOA-based optimization of FMD parameters.
Figure 8. Convergence curve of the fitness function value (envelope entropy) during GOA-based optimization of FMD parameters.
Energies 18 03848 g008
Figure 9. Time and frequency domain representations of IMF components extracted by FMD under normal operating conditions.
Figure 9. Time and frequency domain representations of IMF components extracted by FMD under normal operating conditions.
Energies 18 03848 g009
Figure 10. Energy contribution rate and cumulative contribution rate of each IMF component obtained via FMD.
Figure 10. Energy contribution rate and cumulative contribution rate of each IMF component obtained via FMD.
Energies 18 03848 g010
Figure 11. Confusion matrices comparing diagnostic results using different GOA-optimized signal decomposition methods: (a) GOA-FMD; (b) GOA-VMD; (c) GOA-EMD; and (d) GOA-EEMD. The classification results are based on the same Transformer model.
Figure 11. Confusion matrices comparing diagnostic results using different GOA-optimized signal decomposition methods: (a) GOA-FMD; (b) GOA-VMD; (c) GOA-EMD; and (d) GOA-EEMD. The classification results are based on the same Transformer model.
Energies 18 03848 g011
Figure 12. Five-fold cross-validation accuracy of Transformer-based fault classification using different GOA-optimized decomposition methods.
Figure 12. Five-fold cross-validation accuracy of Transformer-based fault classification using different GOA-optimized decomposition methods.
Energies 18 03848 g012
Figure 13. Confusion matrices of fault classification results using different models based on GOA-FMD features: (a) Transformer classifier; (b) MLP classifier; (c) RF classifier; (d) SVM classifier.
Figure 13. Confusion matrices of fault classification results using different models based on GOA-FMD features: (a) Transformer classifier; (b) MLP classifier; (c) RF classifier; (d) SVM classifier.
Energies 18 03848 g013
Figure 14. Five-fold cross-validation results of different classification models based on GOA-FMD features.
Figure 14. Five-fold cross-validation results of different classification models based on GOA-FMD features.
Energies 18 03848 g014
Table 1. Comparison of representative signal decomposition methods for OLTC vibration analysis.
Table 1. Comparison of representative signal decomposition methods for OLTC vibration analysis.
MethodReferenceAdvantagesLimitations
EMDXu et al. [8,9]Simple implementation; captures nonlinear and non-stationary signalsMode mixing; boundary effect; sensitive to extrema; may miss key features
VMDLiu et al. [10,11,12]Theoretically sound; decomposes into narrowband modes in frequency domainInadequate in capturing transient/impulse features of mechanical faults
FMDMiao et al. [13,15]Retains both periodic and impulsive features; noise-resilientHighly sensitive to parameter settings; requires fine-tuned configuration
GOA-FMDThis studyAutomatically optimized parameters via GOA; adaptive and accurate decompositionNone reported in this study
Table 2. Comparison of classification models for OLTC mechanical fault diagnosis.
Table 2. Comparison of classification models for OLTC mechanical fault diagnosis.
MethodModel TypeAdvantagesLimitations
SVM [16]Machine LearningEffective in small datasets; good at binary classificationRequires manual feature extraction; poor scalability in high-dimensional data
KNN [17]Machine LearningSimple implementation; intuitiveSensitive to noise; high computational cost for large datasets
RF [18]Machine LearningHandles nonlinearity; robust to overfittingFeature engineering needed; lacks interpretability
MLP [20]Deep LearningLearns nonlinear mappings; handles fixed-size input vectorsRelies on handcrafted features; poor at capturing temporal/spatial structure
Transformer [21,22,23]Deep LearningCaptures long-range dependencies; automatic feature integration; highly generalizableHigh training cost
Table 3. Hyperparameter settings of the Transformer model used for OLTC fault classification.
Table 3. Hyperparameter settings of the Transformer model used for OLTC fault classification.
Parameter NameValue/Type
Input Feature Dimension24
Learning Rate0.002
Number of Attention Heads4
Total Attention Dimension128
OptimizerAdam
Number of Attention Layers2
Feedforward Hidden Dimension64
Output Category Dimension4
Dropout Rate0.1
Loss FunctionCross-Entropy
Classification FunctionSoftmax
Maximum Training Epochs150
Table 4. Comparison of temporal and spectral attributes of vibration signals.
Table 4. Comparison of temporal and spectral attributes of vibration signals.
ConditionPeak Value (m/s2)RMS (m/s2)Dominant Frequency Band (Hz)
Normal Condition4.050.65630 ~ 1760
Gear Jamming in Drive Shaft7.581.101210 ~ 2265
Loose Screw in Drive Shaft8.151.031235 ~ 2365
Arc Plate Looseness and Contact Wear9.461.111225 ~ 2180
Table 5. Parameter search space, step size, and optimal values for GOA-optimized FMD.
Table 5. Parameter search space, step size, and optimal values for GOA-optimized FMD.
Parameter NameSearch RangeStep SizeOptimal Value
Mode Number N[3, 10]16
Filter Length L[10, 100]232
Number of Filters K[5, 20]112
Table 6. Comparative analysis of statistical time domain indicators extracted from vibration signals under various OLTC operating conditions.
Table 6. Comparative analysis of statistical time domain indicators extracted from vibration signals under various OLTC operating conditions.
Parameter NamePFIFWFMF
Normal Condition17.9435952.114232.64016155.10672
Gear Jamming in Drive Shaft21.8118368.661093.14788180.01859
Loose Screw in Drive Shaft23.7257563.283352.66728170.87625
Arc Plate Looseness and Contact Wear23.9582363.253772.90436161.58832
Table 7. Parameter names, search ranges, and GOA-optimized values for each decomposition method (VMD, EMD, and EEMD).
Table 7. Parameter names, search ranges, and GOA-optimized values for each decomposition method (VMD, EMD, and EEMD).
Decomposition MethodParameter NameSearch RangeOptimal Value
VMDNumber of Modes[2, 10]6
Penalty Factor[100, 5000]2621
EMDStopping Criterion[1 × 10−6, 1 × 10−2]1 × 10−4
Maximum Number of IMFs Allowed[3, 10]6
EEMDNoise Amplitude[0.1, 0.5]0.2
Number of Ensemble Trials (Additive Noise)[50, 500]200
Table 8. Classification accuracy of OLTC fault diagnosis using different GOA-optimized signal decomposition methods with Transformer-based classification.
Table 8. Classification accuracy of OLTC fault diagnosis using different GOA-optimized signal decomposition methods with Transformer-based classification.
Decomposition MethodNormalFault 1Fault 2Fault 3Overall
GOA-FMD100%90.00%93.33%100%95.83%
GOA-VMD93.33%76.67%80.00%96.67%86.67%
GOA-EMD80.00%73.33%70.00%73.33%74.17%
GOA-EEMD90.00%70.00%80.00%80.00%80.00%
Table 9. Hyperparameter configurations of comparative classification models (SVM, RF, and MLP).
Table 9. Hyperparameter configurations of comparative classification models (SVM, RF, and MLP).
Classification ModelParameter NameValue/Type
SVMPenalty Coefficient0.0009
Kernel FunctionLinear Kernel
RFMinimum Samples Split5
Number of Trees100
Maximum Depth10
MLPHidden Layer Size(128, 64)
Activation FunctionReLU
Learning Rate0.001
OptimizerAdam
Table 10. Classification accuracy of different models on the test set using GOA-FMD features.
Table 10. Classification accuracy of different models on the test set using GOA-FMD features.
Classification ModelNormalFault 1Fault 2Fault 3Overall
Transformer100%90.00%93.33%100%95.83%
MLP90.00%90.00%80.00%90.00%87.50%
RF86.67%76.67%76.67%83.33%80.83%
SVM96.67%76.67%80.00%80.00%83.33%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wei, R.; Chen, Z.; Wang, Q.; Duan, Y.; Wang, H.; Jiang, F.; Liu, D.; Wang, X. A Mechanical Fault Diagnosis Method for On-Load Tap Changers Based on GOA-Optimized FMD and Transformer. Energies 2025, 18, 3848. https://doi.org/10.3390/en18143848

AMA Style

Wei R, Chen Z, Wang Q, Duan Y, Wang H, Jiang F, Liu D, Wang X. A Mechanical Fault Diagnosis Method for On-Load Tap Changers Based on GOA-Optimized FMD and Transformer. Energies. 2025; 18(14):3848. https://doi.org/10.3390/en18143848

Chicago/Turabian Style

Wei, Ruifeng, Zhenjiang Chen, Qingbo Wang, Yongsheng Duan, Hui Wang, Feiming Jiang, Daoyuan Liu, and Xiaolong Wang. 2025. "A Mechanical Fault Diagnosis Method for On-Load Tap Changers Based on GOA-Optimized FMD and Transformer" Energies 18, no. 14: 3848. https://doi.org/10.3390/en18143848

APA Style

Wei, R., Chen, Z., Wang, Q., Duan, Y., Wang, H., Jiang, F., Liu, D., & Wang, X. (2025). A Mechanical Fault Diagnosis Method for On-Load Tap Changers Based on GOA-Optimized FMD and Transformer. Energies, 18(14), 3848. https://doi.org/10.3390/en18143848

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop