Research on Power Quality Disturbance Identification by Multi-Scale Feature Fusion

Wu, Yunhui; Wu, Kunsong; Qian, Cheng; Wu, Jingjin; Tang, Rongnian

doi:10.3390/bdcc10010018

Open AccessArticle

Research on Power Quality Disturbance Identification by Multi-Scale Feature Fusion

by

Yunhui Wu

,

Kunsong Wu

,

Cheng Qian

,

Jingjin Wu

and

Rongnian Tang

^*

School of Mechanical and Electrical Engineering, Hainan University, Haikou 570228, China

^*

Author to whom correspondence should be addressed.

Big Data Cogn. Comput. 2026, 10(1), 18; https://doi.org/10.3390/bdcc10010018

Submission received: 20 November 2025 / Revised: 13 December 2025 / Accepted: 18 December 2025 / Published: 5 January 2026

(This article belongs to the Topic Advances in Integrative AI, Machine Learning, and Big Data for Transformative Applications)

Download

Browse Figures

Versions Notes

Abstract

In the context of the convergence of multiple energy systems, the risk of power quality degradation across different stages of energy generation and distribution has become increasingly significant. Accurate identification of power quality disturbances is crucial for improving power quality and ensuring the stable operation of power grids. However, existing disturbance identification methods struggle to balance accuracy and computational efficiency, limiting their applicability in real-time monitoring scenarios. To address this issue, this paper proposes a novel disturbance recognition framework called ST-mRMR-RF. The method first applies the S-transform to convert the time-domain signal into the time-frequency domain. It then extracts spectrum, low-frequency, mid-frequency, and high-frequency components as frequency-domain features from this domain. These are fused with time-domain features to form a multi-scale feature set. To reduce feature redundancy, the Maximum Relevance Minimum Redundancy (mRMR) algorithm is applied to select the optimal feature subset, ensuring maximum category relevance and minimal redundancy. Based on this foundation, four classifiers—Random Forest (RF), Partial Least Squares (PLS), Extreme Learning Machine (ELM), and Convolutional Neural Network (CNN)—are employed for disturbance identification. Experimental results show that the feature subset selected via mRMR reduces the model’s training time by 88.91%. When tested in a white noise environment containing 21 types of power quality disturbance signals, the ST-mRMR-RF method achieves a recognition accuracy of 99.24% at a 20dB signal-to-noise ratio. Overall, this framework demonstrates outstanding performance in noise resistance, classification accuracy, and computational efficiency.

Keywords:

power quality disturbances; S-transform; feature fusion; mRMR; RF

1. Introduction

With the rapid development and large-scale integration of renewable energy, multi-energy complementary systems have emerged as an important approach in modern power system evolution [1]. However, this system architecture also presents considerable challenges, particularly power quality disturbances caused by widespread power electronic integration and changing load characteristics [2]. Power quality disturbances typically manifest as deviations in voltage, current, or frequency from nominal values [3]. These disturbances are varied, including isolated events and complex disturbance combinations [4]. On the power source side, large-scale integration of power electronic devices, such as distributed photovoltaics and energy storage, causes serious issues, including harmonics and voltage fluctuations [5,6]. On the load side, emerging high-energy-consuming industries, such as Power-to-Hydrogen, have rendered large-scale rectifiers major sources of grid harmonics and disturbances [7]. These disturbances threaten grid stability, cause distributed source disconnections, damage equipment, interrupt production lines, and result in substantial economic losses [8]. Therefore, efficient and accurate power quality disturbance detection technologies are essential for stable smart grid operation and reliable modern industrial systems.

The study of power quality disturbance identification methods primarily involves two components: feature extraction and disturbance classification. For feature extraction, signal processing techniques, including the Fourier Transform (FT) [9], Short-Time Fourier Transform (STFT) [10], Continuous Wavelet Transform (CWT) [11,12], and S-transform (ST) [13,14,15,16,17], are widely applied. While traditional FT analyzes signal spectra and harmonics, it struggles with non-stationary power quality disturbances due to inherent limitations [9]. Although windowing techniques partially improve FT performance for non-stationary signals, the fixed resolution imposed by the window function remains restrictive [10]. Continuous Wavelet Transform excels in analyzing non-stationary signals with abrupt changes, but suffers from poor noise immunity and high computational complexity [11,12]. In contrast, the S-transform employs a Gaussian window whose width inversely scales with frequency. This approach overcomes fixed-window limitations and provides superior noise suppression, offering clear benefits for power quality disturbance feature extraction [14]. However, a single window function in the S-transform cannot simultaneously achieve optimal resolution in both time and frequency domains. To address this limitation, Liu et al. proposed time–frequency segmentation with dynamic correction of the S-transform window function, enabling accurate power quality disturbance localization [15]. Subsequent studies further enhanced the S-transform by integrating singular value decomposition (SVD) to improve feature extraction performance [16]. Moreover, feature characterization was improved by directly matching the Gaussian window to the signal spectrum, accelerating the S-transform optimization process [17]. However, although these advanced time–frequency methods generate high-dimensional feature sets with rich information, they inevitably introduce substantial feature redundancy.

In the study of disturbance classification, classical methods primarily include decision tree (DT) [18], partial least squares (PLS) [19,20], extreme learning machine (ELM) [21,22], and random forest (RF) [23,24,25]. Duc et al. modeled relationships among power quality efficiency, safety, and reliability using partial least squares structural equation modeling (SEM), with artificial neural networks (ANN) employed for analysis [20]. Subudhi U optimized extreme learning machine parameters using a Grey Wolf Optimizer (GWO), categorizing seventeen power quality disturbance events [21]. Experimental results demonstrated accurate detection and classification performance. Random forest has attracted increasing attention due to its high classification performance and parallelized training advantages. Ravi et al. implemented power quality disturbance classification by constructing a random forest based on decision trees and recursive feature importance partitioning [23]. Jampana further introduced gradient boosting to select key features, enhancing the random forest’s ability to recognize complex grid disturbances [24]. Ismael combined continuous wavelet transform time–frequency localization with random forest classifiers for effective classification [25]. Jiang Z et al. detected principal frequency points using iterative circular filtering envelope extreme algorithms and optimized Gaussian window parameters for resolution enhancement [26]. However, random forest classification performance strongly depends on input feature quality, and high-dimensional transformed features introduce redundancy and substantial computational costs. To address these challenges, research has shifted toward multi-dimensional feature fusion. A mainstream approach converts signals into images and applies deep learning techniques [27]. Li et al. integrated statistical features with convolutional neural network (CNN)-based image features [28]. Liu et al. and subsequent studies fused one-dimensional signals with two-dimensional images generated using continuous wavelet transform or Gramian angular field (GAF), processing these representations using CNN or transfer learning models [29,30]. Another research direction focuses on multi-sensor data fusion and graphical analysis. Izadi et al. proposed a synchronized Lissajous figure method, converting multi-sensor waveform differences into images for CNN-based identification, demonstrating strong robustness [31,32].

In summary, advanced time-frequency analysis methods generate high-dimensional feature sets with comprehensive signal characterization but inevitably introduce substantial redundancy. The performance of classifiers such as RF largely depends on input feature quality. When processing high-dimensional features derived from signal transformations, redundancy reduces classification accuracy and increases computational burden. Meanwhile, deep learning methods based on image fusion exhibit strong feature learning capability but suffer from high computational complexity, long training times, and strict hardware requirements. Consequently, neither traditional feature extraction–classification approaches nor deep learning-based image fusion methods effectively balance performance, efficiency, and engineering practicality. To address these limitations, this paper proposes a multi-scale feature fusion PQD identification model based on time–frequency domain fusion. The proposed model abandons image-based recognition and complex deep learning architectures. By integrating time-domain and frequency-domain numerical features, it constructs a lightweight and efficient recognition framework. This paper incorporates the Maximum Relevance Minimum Redundancy (mRMR) algorithm into the feature extraction–classification workflow. By selecting an optimal feature subset with high discriminative ability and low redundancy, the model reduces training time and computational overhead. To validate effectiveness and robustness, experiments were conducted on a dataset containing twenty-one typical power quality disturbance types. Gaussian white noise with varying signal-to-noise ratios and signal loss rates was introduced to simulate real-world noise and data incompleteness. The main contributions of this study are summarized as follows.

1: Construct a multi-scale time–frequency feature set with complementary information by selectively extracting spectral, low-, mid-, and high-frequency components from the S-transform matrix and fusing them with time-domain features. This approach overcomes single-feature limitations by forming an informative multi-scale feature set that provides high-quality input for high-precision power quality disturbance recognition.
2: Propose a power quality identification framework integrating mRMR feature selection to address high dimensionality and redundancy in multi-scale feature sets. The mRMR algorithm selects an optimal feature subset with strong discriminative power and minimal mutual information, improving classification efficiency while preserving core discriminative information.
3: Construct perturbation datasets reflecting real-world scenarios by generating twenty-one standard perturbation waveforms based on IEEE Std 1159-2019 [33]. This work addresses the limitation that most existing studies focus on white noise environments, which inadequately represent practical detection conditions. To overcome this limitation, signal loss mechanisms with varying proportions are introduced. This design enables comprehensive and rigorous evaluation of power quality disturbance recognition robustness under complex real-world scenarios.
4: Conduct systematic comparative experiments to evaluate the proposed framework under different feature extraction methods, feature selection strategies, and varying noise and data incompleteness conditions.

The remainder of this paper is organized as follows. Section 2 presents the proposed methodology. Section 3 presents a case study validating the effectiveness of the ST-mRMR-RF method. Section 4 presents a discussion with a summary of the findings. Finally, Section 5 presents the conclusion with a summary of the findings.

2. Method

2.1. PQDs

Power quality disturbances (PQDs) often arise from equipment malfunctions and external interference. A disturbance model library is established according to IEEE Std 1159-2019 [33]. The library includes seven typical single disturbances (C1–C7) and fourteen composite disturbances (C8–C21). Twenty-one typical disturbance samples are generated in MATLAB, and representative waveforms are illustrated in Figure 1. The single disturbances include harmonics (C1), sag (C2), swell (C3), interruption (C4), flicker (C5), transient oscillation (C6), and transient pulse (C7). Composite disturbances are formed by combinations of single disturbance types.

2.2. Algorithm Architecture

The proposed ST-mRMR-RF method adopts a three-stage processing architecture (Figure 2), which systematically addresses limitations of traditional methods caused by feature redundancy. In the first stage, detailed time–frequency features are extracted from raw disturbance signals using the S-transform and fused with time-domain features to construct a multi-scale feature set. This process provides sufficient feature representation for subsequent analysis. In the second stage, unlike traditional approaches that directly input high-dimensional features into classifiers, the method introduces the Maximum Relevance Minimum Redundancy (mRMR) feature selection mechanism as a key optimization step. This mechanism selects feature subsets highly correlated with target classes while maintaining minimal mutual redundancy, preserving discriminative information, and reducing feature dimensionality. In the final stage, the optimized feature subset is input into a Random Forest classifier based on bagging ensemble learning. With improved feature quality and reduced dimensionality, the classifier achieves high accuracy while improving training and inference efficiency.

2.2.1. S-Transform

The ST was developed to address the limitations of both the STFT and CWT. STFT is ineffective for non-smooth signals, while CWT is highly sensitive to noise. Therefore, the ST is employed to extract features in the time–frequency domain [14]. ST extracts the dynamic characteristics of non-smooth signals by providing frequency-adaptive time–frequency resolution, along with localized phase information.

Continuous ST: For a signal

h (t)

, the CWT is denoted as

w (τ, d)

, as shown in Equation (1).

ω (τ, d) = \int_{- \infty}^{\infty} h (t) w ((t - τ), d) d t

(1)

where

d

represents the scale parameter, and

τ

represents the wavelet position. The S-transform can be expressed as the multiplication of the CWT by the phase factor [34].

S (τ, d) = e^{- i 2 π f τ} (t) w (τ, d)

(2)

The mother wavelet in Equation (2) is shown in Equation (3).

w (t, f) = \frac{| f |}{\sqrt{2 π}} e^{- \frac{t^{2} f^{2}}{2}} e^{- i 2 π f t}

(3)

Note that the dilation factor

d

is the inverse of the frequency

f

.

The Gaussian window’s standard deviation

σ (f)

is frequency-dependent:

σ (f) = \frac{1}{| f |}

(4)

This implies that in high-frequency regions, a smaller (i.e., narrow window configuration) tends to enhance time resolution, whereas in low-frequency regions, a larger (i.e., wide window configuration) improves frequency resolution.

The wavelet in Equation (3) does not satisfy the condition of zero mean for an admissible wavelet; therefore, Equation (2) is not strictly a CWT. Written out explicitly, the S transform is as follows:

S (τ, f) = \int_{- \infty}^{\infty} h (t) \frac{| f |}{\sqrt{2 π}} e^{- \frac{{(τ - t)}^{2} f^{2}}{2}} e^{- i 2 π f t} d t

(5)

If the S transform is indeed a representation of the local spectrum, one would expect a simple operation of averaging the local spectra over time to give the Fourier spectrum. It is easy to demonstrate the following.

\int_{- \infty}^{\infty} S (τ, f) d τ = H (f)

(6)

where

H (t)

is the FT of

h (t)

.

The S transform can be written as operations on the Fourier spectrum

H (f)

of

h (t)

. Mathematically, the FT of the S-transform is given in Equation (7):

S (τ, f) = \int_{- \infty}^{\infty} H (α + f) e^{- \frac{2 π^{2} α^{2}}{f^{2}}} e^{i 2 π α τ} d α

(7)

where

α

is the frequency offset variable.

2.2.2. mRMR Feature Selection

The mRMR [35] feature screening algorithm is widely employed in machine learning to reduce redundancy by maximizing the mutual information between features and targets, thereby enhancing feature relevance while minimizing the mutual information among features [36].

Mutual information quantifies the correlation between two variables [37], x and y, as expressed in Equation (8).

I (x; y) = \iint p (x, y) \log \frac{p (x, y)}{p (x) p (y)} d x d y

(8)

where

P (x, y)

is the joint probability density of variables

x

and

y

, defined as follows:

p (x, y) = \frac{\partial^{2} F (x, y)}{\partial x \partial y}

(9)

where

F (x, y)

is the joint cumulative distribution function, which describes the probability that the random variables

X

and

Y

are simultaneously less than or equal to specific values

x

and

y

. It is defined as follows:

F (x, y) = P (X \leq x, Y \leq y)

(10)

where

P (x)

and

P (y)

are the marginal probability densities of variables

x

and

y

, respectively.

p (x) = \int p (x, y) d y, p (y) = \int p (x, y) d x

(11)

The maximum correlation is the average of the mutual information between the feature

x_{i}

and the target category

C

, as represented in Equation (12).

\max D (S, C) = \frac{1}{| S |} \sum_{x_{i} \in S} I (x_{i}; C)

(12)

where

S

represents the feature set and

|S|

represents the number of features selected in the feature subset

S

.

Minimum redundancy requires that the dependencies between each feature are minimized, as expressed in Equation (13).

\min R (S) \frac{1}{| S |^{2}} \sum_{x_{i}, x_{j} \in S} I (x_{i}; x_{j})

(13)

Then, mRMR is expressed in Equation (14).

\max ϕ (D, R), ϕ = D - R

(14)

2.2.3. Classification Method

Among the PQDs classification problems, CNN, PLS, ELM, and RF are commonly employed. RF is an ensemble classifier composed of multiple decision trees. Each tree predicts the category of the input sample and selects the category with the highest number of votes among all decision trees as the final prediction for the input sample. The procedure is as follows.

1: Parameter determination: Set the number of decision trees N, the depth of decision trees D, and the number of features F.
2: Bagging: Randomly draw N subsets from the original dataset, with each subset used to construct an individual decision tree.
3: Node splitting: For each node, randomly select f candidate features from F features. For each candidate, calculate the midpoint of the interval of its value as the threshold. Arrange the feature values, and according to the principle of minimizing node impurity, traverse to obtain the best feature “f*” and the best threshold “t*”, and perform node splitting.
4: Prediction result: Count the voting results of N decision trees, and the category with the highest number of votes is taken as the final prediction result.

2.3. Multi-Scale Feature Fusion

To improve power quality detection accuracy, representative frequency-domain features were extracted from the time–frequency matrix obtained using the S-transform. As shown in Figure 3, four frequency-domain features were derived: spectrum, low-frequency components (0–100 Hz), mid-frequency components (100–250 Hz), and high-frequency components (250–500 Hz).

First, the signal (C20) in Figure 3a is transformed into the time–frequency domain using the S-transform to obtain the time–frequency matrix in Figure 3b. By summing values across time at each frequency, the frequency spectrum shown in Figure 3c is obtained, revealing frequency components, their locations, and overall energy distribution. Subsequently, the time–frequency matrix is divided into low-, mid-, and high-frequency subbands along the frequency dimension. Time–frequency curves within each subband are summed across frequencies, producing time-varying energy curves whose amplitudes serve as candidate feature values. This broadband energy aggregation allows minor frequency drifts or harmonic distortions to redistribute energy within the same band, enhancing feature stability. The low-, mid-, and high-frequency components are shown in Figure 3d, Figure 3e, and Figure 3f, respectively. The low-frequency component reflects dominant energy and long-term behavior, highlighting the basic voltage sag structure in the C20 signal. The mid-frequency component captures gradual variations, whereas the high-frequency component is sensitive to transient abrupt changes and noise. The mid-frequency plot illustrates slow signal evolution, while the high-frequency plot clearly reveals transient pulse amplitude and abrupt onset.

By extracting features from the S-transformed time–frequency map, representative characteristics of different power quality disturbances are identified, providing richer information than time-domain signals alone. However, due to time–frequency window constraints of the S-transform, the Gaussian window expands in low-frequency regions. Rapid events within this wide temporal window are averaged, causing ambiguous time localization and temporal information degradation. Therefore, concatenating time-domain and frequency-domain features compensates for insufficient temporal information and yields a unified feature set.

3. Results and Discussion

3.1. Dataset Description

In this paper, twenty-one power quality disturbances are studied and labeled from C1 to C21. These disturbances include seven typical single disturbances (C1–C7) and fourteen composite disturbances (C8–C21). Based on IEEE Std 1159-2019, a total of 21,000 waveform samples are generated using MATLAB 2021a, with 1000 samples per power quality disturbance class. The twenty-one power quality disturbance signals are summarized in Table 1.

The simulation experiments were conducted on a computer with an AMD Ryzen 7 7840H CPU (Santa Clara, CA, USA), 32 GiB Kingston DDR4-3200 RAM (Fountain Valley, CA, USA), and an AMD Radeon 780M integrated GPU. The software environment comprised MATLAB 2021a running on Windows 11.

Robustness Evaluation Under Noisy Conditions

To evaluate model robustness in noisy environments, additive white Gaussian noise (AWGN) is introduced at different signal-to-noise ratio (SNR) levels. The SNR is defined as the ratio of clean signal power to noise power, expressed in decibels (dB):

SNR (dB) = 10 \log_{10} \frac{P_{signal}}{P_{noise}}

(15)

where

P_{signal}

and

P_{noise}

represent the power of the clean signal and the noise, respectively.

Given a desired signal-to-noise ratio (SNR) level and the original clean signal

P_{signal}

, the power of the required additive noise

P_{noise}

can be derived from the equation above.

P_{noise} = \frac{P_{signal}}{10^{\frac{SNR}{10}}}

(16)

A white Gaussian noise sequence,

n (i)

, with the same length as the original signal

N

and a power of

P_{noise}

, is generated. The specific noise sequence is given by the following expression:

n (i) = \sqrt{P_{n oise}} \cdot ξ (i), i = 1, 2, \dots N

(17)

where

ξ (i)

is a random number drawn from a standard Gaussian distribution with a mean of 0 and a variance of 1. Finally, the generated noise sequence is added to the original clean signal to obtain the noisy signal

x_{noisy} (i)

:

x_{noisy} (i) = x (i) + n (i), i = 1, 2, \dots N

(18)

To verify whether the added noise simulates uncertainty in real measurement environments, Figure 4 compares the original signal with signals contaminated by noise at 20 dB, 30 dB, and 40 dB SNRs. From Figure 4, as the SNR decreases and noise intensity increases, characteristic perturbations in the waveform are progressively obscured by noise. This results in increased waveform asperities and blurred local details. Such behavior reflects interference commonly encountered during signal acquisition in practical industrial environments.

3.2. Comparison of Feature Extraction Methods and Performance Evaluation

The feature datasets obtained through three different feature extraction methods are fed into the classification models—RF, PLS, ELM, and CNN—for training. To comprehensively assess a classifier’s effectiveness, four primary metrics are commonly used: accuracy, precision, recall, and F1 score. The resulting classification performances are presented in Table 2.

From Table 2, it can be observed that the features extracted using the ST method yield the best performance in the classification model, while the STFT extraction method demonstrates the poorest performance. This limitation arises because the STFT method requires preset fixed-length windows, resulting in fixed time-frequency resolution. Consequently, it is unsuitable for non-stationary signals. In contrast, the ST method employs adaptive windows that dynamically adjust their size based on signal frequency, achieving high resolution in both time and frequency domains. The RF model achieves the highest classification accuracy, reaching 99.98%, 99.68%, and 99.90% for the ST, STFT, and CWT extracted feature sets, respectively.

Power quality disturbances include seven typical single disturbances (C1–C7) and fourteen composite disturbances (C8–C21). From the confusion matrix in Figure 5, most samples are correctly classified. Specifically, the RF model shown in Figure 5a correctly classifies all single disturbances, with only one composite disturbance misclassified, achieving 99.98% accuracy. The CNN model in Figure 5d exhibits more misclassifications among single disturbances but still attains 99.81% accuracy. In contrast, the PLS model in Figure 5b performs worst, particularly for category C7, yielding an overall accuracy of 91.84%.

To identify power quality disturbance types that are difficult to classify using the RF model, classification tests were conducted under noise-free conditions and noise levels of 20 dB, 30 dB, and 40 dB. The corresponding classification results are presented in Figure 6.

Under noisy conditions, classification accuracy decreases. Confusion matrix analysis reveals a systematic misclassification trend from C2 (voltage sag) to C4 (voltage interruption). Specifically, six misclassifications occur in Figure 6d, seven in Figure 6c, and fifteen in Figure 6b. This behavior arises because voltage sags and interruptions both exhibit decreasing voltage magnitudes over time. Such similarities lead to closely aligned feature representations, differing mainly in voltage drop magnitude. Increasing noise progressively compromises feature extraction precision. As shown in Figure 6b at 20 dB and Figure 6c at 30 dB, C2-to-C4 misclassifications increase markedly. This trend occurs because noise obscures critical distinguishing features, thereby degrading overall classification accuracy.

In summary, the ST-RF method demonstrates superior overall performance. However, a key question remains whether the S-transform improvement arises from time-domain features, frequency-domain features, or their interaction effect. To address this issue, ablation studies are conducted in the following section to provide definitive verification.

3.3. Ablation Experiment

To validate the selected frequency band partitioning strategy, ablation experiments were conducted. The proposed scheme was compared with two control approaches: a three-part division (0–166 Hz, 166–333 Hz, 333–500 Hz) and a bisection method (0–250 Hz, 250–500 Hz). All experiments used the same dataset and a unified experimental framework to ensure fair comparisons. The experimental results are presented as follows:

Table 3 compares Random Forest performance under different frequency band allocation strategies. The results show that the proposed segmentation strategy achieves optimal performance, with accuracy, precision, recall, and F1 score all reaching 99.98%. This performance surpasses the three-class partitioning strategy at 99.95% and the two-class partitioning strategy at 99.89%. These results indicate that fundamental frequency-based partitioning is more effective than uniform frequency partitioning for extracting highly discriminative features.

To validate the effectiveness of time–frequency fusion features on classification performance, three ablation experiments were conducted. These experiments evaluated four classifiers (RF, PLS, ELM, and CNN) using time-domain features, frequency-domain features, and time–frequency fusion features, respectively. Model parameters were configured as follows. For CNN, a kernel size of [2 × 1] with 16 and 32 kernels was used, optimized by the Adam optimizer with a 0.001 learning rate and 0.0001 L2 regularization. For ELM, the hidden layer contained 1850 nodes with a Sigmoid activation function. For PLS, 100 principal components were retained. The experimental results are presented in Table 4.

As shown in Table 4, the time–frequency fusion feature set achieves superior overall classification performance among RF, PLS, ELM, and CNN models. The corresponding accuracies are 99.98%, 91.89%, 97.84%, and 99.81%, respectively. When comparing single feature sets, RF, PLS, and ELM achieve higher accuracy using frequency-domain features. Conversely, CNN performs better with time-domain features than with frequency-domain features. Further analysis of RF performance shows that the fused feature set achieves the highest accuracy at 99.98%. This is followed by time-domain features at 99.25%, while frequency-domain features yield the lowest accuracy at 98.36%. This is because time-frequency fusion is not merely a simple concatenation of features. Rather, it involves the effective integration of time-domain and frequency-domain feature vectors to construct a more comprehensive joint representation space. This process enables the model to simultaneously capture the temporal dynamics and spectral structural information of the signal, thereby overcoming the informational limitations inherent in features from either domain alone.

3.4. Dimensionality Reduction Analysis

While the effectiveness of S-transform-based time–frequency fusion features was validated previously, the resulting 3600-dimensional feature set exhibits substantial redundancy. To address this issue, this study employs three feature reduction techniques: Maximum Relevance Minimum Redundancy (mRMR), an Autoencoder (AE), and Lasso regression. The autoencoder uses a single hidden layer with a Sigmoid activation function, a regularization coefficient of 0.004, and 100 training iterations. In the Lasso method, the regularization penalty coefficient is set to 0.01, and features are selected based on the largest absolute regression coefficients. These approaches produce feature subsets with 100, 120, and 200 dimensions, respectively.

Table 5 summarizes the parameter configurations for the mRMR algorithm and the Random Forest classifier. The mRMR feature selection algorithm employs 100 selected features, determined through feature selection experiments. This strategy reduces the original 3600-dimensional feature space while maintaining high recognition accuracy and improving computational efficiency. The Random Forest classifier is configured with 50 decision trees. Validation indicates that further increasing the tree number yields no noticeable performance improvement. This configuration optimizes computational overhead while preserving effective ensemble performance. Gini impurity is adopted as the splitting criterion due to its suitability and efficiency for classification tasks. The maximum tree depth is unrestricted, and the feature number follows the square-root rule to capture data characteristics effectively.

As shown in Figure 7, the CNN model achieves its highest accuracy of 99.83% using the Lasso-selected feature set. On the AE-selected feature set, PLS demonstrates superior performance. For the mRMR-selected feature set, the ST-mRMR-RF model performs best, achieving 100% accuracy. The mRMR-selected feature set not only reduces Random Forest training time but also improves classification accuracy. Notably, feature selection does not always enhance performance. For example, after AE-based reduction, the ST-AE-RF model accuracy decreases to 80.48%.

Computational efficiency is a critical metric for evaluating the performance of dimensionality reduction methods. Table 6 compares experimental results of RF models using feature sets with and without the mRMR feature selection algorithm.

As shown in Table 6, the mRMR algorithm reduces feature dimensionality from 3600 to 100, lowering feature count, parameter scale, and overall computational complexity. RF Model training time decreases from 95.15 s to 10.56 s, corresponding to an 88.91% reduction, while batch inference time decreases by 88.32%. Although mRMR introduces additional one-time computational overhead, the sustained acceleration during iterative training outweighs this cost, improving deployment feasibility on edge devices.

To determine the minimum dataset size required for acceptable accuracy, data ablation experiments were conducted. By progressively reducing training data size, model accuracy dependency on data volume was quantified, as shown in Figure 8. In noise-free conditions, the model requires only 30% of the original data to maintain near-optimal performance. Under 20 dB noise, accuracy gradually decreases as data volume reduces, declining from 99.24% to 98.10%. In low-to-moderate noise environments of 30 dB and 40 dB, accuracy remains at 98.99% and 99.63%, even when data volume decreases to 300 samples. These results indicate that under low-to-moderate noise, the model effectively suppresses noise and retains efficient learning with limited data.

3.5. Comprehensive Assessment of Model Robustness

To ensure practical applicability of the proposed ST-mRMR-RF model in real-world power systems, this section evaluates robustness across two dimensions: additive noise interference and data loss. These scenarios simulate common signal transmission challenges, including random interference and sensor or data acquisition equipment failures.

To evaluate model noise robustness, additive noise at signal-to-noise ratios of 20 dB, 30 dB, and 40 dB is added to the original signal. Classification training is then performed using both the original feature set and the mRMR-selected feature set, with results shown in Figure 9.

As illustrated in Figure 8, the ST-mRMR-RF method achieves 100% accuracy using the optimized feature set under noise-free conditions. Under noisy conditions with SNRs of 20 dB, 30 dB, and 40 dB, it maintains accuracies of 99.24%, 99.75%, and 99.84%, respectively. In contrast, the ST-mRMR-PLS and ST-mRMR-ELM methods exhibit performance degradation at 20 dB SNR, with accuracies decreasing to 84.86% and 94.87%, respectively. This result indicates their limited noise robustness. The ST-mRMR-RF method consistently demonstrates the highest overall performance, regardless of noise presence. Overall, although classification accuracy decreases with increasing noise, the ST-mRMR-RF method remains robust, maintaining 99.24% accuracy even under severe 20 dB noise.

As shown in Figure 10, under noise-free conditions, ROC curves for all classes approach the top-left region, yielding an AUC of 1.000. This result indicates near-perfect classification capability. Even under 20 dB noise interference, AUC values for most classes remain high, such as 0.999 for Class 2. This performance demonstrates the strong noise robustness of the proposed model’s feature representation.

Figure 11 shows that classification accuracy increases and gradually converges as the number of decision trees increases. Under noise-free conditions, accuracy reaches 100% when the number of trees equals 21. With SNRs of 20 dB, 30 dB, and 40 dB, accuracies are 99.24%, 99.75%, and 99.84% at 52, 33, and 46 trees.

Although prior experiments confirm ST-mRMR-RF robustness under noise, real industrial scenarios often include simultaneous signal interference and data loss. To evaluate practicality and stability under adverse conditions, a two-factor experiment was designed. The experiment examines the combined effects of Signal-to-Noise Ratios (no-noise, 30 dB, 40 dB) and data loss ratios (0%, 5%, 10%, 15%). Classification accuracy results are reported in Table 7.

As presented in Table 6, the model demonstrates strong robustness against data loss under noise-free or high-SNR (40 dB) conditions, maintaining an accuracy above 96% even with a 15% data loss ratio. However, under a 30 dB noise environment, the impact of data loss is markedly amplified, leading to an approximately 6 percentage-point drop in accuracy at the same loss ratio. This result reveals a significant coupling effect between noise interference and data loss.

As shown in Table 6, the model exhibits robustness to data loss under noise-free or high signal-to-noise ratio conditions at 40 dB. Even with a 15% data loss rate, accuracy remains above 96%. However, under a 30 dB noise environment, data loss impact increases, causing about a six percentage-point accuracy reduction compared with no-loss conditions. This result indicates a coupling effect between noise interference and data loss.

To validate the proposed method, Table 8 presents comparative results against existing approaches. The ST-mRMR-RF method shows strong performance in noisy environments, especially under challenging 20 dB signal-to-noise ratio conditions. Under noise-free conditions, advanced methods such as SMST-DCNN and DWT-CNN achieve classification accuracies of 99.5% or above. As noise intensity increases, the performance of all methods degrades. At a 20 dB signal-to-noise ratio, ST-mRMR-RF achieves 99.24% accuracy, outperforming all comparison methods. These results indicate that compact, discriminative feature subsets constructed using mRMR provide strong noise robustness.

4. Discussion

This study proposes an improved method integrating mRMR feature selection with the ST-RF framework, enhancing computational efficiency and classification performance. However, the ST-mRMR-RF framework shows limitations in discriminating feature-similar categories and handling extreme composite scenarios.

For feature-similar categories, this study improves discrimination by constructing a multi-source feature representation, applying mRMR for low-redundancy selection, and leveraging ensemble learning. Experiments show perfect separation of voltage sags and short interruptions under noise-free conditions. However, as noise increases (Figure 6b–d), misclassifications rise, indicating strong noise interference as a major contributor to category confusion. Future work should enhance feature extraction robustness under strong noise conditions.

Regarding robustness in extreme composite scenarios, experiments evaluated the method under Gaussian white noise and partial data loss. However, real-world power grids often involve colored noise and high-proportion data loss simultaneously. As shown in Table 6, under 40 dB Gaussian white noise, accuracy decreases to 97.62% when the sampling omission rate reaches 10%, and further degrades as the omission rate increases. This indicates that non-white noise and severe data missingness challenge feature stability and class boundary identification. This study has not yet sufficiently explored highly complex real-world power grid environments. Future research should test more realistic noise and missing data models and incorporate adaptive time–frequency analysis methods, such as Gaussian window optimization or spectral peakedness statistics. This will enable systematic evaluation and enhancement of theoretical performance limits and practical robustness under demanding conditions.

In summary, the limitations of this study mainly lie in insufficient adaptability to complex real-world environments. First, feature discriminability decreases under strong noise and compound interference, reducing the ability to distinguish feature-similar disturbances. Second, the study relies on synthetic datasets and simplified noise models, and robustness under extreme conditions, including non-stationary colored noise and high data loss rates, remains unverified. Engineering deployment also faces challenges from embedded device computational constraints and multi-module integration. To address these limitations, future work will follow several directions. First, the model will be validated using real-world data, and feature enhancement strategies resistant to non-stationary noise will be developed. Second, algorithm lightweighting and edge optimization will be implemented for deployment. Finally, an intelligent monitoring framework integrating online recognition and adaptive learning will be constructed.

5. Conclusions

This study proposes an S-transform-based multi-scale feature fusion method, termed ST-mRMR-RF, for power quality disturbance classification. The method extracts spectral, low-, mid-, and high-frequency components from the S-transform time–frequency matrix as frequency-domain features. These features are fused with time-domain features to construct a comprehensive power quality disturbance feature set. To address feature redundancy and computational inefficiency, the Maximum Relevance Minimum Redundancy (mRMR) algorithm selects discriminative feature subsets. This strategy reduces feature dimensionality while preserving critical information, improving computational efficiency and classification accuracy. Experimental results verify the effectiveness and robustness of the proposed ST-mRMR-RF framework.

Feature sets extracted using the S-transform consistently outperform those derived from STFT and CWT across different classifiers. Using the S-transform multi-scale fusion feature set, RF, PLS, ELM, and CNN achieve accuracies of 99.98%, 91.89%, 97.84%, and 99.81%, respectively. These accuracies exceed those obtained using single feature sets. In particular, RF shows accuracy improvements of 0.73% over time-domain features and 1.62% over frequency-domain features.

Following mRMR-based feature selection, the Random Forest classifier achieves 100% accuracy. Model training time decreases from 95.15 s to 10.59 s, representing an 88.91% reduction. Inference time decreases from 6.48 s to 0.69 s, corresponding to an 88.32% reduction, demonstrating the effectiveness of mRMR within this framework.

Under noise conditions with signal-to-noise ratios of 20 dB, 30 dB, and 40 dB, ST-mRMR-RF maintains high performance, achieving accuracies of 99.24%, 99.75%, and 99.84%, respectively. Additionally, under 40 dB noise with signal loss rates of 5%, 10%, and 15%, ST-mRMR-RF achieves accuracies of 98.27%, 97.62%, and 96.00%, respectively. These results demonstrate robustness against data loss.

In summary, compared with traditional methods, ST-mRMR-RF enhances feature representation through multi-scale fusion and reduces feature dimensionality using mRMR, improving computational efficiency. The method maintains high classification accuracy under noise and signal loss, making it suitable for real-world power grids with complex interference and incomplete data.

Author Contributions

Conceptualization: R.T., Y.W.; Methodology: Y.W.; Software: Y.W.; Validation: K.W., C.Q.; Formal Analysis: Y.W., C.Q.; Investigation: Y.W.; Resources: R.T., J.W.; Data Curation: Y.W., R.T.; Writing—Original Draft Preparation: Y.W.; Writing—Review and Editing: J.W., R.T.; Visualization: C.Q.; Supervision: Y.W., K.W.; Project Administration: K.W., C.Q.; Funding Acquisition: R.T., J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Key R&D Program of Hainan Province (Grant No. ZDYF2024GXJS297), the National Key Research and Development Program of China (Grant No. 2021YFB1507104), the Major Science and Technology Project of Hainan Province (Grant No. ZDKJ2020013), and the Hainan Province Key Research and Development Projects (Grant No. ZDYF2023GXJS148).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author(s).

Acknowledgments

The authors want to thank the editor and anonymous reviewers for their valuable suggestions for improving this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

Nomenclature

PQDs	Power quality disturbances
FT	Fourier transform
STFT	Short-time Fourier transform
CWT	Continuous wavelet transform
ST	Stockwell transformation
mRMR	Max-Relevance and Min-Redundancy
AE	Auto encoder
Lasso	Least absolute shrinkage and selection operator
RF	Random forest
ELM	Extreme learning machine
PLS	Partial least squares
CNN	Convolutional neural network
DT	Decision tree
SEM	Structural equation modeling
ANN	Artificial neural networks
GWO	Grey Wolf optimizer
GRN	Gene regulatory network
DAE	Deep autoencoders
LSTM	Long short-term memory
DCN	Deep convolutional network
SNR	Signal-to-noise ratio
AWGN	Additive white Gaussian noise
$h (t)$	Signal
$d$	Scale parameter
$τ$	Wavelet position
$S (τ, d)$	S transformation
$w (t, f)$	Mother wavelet
$f$	Frequency
$σ (f)$	The Gaussian window’s standard deviation
$H (t)$	The Fourier transform of the signal
$I (x; y)$	Mutual information between x and y
$P (x, y)$	Joint probability density of variables x and y
$F (x, y)$	Joint cumulative distribution function
$P (x)$	The marginal probability densities of x
$C$	Category
$S$	Characteristics
$D (S, C)$	Correlation function
$R (S)$	Redundancy function
$P_{signal}$	Power of the clean signal
$P_{noise}$	Power of the noise
$ξ (i)$	Standard Gaussian distribution
$x (i)$	Original signal
$x_{noisy} (i)$	Noisy signal

References

Wang, G.; Zhang, Z.; Lin, J. Multi-energy complementary power systems based on solar energy: A review. Renew. Sustain. Energy Rev. 2024, 199, 114464. [Google Scholar] [CrossRef]
Martinez, R.; Castro, P.; Arroyo, A.; Manana, M.; Galan, N.; Moreno, F.S.; Bustamante, S.; Laso, A. Techniques to locate the origin of power quality disturbances in a power system: A review. Sustainability 2022, 14, 7428. [Google Scholar] [CrossRef]
Liu, Y.; Jin, T.; Mohamed, M.A.; Wang, Q. A novel three-step classification approach based on time-dependent spectral features for complex power quality disturbances. IEEE Trans. Instrum. Meas. 2021, 70, 3000814. [Google Scholar] [CrossRef]
Alajrash, B.H.; Salem, M.; Swadi, M.; Senjyu, T.; Kamarol, M.; Motahhir, S. A comprehensive review of FACTS devices in modern power systems: Addressing power quality, optimal placement, and stability with renewable energy penetration. Energy Rep. 2024, 11, 5350–5371. [Google Scholar] [CrossRef]
Golla, M.; Reddy, D.V.S.; Padhy, N.P. An Improved Control Strategy for Active Power Injection and Power Quality Enhancement in a Low-Voltage Weak-Grid Integrated PV and Battery System. IEEE Trans. Consum. Electron. 2024, 71, 476–487. [Google Scholar] [CrossRef]
Yang, C.; Zheng, T.; Song, Y.; Hu, J.; Guerrero, J.M. Coordinated control strategy of hybrid AC/DC microgrid for power quality improvement under unbalanced AC conditions. IEEE Trans. Smart Grid 2024, 16, 837–849. [Google Scholar] [CrossRef]
Fathollahi, A.; Andresen, B. Power quality analysis and improvement of Power-to-X plants using digital twins: A practical application in Denmark. IEEE Trans. Energy Convers. 2025, 40, 1909–1921. [Google Scholar] [CrossRef]
Liu, Y.; Yuan, D.; Fan, H.; Jin, T.; Mohamed, M.A. A multidimensional feature-driven ensemble model for accurate classification of complex power quality disturbance. IEEE Trans. Instrum. Meas. 2023, 72, 1501613. [Google Scholar] [CrossRef]
Satyanrayana, M.; Veeramsetty, V.; Rajababu, D. Signal processing approaches for power quality disturbance classification: A comprehensive review. Results Eng. 2025, 25, 104569. [Google Scholar] [CrossRef]
Meignen, S.; Pham, D.H.; Colominas, M.A. On the use of short-time fourier transform and synchrosqueezing-based demodulation for the retrieval of the modes of multicomponent signals. Signal Process. 2021, 178, 107760. [Google Scholar] [CrossRef]
Gao, Y.; Li, Y.; Zhu, Y.; Wu, C.; Gu, D. Power quality disturbance classification under noisy conditions using adaptive wavelet threshold and DBN-ELM hybrid model. Electr. Power Syst. Res. 2022, 204, 107682. [Google Scholar] [CrossRef]
Khetarpal, P.; Nagpal, N.; Alhelou, H.H.; Siano, P.; Al-Numay, M. Noisy and non-stationary power quality disturbance classification based on adaptive segmentation empirical wavelet transform and support vector machine. Comput. Electr. Eng. 2024, 118, 109346. [Google Scholar] [CrossRef]
Bhuiyan, S.M.A.; Khan, J.; Murphy, G. WPD for detecting disturbances in presence of noise in smart grid for PQ monitoring. IEEE Trans. Ind. Appl. 2017, 54, 702–711. [Google Scholar] [CrossRef]
Stockwell, R.G.; Mansinha, L.; Lowe, R.P. Localization of the complex spectrum: The S transform. IEEE Trans. Signal Process. 1996, 44, 998–1001. [Google Scholar] [CrossRef]
Liu, M.; Chen, Y.; Zhang, Z.; Deng, S. Classification of power quality disturbance using segmented and modified S-transform and DCNN-MSVM hybrid model. IEEE Access 2023, 11, 890–899. [Google Scholar] [CrossRef]
Fu, L.; Deng, X.; Chai, H.; Ma, Z.; Xu, F.; Zhu, T. PQEventCog: Classification of power quality disturbances based on optimized S-transform and CNNs with noisy labeled datasets. Electr. Power Syst. Res. 2023, 220, 109369. [Google Scholar] [CrossRef]
Pan, L.; Han, Z.; Wenxu, X.; Qingquan, J. A fast adaptive S-transform for complex quality disturbance feature extraction. IEEE Trans. Ind. Electron. 2022, 70, 5266–5276. [Google Scholar] [CrossRef]
Zhao, F.; Liao, D.; Chen, X.; Wang, Y. Recognition of Hybrid PQ Disturbances Based on Multi-Resolution S-Transform and Decision Tree. Energy Eng. 2023, 120, 1133–1148. [Google Scholar] [CrossRef]
Wold, S.; Sjöström, M.; Eriksson, L. PLS-regression: A basic tool of chemometrics. Chemom. Intell. Lab. Syst. 2001, 58, 109–130. [Google Scholar] [CrossRef]
Duc, M.L.; Bilik, P.; Martinek, R. Analysis of factors affecting electric power quality: PLS-SEM and deep learning neural network analysis. IEEE Access 2023, 11, 40591–40607. [Google Scholar] [CrossRef]
Subudhi, U.; Dash, S. Detection and classification of power quality disturbances using GWO ELM. J. Ind. Inf. Integr. 2021, 22, 100204. [Google Scholar] [CrossRef]
Wei, S.; Wenjuan, D.; Xia, C. Power quality disturbances categorization using Identity Feature Vector and Extreme Learning Machine. Intell. Syst. Appl. 2024, 24, 200446. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Ravi, T.; Kumar, K.S.; Dhanamjayulu, C.; Khan, B. Utilization of Stockwell Transform and Random Forest Algorithm for Efficient Detection and Classification of Power Quality Disturbances. J. Electr. Comput. Eng. 2023, 2023, 6615662. [Google Scholar] [CrossRef]
Jampana, A.S.; Velagapudi, M.; Mohan, N.; Sachin, K.S.; Soman, K.P. Enhancing Power Quality Disturbance Classification Through Ensemble Learning and Statistical Techniques. In Proceedings of the 2024 10th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 14–16 March 2024; Volume 1, pp. 492–497. [Google Scholar] [CrossRef]
Jiang, Z.; Wang, Y.; Li, Y.; Cao, H. A new method for recognition and classification of power quality disturbances based on IAST and RF. Electr. Power Syst. Res. 2024, 226, 109939. [Google Scholar] [CrossRef]
Cai, J.; Zhang, K.; Jiang, H. Power quality disturbance classification based on parallel fusion of CNN and GRU. Energies 2023, 16, 4029. [Google Scholar] [CrossRef]
Li, D.; Channa, I.A.; Chen, X.; Dou, L.; Khokhar, S.; Ab Azar, N. A new deep learning method for classification of power quality disturbances using DWT-MRA in utility smart grid. Comput. Electr. Eng. 2024, 117, 109290. [Google Scholar] [CrossRef]
Salles, R.S.; Ribeiro, P.F. The use of deep learning and 2-D wavelet scalograms for power quality disturbances classification. Electr. Power Syst. Res. 2023, 214, 108834. [Google Scholar] [CrossRef]
Liu, W.; Ye, X.; Yan, W. Power quality disturbance classification based on dual-parallel 1D2D fusion of improved ResNet and attention mechanism. Measurement 2025, 252, 117358. [Google Scholar] [CrossRef]
Izadi, M.; Mohsenian-Rad, H. A synchronized lissajous-based method to detect and classify events in synchro-waveform measurements in power distribution networks. IEEE Trans. Smart Grid 2022, 13, 2170–2184. [Google Scholar] [CrossRef]
Izadi, M.; Mohsenian-Rad, H. Characterizing synchronized Lissajous curves to scrutinize power distribution synchro-waveform measurements. IEEE Trans. Power Syst. 2021, 36, 4880–4883. [Google Scholar] [CrossRef]
IEEE 1159-2019; IEEE Recommended Practice for Monitoring Electric Power Quality. IEEE: New York, NY, USA, 2019. [CrossRef]
Khetarpal, P.; Nagpal, N.; Al-Numay, M.S.; Siano, P.; Arya, Y.; Kassarwani, N. Power quality disturbances detection and classification based on deep convolution auto-encoder networks. IEEE Access 2023, 11, 46026–46038. [Google Scholar] [CrossRef]
Mahela, O.P.; Khan, B.; Alhelou, H.H.; Siano, P. Power quality assessment and event detection in distribution network with wind energy penetration using stockwell transform and fuzzy clustering. IEEE Trans. Ind. Inform. 2020, 16, 6922–6932. [Google Scholar] [CrossRef]
Hermo, J.; Bolón-Canedo, V.; Ladra, S. Fed-mRMR: A lossless federated feature selection method. Inf. Sci. 2024, 669, 120609. [Google Scholar] [CrossRef]
Jiang, D.; Shi, X.; Liang, Y.; Liu, H. Feature extraction technique based on Shapley value method and improved mRMR algorithm. Measurement 2024, 237, 115190. [Google Scholar] [CrossRef]

Figure 1. Disturbance map of power quality.

Figure 2. ST-mRMR-RF framework.

Figure 3. Feature extraction based on S-transform. (a) Original signal; (b) ST time–frequency plot; (c) spectrum; (d) low-frequency component; (e) mid-frequency component; (f) high-frequency component.

Figure 4. Clean versus Noisy PQD Signals.

Figure 5. Misclassification heatmaps for different ST-based models. (a) RF; (b) PLS; (c) ELM; (d) CNN.

Figure 6. Misclassification heatmaps for the RF model under different noise levels. (a) No noise; (b) 20 dB; (c) 30 dB; (d) 40 dB.

Figure 7. Results of different feature selection.

Figure 8. Model accuracy varies with the amount of training data.

Figure 9. Classification results of PQDs in different noise environments. (a) Original feature set; (b) Optimal feature set.

Figure 10. ROC curves and AUC values of the ST-mRMR-RF model under different noise conditions: (a) Noise-free, (b) 20 dB SNR, (c) 30 dB SNR, (d) 40 dB SNR.

Figure 11. Accuracy of the ST-mRMR-RF model test set.

Table 1. Twenty-one types of PQD signals.

Type	PQD Signal	Type	PQD Signal
C1	Harmonic	C12	Sag and Transient oscillation
C2	Sag	C13	Swell and Transient oscillation
C3	Swell	C14	Interruption and Transient oscillation
C4	Interruption	C15	Harmonic and Transient pulse
C5	Flicker	C16	Sag and Transient pulse
C6	Transient oscillation	C17	Swell and Transient pulse
C7	Transient pulse	C18	Flicker and Transient pulse
C8	Sag and Harmonic	C19	Interruption and Transient pulse
C9	Swell and Harmonic	C20	Sag, Harmonic, and Transient oscillation
C10	Interruption and Harmonic	C21	Swell, Harmonic, and Transient oscillation
C11	Flicker and Harmonic

Table 2. Results of different extraction methods on the model.

Feature Extraction Method	Model	Evaluation Index
Feature Extraction Method	Model	Acc (%)	Pre (%)	Recall (%)	F1 Score (%)
ST	RF	99.98	99.98	99.98	99.98
	PLS	91.84	92.50	91.89	91.39
	ELM	97.32	97.59	97.32	97.59
	CNN	99.81	99.80	99.80	99.80
STFT	RF	99.68	99.70	99.69	99.69
	PLS	93.49	93.84	93.49	93.44
	ELM	96.88	97.20	96.88	96.93
	CNN	99.71	99.72	99.71	99.71
CWT	RF	99.90	99.91	99.90	99.91
	PLS	93.07	93.69	93.07	92.95
	ELM	98.35	98.42	98.35	98.34
	CNN	99.66	99.65	99.65	99.65

The bolded parts represent the optimal result.

Table 3. Results of frequency band selection ablation experiment.

Bandwidth Division	Model	Evaluation Index
Bandwidth Division	Model	Acc (%)	Pre (%)	Recall (%)	F1 Score (%)
This article’s strategy	RF	99.98	99.98	99.98	99.98
The Three-Part Division		99.95	99.95	99.95	99.95
Bisection method		99.89	99.88	99.89	99.88

The bolded parts represent the optimal result.

Table 4. Results of ablation experiments.

Feature Set	Model	Evaluation Index
Feature Set	Model	Acc (%)	Pre (%)	Recall (%)	F1 Score (%)
Time feature	RF	99.25	99.29	99.25	99.26
	PLS	59.66	63.03	59.65	53.95
	ELM	87.57	89.03	87.57	87.12
	CNN	98.54	98.64	98.54	98.54
Frequency feature	RF	98.36	98.40	98.36	98.33
	PLS	91.05	92.12	91.05	90.96
	ELM	97.68	97.92	97.68	97.70
	CNN	98.37	98.49	98.37	98.36
Fusion feature	RF	99.98	99.98	99.98	99.98
	PLS	91.89	92.50	91.89	91.39
	ELM	97.84	98.05	97.84	97.86
	CNN	99.81	99.81	99.81	99.81

The bolded parts represent the optimal result.

Table 5. mRMR model parameters and RF model parameters.

Algorithm	Parameter Name	Parameter Value/Method
mRMR	chi2level	0.03
	Initial number of compartments	256
	Arrangement criterion	Mutual information entropy
	Number of node samples	5
	Select the number of features	100
RF	The number of trees	50
	The minimum number of leaf node samples	1
	Splitting criterion	Gini impurity
	Importance assessment	Out-Of-Bag
	Max depth	Unlimited
	Number of features per split	Sqrt (total features)

Table 6. Comparison of mRMR Feature Selection Performance.

Metric	ST-RF	ST-mRMR-RF	Difference (%)
Feature Dimension	3600	100	−97.22
Model Training Time	95.1534	10.5566	−88.91
Model Inference Time	6.4821	0.6952	−88.32

Table 7. Classification accuracy (%) under different noise levels and missing rates.

Model	Different Noise	Proportion of Missing Data
Model	Different Noise	No Missing	5% Missing	10% Missing	15% Missing
ST-mRMR-RF	No noise	100	99.82	99.51	97.59
	30dB	99.75	96.27	94.95	93.43
	40dB	99.84	98.27	97.62	96.00

Table 8. Comparison with results from previous studies.

Method	Number of PQDS	Year of Publication	Different Noise
Method	Number of PQDS	Year of Publication	No Noise	20 dB	30 dB	40 dB
ST-mRMR-RF	21	——	100.0	99.24	99.75	99.84
DBN and ELM [11]	21	2022	——	95.80	98.20	98.70
SMST and DCNN [15]	21	2023	99.70	98.86	99.52	99.6
PQEventCog [16]	16	2023	99.12	98.57	98.14	83.45
DWT-CNN [28]	16	2024	99.88	98.16	98.21	99.24
GAF-GRCNN [30]	20	2025	——	96.66	98.35	98.67

The bolded parts represent the optimal result.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wu, Y.; Wu, K.; Qian, C.; Wu, J.; Tang, R. Research on Power Quality Disturbance Identification by Multi-Scale Feature Fusion. Big Data Cogn. Comput. 2026, 10, 18. https://doi.org/10.3390/bdcc10010018

AMA Style

Wu Y, Wu K, Qian C, Wu J, Tang R. Research on Power Quality Disturbance Identification by Multi-Scale Feature Fusion. Big Data and Cognitive Computing. 2026; 10(1):18. https://doi.org/10.3390/bdcc10010018

Chicago/Turabian Style

Wu, Yunhui, Kunsong Wu, Cheng Qian, Jingjin Wu, and Rongnian Tang. 2026. "Research on Power Quality Disturbance Identification by Multi-Scale Feature Fusion" Big Data and Cognitive Computing 10, no. 1: 18. https://doi.org/10.3390/bdcc10010018

APA Style

Wu, Y., Wu, K., Qian, C., Wu, J., & Tang, R. (2026). Research on Power Quality Disturbance Identification by Multi-Scale Feature Fusion. Big Data and Cognitive Computing, 10(1), 18. https://doi.org/10.3390/bdcc10010018

Article Menu

Research on Power Quality Disturbance Identification by Multi-Scale Feature Fusion

Abstract

1. Introduction

2. Method

2.1. PQDs

2.2. Algorithm Architecture

2.2.1. S-Transform

2.2.2. mRMR Feature Selection

2.2.3. Classification Method

2.3. Multi-Scale Feature Fusion

3. Results and Discussion

3.1. Dataset Description

Robustness Evaluation Under Noisy Conditions

3.2. Comparison of Feature Extraction Methods and Performance Evaluation

3.3. Ablation Experiment

3.4. Dimensionality Reduction Analysis

3.5. Comprehensive Assessment of Model Robustness

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Nomenclature

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI