A Dual-Channel Fault Diagnosis Method for Rolling Bearings Based on VMD-BiGRU and GADF-ResNet-CBAM

Niu, Maoyuan; Wan, Xiaojing; Sheng, Yuzhou

doi:10.3390/app16104968

Open AccessArticle

A Dual-Channel Fault Diagnosis Method for Rolling Bearings Based on VMD-BiGRU and GADF-ResNet-CBAM

by

Maoyuan Niu

,

Xiaojing Wan

^* and

Yuzhou Sheng

College of Mechanical Engineering, Xinjiang University, Urumqi 830047, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(10), 4968; https://doi.org/10.3390/app16104968 (registering DOI)

Submission received: 4 April 2026 / Revised: 1 May 2026 / Accepted: 12 May 2026 / Published: 16 May 2026

Download

Browse Figures

Versions Notes

Abstract

To address the drawbacks of traditional convolutional neural network-based rolling bearing fault diagnosis techniques, including poor feature extraction, low diagnostic accuracy, and poor generalization capability, a dual-channel rolling bearing fault diagnosis model based on VMD-BiGRU and GADF-ResNet-CBAM was proposed. Variational mode decomposition (VMD) was used to first break down and reconstruct the original vibration signal. The rebuilt signal was then input into a bidirectional gated recurrent unit (BiGRU) network in order to extract temporal information. Second, the Gramian angular difference field (GADF) transformed the one-dimensional vibration signal into a two-dimensional picture. This image was then fed into a residual network that was merged with the convolutional block attention module (CBAM) in order to extract spatial characteristics. After concatenating and fusing the data from the two channels, Softmax was finally employed at the output layer to classify different types of faults. The Case Western Reserve University (CWRU) bearing dataset and a self-collected independent dataset from the Xinjiang University experimental rig were utilized for validation. The model achieved diagnosis accuracies of 99.39% and 99.58%, respectively. These results demonstrate the robustness and practical applicability of the proposed method on data acquired from distinct hardware sources and experimental environments, outperforming alternative approaches.

Keywords:

rolling bearings; fault diagnosis; variational modality decomposition; Gramian angular difference field; convolutional block attention module

1. Introduction

Rolling bearings are critical supporting components in rotating machinery, and their performance directly affects the operating efficiency, precision, and service life of equipment. When a bearing issue arises, it can result in production disruption, equipment stoppage, and even safety incidents, while also directly affecting production costs and the economic benefits of enterprises [1].

Rolling bearing fault diagnosis has advanced significantly in recent years due to the quick development of deep learning networks and signal processing techniques. Neural network models like convolutional neural networks, recurrent neural networks, and generative adversarial networks, as well as signal processing techniques like empirical mode decomposition, Fourier transform, continuous wavelet transform, and Gramian angular field transformation, have been extensively used [2].

Currently, the main method for diagnosing bearing faults is the use of one-dimensional vibration signals. Variational mode decomposition (VMD) and a one-dimensional convolutional neural network based on the Pearson correlation coefficient are combined in an early defect identification technique presented by Deng et al. [3]. A rolling bearing defect diagnostic model was created by Wang et al. [4] using an upgraded one-dimensional convolutional neural network with a parametric rectified linear unit and better variational mode decomposition based on grey wolf optimization. A hybrid defect diagnostic technique called GGRU-1DCNN-AdaBN was presented by Sun et al. [5]. Kumar et al. [6] propose an intelligent fault diagnosis framework that integrates deep features with spatio-temporal modeling to capture fault characteristics. By employing a dual-condition cost-sensitive strategy that combines full-data and few-shot learning, the framework enhances adaptability and generalization. Sharma et al. [7] integrated a convolutional neural network (CNN) with a support vector machine (SVM). They implemented an adaptive cutoff strategy that enables the system to automatically determine when to transition from CNN-based feature extraction to SVM-based classification without human intervention.

The working circumstances of rolling bearings in real-world applications are frequently more complicated and changeable, despite the fact that significant progress has been achieved in rolling bearing defect identification using the aforementioned techniques. In these situations, feature extraction from one-dimensional vibration signal data is still difficult and can be greatly impacted by different kinds of noise, which makes it hard to fully represent fault-related information like the bearing’s type, location, and severity. Converting one-dimensional vibration signals into two-dimensional photographs and inserting them into network models for fault classification has become a research hotspot and an emerging trend in this field because of developments in computer technology and image processing techniques [8,9].

A cross-condition bearing defect identification technique that combines a residual deep subdomain adaptation network (RDSAN) with the Gramian angular difference field (GADF) was presented by Qu et al. [10]. A fault diagnostic model based on the Markov transition field and an enhanced capsule network was created by Xiao et al. [11]. Two-stage feature extraction utilizing an enhanced selective kernel network and a capsule network was used to classify faults. Gilbert et al. [12] designed a two-dimensional convolutional neural network (2D-CNN) architecture that utilizes images generated from “Morlet1” continuous wavelets. Group normalization (GN) was employed to enhance the stability and generalization of the bearing diagnostic model. Khan et al. [13] propose a novel Variational Model Decomposition (VMD)-Continuous Wavelet Transform (CWT)-Vision Transformer (ViT) framework that integrates VMD, CWT, and ViT, leveraging the complementary strengths of these three techniques to enhance feature representation and achieve accurate fault classification.

However, the aforementioned single-branch network-based fault diagnostic techniques still have certain drawbacks. For instance, they frequently fail to adequately capture the essential features of vibration signals; the extracted information may be excessively redundant, leading to insufficient information utilization; and the generalization capability of the network models remains limited.

This work created a dual-channel rolling bearing failure diagnostic model based on VMD-BiGRU and GADF-ResNet-CBAM to overcome the aforementioned problems. To get the best decomposition effect, the optimized VMD’s parameters were first iteratively changed. Before being fed into the BiGRU network for temporal feature extraction, the decomposed intrinsic mode components were evaluated using the kurtosis criterion, and the components with the highest kurtosis values were selected for signal reconstruction, minimizing noise interference and redundant features. In the meanwhile, a ResNet linked with the convolutional block attention module (CBAM) was used to extract spatial features from the original vibration signals after they had been encoded into two-dimensional pictures using GADF. The network’s emphasis on important defect information was strengthened and feature representation was further improved by the CBAM module. To give diagnostic results, a feature fusion layer and a fully connected layer were created, and a Softmax classifier was utilized to detect fault categories. The suggested approach allows complementary analysis of the data from various angles by combining the temporal features of fault signals with the spatial features of two-dimensional images. This results in more thorough fault feature extraction as well as enhanced analytical accuracy and dependability. Experiments indicate that the proposed strategy is better and practicable.

2. Basic Theory

2.1. VMD

Recently, an adaptive signal processing technique called variational mode decomposition (VMD) has been put forth. By iteratively searching for the optimal solution, this method determines the center frequency and bandwidth of each component and constitutes a fully non-recursive model [14]. The VMD algorithm mainly consists of two parts: the construction of the constrained variational objective and the iterative computation process. The variational constrained objective constructed in VMD is given as follows:

\{\begin{matrix} min \{\sum_{k} {∥\partial_{t} [(δ (t) + \frac{j}{π t}) * v_{k} (t)] e^{- j ω_{k} t}∥}^{2}\} \\ s . t . \sum_{k} v_{k} = p (t) \end{matrix}

(1)

where

{v_{k}}

and

{ω_{k}}

denote the sets of intrinsic mode function (IMF) components and their center frequencies, respectively;

\partial_{t}

denotes the gradient operator;

δ (t)

represents the Dirac distribution; and * denotes the convolution operation.

To obtain the optimal solution to the above variational problem, the Lagrange multiplier

λ (t)

and the quadratic penalty factor

α

were introduced, and the original objective function was transformed into:

\begin{matrix} L ({v_{k}}, {ω_{k}}, λ) & = α \sum_{k} {∥\partial_{t} [(δ (t) + \frac{j}{π t}) * v_{k} (t)] e^{- j ω_{k} t}∥}^{2} \\ + {∥p (t) - \sum_{k} v_{k} (t)∥}^{2} + 〈λ (t), p (t) - \sum_{k} v_{k} (t)〉 \end{matrix}

(2)

The variables

{\hat{v}}_{k}

,

ω_{k}

, and

{\hat{λ}}_{k}

were iteratively updated using the alternating direction method of multipliers (ADMM) until the convergence criterion for the optimal solution was satisfied, whereupon the iteration process was terminated.

To achieve the most informative decomposition, the mode number K and the quadratic penalty factor

α

were determined by a grid search that minimizes the average envelope entropy of the resulting IMFs. Envelope entropy reflects the sparsity and periodicity of a signal; a low value indicates prominent fault-related transients. The search range for K was set to

[2, 12]

(step 1) and for

α

to

[500, 5000]

(step 500). The noise tolerance

τ

was fixed at 0, and the convergence tolerance of the ADMM solver was set to

1 \times 10^{- 7}

. The combination

K = 10

,

α = 2000

yielded the lowest average envelope entropy and was therefore adopted. These optimized values are used throughout the remainder of the study.

2.2. GADF

The Gramian angular field (GAF) is an image encoding method that transforms one-dimensional time-series data into two-dimensional images [15]. Through polar coordinate mapping and Gramian matrix computation, the dynamic characteristics and nonlinear relationships of the time-series data can be effectively preserved and converted into two-dimensional image texture features.

The Gramian angular field transformation includes the Gramian angular summation field (GASF) and the Gramian angular difference field (GADF). Compared with GASF, GADF exhibits superior performance in terms of image color representation, detail characterization, and cross-boundary expression [16] Therefore, the Gramian angular difference field was adopted for encoding in this study.

Assuming that there are n points of time series

X = {x_{1}, x_{2}, \dots, x_{n}}

, the scale of the time series X places it inside interval [−1, 1], and the processing equation is shown in Equation (3).

{\tilde{x}}_{i} = \frac{[x_{i} - m i n (X)] + [x_{i} - m a x (X)]}{m a x (X) - m i n (X)}

(3)

where

x_{i}

is the ith value in the original time series X and

{\tilde{x}}_{i}

is the i th value in the deflated time series

\tilde{X}

,

i = 1, \cdot \cdot \cdot, n

.

The value

{\tilde{x}}_{i}

in

\tilde{X}

is encoded as the cosine of the angle of the pinch, and the radius serves as the timestamp, the calculation procedure is shown in Equation (4).

\{\begin{matrix} ϕ = arccos ({\tilde{x}}_{i}), & - 1 ⩽ {\tilde{x}}_{i} ⩽ 1, {\tilde{x}}_{i} \in \tilde{X} \\ r = \frac{t_{i}}{N}, & t_{i} \in N \end{matrix}

(4)

where

ϕ

denotes the polar coordinates of the angle cosine;

t_{i}

denotes the timestamp; and N denotes the constant factor obtained from the regularization transformation of the polar coordinates.

After the time series is transformed into the polar coordinate system, the temporal correlations over different time intervals can be identified by considering the angular differences between individual points. The Gramian angular difference field (GADF) transformation is implemented based on the sine function in the Gramian angular field (GAF), as expressed in Equation (5).

G A D F = [\begin{matrix} sin (θ_{1} + θ_{1}) & \dots & sin (θ_{1} + θ_{n}) \\ ⋮ & ⋱ & ⋮ \\ sin (θ_{n} + θ_{1}) & \dots & sin (θ_{n} + θ_{n}) \end{matrix}] = {\sqrt{I - {\tilde{X}}^{2}}}^{T} \tilde{X} - {\tilde{X}}^{T} \sqrt{I - {\tilde{X}}^{2}}

(5)

where I is the vector [1, 1…1] for the unit row.

2.3. BiGRU

The gated recurrent unit (GRU) introduces an update gate and a reset gate, thereby overcoming the problems of gradient vanishing and gradient explosion encountered by traditional recurrent neural networks (RNNs) in long-sequence processing. As a result, long-term dependencies can be captured efficiently. Its specific structure is shown in Figure 1. However, when processing extremely long sequences, GRU may still suffer from memory degradation and sensitivity to noise.

Based on the GRU architecture, the bidirectional gated recurrent unit (BiGRU) further incorporates both forward and backward state information, thereby enhancing the model’s comprehensive understanding of the sequence and improving its robustness and adaptability in temporal feature extraction. The corresponding calculation process is given in Equations (6)–(8).

i = GRU (x_{i}, {\vec{h}}_{t - 1})

(6)

i = GRU (x_{i}, {\overset{\leftarrow}{h}}_{t - 1})

(7)

\tilde{h} = tanh (w_{h} c_{t} + U_{h} [r_{t} ⊙ h_{t - 1}])

(8)

In the formula:

{\vec{h}}_{t}

and

{\overset{\leftarrow}{h}}_{t}

represent the forward output and reverse output generated by the unit at time t, respectively;

W_{t}

and

V_{t}

represent the weight values corresponding to the forward state and backward state, respectively.

h_{t}

represents the output of the BiGRU network model at time step t;

c_{t}

denotes the context vector associated with time step t;

U_{h}

is the weight matrix associated with the hidden state;

r_{t}

is the parameter related to the reset gate at time step t.

2.4. ResNet

The residual neural network (ResNet), proposed by He et al. [17], addresses the degradation problem in deep networks, and its core idea is the introduction of the residual block structure, as shown in Figure 2. This paper uses ResNet 18 as the base model. ResNet is composed of multiple stacked residual blocks, each of which contains two paths: a skip-connection path and a conventional path with nonlinear mapping. Each residual block has two convolutional layers, which are followed by a batch normalization layer and a ReLU activation function.

a is the original signal input to the residual neural network,

H (a)

is the residual output,

F (a)

is the residual mapping function, and

H (a) = F (a) + a

is the constant mapping function.

2.5. CBAM

Woo et al. [18] presented the convolutional block attention module (CBAM) in 2018. It is a lightweight attention mechanism that can be easily added into CNN architectures. CBAM allows the network to adaptively focus on critical feature channels and spatial areas, improving feature representation capabilities. The implementation process of CBAM is described in Equation (9), and its structure is illustrated in Figure 3.

\begin{matrix} F^{'} & = M_{1} (F) \otimes F \\ F^{''} & = M_{2} (F^{'}) \otimes F^{'} \end{matrix}

(9)

where F is the input feature map;

M_{1}

is the channel attention weight;

M_{2}

is the spatial attention weight;

F^{'}

is the feature map weighted by channel attention;

F^{''}

is the feature map weighted by spatial attention.

After processing the input feature map, the channel attention module moves on to the maximum and average pooling layers using the Multilayer Perceptron (MLP), the number of channels is compressed to C/r, and then expanded to C, where C is the number of channels, and r is the attenuation ratio; then the MLP’s output characteristics are activated by

σ

, and the channel attention

M_{1} (F)

is given by Equation (10), and the module structure is shown in Figure 4.

M_{1} (F) = σ (M L P (A v g P o o l (F)) + M L P (M a x P o o l (F))) = σ (W_{1} (W_{0} (F_{a v g}^{c})) + W_{1} (W_{0} (F_{m a x}^{c})))

(10)

where

σ

stands for the sigmoid activation function; F denotes the input feature.

The spatial attention module is used to enhance important spatial regions in the feature map and serves as a complement to channel attention. To build the spatial attention map, the input signal is processed sequentially using max and average pooling techniques. The spatial attention

M_{2} (F)

is defined in Equation (11), and the module structure is shown in Figure 5.

M_{2} (F) = σ (f^{7 \times 7} ([A v g P o o l (F); M a x P o o l (F)])) = σ (f^{7 \times 7} ([F_{a v g}^{s}; F_{m a x}^{s}]))

(11)

where

f^{7 \times 7}

indicates that the convolution kernel size is

7 \times 7

.

3. Dual-Channel Fault Diagnosis of Rolling Bearings

3.1. Structure of the Dual-Channel Bearing Fault Diagnosis Model

The dual-channel rolling bearing failure diagnostic model based on GADF-ResNet-CBAM and VMD-BiGRU is depicted in Figure 6. The signal’s intrinsic mode functions (IMFs) are computed in the first channel by decomposing the original vibration signal with the enhanced VMD. The kurtosis criteria is then used to choose the appropriate IMF components for signal reconstruction. The BiGRU network receives the reconstructed signal and uses it to extract temporal information by capturing the signal’s long-term relationships. The BiGRU network consists of 2 stacked bidirectional GRU layers, each with 128 hidden units, followed by a Dropout layer (rate = 0.5) to prevent overfitting. A two-dimensional GADF feature picture is created in the second channel by processing the initial one-dimensional vibration signal. A convolutional layer with a kernel size of

7 \times 7

receives the feature picture first, and then a max-pooling layer. Each of the four cascaded residual layers that make up the backbone network has two residual blocks. Batch normalization and a ReLU activation function come after each of the two

3 \times 3

convolutional structures found in each residual block. The convolutional block attention module (CBAM) is then shown. CBAM improves feature representation in both the channel and spatial dimensions, allowing the network to better collect essential visual data. A fully connected (FC) layer is linked with an average pooling layer to complete spatial feature extraction. To minimize dimensionality, the features from the two channels are concatenated and merged before being applied to a fully connected layer. Lastly, the fault signals are classified and the diagnostic findings are generated using a Softmax classifier.

3.2. Fault Diagnosis Procedure of the Dual-Channel Bearing Model

The dual-channel bearing fault diagnosis procedure based on VMD-BiGRU and GADF-ResNet-CBAM is illustrated in Figure 7. The specific diagnostic procedure is described as follows:

A bearing test rig was used to gather signals of bearing vibration acceleration under various fault scenarios.
The original vibration signals were decomposed using the optimized VMD, and appropriate IMF components were selected according to the kurtosis criterion for signal reconstruction, thereby constructing a one-dimensional dataset.
A two-dimensional dataset was created after the GADF encoding method was used to transform the original vibration signals into two-dimensional feature pictures.
The dataset was divided into a training set and a test set at a ratio of 7:3, after which the dual-channel bearing fault diagnosis model was constructed and the relevant parameters were configured.
The reconstructed signals and two-dimensional feature images in the training set were separately input into the BiGRU and ResNet-CBAM networks to extract temporal and spatial features, respectively. These features were then fused for model training. The convergence of the model was subsequently evaluated. If convergence was not achieved, the training process was continued and the parameters were adjusted until the optimal performance was obtained, after which the model parameters were saved.
The optimized trained model was finally used to validate the test set, and the bearing fault classification results were output to complete the fault diagnosis task.

To ensure the reproducibility of the experiments and to provide a comprehensive overview of the model configuration, the specific hyperparameters used in this study are summarized in Table 1. This includes the settings for the VMD decomposition, the GADF image encoding, the network architectures (BiGRU and ResNet-CBAM), and the training configurations.

Based on the above methods, a comprehensive fault diagnosis framework has been established. To clearly and formally describe the implementation process, the algorithm shown in Table 2 summarizes the detailed steps from the input of raw signals to the final classification.

4. Validation and Analysis of Experiments

4.1. Case Western Reserve University Bearing Dataset

The bearing dataset from Case Western Reserve University (CWRU), USA, was used for experimental validation. Fan-end bearing data at 12 kHz and drive-end bearing data at 12 kHz and 48 kHz are included in this collection. Figure 8 shows the CWRU bearing test bench, whose main components include a motor, a coupling, a torque sensor, and a dynamometer.

The drive-end bearing data recorded at 12 kHz was employed in this investigation. An SKF 6205 deep-groove ball bearing was the kind of bearing. Four distinct load conditions—0, 0.735, 1.47, and 2.25 kW—corresponding to rotating speeds of 1797, 1772, 1750, and 1730 r/min were used in the data gathering platform. The defects were artificially introduced using electro-discharge machining (EDM) with specific diameters. Three fault states were included in the dataset: rolling-element, outer-race, and inner-race faults—as well as one healthy state for each operating scenario. With fault sizes of 0.1778, 0.3556, and 0.5334 mm, each fault condition was further classified into three severity categories. Based on the fault type and fault diameter, the dataset was divided into ten groups: nine fault categories and one healthy category. There were 200 examples chosen for each category of failure, for a total of 2000 samples. 70% of the total samples were chosen at random to serve as the training set, with the remaining 30% serving as the test set. To generate the experimental data, the sensor signals were split into smaller samples using a sliding-window approach with a window length of 1024 and a step size of 1024. The results of the dataset splitting are shown in Table 3.

Two-dimensional feature images were generated using the GADF encoding method, and the image resolution used as the input to the network model was

512 \times 512

pixels. The GADF-encoded images corresponding to each bearing fault category are shown in Figure 9.

4.2. Experimental Results and Analysis

A hardware configuration featuring an NVIDIA GeForce RTX 4060 Ti graphics card and an Intel(R) Core(TM) i7-12700H CPU was used for the research. The PyTorch 1.9.0 deep learning framework (Facebook AI Research, Menlo Park, CA, USA), based on the Python 3.8 programming language (Python Software Foundation, Wilmington, DE, USA), was used to create the model. The Adam optimizer was used to update the network parameters, the batch size was 64, the initial learning rate was 0.001, and there were 100 training epochs.

Figure 10 displays the accuracy and loss curves of the proposed dual-channel rolling bearing defect detection model based on VMD-BiGRU and GADF-ResNet-CBAM at a rotational speed of 1750 r/min. After 20 training epochs, the accuracy and loss gradually stabilized, indicating that the model ultimately converged and continued training until 100 epochs were completed. The training accuracy was 99.71%, the test accuracy was 99.50%, and the training loss was 0.0015. These findings show that the suggested model performs exceptionally well in defect identification on the experimental dataset.

The confusion matrix depicted in Figure 11 was produced in order to examine the detailed classification results of the suggested model for each type of defect. The confusion matrix offers a thorough visual representation of the categories of faults that were misclassified as well as the quantity of misclassifications. The genuine labels in this matrix are represented by the vertical axis, while the anticipated labels are represented by the horizontal axis. The number of samples that were successfully categorized for each defect category is shown by the values on the diagonal.

As can be seen from the confusion matrix, no misclassification occurred for labels 0, 2, 3, 4, 5, 6, 7, and 9, and the diagnostic accuracy for these categories reached 100%. Two outer-race fault bearings with a fault width of 0.3556 mm were misidentified as rolling-element fault bearings with a fault diameter of 0.1778 mm after two label 8 samples were misprojected as label 1. Furthermore, a single rolling-element fault bearing with a fault width of 0.1778 mm was misdiagnosed as an outer-race fault bearing with a fault diameter of 0.3556 mm due to a single sample of label 1 being incorrectly projected as label 8. Because the vibration characteristics of the two types of bearing failures described above are very similar in the early stages of failure, and because weak failure pulses are partially masked by noise, the model presented in this paper produces classification errors in failure type identification. Overall, the suggested model’s rolling bearing issue diagnosis detection accuracy was rather high.

The fault features were shown using t-distributed stochastic neighbor embedding (t-SNE) to make the classification performance of the proposed model more intelligible. Only a very tiny percentage of data were improperly grouped, and samples from various fault categories display unique clustering patterns with no discernible overlap, as seen in the t-SNE figure. Figure 12 displays the visualization findings.

4.3. Comparative Analysis of Different Algorithmic Models

Comparative tests were carried out against a number of benchmark models, such as a BP neural network, 1D-CNN, BiGRU, MTF-CNN, GADF-ResNet, VMD-BiGRU, and GADF-ResNet-CBAM, to show the improved performance of the suggested model in rolling bearing defect diagnostics. All models were trained using the same dataset under the same experimental conditions, with a constant number of training epochs of 100, in order to guarantee a fair comparison. Ten repeated experiments were used to test each model while accounting for the impact of random mistakes. The accuracy of the suggested model was consistently about 99% during all ten trials, as seen in Figure 13, which was far greater than that of the other models.

The average value from the ten experiments was used to determine the test accuracy. Table 4 shows that Model 1 (1D-CNN) and Model 2 (BiGRU), which are conventional fault classification models employing original vibration signals as input, achieved accuracy rates of 95.54 ± 0.42% and 95.89 ± 0.38%, respectively. Model 3 and Model 4, which use MTF and GADF encoding to convert one-dimensional vibration signals into two-dimensional feature images, obtained accuracies of 96.78 ± 0.35% and 97.37 ± 0.30%, respectively. This transformation allows the network models to extract more detailed fault features and greatly enhances fault classification capability. Model 5 (VMD-BiGRU) and Model 6 (GADF-ResNet-CBAM), which use single signal processing techniques, still show comparatively good diagnostic performance with accuracies of 97.92 ± 0.25% and 98.63 ± 0.20%. On the other hand, the proposed dual-channel model concurrently combines the signal processing techniques mentioned above, enabling data analysis via both spatial and temporal feature extraction. With an average accuracy of 99.39 ± 0.15%, the two channels work in tandem to further enhance the model’s fault identification accuracy while minimizing variance. The improved bearing defect diagnostic ability of the proposed model is confirmed by comparative studies with the six models mentioned above.

To further validate the fault diagnosis capabilities of the model proposed in this paper, we included advanced bearing fault diagnosis models for comparison, including MSCN-LSTM [19] and CWT-IDenseNet [20]. Using the same dataset as before and training for 100 iterations, these models achieved accuracy rates of 99.18% and 99.26%, respectively—both lower than the 99.39% accuracy rate of the model proposed in this paper. This demonstrates that the model proposed in this paper exhibits superior fault diagnosis performance.

4.4. Ablation Study

To rigorously validate the contribution of each component and the rationality of the proposed design, a detailed ablation study was conducted, with the results summarized in Table 5.

Effectiveness of VMD Optimization: A comparison between the Baseline and Variant A reveals that replacing fixed VMD parameters with the optimized strategy yields a significant improvement, boosting accuracy from 95.80 ± 0.45% to 97.10 ± 0.32%. This confirms that adaptive parameter selection allows for more effective feature extraction from raw signals.

Impact of Fusion Strategies: Variant B achieves an accuracy of 97.95 ± 0.28%, outperforming Variant A and demonstrating that weighted fusion is superior to simple addition. Furthermore, when comparing Variant E, which utilizes Late Decision Fusion (97.90 ± 0.30%), with Variant C, the proposed feature-level approach reaches 98.65 ± 0.21%. This indicates that feature-level fusion retains more discriminative information than decision-level fusion.

Contribution of CBAM and Reconstruction: The performance gain observed in Variant C highlights the effectiveness of the CBAM module in focusing on critical features. Additionally, the Proposed model attains the highest accuracy of 99.39 ± 0.15%, surpassing Variant D (without reconstruction) at 98.10 ± 0.25%. This verifies the necessity of the VMD reconstruction step in preserving signal integrity.

Overall Performance: Ultimately, the Proposed model achieves the highest accuracy of 99.39 ± 0.15%, validating that the synergistic integration of optimized VMD, CBAM, and feature-level weighted fusion provides the most robust solution for fault diagnosis.

To evaluate the robustness of the method described in this paper, we conducted five independent runs for each experimental setup. The target model achieved an average accuracy of 99.39 ± 0.15%. The small standard deviation indicates its stability.

4.5. Cross-Load Generalization Study

To verify the model’s generalization capability under different operating conditions, we conducted rigorous cross-load validation. Unlike random partitioning, this approach ensures that the test data originates from load conditions that did not appear during training.

The results are shown in Table 6. We observed that the diagnostic accuracy naturally decreased compared to the 99.39% accuracy achieved by the model under random load distribution. This decline was expected due to the domain shift in load distribution. However, the proposed method still maintained a high average accuracy of 95.93% ± 0.51%, demonstrating strong adaptability to changes in operating conditions.

It is worth noting that in tests under 1 HP and 2 HP loads, the model demonstrated excellent adaptability, with accuracy rates exceeding 96% in both cases. In contrast, performance declined slightly under 3 HP and 0 HP loads. This may be attributed, respectively, to increased signal complexity under heavy-load conditions and the attenuation of fault characteristic signals under no-load conditions.

As shown in Table 6, the proposed method maintains a high diagnostic accuracy even when the load changes, which validates the effectiveness of the learned features.

Although our model achieves high accuracy in cross-domain tasks, the interpretability of the transferred features remains a challenge. Future work will explore explainable domain adaptation.

4.6. Performance Analysis Under Different Sample Sizes

To evaluate the fault diagnosis performance of the proposed model under different numbers of training samples, comparative experiments were conducted using BiGRU, MTF-CNN, GADF-ResNet-CBAM, and the proposed model. The average diagnosis accuracy was used as the final assessment parameter after each model was evaluated ten times.

Under all sample-size settings, the suggested model outperformed the other models in terms of average recognition accuracy, as seen in Figure 14. The average accuracy of the suggested model was 95.07% ± 0.80% when just 30 training samples were provided for each fault category. This was 6.55% better than BiGRU, 3.82% better than MTF-CNN, and 1.91% better than GADF-ResNet-CBAM. All models showed varied degrees of improvement in accuracy as the number of training samples rose. The suggested model’s accuracy achieved 96.44% ± 0.60% when the sample size was 60. The accuracy rose to 97.62% ± 0.40% and 98.75% ± 0.25%, respectively, with sample sizes of 90 and 120. All models attained comparatively high diagnostic accuracy when the sample size reached 140, with the suggested model’s accuracy reaching 99.39% ± 0.15%. The diagnostic findings show that the suggested model continues to perform better than the other models in bearing defect identification with varying training sample sizes, demonstrating not only high accuracy but also robust stability even with limited data.

4.7. Noise Immunity Analysis

In this work, Gaussian white noise with varying signal-to-noise ratios (SNRs) was introduced to the original signals to replicate multi-noise interference circumstances, keeping in mind that vibration signals are frequently contaminated by noise in real-world operational settings. This allowed for a thorough evaluation of the adaptability and robustness of the suggested model under noise interference. Equation (12) provides the formula for computing the SNR.

R_{S N} = 10 l g \frac{P_{S}}{P_{N}}

(12)

where

P_{S}

denotes the power of the original signal, and

P_{N}

denotes the noise power.

Gaussian white noise with SNRs of −2 dB, 0 dB, 2 dB, and 4 dB was applied to the signals using an operating condition of 1750 r/min and 60 training samples for each fault category. A comparison study was carried out against three different models in order to further confirm the noise immunity of the suggested model; the outcomes are shown in Table 7.

As the signal-to-noise ratio (SNR) increases, noise interference gradually decreases, and the performance of all models improves accordingly. Under various noise conditions, the proposed model consistently outperforms the comparison models and demonstrates higher stability. Specifically, even under low SNR conditions of −2 dB, the proposed model still achieves a robust accuracy of 90.64 ± 0.80%, which is 10.61% higher than that of the BiGRU model (80.03 ± 1.25%). At signal-to-noise ratios of 0 dB and 2 dB, the proposed model achieved accuracy rates of 93.81 ± 0.60% and 96.02 ± 0.40%, respectively, which are 2.08% and 1.86% higher than the second-best model (GADF-ResNet-CBAM). Furthermore, at a signal-to-noise ratio of 4 dB, the proposed model achieved an accuracy of 98.79 ± 0.25%. This result is comparable to the noise-free test accuracy of 99.39 ± 0.15% reported in Table 4, indicating that the impact of the additional Gaussian white noise on the proposed model is negligible. These results confirm that the proposed method possesses excellent robustness and noise resistance.

4.8. Validation on an Independent Laboratory Dataset

Experiments were carried out utilizing a dataset gathered from the rolling bearing test rig at Xinjiang University in order to further validate the effectiveness and applicability of the suggested method on an independent dataset. Figure 15 depicts the test setup. The test bearing was an ER-16K rolling bearing, and the sampling frequency was 20.48 kHz. The experimental validation was carried out using a rotating speed of 1500 r/min and a load of 1.2 A.

The specific operational parameters of the dataset used in this experiment are listed in Table 8. The dataset contains four bearing scenarios: rolling-element fault, inner-race fault, outer-race fault, and normal condition. To maintain consistency with the single-fault sample size of the previously mentioned public CWRU bearing dataset, 200 samples were selected for each type of fault, for a total of 800 samples. Thirty percent of the total data were used as the test set for validation, while the remaining seventy percent were chosen at random as the training set.

The suggested model performed exceptionally well in defect diagnosis on the Xinjiang University rolling bearing dataset, with an accuracy of 99.58 ± 0.10% on the test set after 100 training epochs. These results demonstrate that the proposed model can reliably diagnose faults in rolling bearings of different types, confirming its effectiveness in practical applications and strong adaptability to data from distinct hardware sources. The corresponding accuracy curves of the proposed model are shown in Figure 16.

4.9. Limitations and Future Work

While the proposed method achieves high diagnostic accuracy, there are several limitations that warrant discussion regarding the validity and scope of our results.

First, the model is dependent on specific signal encoding. The use of GADF transforms one-dimensional signals into two-dimensional images, which is highly sensitive to the length and scaling range of the input data. If the length of the vibration signal changes in practical applications, the topological structure of the GADF image may be distorted, potentially leading to a drop in recognition accuracy. This limits the model’s robustness to variable-length signals.

Second, the computational cost restricts real-time application. The dual-channel architecture combining ResNet-18 and BiGRU results in a large number of parameters (approximately [Insert Number] Million). While this complexity aids in feature extraction, it increases the computational burden, making it difficult to deploy the model directly on resource-constrained edge devices for real-time monitoring.

Third, the scope of generalization requires clarification. Although the model achieved high accuracy on both the CWRU dataset and our self-collected dataset (Section 4.8), demonstrating cross-device applicability, we acknowledge that the current validation is primarily under fixed operating conditions. While Section 4.5 validates cross-load performance, the model’s robustness against complex environmental and operational variability (e.g., rapid speed changes or temperature fluctuations) has not been fully established. Therefore, we moderate our conclusion: the model exhibits excellent performance under controlled and fixed-condition scenarios, but further research into domain adaptation methods is needed to achieve true strong generalization across diverse industrial environments.

Future work will focus on developing adaptive encoding mechanisms to handle variable-length signals, exploring model compression techniques for edge deployment, and integrating domain adaptation algorithms to mitigate the impact of environmental variability.

5. Conclusions

To address the challenges of incomplete feature extraction and insufficient robustness in traditional single-channel deep learning models for rolling bearing fault diagnosis, this study proposed a dual-channel diagnostic framework integrating VMD-BiGRU and GADF-ResNet-CBAM. By leveraging the CWRU public dataset and a self-collected independent dataset from Xinjiang University, the validity and superiority of the proposed method were rigorously validated. The primary conclusions are summarized as follows:

(1): To overcome the limitations of single-signal processing methods, the proposed model employs a complementary strategy: The VMD-BiGRU channel utilizes optimized Variational Mode Decomposition to reconstruct one-dimensional temporal signals, allowing the BiGRU network to effectively capture long-term dependencies and deep temporal features. Concurrently, the GADF-ResNet-CBAM channel transforms raw signals into two-dimensional texture images via Gramian Angular Difference Field encoding. The integration of the Convolutional Block Attention Module (CBAM) within ResNet enables the model to focus on critical spatial features. By concatenating and fusing these dual-domain features, the model achieves a more comprehensive representation of fault characteristics than single-channel approaches.
(2): Experimental results demonstrate that the proposed dual-channel model achieves an average diagnostic accuracy of 99.39% on the CWRU dataset and 99.58% on the independent Xinjiang University dataset. These results confirm that the model not only outperforms traditional models (such as 1D-CNN, BiGRU, and GADF-ResNet-CBAM) in terms of accuracy but also exhibits strong cross-device applicability. The high performance on the independent dataset verifies the model’s robustness to variations in hardware and experimental environments under fixed operating conditions.
(3): The model demonstrates strong environmental adaptability. Even under a low Signal-to-Noise Ratio (SNR) of −2 dB, the diagnostic accuracy remains as high as 90.64%, validating its excellent noise immunity. Furthermore, in small-sample scenarios (e.g., only 30 samples per class), the proposed model achieves an accuracy of 95.07%, significantly outperforming comparison models, which highlights its capability for reliable diagnosis with limited training data.
(4): While the model exhibits high performance under fixed conditions, the cross-load validation reveals that its generalization capability is context-dependent. The model maintains a high average accuracy (95.93%) when tested on unseen load conditions, demonstrating preliminary adaptability to operational variability. However, as discussed in the limitations, this generalization is currently constrained by fixed-speed scenarios and the computational complexity of the architecture.

Author Contributions

Conceptualization, M.N. and X.W.; methodology, M.N.; software, M.N. and Y.S.; validation, M.N., X.W. and Y.S.; formal analysis, M.N.; investigation, M.N.; resources, X.W.; data curation, M.N. and Y.S.; writing—original draft preparation, M.N.; writing—review and editing, X.W. and Y.S.; visualization, M.N.; supervision, X.W.; project administration, X.W.; funding acquisition, X.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Key Research and Development Special Project of Xinjiang Uygur Autonomous Region (Department-Region Linkage), grant number 2024B04003-2.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The Case Western Reserve University (CWRU) bearing dataset analyzed in this study is publicly available. The self-collected rolling bearing dataset from Xinjiang University presented in this study is available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Wang, X.; Liu, C.; Jia, Z.; Xu, F. Cross-location transfer diagnosis method for rolling bearings based on source domain selection. J. Ordnance Equip. Eng. 2025, 46, 319–327. [Google Scholar]
Zhang, X.; Luo, X.; Li, M.; Wang, L.; Wan, F. Research on rolling bearing fault diagnosis method based on GADF-CWT-GCNN. J. Northwest. Polytech. Univ. 2024, 42, 866–874. [Google Scholar] [CrossRef]
Deng, Z.; Zhang, Q.; Yu, J. Early fault diagnosis of bearings based on PCC-VMD and one-dimensional convolutional neural network. Mach. Tool Hydraul. 2025, 53, 9–15. [Google Scholar]
Wang, X.; Liu, X.; Wang, J.; Xiong, X.; Bi, S.; Deng, Z. Improved variational mode decomposition and one-dimensional CNN network with parametric rectified linear unit (PReLU) approach for rolling bearing fault diagnosis. Appl. Sci. 2022, 12, 9324. [Google Scholar] [CrossRef]
Sun, L.; Zhu, X.; Xiao, J.; Cai, W.; Ma, Q.; Zhang, R. A hybrid fault diagnosis method for rolling bearings based on GGRU-1DCNN with AdaBN algorithm under multiple load conditions. Meas. Sci. Technol. 2024, 35, 076201. [Google Scholar] [CrossRef]
Kumar, S.; Sinha, B.B.; Das, P. A dual-condition cost-sensitive optimization framework for intelligent bearing fault diagnosis. J. Braz. Soc. Mech. Sci. Eng. 2026, 48, 364. [Google Scholar] [CrossRef]
Sharma, V.; Jigyasu, R.; Singh, S. Efficient rolling bearing fault diagnosis with a CNN-SVM system enabled by automatic cut-off conditions for small sample data. J. Vib. Eng. Technol. 2025, 14, 17. [Google Scholar] [CrossRef]
Chen, Z.; Mauricio, A.; Li, W.; Gryllias, K. A deep learning method for bearing fault diagnosis based on cyclic spectral coherence and convolutional neural networks. Mech. Syst. Signal Process. 2020, 140, 106683. [Google Scholar] [CrossRef]
Xiao, X.; Wang, J.; Zhang, Y.; Guo, Q.; Zong, S. An optimization method of two-dimensional convolutional neural network for bearing fault diagnosis. Proc. Chin. Soc. Electr. Eng. 2019, 39, 4558–4568. [Google Scholar]
Qu, H.; Han, S.; Jia, B.; Ma, W.; Zhan, Y.; Tai, H. Cross-condition bearing fault diagnosis based on GADF fused with RDSAN. Modul. Mach. Tool. Autom. Manuf. Tech. 2024, 182–187. [Google Scholar]
Xiao, Y.; Yasenjiang, J.; Wang, K.; Cui, P. Research on rolling bearing fault diagnosis based on MTFSK-CapsNet. Mach. Tool Hydraul. 2024, 52, 206–215. [Google Scholar]
Gilbert Chandra, D.; Srinivasulu Reddy, U.; Uma, G.; Umapathy, M. Group normalization-based 2D-convolutional neural network for intelligent bearing fault diagnosis. J. Braz. Soc. Mech. Sci. Eng. 2023, 45, 584. [Google Scholar] [CrossRef]
Khan, M.J.; Sun, B.; Xiao, S.; Hou, J. Enhanced fault diagnosis of rolling bearings using a VMD-CWT-ViT integrated framework. Meas. Sci. Technol. 2026, 37, 116104. [Google Scholar] [CrossRef]
Zhao, T.; Zhai, X.; Liu, C.; Chen, Y.; Zheng, B. Research on laser detection and feature quantitative recognition method of rail corrugation based on GWO-VMD-NLM. Chin. J. Sci. Instrum. 2026, 1–15. [Google Scholar] [CrossRef]
Wang, Z.; Oates, T. Encoding time series as images for visual inspection and classification using tiled convolutional neural networks. In Proceedings of the 29th AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015. [Google Scholar]
Hou, D.; Zhou, Z.; Cheng, R.; Yan, S. Rolling bearing fault diagnosis method based on GADF-TL-ResNeXt. Acta Metrol. Sin. 2023, 44, 1534–1542. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Chen, X.; Zhang, B.; Gao, D. Bearing fault diagnosis based on multi-scale CNN and LSTM model. J. Intell. Manuf. 2021, 32, 971–987. [Google Scholar] [CrossRef]
Jia, G.; Liang, H.; Yang, J.; Wu, Z.; Han, Y. Rolling bearing fault diagnosis method based on CWT-IDenseNet. J. Hebei Univ. Sci. Technol. 2025, 46, 129–140. [Google Scholar]

Figure 1. Structure of the GRU.

Figure 2. Structure of the residual block.

Figure 3. CBAM.

Figure 4. Structure of Channel Attention Module.

Figure 5. Structure of the spatial attention module.

Figure 6. Dual-channel rolling bearing fault diagnosis model based on VMD-BiGRU and GADF-ResNet-CBAM.

Figure 7. Flowchart of the dual-channel bearing fault diagnosis procedure.

Figure 8. CWRU bearing test rig.

Figure 9. GADF code fault charts for each type of fault. (a) Normal; (b) Ball; (c) Inner race; (d) Outer race.

Figure 10. (a) Accuracy curve; (b) Loss function curve.

Figure 11. Confusion matrix.

Figure 12. Visualization of t-SNE dimensionality reduction.

Figure 13. Detailed accuracy comparison among different models.

Figure 14. Accuracy of different models under different sample sizes.

Figure 15. Rolling bearing test rig at Xinjiang University.

Figure 16. Accuracy curves of the proposed model.

Table 1. Complete hyperparameter configuration of the proposed model.

Module	Parameter	Setting/Value
VMD	Mode number K	10
	Penalty factor $α$	2000
	Noise tolerance $τ$	0
	Convergence tolerance	$1 \times 10^{- 7}$
	Optimization method	Grid search minimizing average envelope entropy
GADF	Image size	$32 \times 32$ pixels
	Rescaling range	$[- 1, 1]$
	Encoding method	Gramian angular difference field
BiGRU	Number of layers	2 (bidirectional)
	Hidden units per layer	128
	Dropout rate	0.5
	Output feature dimension	256
ResNet-CBAM	Base architecture	ResNet-18
	Input channels	3 (grayscale GADF replication)
	Residual stage structure	4 stages, 2 blocks each
	Convolution kernel size	$3 \times 3$
	Activation function	ReLU + batch normalization
	CBAM reduction ratio r	16
	CBAM spatial kernel size	$7 \times 7$
	Output feature dimension	512
Fusion	Method	Concatenation
	Fused feature dimension	256 (BiGRU) + 512 (ResNet-CBAM) = 768
Classifier	Fully connected layer	Dense (768 → 10)
	Output activation	Softmax
Training	Optimizer	Adam
	Initial learning rate	0.001
	Batch size	64
	Epochs	100
	Loss function	Cross-entropy
	Train/test split	70%/30%

Table 2. The Proposed Fault Diagnosis Algorithm.

Input: Raw vibration signals X, fault labels

\hat{Y}

Output: Predicted fault category

1: Initialize optimized VMD parameters

(α, K)

, BiGRU network, ResNet-CBAM network, and fusion weights.

2: Decompose: Utilize VMD to decompose X into K IMFs:

{u_{k}}_{k = 1}^{K}

.

3: Select: Calculate kurtosis for each

u_{k}

and select high-kurtosis IMFs

{u_{k}^{h i g h}}

.

4: Reconstruct: Reconstruct the 1D signal

X_{r e c o n}

using selected IMFs.

5: Encode: Transform X into the 2D GADF image

I_{G A D F}

.

6: Split: Partition

{(X_{r e c o n}, I_{G A D F}), Y}

into the training set and test set (7:3).

7: Train.

8: Input

X_{r e c o n}

into BiGRU to extract temporal features

F_{t}

.

9: Input

I_{G A D F}

into ResNet-CBAM to extract spatial features

F_{s}

.

10: Fuse: Concatenate

F_{t}

and

F_{s}

to obtain fused feature

F_{f u s e d}

.

11: Update network parameters by minimizing the loss function until convergence.

12: Test: Feed the test set into the trained model.

13: Classify: Output the final diagnosis result

\hat{Y}

.

Table 3. CWRU bearing data composition.

Fault Location	Fault Diameter (mm)	Training Set	Test Set	Label
Normal	0	140	60	0
Ball	0.1778	140	60	1
Ball	0.3556	140	60	2
Ball	0.5334	140	60	3
Inner race	0.1778	140	60	4
Inner race	0.3556	140	60	5
Inner race	0.5334	140	60	6
Outer race	0.1778	140	60	7
Outer race	0.3556	140	60	8
Outer race	0.5334	140	60	9

Table 4. Average accuracy of different models.

No.	Model	Accuracy/%
1	1D-CNN	95.54 ± 0.42
2	BiGRU	95.89 ± 0.38
3	MTF-CNN	96.78 ± 0.35
4	GADF-ResNet	97.37 ± 0.30
5	VMD-BiGRU	97.92 ± 0.25
6	GADF-ResNet-CBAM	98.63 ± 0.20
7	Proposed Model	99.39 ± 0.15

Table 5. Ablation study of the proposed method.

Model Variant	VMD Strategy	CBAM Module	Fusion Strategy	Accuracy (%)
Baseline	Fixed Parameters	w/o CBAM	Addition	95.80 ± 0.45
Variant A	Optimized VMD	w/o CBAM	Addition	97.10 ± 0.32
Variant B	Optimized VMD	w/o CBAM	Weighted Fusion	97.95 ± 0.28
Variant C	Optimized VMD	w/ CBAM	Weighted Fusion	98.65 ± 0.21
Variant D	w/o Reconstruction	w/ CBAM	Weighted Fusion	98.10 ± 0.25
Variant E	Optimized VMD	w/ CBAM	Late Decision Fusion	97.90 ± 0.30
Proposed	Optimized VMD	w/ CBAM	Feature Concatenation	99.39 ± 0.15

Table 6. Cross -load generalization performance.

Training Loads (Source Domain)	Testing Load (Target Domain)	Proposed Method (%)
1 HP, 2 HP, 3 HP	0 HP	95.42 ± 0.55
0 HP, 2 HP, 3 HP	1 HP	97.15 ± 0.42
0 HP, 1 HP, 3 HP	2 HP	96.30 ± 0.48
0 HP, 1 HP, 2 HP	3 HP	94.85 ± 0.60
Average Accuracy	-	95.93 ± 0.51

Table 7. Noise immunity analysis of different models (Unit: %).

Model	−2 dB	0 dB	2 dB	4 dB
BiGRU	80.03 ± 1.25	85.46 ± 0.95	89.55 ± 0.65	93.12 ± 0.40
MTF-CNN	87.75 ± 1.10	90.21 ± 0.80	93.35 ± 0.55	96.24 ± 0.35
GADF-ResNet-CBAM	89.08 ± 0.95	91.73 ± 0.70	94.16 ± 0.45	97.39 ± 0.30
Proposed Model	90.64 ± 0.80	93.81 ± 0.60	96.02 ± 0.40	98.79 ± 0.25

Table 8. Partitioning results of the Xinjiang University rolling bearing dataset.

Fault Type	Number of Training Samples	Number of Test Samples	Label
Normal	140	60	0
Inner race	140	60	1
Outer race	140	60	2
Ball	140	60	3

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Niu, M.; Wan, X.; Sheng, Y. A Dual-Channel Fault Diagnosis Method for Rolling Bearings Based on VMD-BiGRU and GADF-ResNet-CBAM. Appl. Sci. 2026, 16, 4968. https://doi.org/10.3390/app16104968

AMA Style

Niu M, Wan X, Sheng Y. A Dual-Channel Fault Diagnosis Method for Rolling Bearings Based on VMD-BiGRU and GADF-ResNet-CBAM. Applied Sciences. 2026; 16(10):4968. https://doi.org/10.3390/app16104968

Chicago/Turabian Style

Niu, Maoyuan, Xiaojing Wan, and Yuzhou Sheng. 2026. "A Dual-Channel Fault Diagnosis Method for Rolling Bearings Based on VMD-BiGRU and GADF-ResNet-CBAM" Applied Sciences 16, no. 10: 4968. https://doi.org/10.3390/app16104968

APA Style

Niu, M., Wan, X., & Sheng, Y. (2026). A Dual-Channel Fault Diagnosis Method for Rolling Bearings Based on VMD-BiGRU and GADF-ResNet-CBAM. Applied Sciences, 16(10), 4968. https://doi.org/10.3390/app16104968

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

A Dual-Channel Fault Diagnosis Method for Rolling Bearings Based on VMD-BiGRU and GADF-ResNet-CBAM

Abstract

1. Introduction

2. Basic Theory

2.1. VMD

2.2. GADF

2.3. BiGRU

2.4. ResNet

2.5. CBAM

3. Dual-Channel Fault Diagnosis of Rolling Bearings

3.1. Structure of the Dual-Channel Bearing Fault Diagnosis Model

3.2. Fault Diagnosis Procedure of the Dual-Channel Bearing Model

4. Validation and Analysis of Experiments

4.1. Case Western Reserve University Bearing Dataset

4.2. Experimental Results and Analysis

4.3. Comparative Analysis of Different Algorithmic Models

4.4. Ablation Study

4.5. Cross-Load Generalization Study

4.6. Performance Analysis Under Different Sample Sizes

4.7. Noise Immunity Analysis

4.8. Validation on an Independent Laboratory Dataset

4.9. Limitations and Future Work

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI