Complex Working Condition Bearing Fault Diagnosis Based on Multi-Feature Fusion and Improved Weighted Balance Distribution Adaptive Approach

Yang, Jing; Bai, Yanping; Xu, Ting; Cheng, Rong; Zhang, Wendong; Zhang, Guojun

doi:10.3390/lubricants13050221

Open AccessArticle

Complex Working Condition Bearing Fault Diagnosis Based on Multi-Feature Fusion and Improved Weighted Balance Distribution Adaptive Approach

by

Jing Yang

^1,2,

Yanping Bai

^3,*,

Ting Xu

³,

Rong Cheng

³,

Wendong Zhang

⁴

and

Guojun Zhang

⁴

¹

Department of Science, Taiyuan Institute of Technology, Taiyuan 030008, China

²

School of Information and Communication Engineering, North University of China, Taiyuan 030051, China

³

School of Mathematics, North University of China, Taiyuan 030051, China

⁴

State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan 030051, China

^*

Author to whom correspondence should be addressed.

Lubricants 2025, 13(5), 221; https://doi.org/10.3390/lubricants13050221

Submission received: 14 April 2025 / Revised: 4 May 2025 / Accepted: 11 May 2025 / Published: 15 May 2025

Download

Browse Figures

Versions Notes

Abstract

In order to improve the accuracy and generalization ability of fault diagnosis for rotating machinery bearings under complex working conditions, a new model based on multi-feature fusion and improved weighted balance distribution adaptation is proposed. Firstly, an optimized variational mode decomposition algorithm is introduced to denoise the fault signal. Secondly, in order to complement fault information from multiple dimensions, thirteen frequency features and four entropy features are extracted. Then, the 17 features are directly concatenated by dimension to form a high-dimensional feature vector that better adapts to complex working conditions and multiple fault modes. Finally, the improved weighted balance distribution adaptive algorithm is used to reduce the distribution difference between the source domain and the target domain. K-nearest neighbors is used as a classifier to determine the fault category. Using the Case Western Reserve University dataset for validation, the experimental results show that the proposed model achieves an average diagnostic accuracy of 99.34% under 12 complex working conditions.

Keywords:

fault diagnosis; feature extraction; multi-feature fusion; transfer learning; variable working conditions

1. Introduction

The modernization of the manufacturing industry has made bearings a core component of industry, and their stability directly affects the operation and socio-economic benefits of enterprises [1]. Therefore, achieving predictive maintenance of bearings through real-time monitoring and fault warning has become the key to ensuring the safety and reliability of mechanical systems and is of great significance for promoting the transformation of industrial intelligence [2].

Early research on bearing failures focused on traditional signal analysis methods, extracting fault features through time-domain statistics. Tahir et al. investigated the sensitivity of time-domain statistical features to vibration signal fluctuations [3]. Junior et al. demonstrated a method of using time-domain and frequency-domain response analysis to diagnose the health status of motors [4]. In addition, envelope spectrum analysis [5], power spectrum analysis [6], and high-order spectrum analysis [7] have also been used for bearing fault diagnosis. The commonly used features in time-domain analysis, such as peak value, mean value, root mean square value, etc., are relatively easy to calculate. However, bearing fault signals often contain a large amount of noise components, which can interfere with the extraction of time-domain features and lead to inaccurate diagnostic results. Compared with time-domain analysis, frequency-domain analysis can extract the fault characteristic frequencies of vibration signals and has stronger noise resistance.

Time-frequency analysis technology breaks through the limitations of steady-state signal processing. The most representative ones are short-time Fourier transform [8], wavelet transform [9], empirical mode decomposition (EMD) [10], and its variants EEMD [11], CEEMDAN [12], etc. Among these methods, short-time Fourier transform has a faster calculation speed and is more suitable for analyzing stationary signals or signals with slow changes. Wavelet analysis is suitable for analyzing non-stationary signals, but it depends on the choice of wavelet basis [13]. EMD adaptively decomposes complex signals into a series of IMFs without the need for preset basis functions. However, there are issues such as mode mixing and endpoint effects that may affect the accuracy of the analysis results [14]. Through improvement, although EEMD and CEEMD effectively suppress modal aliasing, the calculation process is cumbersome. Compared to these methods, VMD has been widely used in bearing fault diagnosis due to its rigorous theory, strong noise resistance, and no mode mixing [15]. At the same time, machine learning methods represented by support vector machines (SVM) [16] and random forests [17] have achieved automatic recognition of bearing faults by constructing a series model of “feature extraction + classifier”. Quan Zhenya et al. performed pattern recognition on weak faults of rolling bearings by constructing a multi-label K-nearest neighbor classifier [18]. Song et al. used a dual-strategy particle swarm optimization algorithm to optimize the BP neural network for bearing fault diagnosis [19]. Yu et al. proposed a new method for rolling bearing fault diagnosis based on mean multi-granularity decision theory, rough set, and non-naive Bayes classifier [20]. Although traditional machine learning methods rely on manual selection of features, they have significant advantages in scenarios with small samples and high interpretability requirements.

With the rapid development of deep learning, more and more researchers are applying it to bearing fault diagnosis. Yuan et al. proposed a rolling bearing fault diagnosis method based on convolutional neural networks and support vector machines [21]. Tang et al. proposed a multi-scale recursive semi-supervised deep learning fault diagnosis method with attention gates to ensure more accurate feature extraction [22]. However, these methods only achieved good diagnostic results under single working conditions. Subsequently, researchers began to explore the performance of deep learning methods under variable working conditions. Zhou C et al. proposed an improved Wasserstein generative adversarial network model with gradient penalty for automatic identification of early bearing faults [23]. Wang B et al. [24] proposed a robust fault diagnosis framework based on an improved domain adversarial neural network (DANN) and multi-module fusion to solve the domain transfer problem between devices. By using adversarial training equipment, the adaptability and robustness of the model under different operating conditions were significantly improved. Deng C et al. used a target decoupling network to extract domain-invariant features while introducing a pseudo-label dual-constraint feature alignment method to ensure efficient and accurate cross-domain alignment [25]. Although these methods significantly improve diagnostic accuracy under different working conditions, they face issues of strong data dependence and high computational complexity.

In recent years, some scholars have begun to study bearing fault diagnosis based on information fusion. Common information fusion methods include data-level fusion, feature-level fusion, and decision-level fusion, which have the advantages of improving diagnostic accuracy, enhancing robustness, and strong adaptability. He et al. designed a multi-sensor data fusion method to comprehensively extract fault feature information and also proposed a new intelligent diagnosis method based on a dual-scale residual network [26]. He et al. proposed a train bearing unbalance fault diagnosis method based on extended combination cleaning, resampling, and multi-scale feature fusion network [27]. Wang et al. first obtained local fault diagnosis results through bidirectional long short-term memory networks and then used evidence theory for decision fusion for final fault diagnosis [28]. Currently, the most commonly used method for bearing fault diagnosis is feature fusion. For complex conditions such as variable working conditions, traditional methods are usually used for feature transfer after feature fusion to reduce distribution differences between data under different working conditions.

In response to this situation, this paper uses traditional machine learning methods that are more suitable for small sample scenarios to construct models. A bearing fault diagnosis method based on multi-feature fusion and improved weighted balance distribution adaptive is proposed. In terms of feature selection, multi-feature fusion is adopted to enhance the representation ability of fault signals. Secondly, improve the WBDA feature transfer technology to successfully apply it to specific variable working conditions scenarios. Compared to other models, even with a small sample size, the proposed model can achieve higher accuracy and better generalization performance in variable working condition experiments. For example, on the basis of using the same dataset, the sample size used in the proposed model is only one-tenth of that in reference [29], but the accuracy has improved by about 1%. In addition, reference [29] is only applicable to three complex working conditions, while the proposed model is applicable to at least 12 conditions.

Specifically, the main contributions are as follows:

(1) Extract frequency features and entropy features to ensure relatively high computational efficiency. Afterwards, the extracted features are concatenated and fused along dimensions to enhance the performance of fault features.

(2) Propose a progressive parameter selection method to improve the weighted balanced distribution adaptive transfer method, making it better and faster to reduce the distribution difference between the source domain and the target domain.

(3) Compared with other models, the proposed model can achieve higher fault diagnosis accuracy with fewer samples. At the same time, maintain better generalization performance.

The rest of this paper is organized as follows: Section 2 introduces theoretical knowledge related to the MFF-IWBDA model. In Section 3, the framework of the MFF-IWBDA model is described in detail. In Section 4, the performance of the MFF-IWBDA model is validated using the CWRU dataset. Finally, Section 5 presents the conclusion.

2. Methodology

The process of bearing fault diagnosis usually includes three parts: signal denoising, feature extraction, and fault classification. For variable working condition fault diagnosis, since the training data and testing data come from different working conditions, feature transfer is required after feature extraction. This part will start with the realization process of bearing fault diagnosis. The theoretical knowledge related to the proposed model will be introduced in turn. Mainly including signal denoising, feature extraction, and feature transfer methods.

2.1. Signal Denoising

Due to the complexity of industrial environments, the collected bearing signals are often accompanied by a large amount of noise. Therefore, it is necessary to denoise the fault signal before performing feature extraction. Variational mode decomposition (VMD) [30], as a powerful signal processing tool, can effectively extract local features and frequency components from bearing fault signals through adaptive decomposition and optimization algorithms. In addition, this paper chooses VMD for fault signal denoising mainly because of its rigorous theory, strong noise resistance, and no modal aliasing. Furthermore, VMD is sensitive to parameter selection, such as decomposition level and penalty factor. Different parameter choices will directly affect the denoising effect of fault signals. Therefore, optimizing the parameters of VMD is essential. Due to the variety of bearing fault signals, after comprehensive consideration, this paper chooses wild horse optimizer (WHO) [31] to optimize the parameters of VMD. The main steps of the WHO algorithm are as follows:

Step 1: Create an initial wild horse population, form multiple horse herds, and select the leader, who is the stallion, for each herd.

Step 2: Update the position of each foal in the horse herd through grazing and mating behavior.

Step 3: The leader leads the horse herd to a suitable habitat; that is, updates the leader’s position.

Step 4: Communication and selection of leaders, selecting the leader in the horse herd by comparing fitness.

Step 5: After repeating steps 2–4 for each horse herd, save the best solution.

2.2. Feature Extraction

Although VMD can effectively separate noise and fault components, the reconstructed signal is still time series data and may contain redundant information. By extracting key information and compressing high-dimensional signals into low-dimensional feature vectors, computational complexity can be reduced. And diagnostic efficiency can be improved. In this article, thirteen frequency features and four entropy features are extracted to characterize fault signals. Frequency characteristics are preferred because they can directly map fault mechanisms and have clear physical meanings. Compared to frequency features, entropy features can effectively capture nonlinear dynamic changes caused by faults and exhibit stronger robustness to working conditions fluctuations. By integrating the two, the effect of complementing each other and jointly improving the adaptability and reliability of diagnosis can be achieved. The following will introduce the theoretical knowledge of relevant characteristics in turn.

2.2.1. Frequency Features

The frequency domain feature extraction of the bearing fault signal requires converting the signal from the time domain to the frequency domain. Then, useful information is extracted. The common frequency domain features mainly include spectral features, spectral shape features, and frequency domain statistical features. The spectral features mainly include spectral density, power spectral density, spectral average, etc. Features such as peak frequency, spectrum width, and spectrum slope can be used to describe the shape characteristics of the spectrum; hence, they are called spectrum shape features. Frequency domain statistical features can extract useful information from the statistical distribution of the spectrum. For example, statistical features such as average frequency, frequency variance, and frequency skewness can be used to describe the degree of frequency distribution concentration. This paper mainly uses Fourier transform to transform the signal from the time domain to the frequency domain. And then extracts 13 frequency features as shown in Table 1. It should be noted that in Table 1,

k \in \{1,2, \dots K\}

represents the number of spectral lines in the Fourier transform spectrum,

X (k)

represents the amplitude of the k-th spectral line in the signal spectrum, and

f_{k}

represents the frequency value corresponding to the k-th spectral line in the signal spectrum.

2.2.2. Distribution Entropy and Its Variants

Dispersion entropy (DE) is an entropy theory method for measuring the complexity and uncertainty of time series. Evaluate the disorder of the system by analyzing the dispersion pattern of signals in phase space. As a powerful tool for signal analysis, DE [32], like other entropies, is used to quantify the amount of information in a signal. However, the DE focuses on reflecting the randomness and diversity of the signal by examining the degree of data dispersion. The calculation process of DE is as follows:

Step 1: Assuming the time series is

X = {x_{1}, x_{2}, \dots x_{N}}

. By mapping the given sequence to the range [0, 1] using Formula (1), the following can be obtained:

y_{j} = \frac{1}{σ \sqrt{2 π}} \int_{- \infty}^{x_{j}} e^{\frac{- {(t - u)}^{2}}{2 σ^{2}}} d t

(1)

Among them, π is approximately 3.14 and

u

and

σ

represent the mathematical expectation and variance of the sequence

X

.

Step 2: Use Formula (2) to sequentially map element

y_{j}

to the range [1,

c

].

Z_{j}^{c} = R o u n d (c \cdot y_{j} + 0.5)

(2)

where

R o u n d ()

represents the rounding function, and

c

represents the number of bearing fault categories.

Step 3: The embedding vector

z_{i}^{m, c}

can be calculated using Formula (3).

z_{i}^{m, c} = \{z_{i}^{c}, z_{i + d}^{c}, \dots, z_{i + (m - 1) d}^{c}\} i = 1,2, \dots, N - (m - 1) d

(3)

Among them,

m

is the embedding dimension and

d

is the delay.

Step 4: Assuming that

z_{i}^{c} = v_{0}, z_{i + d}^{c} = v_{1}

and

z_{i + (m - 1) d}^{c} = v_{m - 1}

, the dispersion pattern corresponds to

z_{i}^{m, c}

is

π_{v_{0} v_{1} \dots v_{m - 1}}

. According to Formula (4), the probability corresponding to each dispersion pattern can be calculated as follows:

p (π_{v_{0} v_{1} \dots v_{m - 1}}) = \frac{Number (π_{v_{0} v_{1} \dots v_{m - 1}})}{N - (m - 1) d}

(4)

Step 5: According to Formula (5), DE value can be obtained via the following formula:

D E (X, m, c, d) = - \sum_{π = 1}^{c^{m}} p (π_{v_{0} v_{1} \dots v_{m - 1}}) \ln (p (π_{v_{0} v_{1} \dots v_{m - 1}}))

(5)

Since its proposal, researchers have made various improvements to DE. For example, MDE [18,33] adds scale information to DE, while RCMDE improves on MDE to reduce computational errors. The following will introduce the calculation methods of MDE, RCMDE, and RCMFDE.

Unlike DE, which only considers the single-scale data information, MDE extends signal analysis to different time scales, obtaining more comprehensive information. The specific calculation process of MDE is as follows:

Step 1: As mentioned above, assuming the time series is

X = \{x_{1}, x_{2}, \dots x_{N}\}

. The time series

Y_{j}^{τ}

can be obtained through Formula (6).

Y_{j}^{τ} = \frac{1}{τ} \sum_{i = τ (j - 1)}^{τ j} x_{i}, 1 \leq j \leq ⌊\frac{N}{τ}⌋

(6)

Among them,

τ

is the scale factor, and N is the length of the time series.

Step 2: Calculate the DE value of each sequence

Y_{j}^{τ}

and obtain the MDE value through Formula (7).

M D E (X, m, c, d, τ) = \frac{1}{τ} \sum_{j = 1}^{τ} D E (Y^{τ}, m, c, d)

(7)

RCMDE [34] is an improved method based on MDE. The main difference between the two lies in the coarse-grained sequence of the original signal X. As shown in Formula (8), the fine-grained processing method of RCMDE can effectively avoid the loss of information.

x_{k, j}^{τ} = \frac{1}{τ} \sum_{i = k + τ (j - 1)}^{k + τ j - 1} x_{i}, 1 \leq j \leq ⌊\frac{N}{τ}⌋, 1 \leq k \leq τ

(8)

Then, calculate Formula (9) to obtain the RCMDE value.

RCMDE (X, m, c, d, τ) = - \sum_{π = 1}^{c^{m}} \bar{p} (π_{v_{0} v_{1} \dots v_{m - 1}}) \ln (\bar{p} (π_{v_{0} v_{1} \dots v_{m - 1}}))

(9)

Unlike RCMDE, RCMDFE [35,36] considers the differences between adjacent elements in the dispersion pattern.

RCMFDE (X, m, c, d, τ) = - \sum_{π = 1}^{{(2 c - 1)}^{m - 1}} \bar{p} (π_{v_{0} v_{1} \dots v_{m - 1}}) \ln (\bar{p} (π_{v_{0} v_{1} \dots v_{m - 1}}))

(10)

2.2.3. Refined Composite Multivariate Generalized Multiscale Fuzzy Entropy

Fuzzy entropy is an important concept in fuzzy information theory, used to measure the complexity and uncertainty of fuzzy systems. Fuzzy entropy uses an exponential function to fuzzify similarity measures, making entropy values continuously smooth with parameter changes, enhancing robustness to noise and baseline drift. The main improvements to fuzzy entropy include multiscale fuzzy entropy, refined composite multiscale fuzzy entropy, etc. On the basis of these improvements, Azami H et al. proposed refined composite multivariate generalized multiscale fuzzy entropy (RCmvMFE) [37,38] based on variance and standard deviation.

Step 1: Assume that the multivariate time series is

u = {{y}_{k, b}}_{, b = 1}^{N}

, where

k

represents the

k

-th channel and

N

is the length of the signal. Coarse-grained sequences

x_{α, k, i}^{(β)}

can be obtained through Formula (11), where

β

is the time series scale.

x_{α, k, i}^{(β)} = \frac{1}{β} \sum_{b = (i - 1) β}^{i β + α - 1} y_{k, b} 1 \leq i \leq ⌊\frac{N}{β}⌋ = L, 1 \leq k \leq p, 1 \leq α \leq β

(11)

Step 2: Assuming the embedding dimension is m,

M = [m_{1}, m_{2}, \dots, m_{p}]

. Performing phase space reconstruction on coarse-grained sequences yields

Y_{m} (i)

.

d [Y_{m} (i), Y_{m} (j)] = \underset{l = 1,2, \dots, m}{m a x} \{|x (i + l - 1) - x (j + l - 1)|\} i \neq j

(12)

Step 3: Calculate

ϕ^{m} (r)

based on the given threshold r and membership function

θ (d, r)

.

ϕ^{m} (r) = \frac{1}{(L - n)} \sum_{i = 1}^{L - n} \frac{\sum_{j = 1, i \neq j}^{L - n} θ (d, r)}{L - n - 1}

(13)

Step 4: Let m = m + 1 and repeat the above steps. Then, RCmvMFE can be calculated through Equation (14)

RCmvMFE (Y, α, M, n, r) = - \ln (\frac{{\bar{ϕ}}_{α}^{m + 1}}{{\bar{ϕ}}_{α}^{m}})

(14)

2.3. WBDA Feature Transfer

For fault diagnosis under a single working condition, classification can be performed after feature extraction. However, as this paper focuses on complex working conditions, it means that the training data and testing data come from different working conditions. In response to the situation, researchers have introduced feature transfer techniques to reduce the differences in data distribution between different working conditions. Through experimental analysis of various transfer learning techniques, it was found that compared to other transfer techniques, the weighted balanced distribution adaptive model (WBDA) [39] has higher diagnostic accuracy. Hence, WBDA is selected to perform feature transfer on the extracted frequency and entropy features. Specific experimental details can be found in Section 4.5.

WBDA calculates weight adjustment coefficients based on the number of samples and classification error rate, thereby dynamically adjusting the weights of each category. Because of this, poorly performing categories can receive more attention during training. Therefore, the accuracy and generalization ability of the model are improved. The algorithm steps of WBDA are shown in Figure 1.

The optimization problem of WBDA is shown in Formula (15).

\begin{matrix} \min tr (A^{⊤} X ((1 - μ) M_{0} + μ \sum_{c = 1}^{C} W_{c}) X^{⊤} A) + λ ∥ A ∥_{F}^{2} \\ s . t . A^{⊤} X H X^{⊤} A = I, 0 \leq μ \leq 1 \end{matrix}

(15)

where

X

represents the input data;

A

is the mapping matrix;

M_{0}

is the MMD matrix;

H

is the center matrix;

c \in \{1,2, \dots C\}

is the category of bearing fault;

W_{c}

is the weight matrix of the c-th class;

μ

is the balance factor; and

λ

is the regularization parameter.

3. The MFF-IWBDA Model

The flowchart of the model proposed is shown in Figure 2. It mainly includes the following five steps: data preprocessing, signal denoising, feature extraction and fusion, feature transfer, and fault diagnosis. Below are detailed explanations for each section.

Step 1: Data preprocessing. Firstly, obtain the original fault signal. Secondly, cut an appropriate length for subsequent sample production. The length of each type of fault signal here is 102,400.

Step 2: Signal denoising. Considering the impact of computational efficiency, 5000 points are selected for VMD parameter optimization for each type of fault signal. Afterwards, the optimized parameters will be used for the entire fault type signal to complete denoising. Finally, for each type of fault signal, 100 non-overlapping samples are selected, each containing 1024 points. This corresponds to intercepting 102,400 points for each type of fault signal in the first step.

Step 3: Feature extraction and fusion. Firstly, thirteen frequency domain features and four entropy features are extracted separately. The four entropy features are MDE, RCMDE, RCMFDE, and RCMvMFE. Secondly, the 17 features are concatenated and fused together by dimension to form the final feature vector. It should be noted that all categories of fault signals in the source and target domains need to form their own category feature vectors. Figure 2 shows the detailed process.

Step 4: Feature transfer. This part is mainly achieved through improvements to WBDA. The main improvement is the selection of parameter Dim. Due to the high computational complexity and time-consuming nature of optimization algorithms, this paper proposed a progressive method for parameter selection. As shown in Figure 2, the horizontal axis represents parameters, and the vertical axis represents accuracy. Firstly, select the maximum accuracy and the three corresponding parameter values on the left and right sides as the new range, and delete the remaining two parameter values. Then, the adjacent parameter values are averaged to obtain the fourth and fifth parameter values. From the graph, it can be seen that after one pass, the range of parameter values is significantly reduced. Repeat this process until the optimal parameters are selected. However, it should be noted that in addition to the situation shown in Figure 2, continuous ascending and descending orders may also occur. For such situations, we directly choose to output the maximum accuracy and corresponding Dim.

Step 5: Fault diagnosis. This section uses KNN as the classifier. This is because KNN is not only easy to understand and effective but also has strong adaptability to nonlinear tasks. The fault signal is often nonlinear, so the KNN algorithm can handle such problems well. In addition, compared to other classifiers such as decision trees, support vector machines, etc., the KNN algorithm does not require explicit model training. This means that the KNN algorithm can adapt more quickly to new fault modes or data.

4. Experimental Analyses

4.1. The Dataset Description

As shown in Figure 3, ten types of bearing fault signals under four working conditions from the CWRU University Bearing Center are displayed [40]. These ten types of faults mainly include normal bearing signals, three types of inner ring faults, three types of outer ring faults, and three types of ball faults. The diameters of the degrees of damage are 0.1778 mm, 0.3556 mm, and 0.5334 mm, respectively. The experiments all used non-overlapping sampling methods. Among them, 100 samples were extracted from each fault signal, 90 samples were randomly selected for training, and the rest were used for testing, as shown in Table 2.

4.2. Analysis of Experimental Results

Due to the inclusion of four working conditions in the experiment, a total of twelve different variable working conditions experiments can be obtained. As shown in Figure 4, the results of 12 variable working conditions experiments are presented. From the graph, we can see that the blue square columns represent 0HP data for training, while the remaining HP data are used for testing. Similarly, the orange, yellow, and green square columns represent 1HP, 2HP, and 3HP data used for training, respectively, while the remaining HP data are used for testing. From the graph, it can be seen that the average value of the 12 variable working condition experiments is over 99%. Among them, the accuracy of the nine variable working conditions experiments is above 99%, and even the experiments of 1HP-2HP and 3HP-2HP have reached 100%. It should be noted that the experimental results of all charts are the average values obtained after ten repeated experiments. That is to say, each experiment will be repeated 10 times, and the average will be taken as the final experimental result. In addition, in these 10 experiments, the training and testing samples were randomly selected for each experiment. For example, in the 0-1 variable working condition experiment of the proposed model, training samples were randomly selected, and the rest were used as test samples to obtain the first experimental results. Repeat this process ten times and then average the results to obtain the accuracy of the final 0-1 variable working condition experiment.

In order to verify the effectiveness of the proposed model, comparative experiments are conducted as shown in Table 3. As shown in this table, the first column shows the KNN classification of the original data after IWBDA feature transfer. The second column adds denoising on the basis of the first column, the third column adds feature extraction fusion on the basis of the second column, and the last column presents the complete model proposed in this paper. Comparing the data in the first and second columns, it can be seen that the overall accuracy of the denoised data has improved by 1.28%, indicating that the impact of denoising on the experimental results is relatively small. Comparing the data in the second and fourth columns, it can be seen that feature extraction and fusion have a significant impact on the experimental results. From the data perspective, the average experimental results before and after feature extraction and fusion increased by 62.5%. This is also why we mainly focus on feature fusion and transfer. Comparing the data in the third and fourth columns, it can be seen that the improvement for WBDA is effective, with an overall average accuracy increase of 2.1%. The experiment with 3HP-0HP increased by 11.66%, which is the highest accuracy improvement.

Although the proposed model has achieved good diagnostic results, there are still shortcomings. In terms of computational efficiency, feature extraction and transfer are relatively fast. However, in the denoising stage, the proposed model uses WHO-VMD. This part has the highest computational complexity,

O (T_{W H O} \cdot M \cdot T_{V M D} \cdot K \cdot N l o g N)

, and is quite time-consuming. Where T is the number of iterations, M is the population size, K is the number of modes, and N is the signal length. The main reason for this is the product effect of parameter optimization times and VMD single decomposition complexity, especially the nonlinear growth characteristics of signal length N and mode number K.

4.3. Feature Extraction and Fusion Analysis

In order to provide a more detailed explanation of the method of feature selection and fusion, experiments are conducted as shown in Table 4 and Table 5. Among them, Table 4 mainly presents the experimental results of 12 variable working conditions with only a single feature selected. For very few variable working conditions experiments, using a single feature can achieve high accuracy; for example, in the 1HP-2HP experiment, RCMFDE features are chosen. However, it is not difficult to find that these are all cases where specific features are selected for specific working conditions. This also means that the stability and generalization ability of the model cannot be guaranteed. Therefore, the feature fusion method is proposed to ensure that the model can achieve good generalization performance. In addition, from the experimental mean of 12 different variable working conditions, it can be seen that the proposed feature fusion method has an overall accuracy improvement of 7.5–22.63% compared to a single feature. This further demonstrates the effectiveness of the feature fusion method.

From Table 4, it can be seen that among single features, RCMDE has the highest accuracy, followed by RCMFDE, and finally RCmvMFE, MDE, and frequency features. Combining Table 4 and Table 5, it can be seen that the feature fusion method is based on the accuracy of individual features. Therefore, the feature fusion combination is sequentially RCMDE and RCMFDE, RCMDE, RCMFDE; RCmvMFE, MDE, RCMDE, RCMFDE, and RCmvMFE; and MDE, RCMDE, RCMFDE, RCmvMFE, and frequency features. As shown in Table 5, when combining RCMDE and RCMFDE features, some variable working conditions experiments have achieved high accuracy, such as 0HP-1HP, 1HP-0HP, and so on. However, there are still those with lower accuracy, such as 1HP-3HP, only achieving an accuracy of 82.62%, which forces us to increase the fusion features. Unexpectedly, the accuracy did not increase but decreased after adding feature RCmvMFE fusion but slightly improved after continuing to add MDE features. Finally, the accuracy of the 12 variable working conditions experiments is significantly improved after adding frequency features. We believe that this phenomenon may be caused by the mutual influence between features. This also indicates that selecting appropriate features and the number of features are crucial in feature fusion.

4.4. Model Stability Analysis

Due to the special nature of variable working conditions, the proposed model needs to achieve good results in multiple experiments simultaneously. This indicates that the stability and generalization ability of the model are very important. The four subgraphs in Figure 5 represent the accuracy of repeating the experiment ten times when 0HP, 1HP, 2HP, and 3HP are used as source domain data, while the other three working conditions are used as target domain data. From the graph, it can be seen that 0HP data are the most stable and accurate when used as the source domain. When 0HP data are used as the target domain, the fluctuation of the three variable working conditions experiment results is relatively significant, especially 3HP-0HP.

Although Figure 5 shows the diagnostic accuracy trend of 12 variable working condition experiments. However, in order to make the experiment on model stability more intuitive, data analysis is conducted as shown in Figure 6. The data displayed for each variable working condition experiment in Figure 6 represent the difference between the maximum and minimum accuracy in 10 repeated experiments. It can be seen that the difference between the 12 variable working conditions experiments is controlled within the range of 0–1%, with a mean of 0.38%. Among them, the 1HP-2HP experiment has the smallest difference, while the 3HP-0HP experiment has the largest difference. This is consistent with the trend chart in Figure 5. In addition, Table 6 shows that the standard deviation range of 10 repeated experimental results under 12 different variable working conditions is 0–0.29%. This indicates that the diagnostic results of the model are relatively stable.

4.5. Comparison of Transfer Models

Table 7 presents the experimental comparison results between 10 different feature transfer methods and the IWBDA proposed. These 10 feature transfer methods are KMM (Kernel Mean Matching), CORAL (CORrelation ALignment), GFK (Geodesic Flow Kernel) [41], MEDA (Manifold Embedded Distribution Alignment) [42], Easy-TL (Easy Transfer Learning without intra-domain comparison), Ea-CORAL (Easy Transfer Learning with CORAL as intra-domain comparison), Ea-PCA (Easy Transfer Learning with PCA as intra-domain comparison) [43], TCA (Transfer Component Analysis), JDA (Joint Distribution Adaptation) [44], and BDA (Balanced Distribution Adaptation) [39]. It should be noted that in the comparative experiment of transfer methods, except for transfer methods, everything else is the same. From Table 7, it can be seen that the KMM and CORAL methods have the lowest accuracy. Although BDA has the highest accuracy, the results are not satisfactory in some variable working conditions experiments. For example, when using BDA for feature transfer in the 3HP-0HP variable working condition experiment, the accuracy is only 87.52%. Similar to BDA, JDA can maintain relatively good accuracy in approximately half of the experiments. For the remaining feature transfer methods (except for KMM, CORAL, JDA, and BDA), they are mostly within the range of 75–95%. In addition, the mean of the 12 variable working conditions for these feature transfer methods is within the range of 80–87%. The data are presented in Table 7 in a graphical manner and the comparison results of feature transfer can be obtained as shown in Figure 7. It can be seen that the IWBDA feature transfer method has the highest accuracy in all experiments. The average accuracy is as high as 99.34%. Based on the above analysis, it can be concluded that the model has good stability and generalization ability.

4.6. Comparative Analysis of Models

To verify the effectiveness of the proposed model, a comparison with recent research has been conducted, as shown in Table 8. The comparative literature also uses the dataset from the CWRU University Bearing Center. It should be noted that “*” indicates that the data are approximate values obtained from the literature. Reference [29] selected the optimal settings for the TTLN model: 300 epochs, a learning rate of 0.0005, and a batch size of 128. The DFCNN proposed in Reference [45] uses a wide kernel to extract features in the first convolutional layer, which has 10 × 10 sampling points. Reference [46] extracts 50 LPC coefficients from each signal segment as inputs for a simple feedforward artificial neural network for classification. References [47,48] respectively use adaptive denoising residual networks and deep convolutional neural networks with wide first-layer kernels for diagnosing rolling bearing faults.

As shown in Table 8, the average accuracy of the model in reference [29] has lower average accuracy than the proposed model in three experiments with the same data. Compared with References [45,47,48], the proposed model has improved accuracy by 8.84%, 2.35%, and 3.44%, respectively, on the basis of better generalization performance. Reference [46] also achieved fault diagnosis across 12 different working conditions, but the accuracy was unsatisfactory, 5.45% lower than the proposed model. In addition, we also counted the sample size of each type of fault during the experimental process of these models. Among them, only the sample size used in Reference [45] is relatively small, while the rest are much higher than the proposed model. It should be noted that in reference [45], the number of sample points per sample is 25,600, while the proposed model has only 1024 sample points per sample. All of these indicate that deep learning methods are dependent on data. The proposed model precisely solves this problem. In addition, compared to deep learning methods, the proposed model requires lower computational resources. All of these are more conducive to the development of bearing fault diagnosis.

5. Conclusions

A model based on multi-feature fusion and improved weighted balance distribution adaptation is proposed for variable working conditions bearing fault diagnosis. This paper uses the CWRU dataset to verify this. The results show that the proposed model exhibits excellent performance in experiments. By analyzing the results, the following conclusions can be drawn:

(1) The MFF-IWBDA model can accurately identify the fault status of bearings under variable working conditions. This can be proven by the fact that 12 variable working condition experiments achieved a mean accuracy of 99.34%.

(2) Whether it is the ablation experiments of feature extraction and fusion or the comparative experiments of multiple different methods of feature fusion, it can be seen that the feature extraction fusion method proposed performs the best. This indicates that the feature extraction and fusion model is successful.

(3) The experiment of feature transfer showed that the proposed IWBDA improved the mean accuracy by 2.1% (3HP-0HP up to 11.66%) compared to the original WBDA. This indicates that the IWBDA model proposed is beneficial for improving the accuracy of fault diagnosis under variable working conditions.

(4) The model proposed has good generalization ability and stability. This can be seen from the fact that the deviation values of all variable working condition experiments are within the range of 0–1% and the average deviation value of 12 variable working condition experiments is only 0.38%.

In summary, the proposed model provides a new approach for the diagnosis of bearing faults under complex working conditions. However, the proposed method has only been tested on the CWRU dataset, and its generalization ability needs to be improved. In addition, reducing model complexity and improving computational efficiency are also the focuses of our future research.

Author Contributions

Conceptualization, J.Y. and Y.B.; methodology, J.Y. and Y.B.; validation, J.Y.; writing—original draft preparation, J.Y.; writing—review and editing, J.Y., Y.B., T.X., R.C., W.Z. and G.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research is funded by the National Natural Science Foundation of China (Grant No. 61774137), the National Natural Science Foundation of China (Grant No. 12401647), the Fundamental Research Program of Shanxi Province (Grant Nos. 202103021224195 and 202203021212336), and Taiyuan Institute of Technology Scientific Research Initial Funding (Grant Nos. 2024KJ038, 2025LJ009, 2024KJ007, 2023LJ004, 2023KJ026, and 2024KJ039).

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: https://github.com/815808034/CWRU-Dataset, accessed on 10 May 2025.

Acknowledgments

We especially thank the fund of Shanxi ‘1331 Project’ Key Subject Construction and Innovation Special Zone Project, China.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Dong, X.; Zhang, C.; Liu, H.; Wang, D.; Chen, Y.; Wang, T. A new cross-domain bearing fault diagnosis method with few samples under different working conditions. J. Manuf. Process. 2025, 135, 359–374. [Google Scholar] [CrossRef]
He, D.; Wu, J.; Jin, Z.; Huang, C.G.; Wei, Z.; Yi, C. AGFCN: A bearing fault diagnosis method for high-speed train bogie under complex working conditions. Reliab. Eng. Syst. Saf. 2025, 258, 110907. [Google Scholar] [CrossRef]
Tahir, M.M.; Khan, A.Q.; Iqbal, N.; Iqbal, N.; Hussain, A.; Badshah, S. Enhancing fault classification accuracy of ball bearing using central tendency based time domain features. IEEE Access 2016, 5, 72–83. [Google Scholar] [CrossRef]
Junior, R.F.R.; dos Santos Areias, I.A.; Gomes, G.F. Fault detection and diagnosis using vibration signal analysis in frequency domain for electric motors considering different real fault types. Sens. Rev. 2021, 41, 311–319. [Google Scholar] [CrossRef]
Ruiz-Sarrio, J.E.; Antonino-Daviu, J.A.; Martis, C. Comprehensive Diagnosis of Localized Rolling Bearing Faults during Rotating Machine Start-Up via Vibration Envelope Analysis. Electronics 2024, 13, 375. [Google Scholar] [CrossRef]
Zheng, J.; Huang, S.; Pan, H.; Tong, J.; Wang, C.; Liu, Q. Adaptive power spectrum Fourier decomposition method with application in fault diagnosis for rolling bearing. Measurement 2021, 183, 109837. [Google Scholar] [CrossRef]
Shao, H.; Lin, J.; Zhang, L.; Wei, M. Compound fault diagnosis for a rolling bearing using adaptive DTCWPT with higher order spectral. Qual. Eng. 2020, 32, 342–353. [Google Scholar] [CrossRef]
Hao, Y.; Zhang, C.; Lu, Y.; Zhang, L.; Lei, Z.; Li, Z. A novel autoencoder modeling method for intelligent assessment of bearing health based on Short-Time Fourier Transform and ensemble strategy. Precis. Eng. 2024, 85, 89–101. [Google Scholar] [CrossRef]
Liang, P.; Tian, J.; Wang, S.Y.X. Multi-source information joint transfer diagnosis for rolling bearing with unknown faults via wavelet transform and an improved domain adaptation network. Reliab. Eng. Syst. Saf. 2024, 242, 109788. [Google Scholar] [CrossRef]
Wang, W.; Yuan, H. Bearing Fault Feature Extraction Method Based on Adaptive Time-Varying Filtering Empirical Mode Decomposition and Singular Value Decomposition Denoising. Machines 2025, 13, 50. [Google Scholar] [CrossRef]
Tu, Z.; Gao, L.; Wu, X.; Liu, Y.; Zhao, Z. Rotate vector reducer fault diagnosis model based on EEMD-MPA-KELM. Appl. Sci. 2023, 13, 4476. [Google Scholar] [CrossRef]
Yan, J.; Zhou, F.; Zhu, X.; Zhang, D. AFSA-FastICA-CEEMD Rolling Bearing Fault Diagnosis Method Based on Acoustic Signals. Mathematics 2025, 13, 884. [Google Scholar] [CrossRef]
Dai, X.; Yi, K.; Wang, F.; Cai, C.; Tang, W. Bearing fault diagnosis based on POA-VMD with GADF-Swin Transformer transfer learning network. Measurement 2024, 238, 115328. [Google Scholar] [CrossRef]
Shen, J.; Wang, Z.; Wang, Y.; Zhu, H.; Zhang, L.; Tang, Y. AGWO-PSO-VMD-TEFCG-AlexNet bearing fault diagnosis method under strong noise. Measurement 2025, 242, 116259. [Google Scholar] [CrossRef]
Li, H.; Wu, X.; Liu, T.; Li, S.; Zhang, B.; Zhou, G.; Huang, T. Composite fault diagnosis for rolling bearing based on parameter-optimized VMD. Measurement 2022, 201, 111637. [Google Scholar] [CrossRef]
Qiu, W.; Wang, B.; Hu, X. Rolling bearing fault diagnosis based on RQA with STD and WOA-SVM. Heliyon 2024, 10, e26141. [Google Scholar] [CrossRef]
Gao, L.; Wang, X.; Wang, T.; Chang, M. WDBM: Weighted deep forest model based bearing fault diagnosis method. Comput. Mater. Contin. 2022, 72, 4742–4754. [Google Scholar] [CrossRef]
Quan, Z.; Zhang, X. Rolling bearing fault diagnosis based on CS-optimized multiscale dispersion entropy and ML-KNN. J. Braz. Soc. Mech. Sci. Eng. 2022, 44, 430. [Google Scholar]
Song, X.; Wang, H.; Liu, Y.; Wang, Z.; Cui, Y. A fault diagnosis method of rolling element bearing based on improved PSO and BP neural network. J. Intell. Fuzzy Syst. 2022, 43, 5965–5971. [Google Scholar] [CrossRef]
Yu, J.; Ding, B.; He, Y. Rolling bearing fault diagnosis based on mean multigranulation decision-theoretic rough set and non-naive Bayesian classifier. J. Mech. Sci. Technol. 2018, 32, 5201–5211. [Google Scholar] [CrossRef]
Yuan, L.; Lian, D.; Kang, X.; Chen, Y.; Zhai, K. Rolling bearing fault diagnosis based on convolutional neural network and support vector machine. IEEE Access 2020, 8, 137395–137406. [Google Scholar] [CrossRef]
Tang, S.; Wang, C.; Zhou, F.; Hu, X.; Wang, T. Multi-scale recursive semi-supervised deep learning fault diagnosis method with attention gate. Machines 2023, 11, 153. [Google Scholar] [CrossRef]
Zhou, C.; Wang, Q.; Xiao, Y.; Xiao, W.; Shu, Y. Research on an Improved Auxiliary Classifier Wasserstein Generative Adversarial Network with Gradient Penalty Fault Diagnosis Method for Tilting Pad Bearing of Rotating Equipment. Lubricants 2023, 11, 423. [Google Scholar] [CrossRef]
Wang, B.; Liang, P.; Zhang, L.; Wang, X.; Yuan, X.; Zhou, Z. Enhancing robustness of cross-machine fault diagnosis via an improved domain adversarial neural network and self-adversarial training. Measurement 2025, 250, 117113. [Google Scholar] [CrossRef]
Deng, C.; Tian, H.; Miao, J.; Deng, Z. Domain adaptation method based on pseudo-label dual-constraint targeted decoupling network for cross-machine fault diagnosis. Reliab. Eng. Syst. Saf. 2025, 256, 110786. [Google Scholar] [CrossRef]
He, D.; Lao, Z.; Jin, Z.; He, C.; Shan, S.; Miao, J. Train bearing fault diagnosis based on multi-sensor data fusion and dual-scale residual network. Nonlinear Dyn. 2023, 111, 14901–14924. [Google Scholar] [CrossRef]
He, C.; He, D.; Wei, Z.; Xu, K.; Chen, Y.; Shan, S. A train bearing imbalanced fault diagnosis method based on extended CCR and multi-scale feature fusion network. Nonlinear Dyn. 2024, 112, 13147–13173. [Google Scholar] [CrossRef]
Wang, X.; Li, A.; Han, G. A deep-learning-based fault diagnosis method of industrial bearings using multi-source information. Appl. Sci. 2023, 13, 933. [Google Scholar] [CrossRef]
Wu, M.; Zhang, J.; Xu, P.; Liang, Y.; Dai, Y.; Gao, T.; Bai, Y. Bearing Fault Diagnosis for Cross-Condition Scenarios Under Data Scarcity Based on Transformer Transfer Learning Network. Electronics 2025, 14, 515. [Google Scholar] [CrossRef]
Dragomiretskiy, K.; Zosso, D. Variational Mode Decomposition. IEEE Trans. Signal Process. 2014, 62, 531–544. [Google Scholar] [CrossRef]
Naruei, I.; Keynia, F. Wild horse optimizer: A new meta-heuristic algorithm for solving engineering optimization problems. Eng. Comput. 2022, 38 (Suppl. 4), 3025–3056. [Google Scholar] [CrossRef]
Rostaghi, M.; Azami, H. Dispersion Entropy: A Measure for Time-Series Analysis. IEEE Signal Process. Lett. 2016, 23, 610–614. [Google Scholar] [CrossRef]
Wang, X.; Si, S.; Li, Y. Multiscale diversity entropy: A novel dynamical measure for fault diagnosis of rotating machinery. IEEE Trans. Ind. Inform. 2020, 17, 5419–5429. [Google Scholar] [CrossRef]
Azami, H.; Rostaghi, M.; Abásolo, D.; Escudero, J. Refined Composite Multiscale Dispersion Entropy and its Application to Biomedical Signals. IEEE Trans. Bio-Med. Eng. 2017, 64, 2872–2879. [Google Scholar]
Rostaghi, M.; Khatibi, M.M.; Ashory, M.R.; Azami, H. Refined Composite Multiscale Fuzzy Dispersion Entropy and Its Applications to Bearing Fault Diagnosis. Entropy 2023, 25, 1494. [Google Scholar] [CrossRef] [PubMed]
Liu, X.; Shu, L. Fault detection of rotating machinery based on marine predator algorithm optimized resonance-based sparse signal decomposition and refined composite multiscale fluctuation dispersion entropy. Rev. Sci. Instrum. 2022, 93, 114703. [Google Scholar] [CrossRef]
Yang, J.; Bai, Y.; Cheng, Y.; Cheng, R.; Zhang, W.; Zhang, G. A new model for bearing fault diagnosis based on optimized variational mode decomposition correlation coefficient weight threshold denoising and entropy feature fusion. Nonlinear Dyn. 2023, 111, 17337–17367. [Google Scholar] [CrossRef]
Azami, H.; Escudero, J. Refined composite multivariate generalized multiscale fuzzy entropy: A tool for complexity analysis of multichannel signals. Phys. A Stat. Mech. Its Appl. 2017, 465, 261–276. [Google Scholar] [CrossRef]
Wang, J.; Chen, Y.; Hao, S.; Feng, W.; Shen, Z. Balanced distribution adaptation for transfer learning. In Proceedings of the 2017 IEEE International Conference on Data Mining (ICDM), New Orleans, LA, USA, 18–21 November 2017; pp. 1129–1134. [Google Scholar]
Smith, W.A.; Randall, R.B. Rolling element bearing diagnostics using the Case Western Reserve University data: A benchmark study. Mech. Syst. Signal Process. 2015, 64, 100–131. [Google Scholar] [CrossRef]
Gong, B.; Shi, Y.; Sha, F.; Grauman, K. Geodesic flow kernel for unsupervised domain adaptation. In Proceedings of the 2012 IEEE conference on computer vision and pattern recognition, Providence, RI, USA, 16–21 June 2012; pp. 2066–2073. [Google Scholar]
Wang, J.; Feng, W.; Chen, Y.; Yu, H.; Yu, P.S. Visual domain adaptation with manifold embedded distribution alignment. In Proceedings of the 26th ACM international conference on Multimedia, Seoul, Republic of Korea, 22–26 October 2018; pp. 402–410. [Google Scholar]
Wang, J.; Chen, Y.; Yu, H.; Huang, M.; Yang, Q. Easy transfer learning by exploiting intra-domain structures. In Proceedings of the 2019 IEEE international conference on multimedia and expo (ICME), Shanghai, China, 8–12 July 2019; pp. 1210–1215. [Google Scholar]
Long, M.; Wang, J.; Ding, G.; Sun, J.; Yu, P.S. Transfer feature learning with joint distribution adaptation. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 2200–2207. [Google Scholar]
Zhang, J.; Yi, S.; Liang, G.; Gao, H.; Hong, X.; Song, H. A new bearing fault diagnosis method based on modified convolutional neural networks. Chin. J. Aeronaut. 2020, 33, 439–447. [Google Scholar] [CrossRef]
Mohammad, M.; Ibryaeva, O.; Sinitsin, V.; Eremeeva, V. A Computationally Efficient Method for the Diagnosis of Defects in Rolling Bearings Based on Linear Predictive Coding. Algorithms 2025, 18, 58. [Google Scholar] [CrossRef]
Chen, Y.; Zeng, X.; Huang, H. Fault Diagnosis of Rolling Bearings Based on Adaptive Denoising Residual Network. Processes 2025, 13, 151. [Google Scholar] [CrossRef]
Zhang, W.; Peng, G.; Li, C.; Chen, Y.; Zhang, Z. A new deep learning model for fault diagnosis with good anti-noise and domain adaptation ability on raw vibration signals. Sensors 2017, 17, 425. [Google Scholar] [CrossRef] [PubMed]

Figure 1. The WBDA algorithm flowchart.

Figure 2. Detailed description of the model proposed.

Figure 3. Ten types of bearing fault signal diagrams under different loads. (a) Description of three types of inner ring faults; (b) description of three types of outer ring faults; (c) description of three types of ball faults; and (d) description of normal bearing signals.

Figure 4. Experimental results of 12 different variable working conditions.

Figure 5. Ten repeated experimental results of twelve different variable working conditions. (a) 0HP data are used as the source domain, and the remaining data are sequentially used as the target domain; (b) 1HP data are used as the source domain, and the remaining data are sequentially used as the target domain; (c) 2HP data are used as the source domain, and the remaining data are sequentially used as the target domain; and (d) 3HP data are used as the source domain, and the remaining data are sequentially used as the target domain.

Figure 6. Difference in results of 10 repeated experiments under 12 different variable working conditions (%).

Figure 7. Comparison results of variable working condition feature transfer experiments. (a) Comparison results of feature transfer in 0HP-1HP, 0HP-2HP, and 0HP-3HP variable working condition experiments; (b) comparison results of feature transfer in 1HP-0HP, 1HP-2HP, and 1HP-3HP variable working condition experiments; (c) comparison results of feature transfer in 2HP-0HP, 2HP-1HP, and 2HP-3HP variable working condition experiments; and (d) comparison results of feature transfer in 3HP-0HP, 3HP-1HP, and 3HP-2HP variable working condition experiments.

Table 1. The calculation formula for frequency feature.

Number	Formula for Feature Calculation	Number	Formula for Feature Calculation
1	$F_{1} = \frac{1}{K} \sum_{k = 1}^{K} X (k)$	8	$F_{8} = \frac{\sum_{k = 1}^{K} {(X (k) - F_{1})}^{3}}{K \cdot \sqrt{F_{7}^{3}}}$
2	$F_{2} = \sqrt{\frac{\sum_{k = 1}^{K} (f_{k}^{2} \cdot X (k))}{\sum_{k = 1}^{K} X (k)}}$	9	$F_{9} = \frac{\sum_{k = 1}^{K} {(X (k) - F_{1})}^{4}}{K \cdot F_{7}^{2}}$
3	$F_{3} = \sqrt{\frac{\sum_{k = 1}^{K} (f_{k}^{4} \cdot X (k))}{\sum_{k = 1}^{K} (f_{k}^{2} \cdot X (k))}}$	10	$F_{10} = \frac{F_{6}}{F_{5}}$
4	$F_{4} = \frac{\sum_{k = 1}^{K} (f_{k}^{2} \cdot X (k))}{\sqrt{\sum_{k = 1}^{K} X (k) \cdot \sum_{k = 1}^{K} (f_{k}^{4} \cdot X (k))}}$	11	$F_{11} = \frac{\sum_{k = 1}^{K} ({(f_{k} - F_{5})}^{3} \cdot X (k))}{K \cdot F_{6}^{3}}$
5	$F_{5} = \frac{\sum_{k = 1}^{K} (f_{k} \cdot X (k))}{\sum_{k = 1}^{K} X (k)}$	12	$F_{12} = \frac{\sum_{k = 1}^{K} ({(f_{k} - F_{5})}^{4} \cdot X (k))}{K \cdot F_{6}^{4}}$
6	$F_{6} = \sqrt{\frac{1}{k} \sum_{k = 1}^{K} {(f_{k} - F_{5})}^{2} \cdot X (k)}$	13	$F_{13} = \frac{\sum_{k = 1}^{K} \sqrt{\|f_{k} - F_{5}\|} \cdot X (k)}{\sqrt{K \cdot F_{6}}}$
7	$F_{7} = \frac{1}{k} \sum_{k = 1}^{K} {(X (k) - F_{1})}^{2}$

Table 2. Explanation of data used in this experiment.

Fault Location	Normal	Inner Race (mm)			Outer Race (mm)			Ball Fault (mm)
Damage degree	0	0.1778	0.3556	0.5334	0.1778	0.3556	0.5334	0.1778	0.3556	0.5334
Label	1	2	3	4	5	6	7	8	9	10
0HP	90/10	90/10	90/10	90/10	90/10	90/10	90/10	90/10	90/10	90/10
1HP	90/10	90/10	90/10	90/10	90/10	90/10	90/10	90/10	90/10	90/10
2HP	90/10	90/10	90/10	90/10	90/10	90/10	90/10	90/10	90/10	90/10
3HP	90/10	90/10	90/10	90/10	90/10	90/10	90/10	90/10	90/10	90/10

Table 3. Experimental results to verify the effectiveness of the MFF-IWBDA model (%).

Source–Target Domain	IWBDA-KNN	WHO-VMD-IWBDA-KNN	WHO-VMD-MFF-WBDA-KNN	MFF-IWBDA
0HP-1HP	32.74	34.61	98.84	99.58
0HP-2HP	34.32	36.34	94.50	99.83
0HP-3HP	36.61	40.09	99.06	99.77
1HP-0HP	31.64	34.98	97.07	98.29
1HP-2HP	34.50	37.63	99.30	100.00
1HP-3HP	41.03	34.40	98.58	99.00
2HP-0HP	33.72	37.64	96.33	98.71
2HP-1HP	34.72	38.43	97.72	99.08
2HP-3HP	35.63	35.09	99.59	99.74
3HP-0HP	37.73	41.47	87.52	99.18
3HP-1HP	39.51	37.67	98.38	98.87
3HP-2HP	34.60	33.70	100.00	100.00
mean	35.56	36.84	97.24	99.34

Table 4. Experimental results of 12 variable working conditions with the single feature (%).

Source–Target Domain	MDE	RCMDE	RCMFDE	RCMVMFE	Frequency	MFF-IWBDA
0HP-1HP	86.92	99.21	92.30	99.66	66.03	99.58
0HP-2HP	87.48	96.96	99.51	90.00	48.98	99.83
0HP-3HP	76.70	96.12	79.31	79.82	80.47	99.77
1HP-0HP	85.19	79.59	98.14	98.00	77.43	98.29
1HP-2HP	94.42	99.71	99.90	99.50	88.29	100.00
1HP-3HP	85.10	79.78	72.40	79.69	80.26	99.00
2HP-0HP	87.96	97.60	82.72	78.89	58.81	98.71
2HP-1HP	90.12	98.86	99.50	86.38	87.32	99.08
2HP-3HP	89.06	93.64	99.57	98.92	85.83	99.74
3HP-0HP	78.79	81.94	94.79	88.08	89.84	99.18
3HP-1HP	80.82	79.29	79.61	88.52	76.36	98.87
3HP-2HP	82.92	99.33	99.56	94.11	80.93	100.00
mean	85.46	91.84	91.44	90.13	76.71	99.34

Table 5. Experimental results of 12 variable working conditions with multiple feature fusion (%).

Source–Target Domain	RCMDE and RCMFDE	RCMDE, RCMFDE, and RCmvMFE	MDE, RCMDE, RCMFDE, and RCmvMFE	MFF-IWBDA
0HP-1HP	99.77	95.88	96.27	99.58
0HP-2HP	91.47	82.08	82.21	99.83
0HP-3HP	91.01	85.63	94.47	99.77
1HP-0HP	99.49	98.72	98.33	98.29
1HP-2HP	100.00	100.00	100.00	100.00
1HP-3HP	82.62	79.16	79.19	99.00
2HP-0HP	85.76	81.46	80.09	98.71
2HP-1HP	99.46	99.44	99.24	99.08
2HP-3HP	99.48	99.71	99.67	99.74
3HP-0HP	99.20	95.17	94.41	99.18
3HP-1HP	90.56	81.14	80.20	98.87
3HP-2HP	99.62	99.76	99.96	100.00
mean	94.87	91.51	92.00	99.34

Table 6. The standard deviation of 10 repeated experimental results under 12 different variable working conditions (%).

Experiment	0-1HP	0-2HP	0-3HP	1-0HP	1-2HP	1-3HP	2-0HP	2-1HP	2-3HP	3-0HP	3-1HP	3-2HP
standard deviation (%)	0.09	0.05	0.03	0.29	0	0.1	0.22	0.11	0.07	0.28	0.2	0

Table 7. Comparative experimental results of different feature transfer methods (%).

Source–Target Domain	KMM	CORAL	GFK	MEDA	Easy-TL	Ea-CORAL	Ea-PCA	TCA	JDA	BDA	I-WBDA
0HP-1HP	20.00	27.82	86.04	90.89	85.11	85.92	84.97	82.48	93.61	98.84	99.58
0HP-2HP	15.03	28.14	75.88	79.86	81.07	82.11	80.97	76.53	74.38	94.5	99.83
0HP-3HP	20.00	62.89	76.4	80.34	78.51	76.41	79.18	69.46	86.51	99.06	99.77
1HP-0HP	20.00	45.47	86.27	81.98	81.43	81.72	81.42	87.42	95.51	97.07	98.29
1HP-2HP	19.90	30.51	94.43	91.72	89.5	90.14	89.01	91.01	99.5	99.3	100
1HP-3HP	17.53	42.81	89.64	93.74	84.96	84.22	85.79	81.19	98.5	98.58	99
2HP-0HP	11.90	53.09	79.13	77.63	81.94	82.13	81.91	80.41	71.28	96.33	98.71
2HP-1HP	14.13	23.99	95.16	96.84	88.97	89.38	89.06	89.74	97.32	97.72	99.08
2HP-3HP	20.00	48.98	89.6	97.66	92.69	92.7	92.49	78.78	89.41	99.59	99.74
3HP-0HP	19.91	46.96	75.98	79.84	81.44	81.16	81.6	67.28	84.46	87.52	99.18
3HP-1HP	12.19	35.72	74.54	90	85.02	87.68	85.08	76.67	97.79	98.38	98.87
3HP-2HP	18.37	25.38	81.3	80.97	94.37	95.62	94.27	85.2	99.1	100	100
mean	17.41	39.31	83.7	86.79	85.42	85.77	85.48	80.51	90.61	97.24	99.34

Table 8. Comparison between the model and recent research (%).

Literature	Year	Number of Working Conditions	Experiment	Accuracy (%)	Average Accuracy (%)	Number of Samples for Each Type of Fault
[29]	2025	3	0-1	99.4	98.8	1000
			0-2	* 98.5
			0-3	* 98.5
[45]	2020	6	0-1,2 1-0,2 2-0,1	90.5	90.5	50
[46]	2025	12	0-1	97.82	93.89	/
			0-2	91.72
			0-3	88.54
			1-0	99.25
			1-2	98.49
			1-3	91.06
			2-0	93.79
			2-1	93.77
			2-3	96.95
			3-0	83.82
			3-1	93.37
			3-2	98.14
[47]	2025	6	1-2	* 99.3	96.99	500
			1-3	* 97.5
			2-1	* 94.1
			2-3	* 98.8
			3-1	* 94.4
			3-2	* 97.9
[48]	2017	6	1-2	99.4	95.9	685
			1-3	93.4
			2-1	97.5
			2-3	97.2
			3-1	88.3
			3-2	99.9
The proposed method	/	12	0-1	99.58	99.34	100
			0-2	99.83
			0-3	99.77
			1-0	98.29
			1-2	100
			1-3	99
			2-0	98.71
			2-1	99.08
			2-3	99.74
			3-0	99.18
			3-1	98.87
			3-2	100

“*” indicates that the data are approximate values obtained from the literature.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, J.; Bai, Y.; Xu, T.; Cheng, R.; Zhang, W.; Zhang, G. Complex Working Condition Bearing Fault Diagnosis Based on Multi-Feature Fusion and Improved Weighted Balance Distribution Adaptive Approach. Lubricants 2025, 13, 221. https://doi.org/10.3390/lubricants13050221

AMA Style

Yang J, Bai Y, Xu T, Cheng R, Zhang W, Zhang G. Complex Working Condition Bearing Fault Diagnosis Based on Multi-Feature Fusion and Improved Weighted Balance Distribution Adaptive Approach. Lubricants. 2025; 13(5):221. https://doi.org/10.3390/lubricants13050221

Chicago/Turabian Style

Yang, Jing, Yanping Bai, Ting Xu, Rong Cheng, Wendong Zhang, and Guojun Zhang. 2025. "Complex Working Condition Bearing Fault Diagnosis Based on Multi-Feature Fusion and Improved Weighted Balance Distribution Adaptive Approach" Lubricants 13, no. 5: 221. https://doi.org/10.3390/lubricants13050221

APA Style

Yang, J., Bai, Y., Xu, T., Cheng, R., Zhang, W., & Zhang, G. (2025). Complex Working Condition Bearing Fault Diagnosis Based on Multi-Feature Fusion and Improved Weighted Balance Distribution Adaptive Approach. Lubricants, 13(5), 221. https://doi.org/10.3390/lubricants13050221

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Complex Working Condition Bearing Fault Diagnosis Based on Multi-Feature Fusion and Improved Weighted Balance Distribution Adaptive Approach

Abstract

1. Introduction

2. Methodology

2.1. Signal Denoising

2.2. Feature Extraction

2.2.1. Frequency Features

2.2.2. Distribution Entropy and Its Variants

2.2.3. Refined Composite Multivariate Generalized Multiscale Fuzzy Entropy

2.3. WBDA Feature Transfer

3. The MFF-IWBDA Model

4. Experimental Analyses

4.1. The Dataset Description

4.2. Analysis of Experimental Results

4.3. Feature Extraction and Fusion Analysis

4.4. Model Stability Analysis

4.5. Comparison of Transfer Models

4.6. Comparative Analysis of Models

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI