Next Article in Journal
Correction: Saadah et al. Developing Robust Safety Protocols for Radiosurgery within Patient Positioning System Framework. Machines 2024, 12, 106
Previous Article in Journal
Design and Analysis of a Novel Hydraulic Energy Storage Component
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Fault Diagnosis for Imbalanced Datasets Based on Deep Convolution Fuzzy System

1
Institute of Cyberspace Security, Zhejiang University of Technology, Hangzhou 310023, China
2
Binjiang Institute of Artificial Intelligence, Zhejiang University of Technology, Hangzhou 310056, China
*
Author to whom correspondence should be addressed.
Machines 2025, 13(4), 326; https://doi.org/10.3390/machines13040326
Submission received: 19 March 2025 / Revised: 10 April 2025 / Accepted: 13 April 2025 / Published: 17 April 2025
(This article belongs to the Section Machines Testing and Maintenance)

Abstract

:
To address the data imbalance issue in the process of collecting bearing fault data in industrial environments and to enhance the robustness and generalization ability of fault diagnosis, this paper proposes a bearing fault diagnosis method based on a Bidirectional Autoregressive Variational Autoencoder (BAVAE) and a Deep Convolutional Interval Type-2 Fuzzy System (DCIT2FS). First, the method extracts features from the imbalanced dataset using dual-tree complex wavelet transform (DTCWT), and then feeds the feature dataset into the proposed BAVAE for data augmentation. The BAVAE improves data generation capabilities by introducing autoregressive distributions to learn latent variables, iteratively obtaining complex high-order latent variables, and amplifying inter-class differences through the introduction of feature discrimination loss during training. Given that relying solely on data augmentation under imbalanced data conditions may lead to overfitting or underfitting, this paper combines the generalization approximation ability of Interval Type-2 (IT2) fuzzy systems with the feature extraction capability of deep convolutional networks, achieving a better balance between model complexity and feature transformation, thereby enhancing the stability and accuracy of the final diagnosis.

1. Introduction

Rolling bearings are essential components commonly used in mechanical systems, responsible for supporting rotating shafts and transmitting loads, thereby playing a critical role in the stable operation of these systems [1,2]. With the development of intelligent manufacturing and industrial automation, the demand for bearing reliability and service life in equipment has increased. Industrial standards, have imposed strict requirements on bearing condition monitoring. Accurately diagnosing bearing health in complex working conditions and preventing sudden failures has become a pressing issue for enterprises.
Although deep learning-based fault diagnosis methods have shown excellent performance, there is a lack of sufficient bearing fault data for model training in real industrial environments. Therefore, how to perform effective fault diagnosis under limited or imbalanced data remains a critical problem in current research. Data augmentation is a typical method for addressing the issue of imbalanced datasets in fault diagnosis, and many effective augmentation methods have proven validity [3,4,5]. Due to the powerful generation capabilities of generative models, which can produce real samples to augment the dataset and improve model generalization, methods like Generative Adversarial Networks (GAN) [6] and Variational Auto Encoders (VAE) [7] have been widely applied to fault diagnosis [8]. While data augmentation based on GAN and VAE has shown effectiveness in various imbalanced fault diagnosis datasets, most methods generate data directly from raw vibration data [9,10], spectra [11,12,13], or learned features [14]. Raw vibration data and spectra have complex distributions, making generation more difficult, and the lack of a feature extraction phase before classification leads to lower diagnostic accuracy. Rezazadeh et al. [15] proposed a novel transfer learning framework combining wavelet transforms, multilayer perceptrons (MLP), and transformer encoders with sequential domain adaptation. Their approach requires only a limited number of labeled samples from the target domain and achieves significant performance even under strong domain shifts. Although deep learning feature extraction can improve generation performance, it is difficult to learn discriminative features with limited samples, often resulting in the loss of fault information.
Existing bearing fault diagnosis methods are primarily based on machine learning algorithms, which model vibration signals or sensor data to predict bearing health. For example, Artificial Neural Networks (ANN) [16,17,18], Support Vector Machines (SVM) [19,20], and Random Forests (RF) [21,22] are widely applied in bearing fault detection. Reference [17] proposes using mutual information to select features and builds a fault diagnosis model using deep neural networks. Reference [21] combines Random Forest algorithms for feature importance analysis to identify key features related to bearing faults and constructs efficient diagnostic models. While these methods improve diagnostic accuracy, they typically require a large amount of labeled data and long training times. The robustness and adaptability of these models are still limited when dealing with uncertainty and variable working conditions.
Fuzzy systems based on rules and linguistic concepts are insensitive to feature variables, requiring no complex feature adjustment or training process, making them better suited to adapt to existing data in cases of scarce or incomplete labeled data, thereby effectively achieving fault diagnosis. Reference [22] proposes an adaptive feature selection Interval Type-2 (IT2) fuzzy system, which automatically selects key input variables. The IT2 fuzzy system shows higher accuracy than Type-1 fuzzy systems in fault prediction. Studies have shown that IT2 fuzzy systems, with their Footprint of Uncertainty (FOU), have a stronger ability to describe uncertainty [23,24]. However, IT2 fuzzy systems are complex in terms of feature variable selection and type reduction, resulting in longer construction and computation times. In recent years, research combining deep learning and fuzzy systems has gradually gained momentum. Wang L et al. [25] designed a deep convolution fuzzy system (DCFS) for high-dimensional input spaces using low-dimensional fuzzy systems, where each layer used a Type-1 Wang–Mendel fuzzy system, completing chaos time series predictions. However, the advantages of IT2 fuzzy systems were not considered in this research, and the issue of information loss in deep computation was not addressed.
In summary, to enhance the fault diagnosis accuracy of rolling bearings on imbalanced datasets, this paper presents a diagnostic method based on a Deep Convolutional Interval Type-2 Fuzzy System. First of all, multi-scaled features of both minority and majority data in each class are extracted as input using DTCWT. Then, a multi-scale feature enhancement method based on Bidirectional Autoregressive VAE is proposed. Unlike previous methods, this paper improves the VAE’s generation capability by introducing BAVAE, iteratively learning lower-level latent variables to obtain complex higher-level latent variable distributions. Additionally, feature discriminative loss is added to the training loss to expand inter-class differences, and the augmented dataset is combined with the original dataset for enhancement. Then, an IT2 model based on irregular Gaussian functions is incorporated into the deep convolution network to construct the Deep Convolution Interval Type-2 Fuzzy System. This model combines the approximation capabilities of the IT2 fuzzy model and the hierarchical structure of deep learning, achieving a better balance in model complexity and feature transformation, thus improving the accuracy of bearing fault diagnosis.

2. Data Description and Related Theories

2.1. Data Description

The dataset used in this section was obtained from the Bearing Data Center of Case Western Reserve University (CWRU), which is one of the most widely analyzed public datasets in the research community. The experimental setup consists of a torque transducer/encoder, a motor, a dynamometer, and control electronics, as depicted in Figure 1. The bearings tested were deep groove ball bearings from both Svenska Kullager-Fabriken (SKF, Gothenburg, Sweden) and NTN which support the motor shaft. Simulation vibration data were collected using a 16-channel DATrecorder under sample rates of 12 kHz and 48 kHz. More detailed information can be found in [26].
The bearing model used is this study is SKF6202, and the bearing faults were artificially induced using electrical discharge machining (EDM). In this article, we use vibration signals from faulty bearings collected at the drive end with a sampling frequency of 12 kHz. The dataset records motor operation data at four different rotational speeds. It includes faults occurring in three distinct locations: the rolling element, the inner race, and the outer race at the six o’clock position, with damage diameters of 0.1778 mm, 0.3556 mm, and 0.5334 mm. Based on the bearing damage conditions, the dataset is categorized into ten states, including the normal state and different fault severity levels. The corresponding labels for both normal and faulty conditions are shown in Table 1.

2.2. Interval Type-2 Fuzzy System

The fundamental concept of a fuzzy system is to represent an expert’s knowledge, understanding, or strategies for a specific object or process through a series of “IF (condition) THEN (action)” rules, and drive the action set via fuzzy inference. The structure of a fuzzy system is shown in Figure 2. A Type-1 fuzzy system consists of a fuzzifier, inference engine, rule base and dimensionality reducer. Compared to a Type-1 fuzzy system, the IT2 fuzzy system includes an additional reducer, which is used to convert the Interval Type-2 fuzzy set output by the inference engine into a Type-1 fuzzy set, followed by defuzzification.
Consider an IT2 fuzzy system with C rules, using a zero-order Takagi–Sugeno-type, where each rule has the following form:
Rule   j :   If   x 1 p A 1 , j   and     and   x n p A n , j , Then   y p y ^ j , j = 1 , 2 , , C .
where A i , 1 , , A i , C , i = 1 , , n are IT2 fuzzy sets, and n is the number of input vectors; and y ˜ j is generally an interval [ y _ j , y ¯ j ] , which can be understood as the centroid of the rule consequent fuzzy set. We let y ^ j = y ¯ j   so that it degenerates into a constant, making subsequent calculations easier. Given a multi-input single-output dataset Z = { z p , p = 1 , , N } , Z N , where a sample data point is denoted as z p = ( x 1 p , , x n p ; y p ) , the general steps for calculating the IT2 fuzzy system are as follows:
(1) Calculate the membership degrees [ u _ i , j , u ¯ i , j ] for input variables x i p , where i = 1 , , n and j = 1 , , C . The membership degree of an Interval Type-2 fuzzy set constitutes an interval range, known as the Footprint of Uncertainty, which is bounded by upper and lower membership functions, as illustrated in Figure 3.
(2) Calculate the firing strength interval F j for the j-th rule as shown in Equation (2), where f ¯ j and f _ j represent the upper and lower bounds of the firing strength interval.
F j ( x p ) = μ A 1 , j ( x 1 p ) μ A n , j ( x n p ) ,   μ ¯ A 1 , j ( x 1 p ) μ ¯ A n , j ( x n p ) = [ f ¯ j , f _ j ] , j = 1 , , C
(3) Perform type-reduction to integrate F j ( x p ) with the corresponding rule consequents. For the most common center-of-sets type reducer, the calculation proceeds as follows:
Y cos ( x p ) = y j P j j = 1 C f j y j j = 1 C f j = [ y l , y r ]
y l = j = 1 L f ¯ j y ˜ j + i = L + 1 C f _ j y ˜ j j = 1 L f ¯ j + i = L + 1 C f _ j
y r = j = 1 R f _ j y ˜ j + i = R + 1 C f ¯ j y ˜ j j = 1 R f _ j + i = R + 1 C f ¯ j
y = y l + y r 2
where y ˜ j   ( j = 1 , , C ) denotes the fuzzy rule outputs sorted in ascending order (still denoted as y ˜ j , but with y ˜ 1 y ˜ 2 y ˜ C ). The corresponding F j is adjusted accordingly to align with the sorted y ˜ j . L and R are switching points determined via the Enhanced Karnik–Mendel algorithm, and y represents the final predicted output.

2.3. Variational Autoencoder Fundamentals

The Variational Autoencoder has emerged as an effective generative model in deep learning due to its traceable sampling process and powerful generative capabilities. As illustrated in Figure 4, the VAE framework resembles autoencoder architectures: input data are encoded into latent variables via an encoder model, constrained by a loss function, to match a predefined prior distribution. Sampled latent variables from this distribution are then decoded to produce generated samples. In VAEs, the encoder and decoder networks represent variational inference and generative processes, respectively, both implemented via neural networks.
The training objective of VAE is to maximize the log marginal likelihood of observed samples:
log P ( x ) = D KL ( q ϕ ( z | x )   p θ ( z | x ) ) + L ( θ , ϕ ; x )
where x denotes real data samples, p θ ( z | x ) represents the posterior distribution conditioned on observations, and q ϕ ( z | x ) is the approximate posterior. VAEs employ deep neural networks to fit the true posterior using the approximate distribution, regularized by the KL divergence. Simplifying Equation (7) yields the VAE loss function:
L ( θ , ϕ ; x ) = 𝔼 q ϕ ( z | x ) log p θ ( x | z ) D KL ( q ϕ ( z | x ) p θ ( z ) )
where the first term corresponds to the reconstruction error between real data and generated samples, while the second term quantifies the divergence between the latent prior p θ ( z ) (typically a standard Gaussian for tractable sampling) and the approximate posterior. Sampling is achieved via the reparameterization trick [5].
Conventional VAEs sample latent variables from isotropic Gaussian distributions, yet decoders struggle to map such simplistic priors to complex real-world data distributions. This limitation constrains model expressiveness and degrades generalization capability in imbalanced fault diagnosis scenarios.

3. Proposed Fault Diagnosis Method and Model Structure

3.1. Dual-Tree Complex Wavelet Transform

Wavelet transform has shown its incomparable capabilities in non-stationary signal analysis and complex compound signal decomposing. Dual-tree complex wavelet transform (DTWCT) [28], as one of the theoretical implementations of complex wavelet transform (CWT), has some promising advantages over real wavelet transforms.
Inheriting the idea of Fourier transform, DTCWT exploits imaginary unit j to encode phase information of signals. The forward and backward passages of dual-tree CWT employ two groups of filter banks (FBs), which are shown in Figure 5. DTWCT consists of a two-channel discrete wavelet transform (DWT) with two different real wavelet functions ψ h ( t )   and ψ g ( t ) ; thus, the composed complex wavelet function can be represented as follows:
ψ c ( t ) = ψ h ( t ) + j ψ g ( t )
To reach a better analyzing capability, DTCWT desires ψ c ( t ) to be in the region of analytical functions so that ψ h ( t ) and ψ g ( t ) should be a pair of Hilbert transform [29]:
ψ h ( t ) = H [ ψ g ( t ) ]
where H [ ] denotes Hilbert transform. However, an analytical function is not a valid wavelet function as it lacks the properties of finite support and fast decay. To bridge the gap between a valid wavelet function and a better analyzing capability, ψ h ( t ) should be as close as possible to the Hilbert transform of ψ g ( t ) and it must be certain that ψ c ( t ) is a valid wavelet function.
The technique to design two parallel FBs requires that the low-pass filters of both real and imaginary trees are approximately a half-sample shift to each other [29]:
g 0 ( n ) h 0 ( n 0.5 )
Based on the above conditions, the decomposition and reconstruction algorithms can be concluded as follows:
d j R e ( n ) = 2 j / 2 + x ( t ) ψ h ( 2 j t n ) d t , j = 1 , , J c J R e ( n ) = 2 J / 2 + x ( t ) ϕ h ( 2 J t n ) d t
where j is the decomposed level and J is the maximum level. d j R e ( n ) are the high-frequency coefficients in level j , and c J R e ( n ) are the low-frequency coefficients in the final level J of the real tree. Similarly, the imaginary tree is decomposed under ψ g ( t ) . Combining both real and imaginary parts, the complex coefficients of DTCWT in each level can be derived as follows:
d j C ( n ) = d j R e ( n ) + j d j I m ( n ) c J C ( n ) = d J R e ( n ) + j d J I m ( n )
Im denotes the corresponding imaginary part of the decomposed signal. The reconstructed real signals of each level can be attained using the following equations:
d j ( t ) = 2 ( j 1 ) / 2 k d j R e ( n ) ψ h ( 2 j t k ) + m d j I m ( k ) ψ g ( 2 j t m ) , j = 1 , , J a J ( t ) = 2 ( J 1 ) / 2 k d J R e ( n ) φ h ( 2 J t k ) + m d J I m ( k ) φ g ( 2 J t m )
DTCWT often proves to be a great innovation and improvement of DWT in the complex domain. Different from real wavelet transform, complex wavelet features have the properties of smoothness and regularity, making them more easily simulated.

3.2. Bidirectional Autoregressive VAE

Traditional VAEs apply simple Gaussian priors on latent variables for ease of training. However, these oversimplified distributions limit the ability to generate complex industrial patterns. To address this limitation, we propose a Bidirectional Autoregressive VAE that enhances latent expressiveness through autoregressive modeling. As depicted in Figure 6, BAVAE integrates residual networks into its feature extraction module, structurally improving the decoder’s ability to synthesize high-fidelity industrial signals.
In the Bidirectional Autoregressive VAE, latent variables are partitioned into multiple mutually independent groups:
{ z 1 , z 2 , , z L }
where L denotes the number of groups. The first-layer latent variable z 1 is hierarchically conditioned by preceding groups through autoregressive modeling. Higher-layer latent distributions are learned via neural networks. The prior and approximate posterior distributions in BAVAE are formulated as follows:
p θ ( z ) = p θ ( z 1 ) p θ ( z 2 | z 1 ) p θ ( z L | z < L ) q ϕ ( z | x ) = p θ ( z 1 | x ) p θ ( z 2 | z 1 , x ) p θ ( z L | z < L , x )
where the first-layer prior is defined as a standard Gaussian, while subsequent priors follow factorized Gaussian distributions with learnable mean and variance parameters. These parameters are adaptively derived from preceding layers via neural networks, regularized by KL divergence to constrain deviations from the prior. The highest-layer latent distribution is conditioned on all preceding variables and decoded into generated signals, governed by KL divergence and reconstruction loss. The total KL divergence for BAVAE decomposes as follows:
D KL ( q ϕ ( z | x ) p θ ( z ) ) = D KL ( q ϕ ( z 1 | x ) p θ ( z 1 ) ) + l = 2 L 𝔼 q ϕ ( z < l | x ) D KL ( q ϕ ( z l | x , z < l ) p θ ( z l | z < l ) )
As shown in Figure 6, the architecture of the bidirectional network integrates the encoder and decoder through latent variables. The network on the left, from bottom to top, represents the encoding process. After obtaining the latent variables at each layer, the higher-level latent variables are derived from the encoded information in the preceding layers, moving from top to bottom. Simultaneously, the prior distributions of different latent variables are learned, ultimately decoding into the generated signal.
This autoregressive hierarchy enhances latent expressiveness for complex industrial data generation. The bidirectional design streamlines training by eliminating redundant connections and reducing computational overhead.
Although the proposed BAVAE draws inspiration from hierarchical latent variable models such as Ladder VAE [30] and PixelVAE [31], it differs significantly in both structure and application. In contrast to Ladder VAE’s skip-connections and auxiliary variables, BAVAE adopts autoregressive modeling between latent variable groups in a bidirectional manner. Compared with PixelVAE’s spatial pixel-level dependency for images, BAVAE is tailored to vibration feature sequences and incorporates residual networks for industrial signal generation. Moreover, a feature discrimination loss is introduced to guide the latent representation to be class-sensitive, which is crucial for handling class imbalance in industrial fault diagnosis.

3.3. Enhanced Loss Function

The conventional VAE loss, as shown in Equation (8), jointly optimizes reconstruction fidelity and latent regularization:
L recon = n = 1 N x n x ^ n 2
To enhance the discriminability of data generated on imbalanced datasets, this paper introduces a feature discrimination loss function based on the existing two loss functions. Specifically, an additional mapping network is constructed on the basis of the last latent variable layer, comprising global pooling and fully connected layers, with the output being the predicted labels of the latent variables. By widening the gap between samples of different classes during the training process of the Variational Autoencoder, the loss function can be formulated as follows:
L dis = n = 1 N k = 1 K y n k log ( [ f ( z i ) ] k ) ( 1 y n k ) log ( [ f ( z i ) ] k )
where f ( · ) denotes the latent mapping function. The composite loss is calculated as follows:
L ( θ , ϕ ; x ) = L recon β L DK + α L dis
Here, α and β are tunable weights, and L DK represents the KL divergence in Equation (17). By incorporating L dis , latent representations become class-sensitive during training, enlarging inter-class margins. This facilitates fault classification by generating label-aware synthetic samples, ultimately boosting diagnostic accuracy.
During training, the reconstruction loss ensures the generated samples closely resemble the real samples in the feature space, promoting fidelity. Meanwhile, the feature discrimination loss introduces an additional constraint on the latent space to make the representations more class-discriminative. These two loss components are complementary: while the reconstruction loss focuses on input–output consistency, the discrimination loss pulls latent features apart across classes. The joint optimization drives the BAVAE to generate not only realistic but also class-aware samples, which is crucial for learning under data imbalance.

3.4. Deep Convolutional Interval Type-2 Fuzzy System

While data augmentation increases sample quantity, it may fail to address inherent class imbalance in original datasets. Augmented samples often deviate from the true feature distribution of real-world data, leading to biased learning of class distributions. To establish a robust bearing fault diagnosis model, we integrate Interval Type-2 fuzzy systems with deep convolutional networks via a bottom-up hierarchical architecture, as illustrated in Figure 7.
The internal structure of the l-th layer ( l = 1 , , L ) fuzzy subsystem FS i l ( i = 1 , , n l ) in the Deep Convolutional Fuzzy System is illustrated in Figure 8. Its input vector is I i l = x i l 1 , , x m + i 1 l 1 , where m is a small positive integer. Each input variable is associated with q ( q 2 ) fuzzy sets defined by trapezoidal membership functions:
A ( x * l 1 ) = σ | x * l 1 x * c l 1 | σ , x * l 1 [ x * c l 1 σ , x * c l 1 + σ ] 0 , x * l 1 [ min x * l 1 , x * c l 1 σ ] [ x * c l 1 + σ , max x * l 1 ]
where x * l 1 ( * denoting any input variable) is the center of the membership function, and σ represents the distance from the center to its endpoints. The global minima min x * l 1 and maxima max x * l 1 are determined from training data. The FS i l subsystem is governed by q m fuzzy rules:
R * : IF   x i l 1   is   A j 1   and     and   x m + i 1 l 1   is   A j m ,   THEN   x i l   is   B j 1 , , j m .
where R * denotes a rule index, j 1 , , j m are fuzzy set indices for each of the m input variables, ranging from 1 to q, and B j 1 , , j m is the output fuzzy set. Defuzzification follows the center-of-sets (COSD) method. The input–output mapping of FS i l is as follows:
x i l = F S i l x i l 1 , , x m + i 1 l 1 = j 1 = 1 q j m = 1 q c j 1 , , j m A j 1 x i l 1 A j m x m + i 1 l 1
Here, c j 1 , , j m represents the center value of the corresponding fuzzy set B ( j 1 , , j m )   , which is also the trainable parameter of FS i l . A ( j 1 ) ( x i ( l 1 ) ) A ( j m ) ( x m + i 1 ( l 1 ) ) denotes the firing strength of the fuzzy rule shown in Equation (22), denoted as w ( j 1 , , j m ) = A ( j 1 ) ( x i ( l 1 ) ) A ( j m ) ( x m + i 1 ( l 1 ) ) .
In the formula, the DCIT2FS model for i = 1 , 2 , , L and s = 1 , 2 , , n is a fuzzy system in the Interval Type-2 fuzzy system, where the fuzzy membership function is represented as follows:
A 1 ( x ( k ) ) = 1 1 1 + e ( x ( k ) α 1 ) β 1 A 3 ( x ( k ) ) = 1 1 + e ( x ( k ) α 3 ) β 3 A 2 ( x ( k ) ) = 1 A 1 ( x ( k ) ) A 3 ( x ( k ) )
In the equation, α and β are two variables of the asymmetric sigmoid function, and α [ α δ , α + δ ] ( δ is a user-defined constant). Equation (1) illustrates the Interval Type-2 fuzzy rule. This structure forms the foundation for constructing the predictive model DCIT2FS.
R * : if   x s l 1   is   A 1 *   and   x s + 1 l 1   is   A 2 *   and     and   x s + m 1 l 1   is   A m * then   y ˜ * = p 0 * + p 1 * x 1 + + p m * x m
In the equation, the variable A is a type-2 fuzzy set. The type-1 fuzzy set describes the polynomial coefficients that follow, denoted as p . The conclusion parameter intervals also include predefined upper and lower bounds. The output interval of each rule is calculated as follows ( i = 1 , 2 , , c ):
y ˜ i y i l , y i h = p i 0 l , p i 0 h + p i 1 l , p i 1 h x 1 + + p i m l , p i m h x m  
The reduction set of the model is an Interval Type-1 fuzzy model,
y ˜ y i l , y i h = i = 1 c μ i l y i l i = 1 c μ i l ,     i = 1 c μ i h y i h i = 1 c μ i h
The precise output of the model is obtained using the centroid method:
y ^ = y l + y h 2
Each first-layer fuzzy system is regarded as a weak evaluator that outputs based on only a small subset of input variables. The fuzzy system adopts an IT2 fuzzy model based on an uncertain Gaussian function. The antecedent parameters of this model are optimized using the PSO algorithm.
As the number of features or fuzzy rules increases, scalability becomes an important consideration. To address this issue and maintain computational efficiency, we have implemented several techniques in the proposed DCIT2FS model. When the number of fuzzy rules increases, rule reduction methods, such as K-means or fuzzy C-means clustering, can be used to reduce the number of rules, thus decreasing the computational cost. Additionally, to handle the increase in feature dimensions, we employ feature selection techniques like Principal Component Analysis (PCA) and Mutual Information (MI), which help identify the most relevant features and reduce the dimensionality, thereby alleviating the computational burden. Furthermore, we leverage GPU acceleration during the fuzzy inference process to significantly enhance processing speed, making the model feasible for real-time applications. These strategies ensure that the DCIT2FS model remains scalable and efficient, even as the number of features or fuzzy rules increases, and maintain its applicability in real-time scenarios.

3.5. Proposed Fault Diagnosis Framework

The proposed bearing fault diagnosis framework for imbalanced datasets integrates Deep Convolutional fuzzy systems with Bidirectional Autoregressive VAEs, as illustrated in Figure 9. The key steps are as follows:
(1)
Data preprocessing: The raw data were subjected to outlier detection and denoising. For imbalanced datasets, DTWCT was employed to extract feature values. The extracted features were then concatenated to construct a multi-scale feature training dataset.
(2)
Data augmentation: A BAVAE network model was developed, and hyperparameters were selected accordingly. The network was trained using the multi-scale feature dataset. Noise was sampled from a standard Gaussian distribution to generate synthetic data, which was then combined with real data to train a classification model.
(3)
Model training and fault diagnosis: The augmented dataset was used to train a DCIT2FS. After training, the test dataset was preprocessed and fed into the diagnostic model to complete the fault diagnosis algorithm.

4. Experimental Validation

4.1. Experimental Setup

Experiments were performed using the CWRU bearing dataset to validate the proposed fault diagnosis method. To simulate imbalanced data scenarios, 30% and 50% of samples from each fault class were randomly selected for training, while all normal-condition samples were retained. And wavelet transform was applied to the imbalanced dataset for multi-scale feature extraction, generating an imbalanced feature set.
To better quantify the performance of the proposed algorithm, this paper employs the accuracy metric to provide a comprehensive assessment of its effectiveness. This evaluation metric can be calculated using Equation (29) based on the confusion matrix in Table 2.
The accuracy index is calculated as follows:
Accuracy = T P + T N T P + T N + F P + F N

4.2. Synthetic Data Quality Assessment

This section demonstrates the generative capability of the Autoregressive Variational Autoencoder (AR-VAE) by comparing the generated features with the true data features. When the data imbalance ratio is 50%, without loss of generality, the data samples of state 8 (fault in rolling element with damage diameters of 0.5334 mm) are selected for analysis. Figure 10a,b show the multi-scale features of the true samples and those generated by the proposed method, respectively. Although there are some differences in detail between the generated multi-scale features and the true sample features, the overall generation effect is satisfactory, which demonstrates that the proposed BAVAE can effectively generate true feature samples. The generated multi-scale features are summed to obtain the original vibration signals.
To more accurately quantify the authenticity of the generated data, this paper employs the inverse of the Fréchet distance as a similarity metric, in combination with CNN as the classifier. Several widely used methods were employed for comparison, including the resampling-based SMOTE [3] and ADASYN [4] algorithms, the Conditional Variational Autoencoder (CVAE), and GAN, as well as BAVAE without the enhanced loss function L d i s and BAVAE with the enhanced loss function L d i s . Both CVAE and GAN were constructed using fully connected layers. And states 1, 2, 3 and 4 were selected for the comparative experiments. The similarity between the data generated by different methods and the true data is compared, as shown in Table 3. The results indicate that the proposed method achieves the highest similarity in generating data across various categories, further validating the effectiveness of the proposed approach in data generation.
The results indicate that the proposed algorithm achieved the best fault diagnosis performance across different imbalance ratios and classifiers, demonstrating its effectiveness. The SMOTE and ADASYN algorithms, based on resampling, can expand the dataset through direct sampling. However, these methods struggle to generate diverse samples within the original data distribution and are highly susceptible to noise interference, resulting in suboptimal data augmentation performance. The CVAE and GAN, built using fully connected layers, have a large number of parameters and overly simplistic structures, making it difficult to generate data that closely resemble real samples. Consequently, their data augmentation effects are limited. The comparative experiments between BAVAE without L d i s and BAVAE with L d i s further validate the effectiveness of the enhanced loss function L d i s .

4.3. Case 1: CWRU Dataset

To validate the superiority of the proposed algorithm in fault diagnosis for imbalanced datasets, five diagnostic models were selected to perform fault diagnosis on the augmented dataset. These models include SVM, RF, Convolutional Neural Network (CNN), Deep Convolutional Interval Type-1 Fuzzy System, and the proposed DCIT2FS. The fault diagnosis results for datasets with imbalance ratios of 30% and 50% are shown in Table 4 and Table 5. For the SVM and RF models, default parameter settings were used. The CNN model was constructed using only conventional convolutional and pooling layers.
From the data in Table 4 and Table 5, it is evident that the BAVAE method outperforms other methods in classification accuracy under different imbalance ratios (30% and 50%), especially with the DCIT2FS and DCFS classifiers, where it shows particularly outstanding performance. At a 30% imbalance ratio, BAVAE achieves higher accuracy across all classifiers compared to other methods, with DCIT2FS reaching an accuracy of 97.00% and DCFS reaching 96.80%. When the imbalance ratio increases to 50%, BAVAE’s overall accuracy improves further, especially with DCIT2FS, which achieves 98.67%, demonstrating stronger generalization ability and stability compared to other methods. In contrast, GAN and CVAE perform well with certain classifiers, particularly GAN in RF and CNN, but still fall short of BAVAE. Traditional oversampling methods like SMOTE and ADASYN have lower overall accuracy; although there is some improvement in the CNN classifier, their performance in other classifiers remains significantly behind. Overall, the proposed BAVAE-DCIT2FS method achieves the best classification results in all experimental settings, proving its superiority in handling imbalanced datasets, especially when combined with the DCIT2FS classifier, which leads to the highest classification accuracy, showcasing strong data generation and adaptation capabilities. Figure 10 and Figure 11 present the confusion matrices for five algorithms combined with BAVAE on a 50% imbalanced dataset. The results indicate that SVM and RF perform relatively poorly, while DCFS and DCIT2FS show better performance. Notably, the introduction of the Interval Type-2 fuzzy system into DCFS, resulting in DCIT2FS, leads to superior diagnostic performance, further validating the effectiveness of the proposed DCIT2FS diagnostic model.
Figure 12 indicates that the proposed method provides accurate diagnostic results for most categories, with only a small number of diagnostic errors in a few categories. The above experiments validate the data generation capability of the proposed BAVAE, as well as the fault diagnosis capability of the proposed BAVAE-DCIT2FS method on imbalanced datasets.

4.4. Case 2: Wind Farm Dataset

Although testing with the CWRU dataset has already demonstrated the superiority of our proposed method, the dataset is quite simple and not affected by substantial noise. In this section, the vibration data from a real wind farm are collected to evaluate our proposed method. The data were provided by a company for us to study and design a customized wind turbine prognosis and health management system with the aim of lowering maintenance costs. The data were acquired from wind turbine (WT) gearboxes using a special CMS and data acquisition station, as shown in Figure 13. Based on long-term recordings of the company, there are three parts of the gearbox that are more vulnerable than other components: high-speed shaft bearings, middle-speed shaft bearings, and planetary gears. Since wind power generators are often built in remote districts and far from each other, inspections by personnel involve high financial and labor costs. Thus, an intelligence-based system is quite urgent to monitor health conditions and predict impending failures of gearboxes.
The data were collected from different WT gearboxes and then segmented and labeled by expert inspection workers with the following classes: Normal Condition (NC), High-speed Shaft Bearing Damage (HSBD), Middle-speed Shaft Bearing Damage (MSBD), and Planetary Gear Damage (PGD). Similarly, 540 samples of each class were selected, each with 1020 data points. Imbalance ratios of 30% and 50% per class were used in the training set. The initialization and parameters of the experiment are the same as Case 1. The experimental results are shown in Table 6 and Table 7, respectively.
Since the wind is unpredictable, the data collected from gearboxes in wind turbines are highly nonstationary and feature irregular noise. However, compared with other methods, our method still achieves reasonable performance on complex data, which verifies the feasibility of our method in an actual wind farm.

5. Conclusions

This paper proposes a novel fault diagnosis framework based on a BAVAE and a DCIT2FS to address the challenges associated with imbalanced bearing fault datasets. In the first stage, multi-scaled features of each class are extracted and concatenated through DTCWT as input data. The BAVAE improves the diversity and quality of generated samples by iteratively learning complex latent distributions and introducing a feature discrimination loss, which effectively enhances inter-class feature distinctions. When combined with the original dataset, this augmentation strategy significantly boosts the robustness of the diagnostic model. Additionally, the DCIT2FS capitalizes on the powerful approximation capabilities of Interval Type-2 fuzzy systems and the hierarchical feature extraction provided by deep convolutional networks. This combination achieves a favorable trade-off between model complexity and feature representation, thereby improving diagnostic performance under uncertain and imbalanced conditions. Experimental results on the CWRU dataset demonstrate that the proposed framework outperforms several state-of-the-art methods in terms of diagnostic accuracy and stability. The generalization capability of the proposed algorithm was validated using a wind farm dataset.
However, there are still areas that warrant further investigation. Future research will focus on developing more advanced feature fusion mechanisms to fully integrate multi-scale and multi-modal data. Additionally, we aim to enhance the generative capacity of BAVAE in extremely imbalanced and noisy industrial environments. Furthermore, combining advanced deep learning models with traditional signal processing techniques to address the complexities of industrial big data will remain a key direction for both theoretical research and practical applications.

Author Contributions

Conceptualization, J.Z. and L.Z.; methodology, J.Z.; software, L.Z.; validation, J.Z. and L.Z.; formal analysis, J.Z.; investigation, L.Z.; resources, L.Z.; data curation, L.Z.; writing—original draft preparation, L.Z.; writing—review and editing, J.Z.; visualization, J.Z.; supervision, J.Z.; project administration, J.Z.; funding acquisition, J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by 2022 Huiyan Action: C2AE0A5C, National Natural Science Foundation of China: 61803334, Zhejiang Provincial Natural Science Foundation of China: LZ21F030004, Key Research and Development Program of Hangzhou: 2024SZD1A11, Key Research and Development Program of Zhejiang Province: 2025C01055.

Data Availability Statement

The original data presented in this study are openly available in the Case Western Reserve University Bearing Data Center at https://engineering.case.edu/bearingdatacenter.

Acknowledgments

The authors express their gratitude to the National Natural Science Foundation of China.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Yuan, H.; Wu, N.; Chen, X.; Wang, Y. Fault Diagnosis of Rolling Bearing Based on Shift Invariant Sparse Feature and Optimized Support Vector Machine. Machines 2021, 9, 98. [Google Scholar] [CrossRef]
  2. Nguyen, V.C.; Hoang, D.T.; Tran, X.T.; Van, M.; Kang, H.J. A Bearing Fault Diagnosis Method Using Multi-Branch Deep Neural Network. Machines 2021, 9, 345. [Google Scholar] [CrossRef]
  3. Irfan, M.; Mushtaq, Z.; Khan, N.A.; Mursal, S.N.F.; Rahman, S.; Magzoub, M.A.; Latif, M.A.; Althobiani, F.; Khan, I.; Abbas, G. A Scalo Gram-Based CNN Ensemble Method with Density-Aware SMOTE Oversampling for Improving Bearing Fault Diagnosis. IEEE Access 2023, 11, 127783–127799. [Google Scholar] [CrossRef]
  4. Guan, S.; Yang, H.Q.; Wu, T.Y. Transformer Fault Diagnosis Method Based on TLR-ADASYN Balanced Dataset. Sci. Rep. 2023, 13, 23010. [Google Scholar] [CrossRef]
  5. Rezende, D.J.; Mohamed, S.; Wierstra, D. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In Proceedings of the 31st International Conference on Machine Learning, PMLR, Beijing, China, 21–26 June 2014; 32, pp. 1278–1286. [Google Scholar]
  6. Creswell, A.; White, T.; Dumoulin, V.; Arulkumaran, K.; Sengupta, B.; Bharath, A.A. Generative Adversarial Networks: An Overview. IEEE Signal Process. Mag. 2018, 35, 53–65. [Google Scholar] [CrossRef]
  7. Zhang, S.; Ye, F.; Wang, B.N.; Habetler, T.G. Semi-Supervised Bearing Fault Diagnosis and Classification Using Variational Autoencoder-Based Deep Generative Models. IEEE Sens. J. 2021, 21, 6476–6486. [Google Scholar] [CrossRef]
  8. Shao, S.Y.; Wang, P.; Yan, R.Q. Generative Adversarial Networks for Data Augmentation in Machine Fault Diagnosis. Comput. Ind. 2019, 106, 85–93. [Google Scholar] [CrossRef]
  9. Wang, Z.; Wang, J.; Wang, Y.R. An Intelligent Diagnosis Scheme Based on Generative Adversarial Learning Deep Neural Networks and Its Application to Planetary Gearbox Fault Pattern Recognition. Neurocomputing 2018, 310, 213–222. [Google Scholar] [CrossRef]
  10. Wang, Y.R.; Sun, G.D.; Jin, Q. Imbalanced Sample Fault Diagnosis of Rotating Machinery Using Conditional Variational Auto-Encoder Generative Adversarial Network. Appl. Soft Comput. 2020, 92, 106333. [Google Scholar] [CrossRef]
  11. Zhou, F.N.; Yang, S.; Fujita, H.; Chen, D.M.; Wen, C.L. Deep Learning Fault Diagnosis Method Based on Global Optimization GAN for Unbalanced Data. Knowl. Based Syst. 2020, 187, 104837. [Google Scholar] [CrossRef]
  12. Dixit, S.; Verma, N.K. Intelligent Condition-Based Monitoring of Rotary Machines With Few Samples. IEEE Sens. J. 2020, 20, 14337–14346. [Google Scholar] [CrossRef]
  13. Wang, L.; Liu, Z.W.; Cao, H.R.; Zhang, X. Subband Averaging Kurtogram with Dual-Tree Complex Wavelet Packet Transform for Rotating Machinery Fault Diagnosis. Mech. Syst. Signal Process. 2020, 142, 106755. [Google Scholar] [CrossRef]
  14. Zhao, D.F.; Liu, S.L.; Gu, D.; Sun, X.; Wang, L.; Wei, Y.; Zhang, H.L. Enhanced Data-Driven Fault Diagnosis for Machines with Small and Unbalanced Data Based on Variational Auto-Encoder. Meas. Sci. Technol. 2019, 31, 035004. [Google Scholar] [CrossRef]
  15. Rezazadeh, N.; Perfetto, D.; de Oliveira, M.; De Luca, A.; Lamanna, G. A Fine-Tuning Deep Learning Framework to Palliate Data Distribution Shift Effects in Rotary Machine Fault Detection. Struct. Health Monit. 2024, 30, 14759217241295951. [Google Scholar] [CrossRef]
  16. Zhao, Z.Q.; Gao, X.L.; Wang, H.H.; Tian, J. Rolling Bearing Fault Diagnosis with Time-Frequency Image Based on Deep Learning. In Proceedings of the 2023 Global Reliability and Prognostics and Health Management Conference, PHM, Hangzhou, China, 12–15 October 2023; pp. 1–6. [Google Scholar]
  17. Wang, W.P.; Xue, S.B. Fault Prediction of Bearing Based on Dual Dimensional Perception and Composite Gated Recurrent Network. IEEE Access 2024, 12, 181509–181520. [Google Scholar]
  18. Tang, S.J.; Zhou, F.N.; Liu, W. Semi-Supervised Bearing Fault Diagnosis Based on Deep Neural Network Joint Optimization. In Proceedings of the 2021 China Automation Congress (CAC), Beijing, China, 22–24 October 2021; pp. 6508–6513. [Google Scholar]
  19. Wang, J.H.; Kang, T.T. Rolling Bearing Fault Diagnosis and Prediction Method Based on Gray Support Vector Machine Model. In Proceedings of the 2015 International Conference on Computer Science and Mechanical Automation (CSMA), Hangzhou, China, 23–25 October 2015; pp. 313–317. [Google Scholar]
  20. Bu, Y.X.; Wu, J.D.; Ma, J.; Wang, X.D.; Fan, Y.G. The Rolling Bearing Fault Diagnosis Based on LMD and LS-SVM. In Proceedings of the 26th Chinese Control and Decision Conference (2014 CCDC), Changsha, China, 31 May–2 June 2014; pp. 3797–3801. [Google Scholar]
  21. Zhu, H.N.; Li, X.Y.; Liu, H.M. Fault Diagnosis of Rolling Bearing Based on WT-VMD and Random Forest. In Proceedings of the 2020 Chinese Control and Decision Conference (CCDC), Hefei, China, 22–24 August 2020; pp. 2130–2135. [Google Scholar]
  22. Ren, Y.X.; Wen, Y.T.; Liu, F.C.; Zhang, Y.Y.; Zhang, Z.W. Deep Convolution IT2 Fuzzy System with Adaptive Variable Selection Method for Ultra-Short-Term Wind Speed Prediction. Energy Convers. Manag. 2024, 309, 118420. [Google Scholar] [CrossRef]
  23. Wu, D.R.; Zeng, G.Z.; Mo, H.; Wang, F.Y. Interval Type-2 Fuzzy Sets and Systems: Overview and Outlook. ACTA Autom. Sin. 2020, 46, 1539–1556. [Google Scholar]
  24. Wu, D.; Mendel, J.M. Enhanced Karnik–Mendel Algorithms. IEEE Trans. Fuzzy Syst. 2009, 17, 923–934. [Google Scholar]
  25. Wang, L.X. Fast Training Algorithms for Deep Convolutional Fuzzy Systems with Application to Stock Index Prediction. IEEE Trans. Fuzzy Syst. 2020, 28, 1301–1314. [Google Scholar] [CrossRef]
  26. Smith, W.A.; Randall, R.B. Rolling Element Bearing Diagnostics Using the Case Western Reserve University Data: A Benchmark Study. Mech. Syst. Signal Process. 2015, 64–65, 100–131. [Google Scholar] [CrossRef]
  27. Hao, S.J.; Ge, F.X.; Li, Y.M.; Jiang, J. Multisensor Bearing Fault Diagnosis Based on One-Dimensional Convolutional Long Short-Term Memory Networks. Measurement 2020, 159, 107802. [Google Scholar] [CrossRef]
  28. Selesnick, I.W.; Baraniuk, R.G.; Kingsbury, N.C. The dual-tree complex wavelet transform. IEEE Signal Process. Mag. 2005, 22, 123–151. [Google Scholar] [CrossRef]
  29. Selesnick, I.W. Hilbert Transform Pairs of Wavelet Bases. IEEE Signal Process. Lett. 2002, 8, 170–173. [Google Scholar] [CrossRef]
  30. Sønderby, C.K.; Raiko, T.; Maaløe, L.; Sønderby, S.K.; Winther, O. Ladder Variational Autoencoders. Adv. Neural Inf. Process. Syst. 2016, 29. [Google Scholar]
  31. Gulrajani, I.; Kumar, K.; Ahmed, F.; Taiga, A.A.; Visin, F.; Vazquez, D.; Courville, A. PixelVAE: A Latent Variable Model for Natural Images. arXiv 2016, arXiv:1611.05013. [Google Scholar]
Figure 1. The fault simulation test bench for CWRU rolling bearings described in [27].
Figure 1. The fault simulation test bench for CWRU rolling bearings described in [27].
Machines 13 00326 g001
Figure 2. (a) The structure diagram of the T1 fuzzy system. (b) The structure diagram of the IT2 fuzzy system.
Figure 2. (a) The structure diagram of the T1 fuzzy system. (b) The structure diagram of the IT2 fuzzy system.
Machines 13 00326 g002
Figure 3. Interval Type-2 fuzzy set diagram.
Figure 3. Interval Type-2 fuzzy set diagram.
Machines 13 00326 g003
Figure 4. Basis framework of VAE.
Figure 4. Basis framework of VAE.
Machines 13 00326 g004
Figure 5. Two-stage DTWCT process.
Figure 5. Two-stage DTWCT process.
Machines 13 00326 g005
Figure 6. The network structure of the BAVAE.
Figure 6. The network structure of the BAVAE.
Machines 13 00326 g006
Figure 7. DCFS structure diagram.
Figure 7. DCFS structure diagram.
Machines 13 00326 g007
Figure 8. The hierarchical structure of the fuzzy subsystem FS i l .
Figure 8. The hierarchical structure of the fuzzy subsystem FS i l .
Machines 13 00326 g008
Figure 9. Flow chart of proposed fault diagnosis method.
Figure 9. Flow chart of proposed fault diagnosis method.
Machines 13 00326 g009
Figure 10. (a) Multi-scale features of real data. (b) Multi-scale features of generated data.
Figure 10. (a) Multi-scale features of real data. (b) Multi-scale features of generated data.
Machines 13 00326 g010
Figure 11. Confusion matrix of BAVAE combined with four classifiers with imbalance ratio of 50%. (a) SVM. (b) RF. (c) CNN. (d) DCFS.
Figure 11. Confusion matrix of BAVAE combined with four classifiers with imbalance ratio of 50%. (a) SVM. (b) RF. (c) CNN. (d) DCFS.
Machines 13 00326 g011
Figure 12. (a) Confusion matrix of the proposed method. (b) Comparison of the evaluated and actual states.
Figure 12. (a) Confusion matrix of the proposed method. (b) Comparison of the evaluated and actual states.
Machines 13 00326 g012
Figure 13. Wind farm data collection station.
Figure 13. Wind farm data collection station.
Machines 13 00326 g013
Table 1. Corresponding table of damage diameter position and status.
Table 1. Corresponding table of damage diameter position and status.
StateDamage Diameter/mmDamage Position
1//
20.1778rolling element
30.1778inner ring
40.1778six o’clock on the outer circle
50.3556rolling element
60.3556inner ring
70.3556six o’clock on the outer circle
80.5334rolling element
90.5334inner ring
100.5334six o’clock on the outer circle
Table 2. Confusion matrix.
Table 2. Confusion matrix.
True TagPrediction Tag
Positive ExampleNegative Example
Positive exampleTP (true positive example)FN (false negative example)
Negative exampleFP (false positive example)TN (true negative example)
Table 3. Comparison of similarities between real and generated data.
Table 3. Comparison of similarities between real and generated data.
MethodSimilarity
State 1State 2State 3State 4
SMOTE0.720.790.770.85
ADASYN0.750.700.710.88
CVAE0.880.850.890.92
GAN0.780.820.650.87
BAVAE   without   enhanced   L d i s 0.910.990.931.12
BAVAE   with   enhanced   L d i s 0.941.100.991.19
Table 4. Classification accuracy using a 30% imbalance ratio with the CWRU dataset.
Table 4. Classification accuracy using a 30% imbalance ratio with the CWRU dataset.
MethodClassification Accuracy (%)
SVMRFCNNDCFSDCIT2FS
SMOTE52.3445.7991.0592.1093.20
ADASYN54.1246.5292.1593.2594.50
CVAE58.7675.3490.1094.5095.30
GAN63.2173.6288.8995.6096.05
BAVAE79.5072.1089.8096.8097.00
Table 5. Classification accuracy using a 50% imbalance ratio with the CWRU dataset.
Table 5. Classification accuracy using a 50% imbalance ratio with the CWRU dataset.
MethodClassification Accuracy (%)
SVMRFCNNDCFSDCIT2FS
SMOTE72.1375.2892.1793.5295.08
ADASYN72.8574.1492.3993.7995.37
CVAE80.9676.1893.0796.0297.03
GAN80.9981.0393.5196.3897.46
BAVAE83.5784.0094.6797.3398.67
Table 6. Classification accuracy using a 30% imbalance ratio with a wind farm dataset.
Table 6. Classification accuracy using a 30% imbalance ratio with a wind farm dataset.
MethodClassification Accuracy (%)
SVMRFCNNDCFSDCIT2FS
SMOTE50.1243.8589.3490.4591.80
ADASYN52.6745.4190.1291.2592.40
CVAE56.0370.1188.4592.6093.15
GAN60.2470.2587.9093.1094.05
BAVAE75.3868.9087.6594.4595.10
Table 7. Classification accuracy using a 50% imbalance ratio with a wind farm dataset.
Table 7. Classification accuracy using a 50% imbalance ratio with a wind farm dataset.
MethodClassification Accuracy (%)
SVMRFCNNDCFSDCIT2FS
SMOTE70.1272.4590.3491.9093.60
ADASYN71.0571.3090.8592.2094.10
CVAE78.2473.0291.4094.2095.50
GAN78.8078.2592.1595.1096.80
BAVAE81.3580.6093.0296.5597.40
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhu, J.; Zhu, L. Fault Diagnosis for Imbalanced Datasets Based on Deep Convolution Fuzzy System. Machines 2025, 13, 326. https://doi.org/10.3390/machines13040326

AMA Style

Zhu J, Zhu L. Fault Diagnosis for Imbalanced Datasets Based on Deep Convolution Fuzzy System. Machines. 2025; 13(4):326. https://doi.org/10.3390/machines13040326

Chicago/Turabian Style

Zhu, Junwei, and Linfang Zhu. 2025. "Fault Diagnosis for Imbalanced Datasets Based on Deep Convolution Fuzzy System" Machines 13, no. 4: 326. https://doi.org/10.3390/machines13040326

APA Style

Zhu, J., & Zhu, L. (2025). Fault Diagnosis for Imbalanced Datasets Based on Deep Convolution Fuzzy System. Machines, 13(4), 326. https://doi.org/10.3390/machines13040326

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop