1. Introduction
Complex systems are characterized by high dimensionality, strong coupling, and significant uncertainty, which makes their operational states difficult to predict [
1]. Once a local fault occurs, system-wide risks are likely to be triggered [
2]. Therefore, accurate and effective fault diagnosis is considered essential for ensuring the stability and safety of such systems.
Recent advances in signal processing have introduced techniques such as the generalized synchroextracting transform, which improves feature resolution and robustness in non-stationary vibration analysis [
3]. For rolling element bearings, blind deconvolution methods like the CFFsBD algorithm have demonstrated effectiveness in enhancing weak fault features [
4]. Beyond mechanical systems, fault diagnosis approaches have also been extended to electrical domains, as shown by the identification of anomalies in co-phase power supply systems [
5]. These studies reflect the depth and breadth of current research and emphasize the need for diagnostic methods that are accurate, robust, and generalizable. These examples reflect both the depth and breadth of current research, highlighting the need for diagnostic methods that are not only accurate and robust but also generalizable.
Building on this context, fault diagnosis techniques are commonly categorized into four types: physics-based, data-driven, knowledge-based, and semi-quantitative information-based methods [
5,
6].
(1) The physical modeling approach reveals the operating mechanism and fault characteristics of a system by constructing a mathematical model of the system [
7]. The method exhibits high diagnostic accuracy [
8]. For example, Huang et al. [
9] analyzed various fault mechanisms in motor drive systems and proposed a fault classification method based on an improved hidden Markov model (HMM), which achieved satisfactory results. Jafari et al. [
10] proposed a simple yet effective method for detecting inter-turn faults, which is based on one modal current and four different simple indicators. Although physics-based methods provide accurate fault identification, they depend on detailed modeling and expert knowledge, which limits their adaptability to complex or variable operating conditions [
11].
(2) Data-driven methods rely on historical operational data and apply machine learning or deep learning techniques to construct diagnostic models without requiring explicit system modeling [
12]. The method shows good performance in dealing with complex systems and extracting fault characteristics. For example, Zhang et al. [
13] proposed a bearing fault identification method based on convolutional neural networks (CNNs), which significantly improves classification accuracy. Pule et al. [
14] construct a multi-fault identification model under complex operating conditions by combining support vector machines (SVMs) with principal component analysis (PCA). However, data-driven methods rely heavily on the quality and quantity of data. When samples are insufficient or noisy, the diagnostic performance tends to degrade [
15].
To overcome the limitations of purely model-based or purely data-driven approaches, researchers have increasingly explored hybrid methods that integrate the strengths of both paradigms. For example, Xia et al. [
16] introduced a digital twin-driven gearbox diagnosis framework, and Meléndez-Useros et al. [
17] presented an active steering fault diagnosis method integrating LSTM-based sensor detection with robust actuator fault estimation. Nevertheless, hybrid methods often involve more complex architectures and higher computational costs, and their performance may be sensitive to model integration strategies, which poses challenges for large-scale or real-time applications.
(3) Knowledge-driven approaches rely on expert experience, domain knowledge or a priori rules to build fault diagnosis systems [
18]. This category of methods effectively incorporates human knowledge into the model’s reasoning process, thereby enhancing interpretability and reliability. For example, Chen et al. [
19] proposed a modular fault tree approach that effectively reduces analysis complexity and enhances efficiency. Chi et al. [
20] used a knowledge-based fault diagnosis approach in the industrial internet of things (IIoT) to enhance interoperability through ontologies and effectively describe system faults. However, knowledge-based methods rely on manually designed rules and often show limited adaptability to dynamic operating conditions [
21].
(4) Semi-quantitative information methods combine qualitative knowledge with quantitative analysis and are suitable for complex systems that cannot be fully quantified [
22]. Cheng et al. [
23] investigated the relationship between valid data and expert knowledge and conducted a detailed analysis of the transmission state of a high-speed train. Yuan et al. [
24] introduced a hybrid knowledge-based method, in which multiple expert knowledge systems are constructed and applied according to the type of available information. These methods integrate multi-source information, capture the uncertainty of fault features, and improve diagnostic accuracy [
25].
Limitations are observed in existing physics-based, data-driven, and knowledge-based methods in terms of diagnostic accuracy, data dependency, and adaptability. In contrast, semi-quantitative information methods are considered more effective in handling uncertainty and integrating multi-source information, making them more suitable for fault diagnosis in complex systems.
The belief rule base (BRB) is a representative semi-quantitative method that integrates expert knowledge with numerical information, making it well suited for fault diagnosis in complex and uncertain environments. However, most existing BRB methods rely heavily on labeled data, while the effective use of large volumes of unlabeled data remains limited. In real-world applications, the high cost and inefficiency of label acquisition further restrict model performance and reduce the reliability of diagnostic results.
To overcome the limitation of scarce labeled data, many approaches have introduced physical biases into the fault classification process. For example, simulation-driven machine learning methods combine simulated data with learning algorithms to improve classification accuracy [
26], while zero-fault learning integrates physical modeling with data-driven techniques to enhance diagnostic performance [
27]. Nevertheless, these methods typically require large amounts of high-quality simulation data, which may not always be available and can limit their applicability.
To address these challenges, this paper proposes a perturbation-based self-training method to enhance BRB. The proposed PS-BRB retains the interpretability and uncertainty-handling capability of the original BRB while introducing a self-training mechanism to exploit unlabeled data. Through perturbation and filtering strategies, the rules and parameters are iteratively optimized, thereby improving the model structure, representational capacity, diagnostic accuracy, and generalization under complex conditions.
Unlike existing physical bias-based methods, PS-BRB does not depend on simulation data. Instead, it leverages perturbation consistency and JS divergence filtering to effectively utilize unlabeled data, maintaining high diagnostic accuracy even with limited labeled samples. This design not only avoids the dependency on simulation data but also demonstrates stronger robustness and adaptability in complex and uncertain environments, providing a new solution to the limitations of current BRB-based and physical bias-driven methods.
In summary, although existing semi-supervised pseudo-labeling methods have achieved promising results in general learning tasks, they do not fully account for the unique characteristics of BRB, particularly its rule-based structure and ability to represent uncertainty. Directly applying such methods to BRB often results in high sensitivity to noisy pseudo-labels and unstable rule updates. Therefore, the proposed PS-BRB is not a direct transplantation but a tailored redesign for the BRB framework. By incorporating input perturbation for consistency checking, JS divergence for distributional filtering, and a self-training mechanism with rule and parameter optimization, PS-BRB effectively mitigates pseudo-label error propagation and data dependency while preserving BRB’s interpretability and uncertainty-handling capability.
The main contributions of this paper are as follows:
(1) A high-quality pseudo-label filtering mechanism is proposed, in which Gaussian noise is applied to inputs corresponding to pseudo-labels generated by the initial BRB model. Label consistency is verified and JS divergence is measured with a threshold determined at the 90th percentile. This dual-constraint strategy improves the reliability of pseudo-labels.
(2) A perturbation-based self-training method is developed, where the filtering mechanism is integrated into the self-training framework. Perturbation and filtering strategies guide the BRB to update its rules and parameters with high-quality pseudo-labels, thereby enhancing its representational ability and improving fault diagnosis performance under complex conditions.
Overall, PS-BRB goes beyond simply combining existing semi-supervised learning techniques. Through tailored adaptation to BRB’s reasoning mechanism, it achieves more stable pseudo-label utilization and demonstrates greater robustness and adaptability in noisy and uncertain environments.
The remainder of this paper is organized as follows. In
Section 2, relevant preliminaries are introduced and the problem description of the PS-BRB method is presented. In
Section 3, a PS-BRB-based fault diagnosis approach for complex systems is proposed.
Section 4 presents two case studies through which the effectiveness of the proposed method is validated. The limitations and future work are discussed in
Section 5. Finally, conclusions are drawn in
Section 6.
3. A PS-BRB-Based Fault Diagnosis Method for Complex Systems
This section proposes a perturbation-based self-training PS-BRB fault diagnosis method, which consists of three key modules.
Section 3.1 introduces the overall structure and workflow of the PS-BRB model.
Section 3.2 presents the theoretical foundations of the proposed method, providing the mathematical derivations and justifications that support its design.
Section 3.3 describes the filtering mechanism for high-quality pseudo-labels.
Section 3.4 focuses on model optimization, where high-quality pseudo-labels and optimization algorithms are used to update and enhance the BRB parameters in a data-driven manner. Finally,
Section 3.5 analyzes the computational complexity and scalability of the PS-BRB framework.
3.1. Description of the PS-BRB
The PS-BRB is a hybrid fault diagnosis framework that integrates expert knowledge, labeled data, and unlabeled data under a self-training mechanism. As illustrated in
Figure 1, the PS-BRB consists of three core components: pseudo-label generation, high-quality pseudo-label filtering, and BRB model enhancement.
Initially, a BRB model is constructed based on expert knowledge and limited labeled samples. This initial model is used to infer pseudo-labels for unlabeled data, generating belief distributions and inference results. To evaluate the robustness of these pseudo-labels, a perturbation strategy is applied by adding Gaussian noise to the inputs, and the perturbed samples are re-inferred via the same BRB model. The belief distributions before and after perturbation are then compared.
A class consistency assessment is performed to ensure that inferred classes remain stable under perturbation. If consistency is met, the JS divergence between the belief distributions is calculated. Samples with low divergence, below the 90th percentile threshold, are considered reliable and are selected as high-quality pseudo-labels.
These filtered high-quality pseudo-labels are used, along with an optimization algorithm, to update the rule weights, attribute weights, and belief degrees of the BRB. The enhanced BRB model is then used for ER and final fault diagnosis. This iterative process allows the PS-BRB to leverage unlabeled data effectively, improving both accuracy and generalizability capability in complex system diagnostics.
3.2. Theoretical Foundations of PS-BRB
The PS-BRB method’s effectiveness is shown through three key aspects: consistent smoothness, distributional robustness, and reduced self-training generalization error. These findings demonstrate that perturbation-filtered pseudo-labels improve the BRB model’s stability and generalization performance.
Consistency regularization requires that model outputs remain smooth and consistent under small input perturbations. Within the BRB framework, Gaussian noise is applied to unlabeled samples, and inference is repeated so that the local smoothness of pseudo-labels can be assessed: only those pseudo-labels for which the hard label remains unchanged before and after perturbation are retained. This strategy is equivalent to imposing a perturbation consistency constraint during model training, which serves to eliminate predictions that are highly sensitive to slight input variations and thereby improves the reliability and accuracy of the pseudo-labels [
32].
After consistency filtering, the JS divergence between the belief distributions obtained before and after perturbation is computed to quantify their difference. As a symmetric measure of similarity between two probability distributions, JS divergence characterizes the model’s robustness to uncertain inputs. The divergence is calculated for all samples with consistent class labels, and the 90th percentile of these values is used as a threshold. Only those pseudo-labels with divergence below this threshold are retained, ensuring that the selected samples not only have stable class assignments but also exhibit high consistency in their confidence distributions after perturbation, thereby further filtering out pseudo-labels with high uncertainty or unstable predictions [
33].
The high-quality pseudo-labels are incorporated together with the labeled samples into the BRB parameter optimization, which is equivalent to adding a pseudo-label loss term to the original supervised loss. According to Vapnik–Chervonenkis (VC) dimension theory, by introducing pseudo-samples with high accuracy, the model’s generalization error bound can be reduced from
to
, where
h is the model’s VC dimension,
m is the number of labeled samples, and
is the number of high-quality pseudo-labels. This demonstrates that, provided the pseudo-labels are sufficiently accurate, the self-training mechanism can theoretically enhance the generalization capability and robustness of the BRB model [
34].
3.3. High-Quality Pseudo-Label Filtering Mechanism
To ensure the quality of the pseudo-labels introduced during the self-training process and reduce the negative impact of noisy labels on model performance, this paper proposes a high-quality pseudo-label filtering mechanism based on perturbation self-training, which can be divided into the following four steps:
Step 1: Pseudo-label generation.
A single inference is performed on the unlabeled sample set using the current BRB model (parameters ). For each sample, its utility value and its belief distribution are obtained. The utility value is then discretized over the predefined reference values to yield the class label .
Step 2: Consistency screening.
Gaussian noise is added to each sample , producing the perturbed sample . A second inference is conducted to obtain the perturbed utility value and belief distribution . The perturbed utility is discretized to yield the label . Samples for which are retained, and labels that fail this condition are discarded.
Step 3: Distributional robustness assessment.
For all samples that passed consistency screening, the JS divergence between the original and perturbed belief distributions is computed:
The 90th percentile of the divergence values is selected as the threshold . Only those samples for which are kept, so that belief distributions before and after perturbation remain highly consistent.
Step 4: High-quality pseudo-label set construction.
The samples that satisfy both the consistency and divergence criteria, along with their labels
, form the high-quality pseudo-label set:
This set is combined with labeled data for joint optimization of the BRB parameters, which improves the model’s robustness and generalizability.
3.4. Model Optimization
To enhance the diagnostic performance of the BRB model in complex systems, high-quality pseudo-labels are introduced into the training set under the framework of the projection covariance matrix adaptation evolution strategy (P-CMA-ES) [
35]. These pseudo-labels are inferred by the initial BRB model and are further processed through Gaussian perturbation. Samples that satisfy category consistency and exhibit low JS divergence are selected to ensure label stability and reliability. Ultimately, both pseudo-labeled and labeled samples are utilized for model training.
As illustrated in
Figure 2, the execution flow of the P-CMA-ES is presented, and the detailed optimization process is described as follows.
First, the objective function for constructing the PS-BRB is as follows:
where
denotes the loss function for model fault diagnosis with the following expression:
where
denotes the prediction error for labeled samples, and
denotes the prediction error for pseudo-labeled samples.
represents the ground truth of the
labeled sample,
denotes the soft pseudo-label of the
pseudo-labeled sample (inferred by the initial BRB model), and
is the current output of the BRB model for the
sample.
T and
indicate the numbers of labeled and pseudo-labeled samples, respectively.
Step 1: Initialization.
The parameter vector is initialized as x, the iteration number is set to G, the initial step size is , the covariance matrix is , and the population size is . This initialization not only specifies the search starting point but also determines the scale and direction of the initial exploration through .
Step 2: Sampling operation.
where
is the
solution in generation
,
is the mean value of the population,
represents the step size,
denotes the normal distribution, and
denotes the covariance matrix of the population in generation
t. In this way, new solutions are generated around the current mean while incorporating random perturbations, balancing exploitation of the current search region with global exploration.
Step 3: Projection.
The projection operation is described as follows:
where
denotes the number of variables in
;
is the number of constants in the solution
; and
represents the parameter vector used in the sampling operation. This operation ensures that candidate solutions remain valid and satisfy model constraints, thereby avoiding infeasible parameter combinations.
Step 4: Mean update.
The population mean is updated according to the weighted average of the top
solutions:
where
is the
output solution among the
solutions in generation
t + 1. This step shifts the search center toward higher-quality solutions, which reflects the principle of survival of the fittest and gradually improves the search direction.
Step 5: Covariance matrix adaptation.
The covariance matrix is updated as shown below:
where
and
are the mechanism rates, and
is the evolutionary path of the covariance. The evolution is as follows:
By accumulating successful steps, the covariance matrix learns the correlations between parameters and adapts its shape to the problem landscape, which is similar to approximating second-order information of the objective function.
Step 6: Step-size adaptation.
Update the step length as follows:
where
is the evolutionary path backward time axis,
denotes the enumeration of evolutionary path vectors,
denotes the decay coefficient, and
represents the expected length of
and denotes the mathematical expectation.
denotes the enumeration of evolutionary paths.
This mechanism adjusts the step size automatically: it enlarges when the search progresses consistently in one direction, enabling broader exploration, and shrinks when oscillations occur, enhancing local exploitation.
Step 7: Termination.
After G iterations, the optimization terminates and outputs the best parameter vector. This criterion ensures computational feasibility while providing sufficient iterations to converge to a robust solution.
The design of P-CMA-ES is supported by evolutionary computation and stochastic optimization theory. The mean update is regarded as an approximation of a gradient-descent direction, the covariance matrix adaptation is interpreted as learning second-order information of the objective function, and the step-size control based on path length is applied to maintain a balance between exploration and exploitation. In addition, the perturbation mechanism is introduced to preserve population diversity and improve robustness. Building on the established convergence properties of CMA-ES, the applicability of P-CMA-ES to constrained and complex optimization problems is theoretically ensured.
3.5. Computational Complexity and Scalability Analysis
The proposed PS-BRB framework inevitably introduces additional computational overhead due to perturbation, dual screening, and iterative optimization. To systematically evaluate the practicality of the method, this subsection analyzes its computational cost, space complexity, and scalability for large-scale applications.
(1) Perturbation.
Perturbation generation requires creating augmented representations of the input space. For a dataset with n samples and d features, the cost is approximately . The memory consumption grows linearly with both sample size and feature dimensionality.
(2) Dual screening.
Dual screening is used to filter redundant or low-contribution rules. The dominant cost arises from ranking and comparison operations, which have a complexity of . As the operation is executed on intermediate candidate sets, the associated memory requirement remains moderate.
(3) Iterative optimization.
Parameter learning in PS-BRB is realized via iterative optimization of the belief rules. Each iteration involves evaluating rule activation and updating parameters with complexity . Over k iterations, the total cost . The space requirement is mainly determined by storing the rule base and intermediate parameters, which is also .
(4) Overall computational burden.
The overall complexity of the framework is polynomial and dominated by the iterative optimization stage. While PS-BRB requires more resources than conventional BRB, the operations are highly parallelizable. Perturbation and dual screening can be executed in batch mode, and iterative optimization can be accelerated via GPU or distributed computing.
(5) Scalability and feasibility.
Despite the additional overhead, the framework remains scalable to medium- and large-scale datasets. The linear space complexity ensures that memory usage is manageable in practice. Moreover, the modular design of perturbation, screening, and optimization enables efficient implementation with parallel computing, making the approach applicable to real-world engineering systems. Future work will further explore lightweight strategies and distributed deployment to enhance large-scale applicability.
4. Case Study
In
Section 4.1, the effectiveness of the proposed PS-BRB is fully validated through a case study on bearing fault diagnosis. In
Section 4.2, the applicability of the model is further demonstrated by analyzing bearing fault diagnosis. A generalization analysis is conducted in
Section 4.3. A comprehensive analysis of the experimental results is provided in
Section 4.4.
The bearing is regarded as a critical transmission component in complex mechanical systems, and its operating conditions are considered to directly affect the overall performance and safety of the system [
36]. Once a fault is detected in the bearing, energy transmission is disrupted, equipment is shut down, and broader systemic damage is likely to be caused [
37]. Therefore, timely and accurate fault diagnosis is viewed as essential for ensuring stable operation, extending the service life, and reducing the maintenance costs of the mechanical equipment.
To verify the adaptability and effectiveness of the proposed method under different operating conditions and data sources, this section conducts an experimental analysis using the 30 Hz-2 V bearing dataset from Southeast University (SEU 30 Hz-2 V) and the 65 Hz bearing dataset from Huazhong University of Science and Technology (HUST 65 Hz). To avoid train–test leakage, all datasets were partitioned into training and testing subsets before feature extraction, normalization, and model optimization, ensuring that the testing data remained completely independent throughout the experiments.
4.1. Case 1: Bearing Fault Diagnosis Based on the PS-BRB (SEU 30 Hz-2 V)
In
Section 4.1.1, the experimental background is introduced and the parameters are configured. In
Section 4.1.2, the experimental results are analyzed.
4.1.1. Background Description and Experimental Parameter Settings
The bearing dataset provided by Southeast University is employed in this study for experimental validation [
38]. This dataset is collected from a dynamic drive system (DDS) test bench under operating conditions of a 30 Hz rotational speed and 2 V load [
39]. As shown in
Table 1, the SEU bearing dataset comprises eight channels, covering motor vibration, gearbox vibration in three directions, and motor torque signals.
Channel 1 motor vibration signals are selected as the subject of analysis to validate the effectiveness of the proposed method. Five typical operating states of the bearing are covered: ball fault (Ball), inner ring fault (Inner), outer ring fault (Outer), combination fault on both the inner ring and outer ring (Combination) and health working state (Healthy). Time-domain features are extracted from the vibration signals corresponding to these states. A total of 1000 samples are collected, each comprising 1024 data points.
To enhance the representativeness of the input features, the out-of-bag predictor importance (OOBPredictorImportance) method is employed to rank the extracted time-domain features [
40]. As shown in
Figure 3, the standard deviation (Std) and root mean square (RMS) achieve the highest importance scores, indicating their significant contributions to fault classification. Therefore, they are selected as key features and retained for subsequent model development.
A total of 5000 samples covering five fault types are used with 1000 samples for each type. Among them, 1000 samples (200 per type) are randomly selected as the test set. From the remaining 4000 samples, 100 samples from each type (500 in total) are selected as labeled data, while the remaining 3500 samples are used as unlabeled data.
Std (
) is divided into five semantic values: small (VS), medium (VM), large (VL), and extreme (VZ). RMS (
) is assigned three semantic values: low (TL), medium (TM), high (TH) and extreme (TE). The reference values of these attributes are provided in
Table 2 and
Table 3. The corresponding result reference values are listed in
Table 4. The initial belief distributions are presented in
Table A1.
4.1.2. Experimental Results Analysis
The attribute reference values updated through PS-BRB are VS, VM, A1, VL, and VZ, as shown in
Table 5. The updated attribute reference values
are TL, TM, TH, and TE, as shown in
Table 6. The updated belief distribution table is provided in
Table A2.
a. Comparative analysis of rule weights before and after the update on the SEU 30 Hz-2 V dataset.
Figure 4 presents a comparison of the rule weights before enhancement (original BRB) and after enhancement via the PS-BRB. The original model contains 16 rules, whereas the enhanced model retains these original rules and adds four new rules (R17–R20). It can be observed from the figure that PS-BRB induces significant adjustments in the rule weight distribution: the weights of some original rules (e.g., R2, R8) decrease markedly, whereas the weights of others (e.g., R10, R11, R16) increase substantially. This change indicates that, driven by the expansion of the training set with high-quality pseudo-labels, the evaluation of rule importance is restructured. The contribution of certain originally high-weight rules is partially shared by the newly added rules, whereas some rules with originally lower weights receive increased importance owing to their outstanding performance on the new data.
b. Comparison of PS-BRB with other BRBs on the SEU 30 Hz-2 V dataset.
Figure 5 shows the fault type predictions on the test set for three methods: PS-BRB (after perturbation self-training enhancement), BRB1 (baseline using only limited labeled data), and BRB2 (the “upper-bound” model trained with the full experimental labels). The blue line represents the true fault types, and the vertical axis denotes the fault category indices.
As shown in
Figure 5, PS-BRB achieves the closest fit to the true trajectory. Within each stable interval, its predictions nearly coincide with the step plateaus; at transition points, overshoot and oscillation are significantly smaller than those of BRB1 and comparable to BRB2. In the black dashed regions, BRB1 exhibits numerous high-magnitude spikes and drops, reflecting unstable rule matching with limited labeled samples, whereas PS-BRB shows the lowest fluctuation amplitude and frequency, indicating that the expanded rule set with high-quality pseudo-labels better covers the feature distribution. In the pink dashed regions, BRB1 generates frequent random spikes and BRB2 occasionally produces outliers, while PS-BRB demonstrates the smallest overshoot and fastest convergence, highlighting stronger robustness to distributional shifts.
From the quantitative results in
Table 7, PS-BRB achieves an accuracy of 98.4%, representing an improvement of 11.6 percentage points over BRB1 (86.8%) and only 0.3 percentage points lower than BRB2 (98.7%). For MSE, PS-BRB (0.1235) reduces the error by 25.7% compared with BRB1 (0.1661), though it remains higher than BRB2 (0.0706) due to a few extreme errors. In terms of F1 and Recall, PS-BRB is also markedly superior to BRB1 and very close to BRB2.
Overall, PS-BRB substantially improves diagnostic performance and stability under limited labels, significantly outperforming BRB1 and approaching the performance of BRB2. Although the additional pseudo-label filtering and iterative optimization lead to increased computational cost, the diagnostic gains justify the trade-off, as PS-BRB achieves accuracy comparable to the fully supervised model.
Figure 6 presents the five-class confusion matrix of PS-BRB on the test set. The results show a unidirectional sparse distribution without obvious symmetry, indicating that the rules learned by PS-BRB provide good discriminability and stability across class decision boundaries. Combined with the overall metrics in
Table 7, PS-BRB achieves class-level consistency and robustness comparable to those of the full-label training model (BRB2) without relying on the complete set of ground-truth labels.
c. Ablation study: role of class consistency and JS threshold in pseudo-label filtering.
Table 8 reports the ablation results of four pseudo-label filtering strategies. The combined class consistency + JS threshold strategy achieves the best overall performance with the highest accuracy (0.9934), Macro-F1 (0.9912), Weighted-F1 (0.9934), and Macro-Recall (0.9887). In comparison, using class consistency only yields moderate improvement (accuracy 0.9452), while relying on JS divergence only leads to poorer results (accuracy 0.8796) and lower recall. The no-filtering strategy performs better than JS divergence only but remains inferior to either class consistency alone or the combined method. These results demonstrate that dual constraints from both class consistency and JS divergence provide complementary benefits, effectively suppressing noisy pseudo-labels and producing the most reliable performance.
Differences are further revealed by the confusion matrices in
Figure 7 compared with the ground-truth labels. Obvious cross-class misclassifications across multiple categories are exhibited by the no-filtering strategy (
Figure 7a). The accuracy of classes 1 and 3 is improved by the class-consistency strategy (
Figure 7b), though a high misclassification rate persists for class 4. The recall for classes 0 and 4 is reduced by JS-divergence alone (
Figure 7c), indicating that numerous valid samples matching ground-truth labels are removed while noise is suppressed.
A near-diagonal confusion matrix is produced by the combined class-consistency and JS-divergence threshold strategy (
Figure 7d) with precision and recall maintained above 97% for nearly all classes compared with ground-truth labels. Class correctness is ensured and prediction stability is enhanced by this dual-filtering strategy, providing high-quality pseudo-labels for PS-BRB and establishing a solid foundation for subsequent training.
Overall, the dual-filtering strategy ensures class correctness while improving prediction stability, thereby providing PS-BRB with higher-quality pseudo-labels and laying a solid foundation for subsequent training.
d. Sensitivity analysis of pseudo-label filtering hyperparameters.
To validate the rationality of the pseudo-label filtering hyperparameter settings, the influence of the JS divergence threshold and Gaussian perturbation magnitude on model performance is further investigated.
With the perturbation magnitude fixed, the JS divergence threshold is varied. Under identical perturbation conditions, the 70th, 80th, and 90th percentiles are selected as thresholds, and the quality of the filtered pseudo-labels is evaluated. As shown in
Table 9, increasing the threshold leads to a larger number of pseudo-labels, and the performance in terms of Accuracy, Macro-F1, and Macro-Recall gradually improves with the best results achieved at the 90th percentile threshold.
With the JS divergence threshold fixed at 90%, different combinations of Gaussian perturbation magnitudes are further evaluated. As shown in
Table 10, excessively small or large perturbations result in performance degradation, whereas the best performance is achieved when (
= 0.005,
= 0.01), indicating that moderate perturbation magnitudes contribute to improving the stability of pseudo-label filtering. In this work, (
= 0.005,
= 0.01) is adopted as the perturbation strength configuration.
In summary, the sensitivity experiments verify that the combination of the 90% threshold with moderate perturbations yields the best performance, indicating that this strategy balances pseudo-label reliability and quantity, thereby enhancing the overall diagnostic capability of the model.
4.2. Case 2: Fault Diagnosis of a Bearing Based on the PS-BRB (HUST 65 Hz)
In
Section 4.2.1, the experimental background is introduced and the parameters are configured. In
Section 4.2.2, the experimental results are analyzed.
4.2.1. Background Description and Experimental Parameter Settings
In this experiment, the bearing dataset collected by the health perception laboratory of Huazhong University of Science and Technology is employed to validate the effectiveness of the proposed method [
41]. The data are acquired using a fault diagnosis test bench for power transmission systems [
42]. During the data acquisition process, bearing vibration signals along the X, Y, and Z axes are recorded by a triaxial accelerometer. The sampling frequency is set at 25.6 kHz, and each acquisition is conducted over a duration of 10.2 s.
In this work, data under the operating condition of a 65 Hz rotational speed are selected for analysis. The raw data file contains five channels: time step (Time), rotational speed (Speed), and vibration acceleration in the X, Y, and Z directions. To ensure the consistency of the input data and the controllability of the experiment, the vibration signal in the X-axis direction is selected for analysis. A sliding window approach is applied for time-domain feature extraction with a window length of 2048 and a step size of 256. Subsequently, based on the results of feature importance ranking, the two most representative time-domain features, namely Kurtosis and RMS, are subsequently selected as inputs to the model, as shown in
Figure 8.
The experiment includes five typical operating conditions, including ball fault (Ball), inner race fault (Inner), outer race fault (Outer), compound fault (Comb, referring to the simultaneous presence of multiple fault types), and healthy condition (Health). Vibration data under each condition are used to construct training and testing samples. Fault diagnosis modeling is then performed to systematically evaluate the performance and adaptability of the proposed PS-BRB method in bearing fault diagnosis tasks.
A total of 5086 sample data points are obtained in this experiment, covering five diagnostic conditions with an equal number of samples for each class. The dataset is divided according to a 3:7 ratio with 30% (a total of 1525 samples) used as the testing set and the remaining 70% (a total of 3561 samples) used for model training. Within the training set, 712 samples are further selected as labeled data, while the remaining 2849 samples are treated as unlabeled data to simulate a diagnostic scenario with limited labeled information, which commonly occurs in real-world conditions.
In this work, five reference values of Kurtosis are selected:
,
,
,
, and
, and five reference values of RMS (
,
,
,
, and
) are selected. The diagnostic results corresponding to these reference values are presented accordingly. The specific reference values are provided in
Table 11 and
Table 12 with the corresponding result reference values listed in
Table 13. The initial belief distributions are given in
Table A3.
4.2.2. Experimental Results Analysis
The attribute reference values updated through PS-BRB are shown in
Table 14 and
Table 15, and the updated belief distribution is provided in
Table A4.
a. Comparative analysis of rule weights before and after the update on the HUST-65 Hz dataset.
Figure 9a,b present the distributions of rule weights at two stages: before and after the update. The left plot illustrates the original rule weight distribution of the 25 rules prior to enhancement, whereas the right plot shows the updated rule weight distribution across all 36 rules with newly added rules (R2, R7–R12, R14, R20, R26, and R32) highlighted in red.
Overall, the rule system is significantly restructured during the self-enhancement process of the PS-BRB. Several rules with initially high weights (e.g., R6, R15, and R17) are substantially down-weighted, whereas some rules with relatively low initial weights (e.g., R9, R11, and R14) are assigned considerably higher weights after the update. This indicates that the model re-evaluates the contribution of each rule under the influence of high-quality pseudo-labels. Among the newly introduced rules, several (e.g., R9, R11, R12, R14, R20, and R26) receive high weights, suggesting strong discriminative capabilities and confirming the effectiveness of the perturbation-based self-training mechanism in rule expansion and optimization.
In addition, the updated rule weight distribution becomes more balanced, reducing over-reliance on a few dominant rules, which helps improve the model’s robustness and generalizability. The radar plots further provide a clear visual representation of the differences in rule weights and structural changes, offering valuable support for evaluating and analyzing the rule system.
b. Comparison of PS-BRB with other BRBs on the HUST-65 Hz dataset.
Figure 10 shows the comparison of fault diagnosis type prediction results on the test set for three methods: the baseline model (BRB1) trained with limited labeled data, the “upper-bound” model (BRB2) trained with fully labeled data, and the proposed PS-BRB method, which integrates perturbation-based self-training and high-quality pseudo-label filtering. The blue line represents the true fault types.
Figure 10 illustrates that the prediction trajectory of PS-BRB on the test set closely aligns with the true fault types with the predicted values in stable intervals nearly coinciding. The overshoot and oscillation amplitudes near the transition points are significantly lower than those of BRB1 and approach those of BRB2, which is trained with fully labeled data. Notably, in the segment marked by the purple dashed circle, BRB1 exhibits numerous high-amplitude spikes and drops, indicating unstable rule matching under limited labeled conditions. Although BRB2 generally maintains stability, occasional large deviations are observed. In contrast, PS-BRB displays the lowest fluctuation amplitude and frequency with the fastest convergence speed after transitions, reflecting superior robustness to distribution drift.
The quantitative results in
Table 16 further corroborate these findings. The accuracy of PS-BRB reaches 93.44%, representing an improvement of 14.358 percentage points (approximately 18.15% relative improvement) over BRB1 (79.082%), which is trained with limited labeled data, and it is only 0.52 percentage points lower than BRB2 (93.96%), which is trained with fully labeled data, nearly matching its performance. In terms of MSE, PS-BRB achieves a value of 0.2507, which is a reduction of approximately 37.04% compared with BRB1 (0.3982). Although slightly greater than BRB2 (0.2333), the MSE difference is inferred to be driven primarily by a small number of outlier samples in the high-category range in the later segments, as observed in
Figure 10.
In summary, PS-BRB achieves an accuracy nearly comparable to that of fully labeled training without requiring complete true labels while significantly reducing prediction fluctuations and the mean squared error. This demonstrates that the proposed high-quality pseudo-label filtering and perturbation-based self-training mechanisms effectively enhance rule coverage and model robustness, providing an efficient self-augmentation pathway for BRB-based fault diagnosis.
c. Ablation study: role of class consistency and JS threshold in pseudo-label filtering.
Table 17 presents the accuracy results of the four pseudo-label filtering strategies compared with the ground-truth labeled data. The unfiltered strategy yields the lowest accuracy (0.8265), as noisy samples are retained without constraints, leading to poor label quality.
The class-consistency-only strategy achieves a higher accuracy (0.9289) by effectively eliminating some misclassified samples. In contrast, the JS-divergence-only strategy, with an accuracy of 0.8876, retains many erroneous samples due to the absence of class-consistency constraints.
The combined class-consistency and JS-divergence threshold strategy attains the highest accuracy (0.9388). Dual filtering effectively removes samples with significant confidence distribution discrepancies while ensuring class consistency, thus minimizing noise and maximizing valid pseudo-label retention.
These results confirm that the dual filtering strategy significantly enhances pseudo-label quality, providing a robust foundation for optimizing PS-BRB performance.
4.3. Generalization Analysis
To validate the generalization capability of the proposed model, experiments were conducted on three datasets from different sources, namely a diesel engine dataset, the Southeast University gearbox dataset, and a spacecraft flywheel system dataset.
Table 18 presents the performance of the enhanced BRB method on fault diagnosis tasks, while
Table 19 demonstrates the improvement in pseudo-label quality achieved by the proposed “dual selection filtering” under ablation study conditions.
From the results presented in
Table 18 and
Table 19, the performance of PS-BRB across multiple datasets was demonstrated to exhibit strong generalization capability. Compared with the traditional BRB method, PS-BRB was able to effectively exploit unlabeled samples under limited annotation conditions, and its performance was further enhanced through high-quality pseudo-label filtering combined with a self-training mechanism. In terms of key metrics such as accuracy, recall, and F1-score, the overall performance of PS-BRB was shown to be close to, and in some cases approaching, that of the fully supervised model. These findings indicate that the proposed method can overcome the limitation of label scarcity while maintaining considerable adaptability across different datasets.
Moreover, the ablation study highlighted the critical role of the dual-constraint filtering mechanism in improving the quality of pseudo-labels, thereby ensuring both the stability and robustness of the model. Taken together, the results suggest that the proposed PS-BRB method possesses substantial potential for application in complex system fault diagnosis tasks and holds considerable value for broader practical deployment in real-world engineering scenarios.
4.4. Summary of Experiments
The perturbation self-training-based BRB enhancement method proposed in this paper is fully validated via fault diagnosis experiments on two bearing datasets. The experimental results show that the PS-BRB method has significant advantages in terms of generalizability, accuracy, robustness, interpretability, and anti-interference capability.
(1) Generalization.
PS-BRB is able to effectively utilize unlabeled data through a self-training mechanism with limited labels, thus significantly improving the generalization ability of the model. The experimental results show that the accuracy of PS-BRB is close to that of a fully labeled training model under the condition of when a small number of labels are used, demonstrating a strong adaptive ability.
(2) Accuracy.
In experiments conducted on the SEU 30 Hz-2 V and HUST 65 Hz datasets, PS-BRB demonstrated a significant improvement in accuracy. The accuracy rates reached 98.4% and 93.44%, respectively, representing an increase of approximately 9.58% and 18.16% over the baseline model. This validates the effectiveness of the PS-BRB in enhancing fault diagnosis precision.
(3) Robustness and anti-interference capability.
Through perturbation-based self-training, the PS-BRB enhances the model’s robustness to data perturbations. After perturbation, the PS-BRB effectively reduces the impact of interference on the prediction outcomes, ensuring the model’s stability and accuracy in unstable environments.
(4) Interpretability.
The interpretability inherent in the BRB model is preserved by PS-BRB, while the optimization of rules and high-quality pseudo-label filtering further enhance the transparency and traceability of the reasoning process. Consequently, the decision-making process of the model is rendered clearer and more comprehensible.
In summary, PS-BRB has significant advantages in complex fault diagnosis tasks with limited labels. By introducing a high-quality pseudo-label filtering mechanism and a perturbation-based self-training strategy, PS-BRB effectively utilizes unlabeled data to improve model performance under limited label conditions while enhancing the model’s robustness to data perturbations and noise. This makes PS-BRB highly adaptable and reliable in practical applications—particularly in fault diagnosis scenarios with insufficient labels or high levels of noise.