1. Introduction
The vehicle power supply has the characteristics of fast movement, lower noise operation, simple operation, all-weather work, etc. It has been widely used in army camping lighting, command, control, communication, artillery system, missile system, air force airport, etc. [
1]. However, the working environment of the vehicle power supply is complex and changeable, resulting in many kinds of faults and high randomness. During operation, when incipient faults evolve into serious faults, the entire weapon system may be paralyzed and affect the combat effectiveness of the military. In severe cases, it may even cause serious accidents that endanger the safety of life and property. Therefore, if the early signs of common system faults can be found in time, the types of incipient faults can be accurately identified, and the defensive maintenance can be carried out in advance, the occurrence of serious faults can be reduced or avoided, and the safety level of vehicle power supply can be improved.
Incipient faults are characterized by subtle manifestations, susceptibility to being masked by noise and disturbance, and difficulty in accurate differentiation between one another. Therefore, more powerful fault diagnosis methods are urgently needed. Vehicle power supply is itself a complex electromechanical system. Because it is difficult to establish a precise mathematical model, and prior knowledge is limited, traditional diagnosis methods based on analytical models and expert systems are not suitable for the incipient fault diagnostics of vehicle power supply [
2]. With the rapid development of sensors and information technology, complex systems can obtain massive, numerous sources, and high-dimensional monitoring data [
3]. Therefore, data-driven intelligent fault diagnosis methods have become an important way to solve the incipient fault diagnosis of complex systems. As a typical representative of data-driven models, deep learning (DL) can effectively extract high-level features of data from monitoring data and express original data features more completely due to its multi-hidden layer structure [
4], opening up a new way for the research of incipient fault diagnosis. Li [
5] proposed a fault diagnosis method that integrates feature selection with a deep learning SAE network, which achieved satisfactory results in diagnosing incipient faults in vehicle power supply. Husari and Seshadrinath [
6] used the HCNN deep network to extract fault features, combined with the two-level SVM to classify the interturn faults of electric drive system, and evaluated the severity of faults. Peng [
7] proposed an Adversarial Domain Adaptation Network with MixMatch (ADANM) for diagnosing incipient faults in permanent-magnet synchronous motors under multiple working conditions. Article [
8] proposes a novel deep learning framework that leverages multi-sensor data to diagnose seven types of incipient interturn short-circuit faults in wind turbine generators.
The literature mentioned above applies the deep network to the incipient fault diagnosis of the balanced datasets, which has achieved a good diagnostic effect. However, a single deep network still has the disadvantage of poor generalization performance in an imbalanced dataset, resulting in a low fault recognition accuracy rate [
9]. At present, the problem with the processing of imbalanced samples mainly starts from two aspects: the data sampling method and improved model algorithm [
10], among which under-sampling, oversampling [
11,
12,
13], and ensemble learning methods [
14,
15] are widely used. Under-sampling eliminates sample imbalances by reducing the number of samples in most classes, which may cause data information loss and affect the classification effect. Oversampling expands the number of minority samples, which tends to strengthen some sample features, causing the classifier to overfit. In 1995, Freund and Schapire [
16] pioneered the AdaBoost algorithm based on the classic boosting ensemble algorithm, which forced weak classifiers to focus on difficult samples, and weighted and combined multiple weak classifiers into a strong classifier. AdaBoost algorithm can significantly improve the model learning accuracy and generalization ability, so that it has certain advantages in the classification of imbalanced sample datasets.
Considering the characteristics of ensemble algorithms and deep learning, this paper proposes an ensemble learning method based on AdaBoost-enhanced SAE network. This approach integrates the sequential weighting mechanism of AdaBoost with the powerful nonlinear feature extraction capacity of SAE, leveraging their complementary advantages to enhance diagnostic performance under imbalanced data conditions. Through multiple rounds of iterative weighting, it actively focuses on a few difficult-to-identify fault samples, effectively alleviating classification bias caused by data imbalance. Building upon the retained advantage of the original deep network in extracting subtle fault features, the proposed model significantly enhances its capability to handle imbalanced data distributions. It is expected to achieve superior performance in diagnosing incipient faults of vehicle power supplies under imbalanced sample conditions.
2. Analysis of Incipient Faults in Vehicle Power Supply
Vehicle power supply is a typical complex electromechanical system. An analysis of its primary structure (see the schematic diagram in
Figure 1) indicates that the key component determining its performance metrics and power quality is its diesel generator set. This core unit is primarily composed of a diesel engine with its electronic governor system and a synchronous generator with its excitation control system.
Due to the strong coupling between each module, the fault mechanism is relatively complex, and the fault types include both electrical and mechanical faults. In the long-term operation process of vehicle power supply, owing to factors such as load changes, component degradation, wear, and harsh environments, the system will gradually experience some failures, such as loss excitation of synchronous generators, unbalanced three-phase voltages of generators, interturn short-circuits, blockage of diesel fuel injection nozzles, electronic governor failure, etc. Once the above faults occur, the vehicle power supply will not work normally. However, the incipient fault of the vehicle power supply has the characteristics of non-obvious measurement signal symptom and loud noise. When an incipient fault occurs, the vehicle power supply can continue to operate in a particular state. But over time, the incipient fault will gradually evolve into a serious fault, which is essentially the process of system parameters from “quantitative change” to “qualitative change”, the mechanism of the failure is the same. Therefore, this study focuses on the early stages of the above common failures, which can be defined according to the four types of power station indicators in “Military Alternating Current Mobile Electric Power Plant, General Specification For” [
17].
Failure mechanism and characteristic data are the basis for developing incipient fault diagnosis research. Considering the lack of previous historical monitoring data for vehicle power supply and the high cost of destructive tests, the virtual simulation of complex systems is not only feasible but also a future development trend. Therefore, relying on the team’s development of “Vehicle Power Supply Simulation System” for military users, based on the analysis of the vehicle power supply failure mechanism, by connecting auxiliary components or adjusting related module parameters on this Simulation System, different fault conditions were simulated and monitoring data were collected, and then, according to the main indicators of electrical performance of power stations in GJB 235A-97, combined with the experience of industry experts and historical data, four categories of incipient faults of vehicle power supply were finally determined. (This simulation system strictly follows the “Technical Agreement for Vehicle Power Supply Model Simulation Software” and is designed specifically for high-fidelity modeling and simulation of 75 kW/120 kW low-noise vehicle power supply. As a core tool, the system has been widely used and tested by the military for many years, and its reliability, accuracy, and effectiveness have been highly recognized and fully verified by users.
Figure 2 presents a comparative analysis of steady-state performance indicators between simulation experiments and physical experiments for a 120 kW vehicle power supply.
Table 1 displays four categories of incipient faults in the vehicle power supply system, describing their characteristic manifestations and highlighting the associated risk consequences.
3. Incipient Faults Diagnosis Model of Vehicle Power Supply
3.1. Proposed Model: AdaBoost-SAE
Through the deep network structure built by deep learning, high-order information such as more abstract and detailed information can be mined from the input data, so it has the ability to automatically learn the essential features of data from the original samples, and these high-order features are undoubtedly the key factors to improve the performance of incipient fault diagnosis [
32]. As a classic network model in the field of deep learning, SAE extracts high-order features layer-by-layer by “copying” the input data and realizing data dimensionality reduction in the process, converting complex input data into more representative features. Therefore, it is feasible to select SAE to extract incipient fault features of vehicle power supply and add a Softmax classifier to diagnose common incipient faults of vehicle power supply.
However, the actual operation of the vehicle power supply is mostly under normal working conditions, and the fault occurs only occasionally as a special working condition. Therefore, the number of samples under normal working conditions is much higher than that under fault conditions, resulting in a severe imbalance in the sample datasets. This imbalance will cause the information of the majority samples to cover up the information of the minority samples and then make the network parameter update more inclined to the majority samples when it is used for deep network training. Especially for the incipient faults whose symptoms are not obvious and lack data, the ability of deep network to extract its features is restricted, and the fault diagnosis effect is further affected.
AdaBoost algorithm, as a classic representative of ensemble algorithm boosting, combines multiple homogeneous weak classifiers into strong classifiers. Meanwhile, in the turn-by-turn learning of the weak classifiers, according to the misclassification of the training samples. In the subsequent learning and training of the weak classifier, it can automatically increase the weight of misclassified samples. The ability of the weak classifier to identify misclassified samples is gradually enhanced. It is precisely this mechanism that enables AdaBoost ensemble algorithm to have the ability to adaptively adjust the error rate of weak classifiers. It not only focuses the focus of weak classifiers on difficult samples, but also improves the generalization ability of weak classifiers, and the ensemble combination of multiple weak classifiers will show better classification performance than a single classifier.
To sum up, in order to obtain the deep, intrinsic characteristics of incipient faults of vehicle power supply and improve the identification ability of incipient faults in deep networks under imbalanced samples, this paper intends to combine multiple weak classifiers SAE into a strong classifier through the AdaBoost ensemble algorithm. A fault diagnosis method for vehicle power supply based on the AdaBoost-SAE ensemble deep learning model is proposed to address the challenges of difficult extraction of incipient fault features and imbalanced samples distribution.
3.2. Model Description
For incipient fault diagnosis of vehicle power supply under imbalanced data conditions, the overall framework of the AdaBoost-SAE-based model is illustrated in
Figure 3. The construction process of the diagnostic model includes three major phases: data acquisition and preprocessing, construction of the AdaBoost-SAE diagnosis model, and model validation.
3.2.1. Data Collection and Preprocessing
The corresponding incipient faults in
Table 1 were simulated through the vehicle power simulation platform developed by the team in advance, and the corresponding monitoring data was collected, and established a feature vector set of incipient faults of the vehicle power supply, which is composed of 15 kinds of relevant variables in the vehicle power supply monitoring data, specifically including active power (
), reactive power (
), power factor (
), frequency (
), rotor speed (
), electromagnetic torque (
), stator voltage (
), stator current (
), excitation voltage (
), three-phase voltage (
,
,
), three-phase current (
,
,
). The feature vector set of the incipient fault of the vehicle power supply is expressed as
, incipient fault category labels use One-Hot Encoding, and the dataset is represented as
.
where
indicates the number of samples in the feature set and
indicates the number of variables in the feature set, namely
;
is the number of fault categories, total from
Table 1 common incipient fault types plus normal status
.
According to the ratio of 10:1 between the normal working conditions and fault conditions, imbalanced datasets were generated from the feature vector set. The datasets were normalized and preprocessed to eliminate the dimensional influence of different features.
3.2.2. The Principle of the Proposed AdaBoost-SAE Algorithm
The AdaBoost-SAE model obtains a better and more comprehensive supervised learning model by combining multiple weak classifiers (SAE). The specific steps are as follows:
Step 1. Initialize the weight vector of the training samples in the imbalanced datasets of the vehicle power supply:
denotes the number of training samples, is the weight of the sample in the first iteration, and denotes the initial weight distribution of the samples.
Step 2. Training weak classifier SAE: one by one for each iteration update training m = 1: M (M says the number of weak classifier), algorithm loop performs (a)–(d).
- (a)
Assign the corresponding weights
to all training samples, train the
weak classifier under this sample distribution and obtain the error rate of the weak classifier during this round of iterative training. Its value is the sum of the weights of misclassified samples.
where
is the weight of the
training sample in the
;
is the prediction result of the
weak classifier,
, and
is the true label of the incipient faults,
.
denotes when the
is true,
, otherwise, its value is 0.
- (b)
Calculate the weight of the
SAE in this round:
Among them, k is described in Formula (2).
- (c)
Update the weight distribution of the training dataset samples:
In Formula (7),
is a normalization factor, and its function is to make the sum of the training sample weights 1, which is defined as follows:
- (d)
Perform linear weighted fusion of all SAE according to Formula (9) to obtain the final strong classifier:
Step 3. Model validation:
The test samples from imbalanced datasets were input into the strong classifier, and finally the class of incipient faults of the vehicle power supply is identified. The diagnostic effect of AdaBoost-SAE model was verified by different evaluation indexes.
3.3. Training of the Weak Classifier (SAE)
The weak classifier SAE is stacked with multiple Auto-Encoders (AE), which is used to extract high-order features of vehicle power supply incipient fault, and input the extracted feature vectors into the Softmax layer to predict the fault categories. In each iteration training process of AdaBoost-SAE model, SAE deep network performs parameter training according to updated sample weight distribution, which mainly includes unsupervised training and supervised fine-tuning process.
Perform layer-by-layer greedy training on the SAE, randomly initialize the initial weights and biases of the first layer of AE, train the network through the gradient descent algorithm to minimize the reconstruction error, and optimize the network parameters. Only the coding part was retained after the first AE network was trained. Subsequently, the high-order features extracted by the first AE are used as the input of the second AE, and the second network is trained. Analogously, after completing the training of the entire network, the output of the high-order features by the last layer of the network is the input of the Softmax classification layer.
After the forward unsupervised training, the weight and bias of each layer are taken as the initial weight and bias of the SAE, the error between the actual value of the sample label and the predicted value output by the Softmax layer is calculated by the backpropagation algorithm, and the weight and bias of each layer are fine-tuned using the Gradient Descent Method. Thus far, after fine-tuning, all parameters of SAE network are in a relatively optimal position, and the network training is completed. In the deep learning ensemble model, the structure of the weak classifier SAE network is shown in
Figure 4. It employs a combination of unsupervised pre-training (forward) and supervised fine-tuning (backward) for feature learning.
4. Experimental Results and Analysis
4.1. Imbalanced Datasets Description and Evaluation Indicators
As described in
Table 1, five working conditions were simulated on the vehicle power supply simulation platform. Namely, normal working conditions, partial loss of excitation of the synchronous generator, unbalanced degree 1~3% of three-phase voltage in the synchronous generator, slight blockage of diesel fuel injector, and electronic governor degeneration. Operating data are collected and an imbalanced dataset is generated. In the dataset, the number of samples under normal working conditions was 5000, and the other four types of incipient fault conditions had 500 samples for each type. That is, the majority samples and minority samples had a 10:1:1:1:1 distribution.
Table 2 describes the sample quantities and proportions under five working conditions of the vehicle power supply, along with their corresponding labels. In the imbalanced datasets, 60% of the samples were randomly selected as the train set (4200 samples), 20% of the samples were used as the validation set (1400 samples), and 20% of the samples were used as the test set. The simulation test hardware environment was a notebook computer with 16G memory and an i7-8750H processor.
To objectively and quantitatively evaluate the effectiveness of the deep learning ensemble model in the diagnosis of common incipient faults in vehicle power supply, considering that its nature is still a multiclassification problem, the accuracy and F1-score in the classification task evaluation index are used to evaluate the diagnostic performance of the AdaBoost-SAE model. The F1-score is the weighted harmonic average of precision and recall rates. The higher the score, the stronger the fault diagnosis ability of the model. When calculating the evaluation index of the
i-th class sample, the
i-th class is regarded as the positive class and the other classes are regarded as the negative class. The calculation formula for the evaluation index is as follows:
where
represents a positive sample classified as the positive class by the model,
represents a negative sample classified as the negative class by the model, and
represents a positive sample classified as the negative class by the model.
represents a negative sample classified as the positive class by the model.
4.2. Optimization of the Main Parameters of the AdaBoost-SAE Model
The network structure and parameters of the weak classifier SAE have an essential influence on the feature extraction and classification of incipient faults [
33]; and the number of weak classifier SAE integrations (the number of iterations of the deep integration model) also affects the final diagnosis effect of the AdaBoost-SAE deep integration model. Therefore, it is necessary to first analyze the main parameters of the AdaBoost-SAE model and then select the optimal parameter settings to make the AdaBoost-SAE model have the best performance in diagnosing incipient faults under imbalanced datasets.
- (1)
Determining the network structure of the weak classifier (SAE)
For the analysis of the parameters in the SAE deep network, there is still no practical and effective method; therefore, traversal optimization is adopted. Based on a large amount of simulation experience in the early stage, after careful consideration, a test range was set for the number of SAE hidden layers and the number of hidden layer neurons. The number of hidden layers increased from 2 to 4, and the number of hidden layer neurons ranged from 5 to 12. Sigmoid function is selected as the activation function of the hidden layer, and MSE (mean-square error) function is used as the loss function. The network is trained using Gradient Descent algorithm to minimize the reconstruction error. The number of iterations for each AE was 2000, and the number of iterations for the overall fine-tuning of the network was 5000. The AdaBoost-SAE model cyclic iterative training number was uniformly set to 10, that is, the number of weak classifiers was 10. Meanwhile, to prevent model overfitting, dropout layers and L2 weight decay were introduced in each weak classifier (SAE). At the same time, when the model performance no longer improves in consecutive iterations, training is immediately stopped to limit the number of weak classifiers and prevent performance saturation and computational redundancy. Taking the incipient fault diagnosis accuracy of the vehicle power supply with imbalanced datasets as the evaluation index of the optimization test, the impact of the number of hidden layers of the weak classifier and the number of hidden layer neurons on the fault diagnosis accuracy of the deep integration model is shown in
Table 3. At the same time, the accuracy of incipient fault diagnosis of the single SAE under different network structures is given in
Table 3.
As shown in
Table 3, under the premise of limiting the number of iterative trainings, the accuracy of the combination of the two-layer hidden layer (10-7) in the AdaBoost-SAE model is the highest, at 94.9%. In contrast, the fault diagnosis accuracy of several combinations of three and four hidden layers is relatively low. This shows that the SAE hidden layer structure of the weak classifier is a combination of (10-7), which is conducive to the AdaBoost-SAE model to learn the data characteristics of the incipient faults of the vehicle power supply under imbalanced datasets, and the diagnosis effect is better than other structural combinations. Comparing the fault diagnosis results of a single SAE with different network structures, it shows that the diagnosis accuracy can be improved by integrating SAE networks. However, the optimal network structure of the two models is not the same, which also indicates that the network structure of the simple SAE model and the AdaBoost-SAE model is not related.
- (2)
Determining the number of iterations of the AdaBoost-SAE model
To select a more appropriate number of weak classifiers (SAE) to form the AdaBoost-SAE model, an optimization experiment was performed on the number of weak classifiers. In the experiment, the hidden layer structure of SAE selected the combination of (10-7) in
Table 3. The initial number of weak classifiers was 1. Each round of iterative training adds a weak classifier and calculates the accuracy of model diagnosis. The model stopped iterating when the accuracy was stable and no longer increased.
Figure 5 illustrates the variation in fault diagnosis accuracy of the AdaBoost-SAE model as a function of the number of weak classifiers.
It can be seen intuitively from
Figure 5 that the number of weak classifiers (SAE) increases one by one from the initial one, and the overall fault diagnosis accuracy also shows an increasing trend. Finally, when the number of weak classifiers reaches 18, the model fault diagnosis accuracy tends to be stable, reaching 96.6%. After that, when the number of base classifiers is increased, the fault diagnosis accuracy remains unchanged, indicating that AdaBoost-SAE model can achieve better fault diagnosis results when the number of weak classifiers is 18.
Based on the above test results, in the subsequent process of fault diagnosis using the AdaBoost-SAE model, the key parameters of the model are set as follows: the number of weak classifiers is 18, the SAE network structure has four layers, including two AE in the hidden layer, and the number of neurons in the hidden layer is (10-7).
4.3. Analysis of Incipient Fault Diagnosis Results
The confusion matrix can directly and comprehensively reflect the diagnostic accuracy and recall rate of different incipient faults of vehicle power supply, as well as the number of samples misdiagnosed and missed diagnosis of various incipient faults. Therefore, the incipient fault diagnosis results of vehicle power supply based on AdaBoost-SAE model under imbalanced samples were visualized using the confusion matrix, as shown in
Figure 6 (including train set, validation set, and test set). Here, only the confusion matrix of the diagnostic results of the test set is analyzed. The green cells along the main diagonal represent the correctly diagnosed samples and their proportions, while the gray cell in the bottom-right corner indicates an overall fault diagnosis accuracy of 96.6%. The first row of the matrix indicates that for the first working condition, 991 samples were correctly classified, while 37 samples were missed, resulting in a recall rate of 96.4% and a miss rate of 3.6%. The first column shows that 11 samples were misclassified as the first working condition, yielding a diagnostic precision of 98.9%.
To validate the performance advantages of the AdaBoost-SAE model in diagnosing incipient faults of vehicle power supply under imbalanced sample conditions, the following four models were established as reference groups in the simulation experiments: a BP model based on shallow neural network, a traditional machine learning algorithm SVM, an AdaBoost-BP ensemble model (which has achieved optimal accuracy after 20 rounds of iterative training), and a SAE network.
Figure 7 comparatively presents the diagnostic result differences between the reference groups and the proposed model through confusion matrices.
Figure 7a–d show the incipient fault diagnosis results of the SVM, BP, SAE, and AdaBoost-BP, the fault diagnosis accuracy under imbalanced datasets is 83.6%, 80.2%, 92.4%, and 88.8%, lower than the 96.6% accuracy of the AdaBoost-SAE model in
Figure 6. For the diagnosis results of the four models mentioned above, owing to sample imbalance, even a large number of samples among the minority classes are misclassified (e.g., the 2nd and 3rd working conditions in BP, SVM and AdaBoost-BP and the 2nd working condition in the SAE model), due to the low proportion of the minority class samples in the total samples, the overall diagnostic accuracy of the model still reached a high level. Therefore, using only accuracy to evaluate the effect of fault diagnosis is neither objective nor scientific. Simultaneously, the influence of contingency and random factors in the training process of each model on the diagnosis results was considered to evaluate the fault diagnosis results more convincingly. We performed 10 fault diagnosis tests on each of the models and performed statistics on the diagnosis results. To obtain the average of the accuracy rate, training time, testing time, and F1-score of different models to evaluate the diagnostic performance of the model comprehensively.
Table 4 summarizes the fault diagnosis accuracy, model training time, and testing time of the four reference group models, along with the deep learning ensemble models AdaBoost-SAE(5), AdaBoost-SAE(10), and AdaBoost-SAE.
Table 5 presents the F1-score and Macro-F1 of the four reference group models compared with the AdaBoost-SAE model.
By comparing the diagnostic effects of the different models in
Table 4 and
Table 5, it can be seen that
- (1)
The average diagnostic accuracy of the AdaBoost-SAE model under imbalanced datasets was 96.6%, which is an increase of 4.3% compared with the single SAE and an increase of 7.8% compared with the integration model AdaBoost-BP. The SVM and BP models had poor fault diagnosis effects, with the average diagnosis accuracies of 82.5% and 79.8%.
- (2)
Among the deep learning ensemble models, AdaBoost-SAE(5), which incorporates the smallest number of weak classifiers (five), required a training time of 136.79 s. In contrast, the AdaBoost-SAE model, integrated with 18 weak classifiers, exhibited the longest training time, reaching 485.74 s. These results indicate substantial variation in training times across the deep learning ensemble models, with a clear trend toward longer durations as the number of weak classifiers increases. Compared to the control models (SVM, BP, SAE, and AdaBoost-BP), the deep learning ensemble models demand considerably more training time due to their higher structural complexity. Nevertheless, the testing time of the AdaBoost-SAE model was 0.0239 s—only marginally longer than those of the four control models, with differences remaining within the millisecond range.
- (3)
SVM and BP showed poor classification results in 2nd, 3rd, and 4th working conditions. The SAE and AdaBoost-BP models are also difficult to classify into the 2nd working condition. The AdaBoost-SAE model significantly improves the F1-score of each working condition on the premise of improving the overall fault diagnosis accuracy. For example, compared with the four models of SVM, BP, SAE, and AdaBoost-BP, the F1-score of the most difficult to diagnose 2nd working condition increased by 73.85%, 73.81%, 70.5%, and 74.27%, respectively.
- (4)
Compared to the reference group models, the proposed AdaBoost-SAE model achieved a Macro-F1 score of 93.84%, significantly outperforming all others and highlighting its superior performance in fault diagnosis on imbalanced datasets.
The above results show that, compared with shallow neural networks and traditional machine learning, deep neural networks have obvious advantages in extracting incipient fault characteristics of vehicle power supply. AdaBoost algorithm can significantly reduce the impact of imbalanced training samples, thereby improving the diagnostic performance of the model. The AdaBoost-SAE model combines the advantages of SAE networks and AdaBoost algorithm. When the symptoms of incipient faults in the vehicle power supply are not obvious, and the data samples are imbalanced, it can effectively extract the features of incipient faults, better recognize some difficult-to-classify samples, and obtain stable diagnosis results for the same datasets, showing strong generalization ability. However, as an ensemble model integrating multiple deep neural networks, AdaBoost-SAE required 18 iterative training rounds to converge, resulting in a time-consuming training process. Nevertheless, given its application mode of “offline training and online testing”, the millisecond-level difference in testing time compared to the reference group models does not constitute a significant impact in practical engineering applications.
5. Conclusions
To enhance the diagnostic performance for incipient faults in vehicle power supplies under imbalanced datasets, this paper proposes a fault diagnosis model based on AdaBoost-SAE deep ensemble learning. The main contribution of this study lies in constructing an ensemble learning framework capable of effectively capturing subtle fault features and addressing class imbalance issues. Experimental results demonstrate that the proposed model significantly improves the overall accuracy of incipient fault diagnosis across multiple working conditions, particularly showing superior performance in identifying challenging samples. This model provides a technically practical solution for the reliable diagnosis of incipient faults in vehicle power systems.
However, this study has several limitations. First, although the AdaBoost-SAE model demonstrates excellent diagnostic performance, its iterative training process is more time-consuming compared to traditional fault diagnosis models. Second, the number of weak classifiers and the network’s hyperparameters still heavily rely on extensive manual experimentation rather than automatic configuration. Finally, due to the lack of validation by real data, the model’s practical applicability remains constrained. To address these limitations, future work will focus on the following aspects: first, optimizing the model architecture and training strategy, for instance, by introducing model pruning or distributed training to significantly reduce training time. Second, investigating adaptive parameter optimization algorithms to minimize the reliance on manual experimentation. Third, exploring transfer learning techniques to enhance the model’s generalization capability.