Next Article in Journal
Analysis of Congestion Control Mechanisms for Cooperative Awareness in IoV Environments
Next Article in Special Issue
Thermal Analysis of Water-Cooling Permanent Magnet Synchronous Machine for Port Traction Electric Vehicle
Previous Article in Journal
Power-Delay-Profile-Based MMSE Channel Estimations for OFDM Systems
Previous Article in Special Issue
Electric Vehicle Powertrains with Modular Battery Banks Tied to Multilevel NPC Inverters
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Data-Driven Fault Early Warning Model of Automobile Engines Based on Soft Classification

School of Automotive Studies, Tongji University, Shanghai 200092, China
*
Author to whom correspondence should be addressed.
Electronics 2023, 12(3), 511; https://doi.org/10.3390/electronics12030511
Submission received: 21 December 2022 / Revised: 11 January 2023 / Accepted: 16 January 2023 / Published: 18 January 2023
(This article belongs to the Special Issue Feature Papers in Electrical and Autonomous Vehicles)

Abstract

:
Since automobile engine fault is the main factor leading to a vehicle breaking down, engine fault diagnosis has captured a lot of attention. Fault diagnosis identifies fault types to facilitate maintenance. However, the method of the warning before the fault occurs is more attractive to users and is more challenging. Therefore, this study would like to explore the feasibility of implementing automobile engine fault early warning based on the fault diagnosis model. First, the theoretical method of a fault domain is established, and the state of the engine is regarded as a point in n-dimensional space. The normal or fault of the engine will correspond to different state domains in this space. Second, to diagnose multiple fault types at the same time, an ensemble model based on multiple machine learning methods is established. The probability outputs by the ensemble model measure the distance between the point and each fault domain in the space. Finally, considering the temporal factor, an early warning threshold is established based on the probability, and a fault warning model is established by using the dual probability structure. Comparative experiments show that the proposed method can greatly reduce the calculation time based on ensuring the accuracy of early warning and is suitable for real-time early warning of multiple faults.

1. Introduction

Fault diagnosis and early warning signs are crucial in the work of power machineries, such as aero-engines [1], gas turbines [2], and automobile engines [3], where the continued normal operation of these powerful machines is related to the safety of large amounts of property and people. As the power core of the automobile, the engine has a high failure rate, reliable fault diagnosis, and an early warning system can provide timely guidance for the maintenance of the engine, which is very important in the field of automobile manufacturing and maintenance [4]. Fault diagnosis, also known as fault detection or fault identification, is the detection of the operating status of the equipment [5]. If the state of the equipment is normal, fault diagnosis evaluates and warns of the future state of the equipment, and when the state of the equipment is abnormal, fault diagnosis means to analyze the cause and degree of the abnormality, and propose a maintenance idea [6]. Fault warning, also known as fault prediction, is based on the operating laws of equipment to model the trend of state characteristics of the equipment parameters. This method can estimate the time of equipment failure, and is used to prompt users to check and repair in advance, to avoid greater losses caused by the unexpected failure of equipment [7]. Usually, the fault warning methods are improved based on fault diagnosis.
The methods of fault diagnosis can be divided into three kinds: (1) the method based on mathematical models [8]. This kind of method establishes mathematical formulas for each kind of engine fault and enables accurate results and interpretability at the same time. However, it is difficult to establish an accurate mathematical representation for each kind of fault because there are so many kinds of faults, and they occur at different frequencies. (2) The method based on signal processing [9]. Many mechanical faults can be adequately and effectively detected by vibration, sound, and other signals. However, this method is only suitable for a single fault that can be identified by the signal and is powerless for signals from multiple faults. (3) The method based on machine learning [10]. This does not need a precise mathematical description of the research object, rather, it learns according to the monitoring data reflecting the operating status of the object, to achieve the automatic identification of unknown states. The common intelligent diagnosis methods are neural networks, support vector machines, evolutionary intelligence, expert systems, fuzzy faults, information fusion diagnosis methods, etc.
With the development of automotive sensing technology and automotive electronic software technology, early fault warning technology has attracted increasing attention in the process of vehicle design. Traditional early fault warning technology, with a slow iteration, relies on the accumulation of industry knowledge and experience, it takes too long to wait for a new early fault warning technology using the traditional methods. In the era of big data, data-driven fault warning has become an effective solution to this problem. Machine learning provides a more convenient and low-cost way for the fault early warning system, which enables better performance than traditional methods. Similar to the previously mentioned machine learning methods for fault diagnosis, which can also be used for fault warning after combining time series information. Many machine learning methods are applied to automobile fault diagnosis systems, such as Long-Short Term Memory (LSTM), Gated Recurrent Unit (GRU), Convolutional Neural Networks (CNN), Light Gradient Boosting Machine (LGBM), eXtreme Gradient Boosting (XGB) [11], Random Forest (RF), etc. [12].
The research on engine fault warnings has many challenges. When an engine fails, there may be multiple causes of failure. It is difficult to distinguish which cause. Similarly, even if an engine is in normal operation, the probability of different failures that may occur in the future varies. The first challenge of engine early fault warning is to build a common method to identify many different types of faults, especially to accurately identify faults with high frequency. In addition, the data distribution in the fault identification and early warning problem is typically unbalanced [13]. Most of the collected data have normal labels and very few have fault labels, which brings the second challenge, that is, how to deal with unbalanced data. Finally, users expect to be able to maintain a high degree of accuracy, as far as possible early warning of possible failures, is the third challenge.
To ensure the accuracy and efficiency of simultaneous warning of multiple engine faults, an engine fault early warning model based on a soft classification is built. First, an integrated learning model is built for identifying most of the engine faults based on the fusion of multiple machine learning models, then a dual-probability state warning threshold model is built considering the warning thresholds based on the concept of fault domain, this method can deal with the problem of data imbalance. Finally, by adjusting the warning threshold, a fault warning model that could guarantee the accuracy of warning and real-time calculation is obtained.
The major contribution of this research is as follows:
(1)
Fault domain theory and multiple fault diagnosis model based on the theory are proposed;
(2)
A fault warning model based on a dual probability structure and warning thresholds is developed, and a method to improve the warning advance time based on adjusting the warning thresholds is proposed;
(3)
The prediction accuracy and computational efficiency of the failure warning model are discussed and it is demonstrated that the proposed dual probability warning model can guarantee both.
The method proposed in this study has been applied in engine fault detection at Bosch. Although this study focuses on the fault early warning of automobile engines, the method proposed can also be applied in other power machineries, such as the electric motor of electric vehicles and gas turbines. The failure data imbalance problem encountered in the study has similarities with the problem in the field of autonomous driving, and the dual probability warning model structure can also be used in the field of autonomous driving.
The remainder of this paper is organized as follows: Section 2 gives the literature review on the related works. Section 3 introduces the models proposed in detail. Section 4 presents the experimental results and discussion. Finally, the conclusions are summarized in Section 5.

2. Related Works

2.1. Engine Fault Diagnosis

There have been many applications of artificial intelligence methods to engine fault diagnosis problems in recent years. Support vector machine (SVM) is a common classification machine learning method [10], and fault diagnosis is a typical classification problem, so the SVM model is often used in many studies to predict the occurrence of engine faults [14,15]. The Artificial Neural Network (ANN) is one of the most common methods used for engine fault diagnosis [3]. Chen et al. proposed neural network algorithms to realize engine misdiagnosis through feature analysis [16]. Qiao et al. implemented a neural network method to analyze the instantaneous speed of the diesel engine and diagnose the cylinder wear status [17]. Di et al. proposed a BP algorithm in the fault diagnosis of automotive engines, the indexes of the automobile exhaust are used as the inputs of the neural network [18]. With the development of deep learning, CNN has been applied to fault diagnosis problems. Jing et al. based on nonlinear time series modeling of deep confidence networks to detect and classify the health status of bearings by Weibull distribution [19]. Chen et al. extracted the time domain and frequency domain features of vibration data, then fed them into CNN, and, finally, the fault of gearbox identification and classification of gearbox faults [20]. Wang et al. proposed a new intelligent diagnosis method for bearing faults, which combines symmetrized dot pattern with squeeze-and-excitation enabled convolutional neural network with 99% classification accuracy [21].
Other machine learning methods have been applied to fault detection. Justin et al. used a naive Bayesian classifier to classify and predict diesel engine faults [22]. Ma et al. proposed a method based on Weibull distribution and deep confidence network to classify the bearing states, and the results showed that the method has good performance [23]. Recently, more and more researchers prefer to combine different machine learning approaches to take advantage of the points of different methods to target some fault diagnosis problems. When solving the problem of data set imbalance, some experts and scholars use the method of the fuzzy system combined with neural network algorithm: Doddi et al. implemented a monitoring system combining fuzzy kernel c-means clustering algorithm and type 2 optimal fuzzy neural network [24]. Naderkhani et al., 2021 analyzed and predicted the fuzzy regression functions of non-fuzzy input and symmetrical trapezoidal fuzzy input based on an adaptive neuro-fuzzy inference system [25]. Li et al. proposed a method combining CNN and GRU to diagnose gear pitting faults, and the proposed method has higher diagnostic accuracy compared to CNN and GRU [13].
There is also a combined model-based and data-driven diagnostic method, which takes advantage of physical models and historical data. One solution is to use data-driven methods to improve generalization ability based on physical models, which are called reference models [26]. For the problem of fault diagnosis, a differential equation can be established based on the physical parameters of the diagnosed system, then the state representation vector can be extracted to identify and collect the data of important parameters, and a data-driven model can be established [27]. Another method is to add expert experience to the Bayesian network, and design Bayesian Network classifiers to detect faults. The advantage of this method is that Bayesian Network classifiers use probabilistic reasoning and can be designed using physical knowledge or system data or both [28]. However, these methods of combining physical models and data-driven models are mostly used to pursue the accuracy of a single fault and the generalization performance of models.

2.2. Engine Fault Warning

Engine fault warning technology is based on engine failure detection technology, the difference being that the former adds time series information to predict the type and probability of possible failures in the future. Due to the different expected objectives of fault warning, there are two main forms of early equipment failure warning, one is fault trend prediction [26] and the other is residual life prediction [11,29].
Boskosk et al. proposed a bearing fault prediction method based on entropy [30]. Thumati et al. proposed a model-based fault early warning scheme for nonlinear discrete systems [31]. Rashid et al. preprocessed the data through the Fourier transform and proposed a new data mining algorithm to identify faults [32]. Kamal et al. used a variety of classification algorithms, including SVM, to analyze the extracted features to improve the performance of fault prediction [33].
A recurrent Neural Network (RNN) is a neural network specialized in processing sequential data, and is also a common machine learning approach for fault warning [34]. Therefore, this type of network is optimized for tasks related to the recognition of defects in machines. Yuan et al. applied four models, namely, RNN, AdaBoost-LSTM, conventional GRU, and conventional LSTM, to aero-engine fault prediction, and compared the results of the four models to find that LSTM has higher performance [12]. Chui et al. proposed the remaining useful life prediction algorithm in combination with RNN and LSTM, to address the issues of unnecessary maintenance checks in run-to-failure maintenance [35]. Zhou et al. proposed a back propagation neural network fault diagnosis model to achieve high-precision vehicle chassis dynamometer control [36].
From the above studies, many machine learning methods are currently used for fault diagnosis and early warning, and more studies are using multiple machine learning methods to exploit more advantages; however, these studies are mostly for one or two kinds of faults. In this study, the diagnosis and early warning of multiple common faults of an engine from Bosch from the practical application requirements, establish a fault diagnosis model using a combination of multiple models, then combine the fault domain and time series information to establish an accurate fault warning model.

3. Methodology

3.1. The Hypotheses of Fault Domain

To model the engine fault problem, the state of the engine is regarded as a point in n-dimensional space. The normal or fault state of the engine will correspond to different state domains in this space. The fault diagnosis and early warning model in this study rely on two basic assumptions of the fault domain theory: the separability of fault types and the continuity of the movement of the engine state point.
(1)
Separability of fault types
The problem of classification is to find a boundary to divide the groups into different categories. This requires effective separable distribution of different groups. Figure 1 shows an illustrative example of the non-separable distribution form of groups and the separable distribution of groups. It can be observed that the distribution of the two fault types in Figure 1a is similar and difficult to distinguish. The two fault types in Figure 1b can be easily separated by a boundary line. If the distribution of fault types corresponding to each label satisfies the separability assumption, the probability value of each label output by the classification model represents the distance between the input sample and the label in an n-dimensional space composed of n features.
(2)
Continuity of the movement of the engine state
If the various fault states and normal states of the automobile engine satisfy the separability assumption, the engine state can be regarded as a domain with different positions, shapes, and sizes in n-dimensional space. The state of an automobile engine at any time can be expressed as a point in any domain of the n-dimensional space. The set of domains representing the fault state can be called the fault domain, and the set of domains representing the normal state can be called the normal domain. The task of fault warning is to output warning signals when the state point of the automobile engine moves from the normal domain to the trend of the fault domain or close to the fault domain. To determine when the warning signal is sent out, the fault domain can be expanded outward by a fixed distance to form a new domain, called the early warning domain. When the engine state point enters the early warning domain, the warning signal is output.
This fault warning assumes that the movement of the engine state point is continuous. When the engine state point enters another domain from one domain, it will pass through the boundary between the domains, that is, the warning domain, rather than jump in the n-dimensional state space, so that the warning signal can be issued when the automobile engine is about to move from a normal state to a fault state. As shown in Figure 2, the dotted light color field is an early warning domain wrapped with different faults. When the movement of the engine state point satisfies the continuity assumption, the engine state will first pass through the early warning domain when entering the fault state from the normal state. Since the probability value output by the classification model described above represents the distance between the state point and each state domain, early fault warning can be realized by using the engine state classification model.

3.2. Fault Diagnosis Model

The fault diagnosis is achieved through a multi-classification model. The input of the fault diagnosis model is a multi-dimensional state vector. The output is the probability of various fault types. To be able to assess the possibility of a variety of fault types, rather than only giving the result is which kind of fault, the multi-classification model adopts the way of soft classification, namely the output of the probability of occurrence of various fault types. For example, if the probability of fault type 1 is 0.1, and the probability of fault type 2 is 0.9, we have 90% confidence that the engine fault is fault type 2.
The Random Forest, k-Nearest Neighbor, and eXtreme Gradient Boosting are combined by the ensemble learning method.
(1)
Random Forest (RF)
The bagging technique is used in RF when training each classifier, RF only uses part of the features of the sample to build the classifier, to achieve the diversity of classifiers, through this method, classifiers with different concerns can be obtained. In RF, each classifier is a single decision tree [37]. By training multiple decision trees and summing their judgment results by voting, the judgment results of the RF model can be obtained. Due to the use of the bagging technique, the performance of each classifier is different, which can reduce the risk of over-fitting and enhance the generalization ability of the RF model, so the performance of RF is often better than that of a single decision tree.
(2)
K-Nearest Neighbor (KNN)
By calculating the distance (usually Euler distance) between the test sample and each training sample in the training set, the KNN model will find out the k nearest samples in the training set, then select the most frequent class mark in the K samples as the prediction result of this test sample [38]. The KNN model has a high sensitivity to population distribution, and its accuracy is limited when distinguishing the population from the mixed distribution. However, because of its simplicity and fast training speed, the KNN model has a good performance in solving simple and conventional classification problems. However, in the KNN algorithm, the distance weights of different features need to be set artificially, which is cumbersome. In this study, the distance weights of each feature are equal to simplify the model.
(3)
eXtreme Gradient Boosting (XGB)
XGB is an improvement on the principle of GBDT (Gradient Boosting Decision Tree): GBDT trains multiple classifiers serially, and each classifier focuses on learning the features of the sample that the previous classifier misjudged, which makes up for the mistakes of the previous classifier. Finally, the judgment results of each classifier are added together as the judgment results of the whole model. Different from the traditional GBDT, XGB expands the objective function to the second order instead of the first order. Second, GBDT finds a new fitting label for the next level classifier, while XGB finds a new objective function. Finally, XGB adds L2 regularization (i.e., square regularization term) to the weight of cotyledon. The improvements above enable XGB to achieve better performance than traditional GBDT in most machine-learning competitions [39].
(4)
Ensemble Model
Since the above algorithms have their limitations, combining multiple models can boost the accuracy. To reduce the error of the model and maintain the model’s generalization, the combination of models by aggregating the output from each model is implemented.
In this study, an ensemble model based on soft voting is adopted: RF, KNN, and XGB are used as sub-classifiers, each sub-classifier outputs the probabilities that the sample belongs to each label, and the probability value of each label output by the ensemble model is the sum of the each classifier’s probability of this label, and the label with the highest sum result is taken as the result of the ensemble model.
y ¯ i = y ^ i , R F + y ^ i , K N N + y ^ i , X G B 3
where y ¯ i denotes the result of the ensemble model for fault i, y ^ i , R F denotes the result of the RF model for fault i, y ^ i , KNN denotes the result of the KNN model for a fault i, and y ^ i , XGB is the result of the XGB model for fault i.

3.3. Fault Earning Warning Model

The fault early warning model is built based on the fault diagnosis model. The input of the fault warning model is also a multi-dimensional state vector, and the output is what faults are about to occur. According to the fault domain theory described above, a fault early warning is triggered when an engine state point is close to a certain fault domain, i.e., when the probability value output by the fault diagnosis model exceeds its warning threshold. However, in most fault warning problems, the proportion of fault and normal samples varies greatly, showing an unbalanced distribution, i.e., the vast majority of the data collected is normal, and very little is fault data. Thus, the fault diagnosis model increases the likelihood of identifying a state point as normal. Specifically, the first high probability value output by the fault diagnosis model corresponds to a state point that is most likely to be normal; thus, there is no way to identify more fault states. The dual probability early warning model is proposed to solve this problem. Its structure is as shown in Figure 3. If the first high probability value corresponds to a normal point, the second high probability value is checked to see if the warning threshold is exceeded so that potential faults are not easily missed.
First, according to the input features of the data, the fault diagnosis model outputs probability values of the sample belonging to various fault labels and normal labels.
Second, the early warning model will first check the label with the highest probability value, if the label belongs to fault types, the model will check whether the corresponding probability value exceeds the warning threshold, if it exceeds, it means that the engine state has entered the fault domain, and the model will output a fault warning signal. If the label with the highest probability belongs to the normal type, or the label with the highest probability belongs to fault types, but the corresponding probability value does not reach the warning threshold, the model will check the label with the second highest probability value and make a similar decision process.
Finally, if the label with the highest probability value and the label with the second highest probability value does not reach the warning condition, the model will not output a fault warning signal; that is, the engine state is considered normal at the sampling time.
It is worth noting that checking the second probability is conditional on the first higher probability value corresponding to a label that is normal or faulty but does not exceed the warning threshold. The second probability is used to prevent some potential faults from crossing their warning threshold. In general, it is possible to improve the fault tolerance by adding more probability values, such as the third probability, but good results can be achieved with fault warning models using only the first and second probabilities. Adding the third probability will not improve the warning results greatly, but will instead lead to a considerable increase in computation time, which will be supported by setting up controlled tests in the subsequent discussion.

3.4. Evaluation Metrics and the Setting of Warning Thresholds

3.4.1. Evaluation Metrics

The Confusion Matrix is used to count the amount of correct and incorrect results of the classification model. Taking the binary classification task as an example, the structure of the confusion matrix is shown in Table 1.
where:
  • a represents the number of samples with a sample true label of A and a model-identified label of A;
  • b represents the number of samples with sample true label B and model-identified label A;
  • c represents the number of samples with sample true label A and model-identified label B;
  • d represents the number of samples with sample true label B and model-identified label B.
(1)
Precision
Precision represents the reliability of the model’s diagnostic results. It is calculated by Equation (2):
p r e c i s i o n ( A ) = a a + b
(2)
Recall
The recall rate of fault early warning is defined as the probability of the fault that occurs on that day being given an early warning by the model. The recall rate of early warning reflects the sensitivity of the early warning model. It is calculated by Equation (3):
R e c a l l ( A ) = a a + c
(3)
F1-Score
The f1-Score is a composite measure of a model’s performance on a class of labels, combining information from both Precision and Recall metrics to represent the model’s overall performance. As with Precision and Recall, the higher the value, the better the performance of the model. It is calculated by Equation (4):
F 1 ( A ) = 2 1 P r e c i s i o n ( A ) + 1 R e c a l l ( A )
(4)
Advance time
The function of the early warning model is to realize the early warning of the impending fault. The advance time of early warning refers to the time difference between receiving the early warning signal of a specific fault and the actual occurrence time of the fault. The longer the advance time of early fault warning, the longer the response time is left for passengers, which means that the greater the practical value of the early warning model.
(5)
Accuracy
The accuracy of fault early warning is defined as the probability of the occurrence of a specific fault on that day after receiving the warning signal of this fault. The accuracy of early fault warning reflects the credibility of the early warning model. It is calculated by Equation (5):
A c c = a + d a + b + c + d
(6)
False positive rate
The false positive rate is defined as the ratio of the fault alarm time length to the total travel time length for a fault that does not occur during vehicle driving. The false positive rate of early warning reflects the frequency of early warning of faults that will not occur during the actual driving of the vehicle and indicates the degree of interference of the early warning system to users.
(7)
Macro Average
The Macro Average of metrics is used to evaluate the effect of multi-classification models. It is the average value of the evaluation metrics of each type. For example, Precision’s Macro Average is calculated by Equation (6):
Macro   Average - Precesion = 1 n i = 1 n P i
(8)
Weighted Average
When there is a problem of unbalance in the data set, we cannot only use Macro Average but also must adopt Weighted Average. Specifically, when calculating Macro Average, each class is given the same weight, but when the sample is unbalanced, it is not appropriate to give each type the same weight. Therefore, different weights can be assigned to each class according to the sample size of each class, such as Precision’s Weighted Average calculation method, such as Equation (7):
Weighted   Average - Preceion = A P A + B P B + C P C + D P D A + B + C + D

3.4.2. The Setting of Warning Thresholds

The warning thresholds of a fault warning model are a set of hyperparameters that are key to the model. A common way of setting the thresholds is to set the same value for all engine fault types, for example setting the threshold to 0.5 means that a fault warning is triggered when the fault diagnosis model outputs a fault probability value greater than 0.5. However, the distribution of each fault in the state space is different. If the same thresholds are used, the model outputs more frequent warning signals for some faults, while others are often missed, and the overall fault warning accuracy is not high. Therefore, the warning thresholds for these faults need to be tuned so that the warning algorithm can perform optimally for each fault.
When setting warning thresholds for each type of fault, the pattern of change in the evaluation metrics of each type of fault in the model concerning the threshold should be considered. More specifically, when selecting thresholds, priority should be given to ensuring that accuracy and recall are at a high level, followed by making the frequency of false alarms as low as possible, then making the warning time advance as large as possible.
In this way, the proposed fault warning model is not predicted with a fixed advance time, that is, the advance time for each type of fault is different, which is the result of coordination with the warning accuracy and recall.

4. Experiment and Discussion

4.1. Data Introduction and Processing

The data comes from the real-time operation data of four fleets collected by BOSCH in 2020, the sampling interval is one second. At each sampling timestamp, the original data recorded 62 features representing the state of the engine, such as “vehicle speed”, “intake flow”, “intake temperature”, “DPF pressure difference”, “ECU mileage”, “engine speed”, and “total fuel consumption”. Table 2 shows record samples of the original data points.
“S1|S2” in Table 2 represents the fault type. In the original data, the fault code output by the two systems is used to determine a fault. S1 represents the first system and S2 represents the second system. For example, “4399|11” and “5243|1” represent two kinds of faults, respectively. The normal data in this data set occupy 95%, and the fault data occupy 5%. There are 275,400 normal data points and 14,497 fault data points. It is worth noting that “1-normal” and “3-5243|1” in Table 2 have roughly the same speed, but “1-normal” consumes more fuel. As these data are instantaneous values, fuel consumption increases with the sudden growth of air intake, but the engine speed and vehicle speed have not been raised at this time.
After expert evaluation, 12 useless fields, such as latitude and longitude, in the original data were eliminated, and 50 possible useful features were obtained. Then, the feature importance of random forest is used to remove the features that are not highly correlated with faults. When each feature is used as the splitting feature in the construction process of the tree, RF can count the Gini reduction between the child node and the parent node caused by this feature, in this way, the discrimination of each feature in the sample can be measured. For example, the formula for calculating the influence of feature A is as follows:
F e a t u r e I m p o r t a n c e ( A ) = i Δ G i n i i
Δ G i n i i = 2 [ p ^ m ( 1 p ^ m ) p ^ l ( 1 p ^ l ) p ^ r ( 1 p ^ r ) ]
where i is the parent node with feature A as the splitting feature, Δ G i n i i represents the Gini coefficient variation of the split node in the ith characteristic, p ^ m represents the probability estimation of the sample in the node m, p ^ l represents the probability estimation of the sample in the left node l after the node m split, and p ^ r represents the probability estimation of the sample in the right node r after the node m split.
Random forests are used to capture the importance ranking of features to select relatively important features, as shown in Figure 4. The random forest method is then used to remove features with less than 0.5% influence. The first 15 features in Figure 4 are also removed because of their low correlation. Finally, the 35 most influential features are put into the fault diagnosis and warning model. The dataset uses the “S1|S2” feature as the status label of the engine, which is marked as normal or fault states. A detailed description of the 50 features in Figure 4 is shown in Table 3.
There is an unbalance problem for the original data, which makes the trained model insensitive to the identification of some fault types with a small sample. If the distribution of the training set and the test set are different, biased evaluation results may be obtained. Stratified sampling can ensure that the proportion of data samples of all kinds of fault types and normal types in the training set and test set is the same, which makes the test results more convincing. In this study, the stratified sampling method is used to divide the data into a training set and test set according to the ratio of 9:1, and 260,907 training samples and 28,990 testing samples are obtained.
Since there is no verification set, model training is performed here by ten-fold cross-validation. The n-fold cross-validation means that the training set is divided into n parts by stratified sampling, of which n − 1 is used for training the model, and the remaining one is used as the validation set. Loop back and forth until each data set has been validated once, then take the average performance of n training validations to aid in hyperparameter tuning. Cross-validation can comprehensively examine the influence of model hyperparameters on model performance and avoid overfitting. Here, ten-fold cross-validation was performed on the training data.

4.2. Results of Fault Diagnosis Model

As there is no validation set, this paper uses ten-fold cross-validation when training the model to avoid over-fitting the model. The evaluation metrics are shown in Section 3.4.1. The macro average and weighted average values of each evaluation metric are calculated to verify the performance of the model. The macro average value reflects the average performance of all fault types on the same metric. The weighted average value indicates the performance of the model on the whole data set because it combines the proportion information of all kinds of samples. To compare the effects of the ensemble classification model and single classification model in engine fault diagnosis, the RF model, KNN model, XGB model, and ensemble model are built. Grid Search and Random Search methods are used to determine the optimal hyper-parameters of each model. The results of each model on the test set are shown in Table 4.
By comparing the evaluation metrics of each model, the performance of the ensemble model is better. Consequently, the ensemble model is regarded as the core of the fault early warning model.

4.3. The Results of the Fault Early Warning Model

Calculate the mean value of four evaluation indexes of various faults under different warning thresholds and visualize the evaluation results. Finally, get the changing trend of each index with the increase in the early warning threshold, as shown in Figure 5.
The trend of early warning accuracy with the increase in warning threshold is shown in Figure 5a. The accuracy shows an upward trend, because the higher the probability that a momentary state is under the fault label, the closer the momentary state is to the real fault domain in space and the more discriminative it is. Most of the faults have more than 90% accuracy when the warning threshold reaches 0.8, and the accuracy of the fault “5268|4” gets rapidly increased with the growth of the warning threshold, which indicates that the warning algorithm is more sensitive to the threshold in the performance of this fault.
The trend of the early warning recall rates with the increase in the warning threshold is shown in Figure 5b. The warning recall rates decrease with the growth of the warning threshold. This is because the warning threshold is elevated to screen out the momentary states that are far from the fault domain, which results in more engine states moving faster toward the fault domain being missed. As can be seen from the figure, the recall rate of many faults decreases rapidly as the warning threshold is raised. For these faults, the warning threshold should not be lower than 0.4 to ensure a high recall rate for their warnings, when a warning recall rate of more than 90% can be maintained. However, there are some faults whose recall rate is always firm above 95%, such as “5243|1”, for which a higher warning threshold can be taken to seek performance improvement in other indicators. As can be seen, recall rate and accuracy are a pair of performance metrics that are in contrast to each other, and it is important to pay attention to the balance between the two when adjusting the warning threshold to ensure that both are maintained at a high level.
The changing trend of the early warning false positive rate with the increase in the warning threshold is shown in Figure 5c, which shows that the overall warning false positive rate decreases with the enhancement of the warning threshold. Since most of the time is in the normal domain, the warning false positive rate is maintained at an extremely low level, and most of the faults are maintained at 0–3%. However, even a false positive rate of 1% seems unacceptable for passengers. Consequently, when adjusting the warning threshold, the false positive rate should be minimized while guaranteeing the safety of the system.
The trend of the warning advance time with the increase in the warning threshold is shown in Figure 5d, the overall warning time decreases with the increase in the warning threshold. It is worth mentioning that the model performs very well in this metric, and even when the warning threshold is raised to a very high level, 0.95, the average warning advance time for each fault is above 200 min. This makes it possible that passengers will have sufficient time to cope with upcoming faults.
According to the change law of the four evaluation metrics of various faults with the warning threshold, following the setting principle of the early warning threshold, the best warning threshold is selected for each fault. The early warning performance of each fault under the best early warning threshold is shown in Table 5.
According to Table 4, the dual probability fault early warning model with the ensemble model shows good warning performance. The accuracy and recall rate of most fault types tested in the model reaches 90%. The average advance time of the model is 440 min, which can give passengers enough time to deal with the faults that will occur. The average false positive rate of the model is about 0.27%, meaning that there is still room for optimization.
To illustrate the computational efficiency of the proposed dual probability early warning model, another parallel early warning structure is constructed to make a control experiment. The parallel early warning structure refers to not only considering the fault label corresponding to the first and second highest fault probability values but comparing the probability values of each fault output with the set early warning threshold. If the threshold is exceeded, the alarm will be triggered. Using the same computing equipment and data set, the early warning accuracy and calculation time obtained by the two structures are recorded in Table 6, where the calculation time is the average value obtained using the ten-fold cross-validation method. According to the comparison results of Table 5, the accuracy of faults ‘5557|2’ and ‘55|4’ is increased by 4%, and the accuracy of other fault types is not significantly improved. Due to the change from checking two probability values to checking all probability values, the calculation time is increased to nearly four times. In contrast, the dual probability early warning model can greatly reduce the computation time while achieving high warning accuracy.

5. Conclusions

To realize the function of automatic identification of engine fault in automobiles, a fault early warning model of automobile engines based on soft classification is built, the model will give an early warning when the probability of engine fault exceeds the warning threshold. The experimental results show that the accuracy and recall rate of most fault types tested in the model reaches a high level of more than 90%, and the advance time of each fault type is maintained at more than 3 h. Comprehensively, the model has good performance in the fault early warning task, which reflects the practical value of the model. However, it is undeniable that the early warning accuracy of some fault types in the model needs to be improved, and the uneven distribution of sample size is one of the reasons. In addition, by comparing the parallel warning structure and the dual-probability warning structure, it is found that the proposed warning structure can significantly reduce the computation time and is suitable for real-time warning of many kinds of faults while ensuring a high warning accuracy.
In the future, the accuracy of the early warning model can be improved by increasing the training samples. In addition, the current early warning model has a high false positive rate for some faults, which may be mainly due to the lack of ability to express the characteristics of the fault. In further research, customized data collection for each fault should be considered.

Author Contributions

X.L.: Conceptualization, methodology, writing an original draft; N.W.: conceptualization, methodology; Y.L.: writing review and editing; Y.D.: data curation; J.Z.: visualization. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Science and Technology Commission of Shanghai Municipality (grant number 22692107000).

Data Availability Statement

Restrictions apply to the availability of these data. Data was obtained from the Bosch Corporation. Because the data involves commercial privacy, the authors have no right to share the data.

Conflicts of Interest

The authors declare that they have no known competing financial interest or personal relationship that could have appeared to influence the work reported in this paper.

Abbreviations

RFRandom Forest
LSTMLong-Short Term Memory
GRUGated Recurrent Unit
CNNConvolutional Neural Networks
LGBMLight Gradient Boosting Machine
XGBeXtreme Gradient Boosting
SVMSupport vector machine
ANNArtificial Neural Network
RNNRecurrent Neural Network
KNNK-Nearest Neighbor
DPFDiesel Particulate Filter carbon
ECUElectronic Control Unit
SCRSelective Catalytic Reduction

References

  1. Li, F.; Chen, J.; Liu, Z.; Lv, H.; Wang, J.; Yuan, J.; Xiao, W. A soft-target difference scaling network via relational knowledge distillation for fault detection of liquid rocket engine under multi-source trouble-free samples. Reliab. Eng. Syst. Saf. 2022, 228, 108759. [Google Scholar] [CrossRef]
  2. Talebi, S.; Madadi, A.; Tousi, A.; Kiaee, M. Micro Gas Turbine fault detection and isolation with a combination of Artificial Neural Network and off-design performance analysis. Eng. Appl. Artif. Intell. 2022, 113, 104900. [Google Scholar] [CrossRef]
  3. Zhu, S.; Tan, M.K.; Chin, R.K.Y.; Chua, B.L.; Hao, X.; Teo, K.T.K. Engine Fault Diagnosis using Probabilistic Neural Network. In Proceedings of the 2021 IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET), Kota Kinabalu, Malaysia, 13–15 September 2021; pp. 1–6. [Google Scholar] [CrossRef]
  4. Li, X.; Bi, F.; Zhang, L.; Yang, X.; Zhang, G. An Engine Fault Detection Method Based on the Deep Echo State Network and Improved Multi-Verse Optimizer. Energies 2022, 15, 1205. [Google Scholar] [CrossRef]
  5. Tang, D.; Bi, F.; Lin, J.; Li, X.; Yang, X.; Bi, X. Adaptive Recursive Variational Mode Decomposition for Multiple Engine Faults Detection. IEEE Trans. Instrum. Meas. 2022, 71, 1–11. [Google Scholar] [CrossRef]
  6. Mohammed, A.A. Performance analysis of variable valve timing engine to detect some engine faults by using Hilbert Huang transform. Appl. Acoust. 2022, 194, 108775. [Google Scholar] [CrossRef]
  7. Perla, F.; Richman, R.; Scognamiglio, S.; Wüthrich, M.V. Time-series forecasting of mortality rates using deep learning. Scand. Actuar. J. 2021, 2021, 572–598. [Google Scholar] [CrossRef]
  8. Yuan, Y.; Liu, X.; Ding, S.; Pan, B. Fault Detection and Location System for Diagnosis of Multiple Faults in Aeroengines. IEEE Access 2017, 5, 17671–17677. [Google Scholar] [CrossRef]
  9. Silva, A.; Zarzo, A.; González, J.M.M.; Munoz-Guijosa, J.M. Early fault detection of single-point rub in gas turbines with accelerometers on the casing based on continuous wavelet transform. J. Sound Vib. 2020, 487, 115628. [Google Scholar] [CrossRef]
  10. Huang, H.-Z.; Wang, H.-K.; Li, Y.-F.; Zhang, L.; Liu, Z. Support vector machine based estimation of remaining useful life: Current research status and future trends. J. Mech. Sci. Technol. 2015, 29, 151–163. [Google Scholar] [CrossRef]
  11. Deutsch, J.; He, D. Using Deep Learning-Based Approach to Predict Remaining Useful Life of Rotating Components. IEEE Trans. Syst. Man Cybern. Syst. 2018, 48, 11–20. [Google Scholar] [CrossRef]
  12. Yuan, M.; Wu, Y.; Lin, L. Fault diagnosis and remaining useful life estimation of aero engine using LSTM neural network. In Proceedings of the 2016 IEEE International Conference on Aircraft Utility Systems (AUS), Beijing, China, 10–12 October 2016; pp. 135–140. [Google Scholar] [CrossRef]
  13. Li, X.; Li, J.; Qu, Y.; He, D. Gear Pitting Fault Diagnosis Using Integrated CNN and GRU Network with Both Vibration and Acoustic Emission Signals. Appl. Sci. 2019, 9, 768. [Google Scholar] [CrossRef] [Green Version]
  14. Dejun, W.; Tianliang, X.; Chengdong, L.; Lihua, W. Fault diagnosis of automobile engine based on support vector machine. In Proceedings of the 2011 3rd International Conference on Advanced Computer Control, Harbin, China, 18–20 January 2011; pp. 320–324. [Google Scholar] [CrossRef]
  15. Zheng, J.-Y.; Yang, Z.-X.; Wu, G.-G.; Li, X.-M.; Wang, J. FTA-SVM-based fault recognition for vehicle engine. In Proceedings of the 2015 IEEE 12th International Conference on Networking, Sensing and Control, Taipei, Taiwan, 9–11 April 2015; pp. 180–184. [Google Scholar] [CrossRef]
  16. Chen, J.; Randall, R.B. Improved automated diagnosis of misfire in internal combustion engines based on simulation models. Mech. Syst. Signal Process. 2015, 64–65, 58–83. [Google Scholar] [CrossRef] [Green Version]
  17. Qiao, X.; Gu, C. Research on Wear Status of Diesel Engine Cylinder Based on BP Neural Network and Instantaneous Speed. J. Phys. Conf. Ser. 2019, 1237, 052011. [Google Scholar] [CrossRef] [Green Version]
  18. Di, L.; Jie, W. The application of improved BP neural network in the engine fault diagnosis. In Proceedings of the 31st Chinese Control Conference, Hefei, China, 25–27 July 2012; pp. 3352–3355. [Google Scholar]
  19. Jing, L.; Zhao, M.; Li, P.; Xu, X. A convolutional neural network based feature learning and fault diagnosis method for the condition monitoring of gearbox. Measurement 2017, 111, 1–10. [Google Scholar] [CrossRef]
  20. Chen, Z.; Li, C.; Sanchez, R.-V. Gearbox Fault Identification and Classification with Convolutional Neural Networks. Shock. Vib. 2015, 2015, 390134. [Google Scholar] [CrossRef] [Green Version]
  21. Wang, H.; Xu, J.; Yan, R.; Gao, R.X. A New Intelligent Bearing Fault Diagnosis Method Using SDP Representation and SE-CNN. IEEE Trans. Instrum. Meas. 2020, 69, 2377–2389. [Google Scholar] [CrossRef]
  22. Flett, J.; Bone, G.M. Fault detection and diagnosis of diesel engine valve trains. Mech. Syst. Signal Process. 2016, 72–73, 316–327. [Google Scholar] [CrossRef]
  23. Ma, M.; Chen, X.; Wang, S.; Liu, Y.; Li, W. Bearing degradation assessment based on weibull distribution and deep belief network. In Proceedings of the 2016 International Symposium on Flexible Automation (ISFA), Cleveland, OH, USA, 1–3 August 2016; pp. 382–385. [Google Scholar] [CrossRef]
  24. Srilatha, D.; Shyam, G.K. Cloud-based intrusion detection using kernel fuzzy clustering and optimal type-2 fuzzy neural network. Clust. Comput. 2021, 24, 2657–2672. [Google Scholar] [CrossRef]
  25. Naderkhani, R.; Behzad, M.H.; Razzaghnia, T.; Farnoosh, R. Fuzzy Regression Analysis Based on Fuzzy Neural Networks Using Trapezoidal Data. Int. J. Fuzzy Syst. 2021, 23, 1267–1280. [Google Scholar] [CrossRef]
  26. Wang, Z.; Wang, L.; Tan, Y.; Yuan, J.; Li, X. Fault diagnosis using fused reference model and Bayesian network for building energy systems. J. Build. Eng. 2021, 34, 101957. [Google Scholar] [CrossRef]
  27. Xu, Q.; Du, X.; Ai, Q.; Liu, Y. A Combined Model-Based and Data-Driven Method for Early Sensor Fault Detection and Identification of an Electro-Hydraulic Servo System. Chin. J. Sens. Actuators 2020, 33, 1061–1073. [Google Scholar]
  28. Atoui, M.A.; Cohen, A. Coupling data-driven and model-based methods to improve fault diagnosis. Comput. Ind. 2021, 128, 103401. [Google Scholar] [CrossRef]
  29. Meicun, S.; Qi, C. Fault trend prediction of device based on support vector regression. At. Energy Sci. Technol. 2011, 45, 972. Available online: https://www.osti.gov/etdeweb/biblio/22247197 (accessed on 8 September 2022).
  30. Boškoski, P.; Gašperin, M.; Petelin, D.; Juričić, Đ. Bearing fault prognostics using Rényi entropy based features and Gaussian process models. Mech. Syst. Signal Process. 2015, 52–53, 327–337. [Google Scholar] [CrossRef]
  31. Thumati, B.T.; Jagannathan, S. A model based fault detection and prognostic scheme for uncertain nonlinear discrete-time systems. In Proceedings of the 2008 47th IEEE Conference on Decision and Control, Cancún, Mexico, 9–11 December 2008; pp. 392–397. [Google Scholar] [CrossRef]
  32. Rashid, M.M.; Amar, M.; Gondal, I.; Kamruzzaman, J. A data mining approach for machine fault diagnosis based on associated frequency patterns. Appl. Intell. 2016, 45, 638–651. [Google Scholar] [CrossRef]
  33. Jafarian, K.; Mobin, M.; Jafari-Marandi, R.; Rabiei, E. Misfire and valve clearance faults detection in the combustion engines based on a multi-sensor vibration signal monitoring. Measurement 2018, 128, 527–536. [Google Scholar] [CrossRef]
  34. Mikolov, T.; Karafiat, M.; Burget, L.; Cernocky, J.; Khudanpur, S. Recurrent Neural Network Based Language Model. Interspeech 2010, 2, 1045–1048. [Google Scholar]
  35. Chui, K.T.; Gupta, B.B.; Vasant, P. A Genetic Algorithm Optimized RNN-LSTM Model for Remaining Useful Life Prediction of Turbofan Engine. Electronics 2021, 10, 285. [Google Scholar] [CrossRef]
  36. Zhou, Z.; Cheng, X.; Chang, H.; Zhou, J.; Zhao, X. A Self-Diagnostic Method for Automobile Faults in Multiple Working Conditions Based on SOM-BPNN. Comput. Intell. Neurosci. 2021, 2021, 6801161. [Google Scholar] [CrossRef]
  37. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
  38. Kramer, O. K-Nearest Neighbors. In Dimensionality Reduction with Unsupervised Nearest Neighbors; Kramer, O., Ed.; Springer: Berlin/Heidelberg, Germany, 2013; pp. 13–23. [Google Scholar] [CrossRef]
  39. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
Figure 1. An illustrative example of non-separable distribution and separable distribution. (a) Non-separable distribution form; (b) Separable distribution form.
Figure 1. An illustrative example of non-separable distribution and separable distribution. (a) Non-separable distribution form; (b) Separable distribution form.
Electronics 12 00511 g001
Figure 2. Spatial distribution of engine state.
Figure 2. Spatial distribution of engine state.
Electronics 12 00511 g002
Figure 3. Structure of dual probability early warning threshold model.
Figure 3. Structure of dual probability early warning threshold model.
Electronics 12 00511 g003
Figure 4. The influence of each feature.
Figure 4. The influence of each feature.
Electronics 12 00511 g004
Figure 5. Evaluation metrics of various faults—Early warning threshold. (a) Accuracy of various fault types—Warning threshold ; (b) Recall rate of various fault types—Warning threshold; (c) False positive rate of various fault types—Warning threshold; (d) Advance time of various fault types—Warning threshold.
Figure 5. Evaluation metrics of various faults—Early warning threshold. (a) Accuracy of various fault types—Warning threshold ; (b) Recall rate of various fault types—Warning threshold; (c) False positive rate of various fault types—Warning threshold; (d) Advance time of various fault types—Warning threshold.
Electronics 12 00511 g005
Table 1. Binary classification confusion matrix.
Table 1. Binary classification confusion matrix.
True ATrue B
Labeled Aab
Labeled Bcd
Table 2. Record samples of the data points.
Table 2. Record samples of the data points.
S1|S2Vehicle Speed
(m/s)
Intake Flow
(g/s)
Intake Temperature
(°C)
DPF Pressure DifferenceECU
Mileage
(m)
Engine Speed
(rpm)
Total Fuel Consumption
(L/100 km)
1Normal35.268338.935.0109,8601132.032.2
24399|1130.238532.634.0125,6891002.015.6
35243|136.249539.639.0129,2871092.016.6
Table 3. The 50 features in Figure 4.
Table 3. The 50 features in Figure 4.
IndexFeatureIndexFeature
1DPF carbon loading26Intake Flow Rate
2Average temperature in 15 min27Cumulative hydrocarbon injection volume
3Reference torque28Urea Nozzle Duty Cycle
4Real-time hydrocarbon injection volume29Oil pressure
5Torque limiting switch30Power Take Off switch
6Brake switch31Average temperature in 4 min
7Exhaust brake switch32Exhaust gas dew point information
8Hand brake switch33Average temperature in 5 min
9Active regeneration status34Intake air temperature
10Clutch switch35Regeneration Request
11Neutral state36Battery voltage
12Throttle opening angle37Atmospheric temperature
13Service regeneration state 138DPF Differential Pressure
14Service regeneration state 239SCR upstream temperature
15Torque Percentage40Atmospheric Pressure
16Frictional torque41Urea tank temperature
17Vehicle speed42Metering valve flow control
18Instantaneous fuel consumption43Water temperature
19Idle Torque44Urea level
20Internal Torque45Nitrogen and oxygen value
21Injection volume46ECU mileage
22Actual Rail Pressure47Total fuel consumption
23Engine Speed48SCR downstream temperature
24Absolute boost pressure49Cumulative engine runtime
25Exhaust Flow Rate50Total urea consumption
Table 4. Results of the fault diagnosis models on the test set.
Table 4. Results of the fault diagnosis models on the test set.
ModelMacro AverageWeighted Average
PrecisionRecallF1-ScorePrecisionRecallF1-Score
RF0.920.760.810.970.970.97
KNN0.730.590.630.930.940.93
XGB0.890.790.830.970.970.97
“RF + KNN + XGB”0.930.800.840.970.970.97
Table 5. Early warning performance of various faults.
Table 5. Early warning performance of various faults.
Fault LabelWarning ThresholdAccuracyRecall RateAdvance TimeFalse Positive
4399|110.350.940.94488 min0.881078%
5243|10.950.960.96525 min0.008634%
5266|40.850.910.78453 min0.129269%
5268|40.550.980.90418 min0.099155%
5393|220.60.990.96304 min0.006078%
5557|20.950.720.95467 min0.024249%
55|40.950.710.99655 min0.983239%
91|90.950.990.98206 min0.000001%
Average——0.900.93440 min0.266462%
Table 6. Results of the control test.
Table 6. Results of the control test.
Fault LabelWarning ThresholdDual Probability StructureParallel Structure
AccuracyComputing TimeAccuracyComputing Time
4399|110.350.94124.2 (s)0.94481.5 (s)
5243|10.950.960.96
5266|40.850.910.92
5268|40.550.980.98
5393|220.60.990.99
5557|20.950.720.76
55|40.950.710.75
91|90.950.990.99
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, X.; Wang, N.; Lyu, Y.; Duan, Y.; Zhao, J. Data-Driven Fault Early Warning Model of Automobile Engines Based on Soft Classification. Electronics 2023, 12, 511. https://doi.org/10.3390/electronics12030511

AMA Style

Li X, Wang N, Lyu Y, Duan Y, Zhao J. Data-Driven Fault Early Warning Model of Automobile Engines Based on Soft Classification. Electronics. 2023; 12(3):511. https://doi.org/10.3390/electronics12030511

Chicago/Turabian Style

Li, Xiufeng, Ning Wang, Yelin Lyu, Yan Duan, and Jiaqi Zhao. 2023. "Data-Driven Fault Early Warning Model of Automobile Engines Based on Soft Classification" Electronics 12, no. 3: 511. https://doi.org/10.3390/electronics12030511

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop