Fault diagnosis of diesel engine valve clearance based on variational mode decomposition and random forest

: Diesel engines, as power equipment, are widely used in the fields of automobile industry, ship and power equipment. Due to wear or faulty adjustment, the valve train clearance abnormal fault is a typical failure of diesel engines, which may result in the performance degradation, even valve fracture and cylinder hit fault. However, the failure mechanism features mainly in time domain and angular domain, on which the current diagnosis methods based, are easily affected by working conditions or hard to extract accurate enough, as the diesel engine keeps running in transient and non-stationary process. This work arms at diagnosing this fault mainly based on frequency band features which would change when the valve clearance fault occurs. For the purpose of extracting a series of frequency band features adaptively ， a decomposition technique based on improved variational mode decomposition is investigated in this work. As the connection between the features and the fault is fuzzy, the random forest algorithm is used to analyze the correspondence between features and faults. In addition, the feature dimension is reduced to improve the operation efficiency according to importance score. The experimental results under variable speed condition show that the method based on variational mode decomposition and random forest is capable to detect valve clearance fault effectively.


Introduction
As a large-scale reciprocating power machine, diesel engine is widely used in industrial, military and agricultural fields.Because it has complex mechanical structure, some faults also exhibit diversity and complexity.The major faults reported include misfire, piston slap, knock, fuel injection fault, and valve fault.Among them, valve fault is one of the most common fault modes in diesel engines.The valve clearance of a diesel engine, which controls the intake and exhaust of the cylinder, usually exceeds the normal value due to wear or faulty adjustment.The increase or decrease of valve clearance will affect the performance and reliability of the entire equipment and cause unnecessary economy loss.Therefore, it is very important to apply the techniques of condition monitoring and fault diagnosis on diesel engines to ensure the safety and reliability during operation process [1,2].
The common methods used for the research of fault diagnosis are divided into two types.One type of research is to identify the state of the valve, to distinguish between good and faulty valve, and to indicate abnormal alarm.Another is to directly diagnose the specific fault type of valve [3].The first type is the widest research area and is very important for timely warning when the equipment has a problem.This article includes two layers of content.Besides the first type, methods This paper presents a new diesel engine valve fault diagnosis method based on VMD and RF algorithm.Based on the two-time decomposition strategy, the signal is decomposed by the VMD method, and the penalty function α value are determined by the criterion for optimal selection.After acquiring several IMF signals, the spectral energy characteristics of the IMFs are extracted.In addition, we also combine the time domain features and frequency domain features of the original signals to form an 18-dimensional feature vector.With the RF used for feature selection and the lowdimensional features used as the input parameters of the RF classifier, the alarm of the abnormal valve state of the diesel engine and the identification of the three types of valve clearance faults are finally realized.The analysis of the actual data under the condition of variable speed indicates that the combination of VMD and RF is an effective method for the diagnosis of valve clearance faults.
The organization of this paper is as follows.Section 2 introduces the diesel engine experimental test system.Section 3 introduces the principle of diagnosis based on optimized VMD-RF algorithm.The examples of application are described in Section 4. Finally, conclusions are given in Section 5.

Diesel engine experimental test system
In order to verify the effectiveness of the proposed method in valve fault diagnosis, all the experiments will be conducted on a TBD234V12 direct injection diesel engine.The increase of valve clearance will lead to the increase of impact force when the valve is seated.According to the transmission path of vibration, the vibration generated by the valve is finally reflected on the surface of diesel engine cylinder head.Therefore, the vibration measurement point is determined to be the upper surface of cylinder head of diesel engine.The diesel engine has 12 cylinders which are arranged in two columns, named column A and column B. The major features of the diesel engine are summarized in Table 1.The vibration signals and pulse signals are sampled using a data acquisition (DAQ) system, in which the DAQ card has a 16-bit analog-to-digital converter (ADC) resolution and a maximum sampling rate of 102.4KS/s per channel, and up to 32 analog inputs.Signals are processed by a computer with 16 GB of random access memory (RAM), and a 3.10 GHz Intel i7 processor.The interconnection of the main test rig components is shown in Figure 1.The simulation experiment of valve faults is carried out under variable speed conditions with speeds of 1500 r/min, 1800 r/min, and 2100 r/min, respectively, and the load is kept constant at 1000 N•m.The accelerometer is mounted on the cylinder head through a screw connection.The sampling frequency is 51,200 Hz.The installation position of acceleration sensor is shown in Figure 2. In addition, a key phase sensor (ie, an eddy current sensor) is mounted on the flywheel.Each time the diesel crankshaft makes one revolution, the key phase pulse signals used to identify the start of the duty cycle will be recorded.The main parameters of the acceleration sensor and the eddy current sensor are shown in Table 2 and Table 3,     Linear Range mm 4 There are two groups in total, which are normal gap test and abnormal gap test.In view of the fact that the velocity and acceleration of the valve movement in the cam buffer section are small and the amount of change is small, the slight deviation of the valve clearance near the standard value will not have a significant impact on the operation of the unit.Therefore, a total of six valve clearances are set in the valve abnormality test group.The status and specific parameter settings are shown in Table 4. First, we need to loosen the lock nut above the B1 to B6 cylinder valve in column B, then turn the adjusting screw to simulate the six types of faults according to the cylinder number sequence in the table, respectively, small intake valve clearance increment (fault 1), large intake valve clearance increment (fault 2), small exhaust valve clearance increment (fault 3), large exhaust valve clearance increment (fault 4), small intake and exhaust valve clearance increment (fault 5), large intake and exhaust valves clearance increment (fault 6).

The basic principle of VMD algorithm
VMD is a new non-stationary and non-linear method in the signal decomposition and calculation field, which is based on the theoretical basis of classical wiener filtering, hilbert transform, and heterodyne demodulation.The essence is to iteratively solve the variational problem.VMD has a strict mathematical theory as the basis for the adaptive decomposition of signals [17].
According to the preset number of modal components, VMD can decompose the signal into K modes with center frequency and limited bandwidth.Through continuous iteration, the center frequency and bandwidth are updated to finally determine an ensemble of IMFs, written as   () ， = 1,2, … ,  .Input signal () can be restored by simple summation.The decomposition of the algorithm is as follows: 1.The analytic signal of each mode function   () is computed by hilbert transform method, thus unilateral spectrum of the analytic signal is obtained: 2. Since it is necessary to adjust the center frequency of the analytical signal corresponding to each mode component, an exponent term  −   is added to modulate the spectrum of each mode component to the baseband: 3. The bandwidth of each component is estimated through the gaussian smoothness of the demodulated signal, that is, the squared  2 -norm of the gradient.According to the constraint conditions, the optimal variational mode is established: where   = { 1 , … ,   } are central frequencies of modes.4 On this basis, the secondary penalty factor α and lagrange multiplier factor λ(t) are introduced to solve the problem by transforming into a variational problem.Among them, the second penalty factor can guarantee the reconstruction accuracy of the signal and is not affected by gaussian noise.The lagrangian multiplication factor can keep the constraint condition strict, and the extended lagrangian expression is as follows: 5. An alternate direction multipliers method (ADMM) is used to solve the original minimization problem, to find the saddle point of the augmented lagrange expression through updating {  ()}, {  }, λ(t) alternately.So the signal was decomposed into different discrete modes adaptively.

VMD algorithm optimization strategy
Before the vibration signal is decomposed by VMD method, the parameters including the number of the modes K and the penalty function α should be predefined in advance.However, it is usually very difficult to determine the number of the modes K in practical applications.In addition, we find that a certain value of K cannot keep good decomposition performances both in normal and fault conditions after manually adjusting the parameters many times.Therefore, based on the idea of wavelet packet decomposition, we propose a two-time decomposition strategy.In the process of testing parameters using the strategy, we found that the penalty function  has a significant impact on the width of the decomposition band.The larger the value of  , the smaller the bandwidth of the IMF component; the smaller the value of  , the larger the bandwidth of the IMF component.In order to obtain a suitable penalty function  , we consider that the IMF component after VMD decomposition satisfies the following two conditions [18]: 1.When the original signal two-time decomposition strategy is processed, the frequency spectral energy of the original signal leaks due to VMD decomposition.In order to make the IMF component signal have more valid information about the original signal, the energy leakage of the IMF component frequency spectrum should be minimized.
2. If the IMF component after the VMD decomposition contains more noise, and the impact characteristics associated with the fault are not obvious, the sparsity of the component signal is weak, and the power spectrum entropy value is large.If the IMF component contains more fault characteristic information and there is a significant pulse in the waveform, the signal has strong Preprints (www.preprints.org)| NOT PEER-REVIEWED | Posted: 25 December 2019 sparse characteristics and the power spectrum entropy is small.Therefore, we need to have the IMF component with the smallest power spectral entropy value.
At the same time based on the above two points, we propose a criterion for optimal selection: ar = () + () , (5) where p represents the energy leakage ratio, q represents the power spectrum entropy increment ratio,  represents the penalty function, and  represents the weighting factor.This paper takes  = 0.75 .
Taking the vibration signal of large exhaust valve clearance increment (fault 4) at 1500 rpm as an example, we analyze the penalty function  as an independent variable, here the penalty function  ∈ [10,3000] .According to the above criteria for optimal selection, we can get the parabolic curve of Tar changing with  , as shown in the Figure.From a partially enlarged image, we can find that when the penalty function  takes a value of 80, Tar can get the minimum value.Therefore, we decided to choose the penalty function  = 80 in this paper.The program code involved in this paper is all achieved in the MATLAB R2015b environment.
According to the two-time decomposition strategy performing VMD algorithm, in the first time decomposition, the original signal D00 (fault 4) is decomposed into two modes defined as D11 and D12, then the low frequency mode decomposed again into two modes D21 and D22, equally the high frequency mode decomposed again into two modes D23 and D24, as shown in Figure 4. Obviously, the spectrum of different IMF components can be well separated and we calculate the spectral energy of each component as a fault feature that characterizes the valve clearance anomaly, indicating that the VMD method with the two-time decomposition strategy is capable of obtaining the fault-reflected components of vibration signals.

The introduction of RF
The RF algorithm essentially belongs to the category of ensemble learning.It builds multiple individual decision trees based on randomly sampled information to make predictions, and finally selects the mode of the prediction target class as the final result by voting [19] .The randomness of RF is reflected in two aspects, one is the randomness of sample selection and the other is the randomness of the selection of feature attributes.Suppose that N decision trees are generated, there are M feature attributes of the sample.The RF model is based on the bootstrap sampling method to extract sample subsets from all training samples, and then randomly selects the specified number of feature attributes to establish each decision tree.Out-of-bag (OOB) data that is not in the sample subset can be used to evaluate the internal error of the decision tree classification accuracy.The most likely target category of OOB data can be comprehensively determined by voting.Through continuous iteration, the relationship between the OOB error and the number of decision trees is obtained, which provides a selection foundation for the number of decision trees [20].The OOB error is expected to be lower, so the classification accuracy of RF is higher, and the generalization performance of the model is also better.The construction process for the RF classifier is shown in the Figure 5.In addition to excellent classification performance, RF can also calculate the importance score V for each feature.After calculating the OOB error (B1) for each decision tree, the interference noise is randomly added to each feature variable of the out-of-pocket data.The values of other features are fixed and the OOB error (B2) is calculated again.The sum of the differences of the two types of OOB errors for all decision trees is divided by the number of decision trees N to obtain V.The importance score for any feature X can be expressed as: According to the importance score of feature parameters, feature selection can be performed on multi-dimensional features.Extracting feature parameters with larger V-values to optimize feature vectors can greatly improve the efficiency of program operation.

Proposed optimized VMD-RF analysis
In this paper, the advantage of VMD decomposition on the non-stationary signal is combined with RF algorithm to diagnose valve clearance fault of diesel engine.Firstly, VMD parameters are determined by optimization criteria.Then, for each IMF component obtained by the two-time decomposition strategy, spectrum energy characteristics are extracted.In addition, some common time domain and frequency domain features are extracted from the original signal to form a comprehensive feature vector.Finally, the reduced feature parameters are determined through RF feature selection and input into the RF classifier for abnormality alarm and fault identification of valve clearance.The fault diagnosis process is shown in Figure 6.

Theoretical analysis of valve train
The valve allow mixture of air and fuel to enter into a cylinder's compartment and the exhaust of residue to leave the cylinder's compartment after one combustion cycle of a cylinder has been completed.As shown in Figure 7, a camshaft uses a cam to push against a valve so that it can be opened when the camshaft is rotated to its specific profile.A spring located on top of the valve returns it to the closed position.The lifting of the valve depends on the cam profile and the valve clearance [21].When the valve has excessive clearance, the energy of vibration induced by impacting during the process of the valve opening and closing increases [22].Therefore, vibration energy could be considered as an identification feature for valve clearance's fault.
However, if an inspector uses only amplitude or energy of vibration signals to observe if the clearance of the valve is proper, then the result could be misleading.A change in the diesel engine's operating condition, such as working at different speeds, also affects the value of amplitude or vibration energy.Thus, there is a need for an analysis method that can obtain stable characteristics that do not change with the rotational speed of diesel engine to characterize changes in valve clearance.

Feature extraction
These are considered as a pattern recognition problem which consists of three phases namely, feature extraction, feature selection and feature classification.Among them, feature extraction is a key step in fault diagnosis.It usually involves two important aspects: one is that the extracted features should be sensitive to the target fault, and the other is that they should be insensitive or less sensitive to other conditions.For this article, ideal characteristics should be able to effectively identify the fault of abnormal valve clearance under different rotational speed conditions of the diesel engine, that is, stability to the rotational speed conditions.The main purpose of this work is to develop a method to obtain desirable features for detecting and diagnosing abnormal valve clearance.
When the valve clearance becomes abnormal, the force of the valve impacting on valve seat changes, and the spectral energy of each frequency component in the signal changes accordingly.The IMFs produced by VMD decomposition represent a set of stationary signals in feature scales, of which the spectrum energy can be used to characterize different working states of valves.Therefore, the original signal is decomposed into D21, D22, D23 and D24 by the two-time decomposition strategy, and their spectral energies are taken as characteristic parameters.
Diesel engine's vibration signal has the problems of local impact, wide frequency band distribution, and being susceptible to external noise.If we only use the VMD method to decompose the vibration signal, the characteristic information of valve fault cannot be completely extracted.In order to fully reflect the working state of the valve, RMS value, peak value, peak-to-peak value, waveform index, pulse index, K-factor, crest factor, kurtosis, and skewness may be taken as time domain characteristic parameters representing the state of the vibration signal.For the valve, when the clearance is abnormal, the main frequency of the cylinder head's vibration signal may move to high or low frequency, and its distribution is also an effective parameter reflecting the working state of the valve.Therefore, the center of gravity frequency, mean square frequency, frequency variance, RMS frequency and standard deviation of frequency are used as frequency domain characteristics to reflect different valve's operating conditions.
To sum up, this paper proposes to extract the spectral energy characteristics of IMFs, and combines the time domain and frequency domain features to comprehensively select feature parameters for establishing the feature vectors.The specific steps are as follows: 1. Acquiring experimental data for normal and fault conditions of diesel engine's valves.
2. Studying the frequency components of the vibration signal.After determining the penalty function  by the criterion for optimal selection, each group of vibration signals is decomposed by the VMD method according to the two-time decomposition strategy.
3. The IMFs perform fourier transform to calculate the corresponding spectral energy Ei and extract the spectral energy characteristics.The formula is as follows: 4. Calculating the RMS value, peak value, peak-to-peak value, waveform index, pulse index, K factor, crest factor, kurtosis, and skewness of each group of signals.
5. Calculating the gravity frequency, mean square frequency, frequency variance, RMS frequency, and standard deviation of frequency of each group of signals.
6. Integrating spectrum energy of IMFs, 9 time domain features and 5 frequency domain features, a total of 18-dimensional features are used to construct feature vectors.

Feature selection
Since each feature parameter has to be calculated several times during the training and testing process, RF algorithm has complicated and much calculations.If there are a lot of decision trees, the operation efficiency will be very low.Therefore, in practical applications, in order to improve the execution efficiency of the algorithm, feature selection ways need to be performed on the initial feature vectors.
In order to reflect the classification effect of the method described in this paper under the condition of variable speed, the sample database is constructed by mixing the characteristic data Preprints (www.preprints.org)| NOT PEER-REVIEWED | Posted: 25 December 2019 under different working conditions.From all normal and faulty data obtained from experimental testing, 70% are randomly selected as the training set, and the remaining 30% as the test set.The training set data is input into the RF algorithm.There are two important parameters in the RF model, which are the number of decision trees and the number of random feature variables.Figure 8 shows the relationship between OOB error and number of decision trees.It can be seen from the figure that with the increase of the number of decision trees, the OOB error gradually tends to be stable.Taking into account the accuracy and computational efficiency, 100 decision trees are selected.The random feature variable is called mtry which takes the default value √ = 5 , where M is the total number of features.After the establishment of the RF model, random disturbances are applied to the individual characteristic variables of all the samples of the OOB data, and the importance score of the individual characteristic variable is calculated by formula (6). Figure 9 shows the importance score's ranking of all the characteristic variables.According to Figure 9, the importance scores of the waveform index and the pulse index are very low.In order to reduce the redundancy of the feature parameters, only the first 15 feature parameters with high importance scores are retained.However, the feature dimension cannot be too small, otherwise it is difficult to accurately detect the difference between distinct classification targets.

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 25 December 2019
Therefore, according to the order of importance scores from big to small, 3 to 15 dimensional feature vectors are selected in order to identify the abnormal valve condition.Figure 10 shows the average recognition rate of the valve condition in different dimensions of feature.Figure 10 shows that as the dimension of the feature increases, the average recognition rate gradually increases.When the dimension of feature is 8, the average recognition rate reaches the maximum and remains basically constant, indicating that the dimension of the optimal feature vector is 8, which can significantly highlight the difference between normal and abnormal valve conditions.Therefore, we choose 0.7 as the threshold to reduce the dimension of the initial 18-dimensional feature vector to obtain the optimal eigenvector.According to the order of importance scores, the D22 spectral energy, D21 spectral energy, D23 spectral energy, skewness, RMS value, K factor, D24 spectral energy and peak-to-peak value are the characteristic parameters which contribute the most to the RF model's classification effect.Table 5 shows the optimal eigenvectors for each valve state of the diesel engine.Due to space limitations, 2 samples are listed for each valve state.The optimal eigenvector is input into the RF algorithm to establish a classifier model which can identify the normal and abnormal valve state.For the importance of fault alarm in the actual diesel engine, the normal or abnormal state should be identified first.So the online monitoring system on site can timely give a warning to the machine protectors when the valve clearance is abnormal.420 groups of characteristic data in the test set are input into the RF algorithm for abnormal alarm processing, among which 60 groups are normal characteristic data and the remaining 360 groups are abnormal characteristic data.The comparison result of the classification effect of the RF reduction dimension and the initial feature vector is shown in Table 7.The above analysis shows that the use of RF algorithm for feature selection can retain the main feature information, that is, while reducing the data dimension, the recognition rate basically remains the same, and the time is significantly reduced, effectively improving the efficiency of fault diagnosis.In addition, this method ensures that the average recognition accuracy is above 90%, and can accurately make an alarm when the valve clearance becomes abnormal.
In order to help the on-site personnel to diagnose the fault type preliminary, under the premise that the system has made an abnormal alarm, it is necessary to make further decisions on what fault type the valve specifically belongs to.For the 1260 groups of the abnormal valve data, there are 420 groups data of increasing intake valve clearance (fault 1 and fault 2), 420 groups of increasing exhaust valve clearance (fault 3 and fault 4), and 420 groups of increasing both intake and exhaust valve clearance (fault 5 and fault 6).360 groups of data are randomly selected as the test set, and the number of each fault state data is 120 groups.The rest of the data is used as a training set.The RF classifier is again constructed using the training set data.And the confusion matrix is shown in Table 8.According to Table 8, after the system alarms, it can accurately identify the specific fault type of the valve clearance as well.Among them, only 9 intake valve clearance fault is diagnosed as other faults, and the recognition rate reaches 92.5%.Only 8 exhaust valve clearance fault are misdiagnosed as other faults, and the accuracy reached 93.33%.The situation of both valve clearance fault is the same as exhaust valve clearance fault.The recognition rate of the overall valve fault has reached 93.06% and satisfactory results have been obtained under different speed conditions.
In order to highlight the advantages of the fault diagnosis method based on VMD and RF, this method is compared with the other two methods, as shown in Table 9.As can be seen from Table 9, the combination method of statistical characteristics in time and frequency domain and KNN is poor in the performance of fault identification rate for the increase of exhaust valve clearance.Although the performance of EMD and decision tree algorithm in the recognition accuracy of three faults is not bad, the average recognition rate is only 84.76%.The combined method of VMD and RF presented in this paper performs well in the recognition rate of three valve clearance faults, and the average recognition rate is obviously higher than the other two methods.

Conclusion
In this paper, a new fault identification method of the diesel engine valve clearance is proposed, which is based on the improvement VMD and RF.It not only can alarm the valve abnormality, but Preprints (www.preprints.org)| NOT PEER-REVIEWED | Posted: 25 December 2019 also can identify the specific fault type accurately.In the key process of nonstationary signal decomposition of this method，a novel parameters optimization strategy for VMD is carried out.Moreover, RF algorithm is employed to rank the importance of all the extracted features for the purpose of reducing the dimension and improving the operating efficiency.The verification results under the variable operating condition of diesel engine proved that the fault diagnosis method can effectively recognize the specific fault type of valve clearance and help diagnosis personnel on site to make decisions.In addition, this paper also makes a comparative analysis with frequently-used feature extraction and fault detection methods, the combined method of VMD and RF presented in this paper performs best in the recognition rate of three valve clearance faults.In the future, the valve clearance detection approach will be further studied on different engines and finally built into a realtime monitoring system.

Figure 1 .
Figure 1.The interconnection of the main test rig components. respectively.

Figure 2 .
Figure 2. Installation diagram of cylinder head accelerometer.

Figure 4 .
Figure 4.The spectrum and mode obtained by VMD in fault 4 condition, (a) VMD decomposition tree, (b) spectrum, (c) mode.

Figure 8 .
Figure 8. Relationship between OOB error and number of decision trees.

Figure 10 .
Figure 10.Average recognition rate of the valve condition in different dimensions of feature.

Table 2 .
The main parameters of the acceleration sensor.

Table 3 .
The main parameters of the eddy current sensor.

Table 4 .
The experimental parameters of valve clearance faults.

Table 5 .
The optimal eigenvectors for each valve state.

preprints.org) | NOT PEER-REVIEWED | Posted: 25 December 2019
In this experiment, 490 data were collected under each working condition of the diesel engine.The experimental data division under different speed conditions is shown in Table6, where N1 represents the number of training samples and N2 represents the number of test samples.

Table 6 .
Division of experimental data set.

Table 7 .
The comparison result of the classification effect.

Table 9 .
Comparison of different fault diagnosis methods.