A Model-Free Diagnosis Approach for Intake Leakage Detection and Characterization in Diesel Engines

: Feature selection is an essential step for data classification used in fault detection and diagnosis processes. In this work, a new approach is proposed, which combines a feature selection algorithm and a neural network tool for leak detection and characterization tasks in diesel engine air paths. The Chi square classifier is used as the feature selection algorithm and the neural network based on Levenberg-Marquardt is used in system behavior modeling. The obtained neural network is used for leak detection and characterization. The model is learned and validated using data generated by xMOD. This tool is used again for testing. The effectiveness of the proposed approach is illustrated in simulation when the system operates on a low speed/load and the considered leak affecting the air path is very small.


Introduction
In order to reduce air pollution caused by automobile engines, several standards have been introduced.The first standard was proposed by the California Air Resources Board in 1970.Since 1993, marked by the introduction of the Kyoto Protocol, the European anti-pollution standards have become more stringent where the authorized emissions of a diesel vehicle have been decreased from (NOx = nil, classification is assured by the k-nearest neighbors rule including reject options.The authors of [16] propose to incorporate a GA with Fisher Discriminant Analysis (FDA) in the key variables identification procedure.The GA is used to select the features that optimize the FDA classification success rate.This approach is applied to the data generated by the Tennessee Eastman Process (TEP) simulator.A new method for feature selection based on mutual information for fault detection and identification is proposed in [17].Their algorithm is based on two principle stages: the variables are sorted based on their shared mutual information with the class variable and secondly the more informative variables are chosen by taking into account the classification error rate.Once more, the approach is applied in the TEP (Tennessee Eastman Process) simulator.In [18], the authors use a recursive feature elimination to select key variables using Support Vector Machines (SVM).The SVM is combined with time lags incorporated before every classification step.Finally, the number of relatively important variables determined by each classifier is basically determined by 10-fold cross-validation.Wang [19] introduces a neural network approach to vibration feature selection in mechanical systems fault detection.He proposes an artificial intelligence methodology for mechanical fault detection using vibration data, which includes intelligent feature optimization.He uses a back-propagation neural network twice, the first one for feature selection and the second for fault detection.
In this paper, a new methodology dealing with the problem of detecting and characterizing small leaks in diesel air paths is developed.To achieve this goal, a new scheme based on a neural network technique is proposed.The nominal mode (without leaks) and leakage mode corresponding to several diameters of leaks were trained using a Levenberg-Marquardt algorithm.Before using the acquired data, a feature selection task is proposed in order to reduce the complexity of the problem.The main challenge of the proposed approach is the use of selected sensors leading to a reduced cost.The data for the considered modes are generated using the xMOD platform which will be described later.
The paper is organized in this way.First, the considered problem is presented in Section 2. Section 3 describes the proposed approach in detail.A brief description of neural networks based on the steepest-descent and Gauss-Newton methods is given, and the main detection and characterization scheme is illustrated.After a brief description of the MOD tool used in engine data collecting, Section 4 gives some results obtained using our approach.Then, these results are discussed and commented in order to illustrate the effectiveness of leakage detection and characterization.

Problem Statement
Over the past several years, anti-pollution standards have become more stringent and then the constraints for the automotive industry have also become very complex.The main objective of these standards is to reduce the emission levels of cars.In the case of diesel engines, there are several pollutants: carbon monoxide, unburned hydrocarbons, nitrogen oxides (NOx) and diesel particulate matter.Usually, the emission level proportionally increases with the appearance of faults in diesel engines, more precisely in diesel air paths.These faults can be due to sensor failures, actuator failures or system degradation.In this paper, the latter failure class is considered.More precisely, the leakage detection and characterization in diesel air paths is studied.This failure can cause multiple non-desired system behaviors.In addition to the high emission levels, this failure causes multiple non-desired effects such as:  Changes in the operating points of the air path subsystems,  Incomplete combustion in cylinders,  Appearance of smoke and the reduction of performance.
Often, this type of failure can be confused with the two other types of faults, i.e., sensors or actuators; consequently, it is very important to distinguish this fault from others.
In addition to the main objective of this paper, the feature selection problem is considered.It is well known that today's vehicles are characterized by increased complexity due to the growing number of embedded sensors.Consequently, the use of selected subsets of sensor data which are in correlation with the considered problem is widely desired in such applications.
In this work, our main objective is to detect and characterize air leaks in diesel air paths regardless of their diameters.Before performing leak detection and characterization, we carry out a feature selection in order to reduce the data complexity.
It is important to specify that, for this application, small leaks are hidden and are very difficult to detect because of the phenomenon of the non-solicitation of the system.

Proposed Approach
Nowadays, the neural network is an essential tool used in many research activities for complex industrial systems.An advantage of using neural networks to detect system faults is that they can interpret the measurement data.Indeed, a neural network has both the ability to generalize an obtained model and, to apply the associative property to the available memory.The error tolerance, characterizing the neural network, effectively deals with the errors of the model.In addition, it can perform nonlinear mapping and also learn dynamic behaviors in order to generalize the obtained models.
Generally, the collected data for the detection and characterization process are noisy, but, the error tolerance ability of neural networks enables the detection scheme to differentiate the pattern from noise.This property is a huge advantage in fault detection and isolation problem resolution.In addition, similar patterns are separated using the property of characterization of a neural network.
Leak detection in intake systems is very difficult to achieve especially when the operating point corresponds to low load-torque couple.In these conditions, the compressor in the air path is not solicited by the driver, leading to similar pressure between the intake system and the atmosphere.This constraint requires an improved detection and characterization algorithm.The proposed system must increase the accuracy of the model, enhance the performances of the vehicle and guarantee the management of small leaks.In this paper, the Levenberg-Marquardt (LM) algorithm is proposed to carry out the detection and characterization tasks.The LM algorithm is used to train the air path diesel dynamics.Once the dynamics are modeled, the leak is detected and characterized by comparing the new measurements with the model established using a neural network.The proposed approach contains two blocks which are the training block and the decision block.This approach is shown in Figure 1.The proposed approach is designed to operate in on-line mode, thus, a classic process of real data acquisition is adopted in this work.It is important to remember that in this application we use only sensors selected by the feature selection algorithm and summarize the intake behavior of the diesel air path.The data acquired in this step are sent to the decision block in order to detect and characterize the leaks affecting the vehicle.

Feature Selection
In this work, a very popular feature selection is chosen; it is the Chi-square algorithm [20].Chi-square is a simple and general algorithm which achieves feature ranking using a discretization process.This algorithm is combined with a neural network classifier to select the features that we must keep.
The Chi-square algorithm is based on the X 2 which runs in two stages in this manner: 1. Set the sigLevel to 0.5 for all features; 2. Sort each feature according to its values; 3. Compute the X 2 value for every pair of adjacent intervals, such that: Firstly, the WEKA [21] data-mining tool is used to perform Chi-square ranking.The features are sorted according to their rank.Secondly, the most important features are selected using the neural network classifier, which will be described later.More precisely, the features will be eliminated iteratively from least important to most important and the weight of the eliminated features is evaluated according to the obtained classification Mean Squared Error (MSE).

Training
Pattern classification using neural networks consists of determining the class boundaries using the classifier.The training phase of neural networks achieves this goal.In this paper, a gradient-based training algorithm is used.This category of algorithm is most commonly used by researchers.One of these algorithms is Hessian-based algorithms; they can significantly reduce the convergence time.The Levenberg-Marquardt algorithm [22] belongs to category of Hessian-based techniques; it makes use of the advantages of Hessian-based algorithms in the optimization of nonlinear least squares.
The Levenberg-Marquardt algorithm is a well-known optimization technique.It locates the minimum of a function which is expressed by the sum of squares of nonlinear functions.This algorithm, widely used in many disciplines, is a combination of the Steepest-Descent and the Gauss-Newton methods.Depending on the distance between the current position and the best one, these techniques operate intermittently; if the current position is far from the best one the steepest-Descent is applied, otherwise the Gauss-Newton takes over.The Steepest-Descent technique used in the LM algorithm is slow, but it guarantees the convergence property.When the current position approaches the best one, the LM algorithm switches to the Gauss-Newton method which converges rapidly.
For neural network training, the objective function is the error of the type: where ykl are real data of diesel engines, akl are a network output, p is the total number of samples and n0 represents the total number of nodes in the output layer.
In this work, the neural network used contains five layers.The first layer is the input layer which receives the data corresponding to the selected sensors which are used in this application.The three following layers are the hidden ones which represent the network core.The last one is the output layer which generates two signals when the detection task is considered or four signals when the tasks of detection and characterization are considered together.
The steps required in the neural network using the LM algorithm in batch-mode training are the following:  Compute the corresponding network outputs and evaluate the mean square error for all inputs as in Equation ( 1);  Calculate the Jacobian matrix j(x), where x represents the weights and biases of the network;  Solve the equation which adapts weights in order to obtain Δx, The update of the weighted vector Δx is computed as follows: where µ is the training parameter and R is a vector of size pn0 computed as follows: J T (x)J(x) is referred to as the Hessian matrix.
 Recalculate the error using x + Δx.If there is the reduction of the error calculated in step 1, the training parameter µ is reduced by µ − , keep x = x + Δx and return to step 1.If there is no reduction, increase µ by µ + and go back to step 3. µ + and µ − are fixed by the user;  The algorithm is stopped in two cases; when the gradient is less than the predefined value, or when the error is below a given error objective.
Generally, the training step in neural networks is very complex and requires considerable computing resources, especially in on-line cases.In this work, the training problem is carried out in off-line mode, then, the obtained neural model is used to detect and characterize leakage in on-line mode.The adopted neural network returns both the nominal behavior corresponding to the system without leakage and the faulty system behavior (occurrence of leakage).

Decision Block
The decision block is the most essential component of the proposed scheme where the leaks in the intake of the air path are detected and characterized using the neural model developed in the training step.Direct interactions are established between the detection and characterization block with the neural network model in order to estimate the actual state of the system.The decision block works in two modes, "detection mode" or "detection and characterization mode".If the detection mode is considered, the decision block returns the two possible outputs, "No Leakage" or "Leakage".In "detection and characterization mode", it returns four outputs which are, "No Leak", "Low Leakage", "Medium Leakage" and "High Leakage".

Application
A critical operating mode system is considered to illustrate the effectiveness of the proposed approach.This mode concerns the case of low engine load, speed and couple, where the leak detection and characterization problem is not systematically realized.In this application, the data acquisition is carried out using xMOD software.

xMOD Software
xMOD is a software platform that was developed at "IFP Energies Nouvelles" combining heterogeneous models and a virtual experimentation laboratory.These heterogeneous models are generated by different simulation tools, such as Matlab/Simulink, AMESim, Dymola, SimulationX and GT Power.A combination of these tools means benefiting from the advantages of each modeling and simulation tool, and the user can freely select these tools.
In this work, xMOD is used to simulate the diesel engine functioning, especially the air path behavior.The simulation model produced by IFP "Energies Nouvelles" is used, to which a leak model has been added.The diameter of the leak can be freely adjusted.The simulation results can be recovered and stored in text files.

MSE: All Features vs. Selected Features
Before presenting the results with selected features, a comparison of the MSE evaluation for "all features" and "selected features" is presented in Table 1.
In order to illustrate the advantage of feature selection, the MSE values are jointly shown with their training run times.In this table, we can firstly observe that MSE values corresponding to the use of all features are greater than the MSE values when only the selected features are used.Secondly, we observe that the run time corresponding to the use of all features is always higher than when the selected features are used.For example, when the torque value is set to 40 Nm, all MSE values of the all features case are greater than those corresponding to the selected features case.The same conclusion can be drawn for the remaining three cases except for some values.The detection and characterization tasks are presented for the selected features case.

Detection Task Results
The first property of the proposed approach is the detection ability.In this situation, the neural network trains two classes: "No Leakage mode" and "Leakage mode".The training set consists of 10,000 samples without leaks and 10,000 samples with leaks.We choose three values of leaks, 0.1 mm, 0.4 mm and 0.9 mm.These results are obtained using only selected features in the training algorithm.

Interpretation
Figures 2-10 show the effectiveness of the proposed approach where we can see that the leak is detected for all considered diameters.Mean Squared Error (MSE) values give information about the accuracy of the neural network used.From the obtained results we can first remark that the MSE values increase when the torque values decrease.For example, in the first case (Figures 2-4) when the leak diameter is set to 0.1 mm, the MSE value decreases from 0.0240 (2.4%) to 0.00746 (0.7%) when the torque increases from 110 Nm to 150 Nm.This observation can be explained by the fact that the air path system (compressor) works at a lower speed.In other words, in low speed, the mechanical compressor of the air path is not solicited.The same remark is applied to cases 2 and 3.
Naturally, the leak is easily detected when it is large, but, it becomes extremely difficult to detect when it is very small.The obtained results show that the proposed approach is efficient and the leak is detected in all cases even when it is equal to 0.1 mm (almost negligible leakage).In addition, this approach gives better results at higher engine load, speed and couple which is expected due to higher flow on the air intake system.

Characterization Task Results
This property is very important and allows for the estimation of the severity of the leak.Thus, the characterization of the leak diameter is highly desired and it is often difficult to accomplish.In order to do this, the neural network is trained with the data of four modes which are: "No Leak", "Low Leakage", "Medium Leakage" and "High Leakage".The three last modes respectively correspond to "Leak = 0.1 mm", "Leak = 0.4 mm" and "Leak = 0.9 mm".The data of each mode contains 10,000 samples.The  show the leak characterization in each mode.These figures show the effectiveness of the proposed approach dealing with leak characterization.Firstly, we remark that MSE value (0.0140) is low and the accuracy of the neural network is demonstrated.Secondly, the leakage class is found for each case.

Conclusions
A leak detection and characterization approach for diesel air paths has been developed.The proposed approach contains two blocks: a training block and a decision block.The first one is realized off-line and combines a feature selection algorithm with a neural network based on the Levenberg-Marquardt optimization.The L-M function was chosen for its accuracy and adaptability; it combines two different techniques according to the current position of the solution compared to the best one.The second block uses the neural model obtained in training phase in order to detect and characterize leaks that appear in the air path system.The detection and characterization capability is evaluated using the MSE index.
The proposed approach effectively solves the leak detection and characterization problem, especially in the case of small leaks in critical operating points (low speed and torque).In order to validate this solution, the proposed algorithms will be implemented in a real diesel engine.
Aij: number of samples in the i th interval and j th class; Ri: number of samples in the i th interval; Cj: number of samples in the j th class; N: total number of samples; 4. Merge the pair of adjacent intervals with the lowest X 2 value until the X 2 value of each pair of adjacent intervals exceeds sigLevel.
 Stage 2:1.Start with the sigLevel0 corresponding to the last sigLevel value determined in the first stage; 2. Associate sigLevel(i) with each feature and run merging; 3. Consistency test:If inconsistency < δ merge intervals and decrease sigLevel (i); Else, eliminate the i th features for the next step.

Table 1 .
Mean Squared Error (MSE) Evolution with torque variation.