Diesel Engine Fault Prediction Using Artiﬁcial Intelligence Regression Methods

: Predictive maintenance has been employed to reduce maintenance costs and production losses and to prevent any failure before it occurs. The framework proposed in this work performs diesel engine prognosis by evaluating the absolute value of the failure severity using random forest (RF) and multilayer perceptron (MLP) neural networks. A database was implemented with 3500 failure scenarios to overcome the problem of inducing destructive failures in diesel engines. Diesel engine failure signals were developed with the zero-dimensional thermodynamic model inside a cylinder coupled with the crankshaft torsional vibration model. Artiﬁcial neural networks and random forest regression models were employed for classifying and quantifying failures. The methodology was applied alongside an engine simulator to assess effectiveness and accuracy. The best-ﬁtting performance was obtained with the random forest regressor with an RMSE value of 0.10 ± 0.03%.


Introduction
Diesel engines are widely used in industrial environments and are thus expected to be robust.Accordingly, these engines are usually under an effective maintenance program to avoid unplanned shutdowns due to defects [1].A few important attributes influence the choice of a diesel engine for a given application, such as cost, performance, and useful life [2].The major application of this engine in several sectors is related to the reclamation of the portion of crude petroleum that was once considered to be the refuse of the gasoline refining process.Over time, diesel engine manufacturing technology has evolved, allowing more robust high-torque and energy-efficient equipment, which can be applied in environments that require high power, to be fabricated [3].
Diesel engines used in the offshore industry, such as in support vessels and oil production units, are subjected to an inhospitable environment, making them more susceptible to failure.Unfortunately, this kind of machine is prone to performance degradation and mechanical breakdowns [4].In addition, this equipment is usually the primary power plant on ships.For instance, marine diesel engines are prone to damage due to harsh working conditions, which include, but are not limited to, the presence of water, salt corrosion, and abrupt vibrations.The failures of such equipment are responsible for approximately 41.6% of the marine accidents caused by mechanical faults [5].Therefore, effective fault diagnosis is essential to improving machinery reliability [6][7][8][9][10][11]. Furthermore, the maintenance cost of diesel engines can vary from 10% to 20% of the total value of a vessel [12,13].It is fundamental to have a system capable of preemptively identifying potential motor faults in order to avoid damage to the production, the workers, and the ship itself [4,14].
Recently, many methods have been proposed to detect diesel engine faults, which include oil analysis [15], thermodynamic parameters [16], and vibration analysis [17,18], among others.Furthermore, advances in the development of automation and instrumentation technology have led to the development of efficient monitoring devices [19], contributing to predictive maintenance techniques.These methods typically use input data in the form of signals.Once the signal has been appropriately diagnosed, it can be used to determine the best course of action to take.
Intelligent fault diagnosis methods to estimate machine conditions have been widely applied [7,[20][21][22].In order to predict combustion failures in the cylinders of engines, the torsional vibration of high-power internal combustion engines was analyzed in [23] using the instantaneous angular velocity wave signal.Note that the crankshaft speed is unstable in these types of engines due to the cylinder firing order.In these tests, a magnetic pickup sensor was installed on the flywheel of the motor, and the fast Fourier transform, alongside time series analysis, was used to predict failures.Both techniques attained good results when applied to identifying combustion failures in the engine cylinders.
The work described in [24] acquired a vibration signal to observe failures in a fourstroke spark-ignition engine.Tests were performed for (i) normal operating conditions, (ii) a separation between the candle electrodes, (iii) open intake valves, and (iv) closed intake valves.The vibration signals were obtained using an accelerometer that had been placed on the cylinder head.In addition, a tachometer was employed to obtain the number of revolutions.A temporal analysis of the vibration signal was used, taking into account the angle of the crankshaft, using as reference the signals recorded when the motor was operating without defects.Their results showed that a non-intrusive methodology could be efficiently utilized to assess the diagnosis of both medium and large engines.
Reference [4] focused on a four-cylinder commercial diesel engine.In order to measure the vibration signal, four piezoelectric accelerometers were mounted on each cylinder head.Five types of faults were investigated: piston pin fault, piston ring fault, inlet valve fault, outlet valve fault, and connection rod fault.The rotating speed of the engine was 1500 rpm, and each fault scenario was tested ten times.A method based on kernel independent component analysis and the Stockwell transform was used for preprocessing.Twelve features extracted from this initial procedure were used as input for an artificial neural network that performed automatic failure classification.The advanced classifier obtained the accuracy of 93.33% in the failure scenarios.
The work described in [22] developed a methodology for managing the energy efficiency of a vessel, taking into account simulation data and system monitoring.Several critical situations in the main engine were analyzed and implemented as simulations with an engine room simulator for a certain navigation route of a very large crude carrier vessel.Based on the results obtained in this paper, it was verified that fuel consumption and the emission of gases into the atmosphere considerably increased when the vessel presented defective or inefficient operating conditions.
A four-stroke high-speed marine diesel engine failure simulator based on an adjusted dimensional thermodynamic model was used in [25].The method was validated on a dataset of real engines.The developed model was able to generate, with great reliability, a large number of typical thermodynamic diesel engine faults and the standard operating behavior.The work [26] presents a predictive maintenance system based on the fault diagnosis of diesel engines using the vibration response of the crankshaft and the variation in pressure curves inside the cylinders.The work developed a simulation model based on a zero-dimensional thermodynamic model.A total of 701 scenarios divided into four operations configurations were observed.Two machine learning algorithms were applied: random forest (RF) and multilayer perceptron (MLP) neural networks.The accuracy obtained by the algorithms was about 99.3%.The authors concluded that the signal-noise ratio must be small to guarantee good performance.
The proposed system possesses a modular architecture (equivalent to the ones described in [27,28]) for a condition-based maintenance system comprising twelve blocks.As shown in Figure 1, the system is composed of the start block (severity input); diesel engine performance simulation, which, in dynamic and thermodynamic model blocks, consists in solving the equations of the model based on the severity applied (start block) and on the returned fault signals (signals from a fault condition); the dataset generation block (split into the creation of fault and normal signals, which result from the normal and fault simulation blocks), which deals with the generation of the normal instances and their relationship with the fault examples; the additive white Gaussian noise (AWGN) process block (which adds noise to the signals); the dataset signal block (file in the form of a dataset according to the applied noise levels); the feature extraction block, which generates attributes from the dataset that are employed in the regression stage; the feature dataset block; the cross-validation block, which is responsible for evaluating the generalization abilities of the model; the regression block (composed of artificial neural networks and random forest-based regressors), which performs the machine learning tests; and the last (end) block, which presents the results.In the remainder of this work, the acronym 3500-DEFault (diesel engine fault) represents the dataset of 3500 signals from a diesel engine simulation.It is also important to emphasize the differences between the main contributions of this work and the ones presented in [26].Namely, the latter focused on diagnosing faults in diesel engines using machine learning algorithms that were trained using a dataset consisting of 701 fault events.This contrasts with the dataset employed herein, which employs 3500 instances of fault event signals.These signals were obtained by developing improved versions of the algorithms responsible for generating these signals.Furthermore, the main idea of this work resides not only in identifying the existence of a fault but also in its corresponding severity (in the form of a numerical value), which is usually the main object of interest in real-world applications.In the context of this work, severity assessment refers to the process of evaluating the gravity of an identified fault or malfunction in a diesel engine.It involves determining the extent to which the fault affects the engine's performance, safety, and reliability.Severity assessment is not performed in the literature, with one of the main reasons being the lack of a suitable database of signals.Our approach is capable of predicting low-severity fault events up to ±50% variance in parameter input, effectively anticipating fault occurrence and thus enabling our prognosis system.This work is organized as follows: Section 2 presents the main concepts of the diesel engine model that was used to generate the fault signals.This section also describes (i) the mathematical foundation of the dynamic torsional model of the crankshaft, (ii) the model verification, and (iii) the failure simulation model.The dataset and the feature extraction process are described, respectively, in Sections 3 and Section 4. The results and discussions are detailed in Section 5. Finally, concluding remarks and future work ideas are presented in Section 6.

Diesel Engine Model
This section presents the main characteristics of the diesel engine model that was used to create the failure signals.These were later employed to train the regression models that were developed for prognosis by assessing the fault and corresponding severity.In order to simulate the behavior and operation of a diesel engine under normal and fault conditions, the following models were developed: 1.

2.
A lumped mass model for the torsional vibration of the crankshaft.

3.
A fault simulation model.
This work used the specifications and characteristic curves of a 4-stroke marine diesel engine with six cylinders, a turbocharger, and a common rail injection system [29].Table 1 presents the technical specifications of this engine.The thermodynamic model was validated using the pressure curves inside the cylinder, while the torsional vibration model used the nominal torque curve, both provided by the manufacturer.The sections that follow present a summary of the diesel engine models that were developed during the course of this work.

Thermodynamic Model
The 0D model, based on the design data, can adequately describe the operation of the diesel engine used as a reference [30].It was obtained with the application of the first law of thermodynamics, considering the control volume presented in Figure 2.This behavior leads to an expression for calculating the temperature of the gases inside the cylinder [29,30], given in Equation (1).
where the terms dΨ dθ , δQ t dθ , δQ w dθ , and dV dθ represent the rates, depending on the angle of the crankshaft, θ, of (i) the temperature variation of the gases inside the cylinder, (ii) the heat supplied to the system due to the burning of fuel, (iii) the heat transferred by the cylinder walls, and (iv) the volume of the combustion chamber, respectively.In addition, P, m, and c v represent the instantaneous pressure of the gases, the mass of the mixture, and the specific heat at the constant volume of the mixture, respectively.Furthermore, considering that the gas mixture inside the cylinder behaves as an ideal gas, Equation (2) is obtained.
where dP dθ denotes the variation rate of the pressure P (Pa) inside the cylinder depending on the rotation angle of the crankshaft θ (degrees), m is the mass of the gas mixture inside the cylinder (kg), R is the constant of the gas mixture inside the cylinder (J/kg-K), Ψ is the instantaneous temperature of gases inside the cylinder (K), and V is the instantaneous volume inside the cylinder (m 3 ).
Equations ( 1) and (2) summarize the thermodynamic behavior of the diesel engine.However, it is worth mentioning that depending on the application, the performance of the engine depends on several operational parameters to characterize the power and torque.More details of the presented 0D model can be found in [29].

Torsional Vibration Model
Considering the engine performance, the lumped mass model properly represents the dynamics of the crankshaft [29,31,32].In the equivalent model, two dampers, the crankshaft pulley, the gear train, all six cylinders, and the flywheel were considered, totaling 11 • of freedom [32,33].Therefore, by applying Newton's second law, Equation (3) is obtained.
where the parameters [J], [C], and [K] denote the torsional inertia, damping, and stiffness matrices of the analyzed system, respectively, provided by the manufacturer; {θ(t)}, θ(t) , and θ(t) are the response vectors of displacements, velocities, and angular accelerations of the crankshaft for each considered degree of freedom; and {M(t)} is the vector of torques applied to the crankshaft, which depends on the consumer-requested power.Inertial and combustion torques are two types of torque acting on the crankshaft.Inertial torques are due to the alternative movement of the component masses (piston, pin, rings, locks, and a fraction of the connecting rod that has reciprocating movement).They can be obtained with the kinematic analysis of the crank-crank system.On the other hand, the combustion torque is due to fuel burning.It can be calculated using the pressure curve inside the cylinders estimated using the thermodynamic model.Equations ( 4) and ( 5) are the loads acting on the crankshaft due to the inertia of the system and the combustion inside the cylinder.
Considering the load distribution, the resulting torque in the cranckshaft is given in Equation ( 6).
where F r , F c , and M are the alternating force, the force due to combustion, and the torque exerted on the crank, respectively; the variables m r , r, Ω, and l are the reciprocating mass, the crank radius, the crankshaft rotation in rad/s, and the ratio between the crank radius and the connecting rod length; and γ is given by tan γ = . Further details on torsional vibration model development can be found in [32].

Model Validation
The simulation results were compared with the experimental pressure and torque curves provided by the manufacturer.Figure 3 compares the pressure curves inside the cylinder at speeds of 2500 rpm and 2100 rpm and depicts a good agreement between the experimental and simulated data.In addition, the errors of maximum pressure and indicated medium pressure were 0% and 5%, respectively.This allows us to conclude that the model satisfactorily and reliably represents the engine performance.The 0D model was only applied to one cylinder.However, it can be expanded to all cylinders, provided that the ignition order is respected [29].This behavior is shown in Figure 4, which presents the pressure curves (the excitation torques acting on the crankshaft were calculated).Note that it is possible to observe, in Figure 3, the simulated P 1 (θ) and experimental P 1 (θ) measured pressure profiles.Figure 5 shows the instantaneous torque curve due to each cylinder.Pressure (Bar) Pressure curves according to the ignition order at 2500 RPM.
Crankshaft Angle (degree) Torque (Nm) Figure 6 shows the torsional vibration response in the region near the flywheel at 2500 rpm.Note that the torque supplied by the engine is the average value of the torsional vibration function in the time domain.Comparing the simulated torques and the nominal torques provided by the manufacturer, an error of 7% was obtained, which allows us to conclude that the torsional vibration model is valid for this engine pattern and it is reliable.

Fault Simulation Model
Diesel engine operation depends on the functioning of its subsystems and components [34].Any fault they may suffer during the engine's useful life affects the operating condition of the engine to a greater or lesser degree (depending on the nature and type of fault).Although the failure model developed in [29,33] is more general in the sense that it also simulates structural failures in the crankshaft, in this paper, only three types of failure of thermodynamic nature will be considered: (i) failure of the intake system, which can lead to pressure alteration inside the intake manifold, ∆P i , due to turbocharger malfunction, corrosion of the intake valve, etc.; (ii) failure of the injection system, which can change the mass of injected fuel, ∆m f , due to the variation in injection pressure, corrosion, nozzle clogging, etc.; (iii) and loss of compression ratio, ∆r, due to piston corrosion.It should also be noted that ∆P i simultaneously affects all cylinders once it is a global failure.On the other hand, ∆m c and ∆r separately affect each cylinder, i.e., they are local failures.The fault vector is defined as presented in Equation (7).
where j represents the cylinder number and each element of ∆ f is a fault parameter, defined as a percentage of the value under the normal operating condition.Therefore, the pressure curves inside of the 6 cylinders, P j , is represented in Equation ( 8).
where OP is the set of operational parameters that influence engine performance under a given normal operating condition.Equation ( 8) enables the torsional vibration for the fault condition to be simulated.

Dataset
The dataset used to train and test regression models must emulate all operating scenarios under study.In our case, the dataset has to include signals containing enough knowledge to distinguish different diesel engine defects as well as the corresponding severity levels.The dataset employed in this work covers four operational conditions that will be described in Sections 3.1.1-3.1.4.
Engine rotation at 2500 RPM was applied in all scenarios because it presented the lowest joint error value (between the simulated and experimental instances) in estimating the maximum and mean pressures of the burning cycle.The joint error was obtained in the validation step of the dynamic/thermodynamic models by taking into consideration the deviation that exists between the values produced by the models and the ones provided by the manufacturer of the equipment.
The dataset contains 3500 distinct fault signals, which were constructed using the model detailed in Section 2, for four distinct operational conditions: the normal condition, "pressure reduction in the intake manifold", "compression ratio reduction in the cylinders", and "reduction of the fuel quantity injected inside the cylinders".The 3500 fault scenarios consist of 250 instances from the normal class, 250 instances from the "pressure reduction in the intake manifold" class, 1500 instances from the "compression ratio reduction in the cylinders" class, and 1500 instances from the "reduction of the amount of fuel injected into the cylinders" class.This dataset is named Diesel Engine Faults Features Dataset (3500-DEFault) and is publicly available among the Mendeley datasets [35].

Fault Classes
This section presents the scenarios analyzed in this research work.Four different types of scenarios were evaluated, namely: m variable.This stage covers a span between 0 and 0.1% of maximum severity (with uniform probability distribution) in the 27 input severity parameters chosen from the thermodynamic/dynamic models ∆P a , ∆P r , ∆T p , ∆r i , ∆m a i , ∆m c i , and ∆θ inj i .This stage aspires to emulate the real engine condition without faults, where the machine variables vary in a slight range near optimal functioning.

Compression Ratio Reduction in the Cylinders
To generate this type of fault, the use of all cylinders is required.Many instances with ∆r i ∈ {1.0, 1.2, 1.4, . . ., 50} (in %) severity values corresponding to the cylinders i ∈ {1, 2, . . ., 6} were evaluated.This strategy allowed 250 distinct examples per cylinder to be constructed, thus producing a total of 1500 instances of this class.

Reduction in the Amount of Fuel Injected into the Cylinders
The scenarios of this condition also require all cylinders.Several scenarios with severity values of ∆m c i (in %) ∈ {1.0, 1.2, 1.4, . . ., 50} with respect to the cylinders i ∈ {1, 2, . . ., 6} were considered.A total of 250 different scenarios were addressed for each cylinder, generating a total of 1500 instances of this class.

Additive White Gaussian Noise Process
In 3500-DEFault, different noise levels were applied with a signal-to-noise ratio (in dB) of L ∈ {0, 15, 30, 60}, which was obtained using additive Gaussian white noise (AWGN).Noise addition was implemented to investigate its influence on the regression performance and to emulate a measured signal.The clean vector of engine signals V s is defined in Equation (9).
Figure 7 shows the variables P(θ), M(t), and T(θ) with different levels L of AWGN, labeled as Vs [ M(t, L), P(θ, L), Ť(θ, L)].Note that Vs is a vector with the original corrupted signals.The addition of white noise for each element V s i of V s can be expressed as is described in Equation (10).
The operations carried out in Equation (10) and Figure 7 can be summarized as presented in Equation (11).
where the white noise ν can be characterized as a random process of zero mean and variance defined by the chosen SNR level L. The SNR is defined as shown in Equation (12).
where E[•] is the expected value and the variable A V is the maximum value for each element V s i of vector V s .Torque (kNm) where ν is white Gaussian noise in L = [15, 0] dB signal-to-noise ratio (SNR).

Data Normalization
In statistical studies, normalization or feature scaling is widely used to standardize data and thus optimize data processing.In machine learning, normalization plays a significant role when attributes can hinder data processing, such as features with dynamic ranges of different orders of magnitude, measured on different scales, that do not equally contribute to model fitting.Normalization is a way to standardize and minimize the problems arising from these dispersions.In addition, by not analyzing data considered inconsistent, data processing is also more efficient [36].This work employs minimummaximum normalization, which uses the minimum and maximum values of each feature to define a common range yield, as is illustrated in Equation (13).
where X n is the normalized X array, min(X) is the lowest value of the vector X, and max(X) is the highest value of the X array.Minimum-maximum normalization sets all dynamic data ranges to a scale from zero to one and decreases the overall standard deviation.However, this normalization may exclude outliers, which can bring important information to the analysis of the dataset [37].

Partitioning the Dataset
The dataset was divided into two parts, namely, training and test sets.The training samples were used to teach the regressor about the data pattern, whereas the test samples were used to evaluate the chosen regressor model.The training set can be subdivided into training and validation subsets, with the latter being used to fine-tune the algorithm model.
When the dataset used is sufficiently large, the hold-out technique [38] is used to separate 70% of the instances for training, 10% for validation, and 20% for testing.This technique might generate overfitting in the classifier, especially when the dataset is not large enough to properly train the classifier.In these cases, cross-validation techniques, such as K-fold, can be applied.K-fold makes the classification algorithms more robust to overfitting, producing classification models with greater generalization capacity [39][40][41].
The K-fold procedure randomly divides the dataset into K blocks of approximately equal size.Subsequently, it uses K − 1 samples to perform the training of the model and a sample to perform the test.This process is then iteratively performed for each block that can be used as test data.The performance of this method is given by the mean accuracy associated with the standard deviation.The values of K usually chosen are K = 5 or K = 10 [41].In this work, five K-fold blocks were used, as is depicted in Figure 8, where the gray blocks correspond to the test sets, and the white blocks are the training sets.

Dataset Regression
Regression is a statistical method for finding the relationship among variables [42].Regression algorithms are employed to predict the outcome based on the relationship among variables obtained from the dataset [43].The regression algorithm output is a real number that depends on a hypothesis function defined according to the problem [44].The hypothesis function is defined with hidden parameters [45].These hidden parameters are optimized for the training set input in the training phase [46].

Artificial Neural Networks
Artificial neural networks (ANNs) are computational structures with interconnected nodes that mathematically model, in a simplified way, the basic principle of cognitive processing of neurons in the human brain [47][48][49].Using algorithms and mathematical models, ANNs can recognize hidden patterns and estimate non-linear relationships among elements in a database.Classification and estimation are examples of tasks that a neural network can perform after a training phase.Neural network accuracy can be continually improved by training and feeding it with new data [47].
This paper used the sigmoidal activation function, which describes a non-linear output for each neuron.Consequently, there is non-linearity between the two layers, which is ideal for recognizing non-linear failure patterns.The sigmoidal activation function was selected because it presented good results in related research, as described in [50,51].Additionally, this paper uses the multi-layer feedforward network called multilayer perceptron (MLP) [52].MLP was chosen due to its hidden layer, which can be used to recognize and classify problems that are not linearly separable.A linear activation function is used in the output layer, and the sigmoidal activation function is used in the hidden layer.
The trained network is tested by obtaining a prediction for each test point [47].We compute the error with respect to the test dataset using quadratic loss as in the training phase.

Random Forest
Random forest (RF) is an ensemble supervised machine learning algorithm that can be used for classification and regression tasks [53].This algorithm is formed by decision trees, which are simple predictor elements.An ensemble classifier is often more powerful than the individual predictors that form it [54].The choice of random forest was due to its excellent predictive ability not only in diesel engine-related research, as can be seen in [55,56], but also in fault detection [26,57].This paper used this algorithm in the regression task.
Each model of the set is used to create a prediction for a new sample.The average results of each tree characterize the regressor accuracy.Since the classifier randomly chooses the predictors in each division, the tree correlation decreases.This selection gives strong and complex predictors presenting low-bias yield in an RF algorithm with low variance.This selection causes a decrease in error rates.Each predictor is independently chosen.Because of this independence, the RF has an effective noise response.RF is computationally more efficient than bagging, since in constructing the algorithm, it only needs to analyze a part of the original predictors in each division.However, RF needs to use many trees to form the regressor set.In addition, RF also presents a high level of parallelization, which allows high computational efficiency to be achieved [53].
Training and test errors tend to level off after some trees have been fitted.The difference between the bagging algorithm and the RF algorithm is that the latter uses a modified tree learning algorithm that selects, at each candidate split within the learning process, a random subset of the features [58].

Regression Metrics
Let P be a discrete variable.The point-to-point correspondence between the numerical solution and the observational measures of the same variable provides a quantitative test to measure the ability of the model (or dexterity) to reproduce or estimate observed data [59].Let P is and P io be the severity of the simulated failure and that of the failure observed in the same point at time i in a numerical domain with N samples, respectively.The fundamental quantity for the study of errors is the difference d i between the predicted or simulated values of input variable x and output variable y in point i (i = 1, 2, 3, . . ., N) at time t, where (P is = P(x, y, t)), and the measured or observed values of the same variables in the same points x and y at time (P io = O(x, y, t)), which is simply expressed in Equation ( 14).
Essentially, d i = 0 indicates an exact simulation for that point i, while d i >> 0 or d i << 0 characterizes non-exact simulations.The further away from zero the value of d i is, the more inaccurate the simulation is.That is, d i is equivalent to the e i error of a given analysis [60].Although ∑ d i provides an idea of the quality of the simulation for a given variable, it does not explain the particular sources or characteristics of the magnitudes of the errors.Note that from the basic quantity d i , it is possible to derive errors that reflect different components of the total error [60].
The metric adopted in this work is the root mean square error (RMSE), which is derived from d i and points to the trend or bias.The bias measures the tendency of the model to overestimate or underestimate the severity of the failure of what was observed.This trend can be approximated as is described in Equation (15).
RMSE is used to express the accuracy of the numerical results, with the advantage that it also presents error values in the same dimension of the analyzed variable [59].Let h ∈ H be a hypothesis-candidate to approximate the unknown function f : X → Y, and let D = [(x(1), y(1)), . . ., (x(N), y(N))] denote the labeled sample.In principle, considering D ⊂ {X , Y } | D ∈ R, the quadratic error average RMSE is presented in Equation ( 16).
Since hypothesis h can be seen as a certain regressor configuration, motivated by training with a specific portion of the dataset, where h[x(n)] is the adjustment of the regression variable, for each of the K successive K-fold implementations of the crossvalidation step, the average value of RMSE (µ RMSE ) after K implementations is described by Equation ( 17), and the associated standard deviation of the RMSE (σ RMSE ) is detailed in Equation (18).
where j represents the implementations of the cross-validation set.µ RMSE ± σ RMSE was adopted as the metric for evaluating the regressors to quantify the performance of adjusting the regression curves of the severity variable, using first-and second-order statistics.

Pearson Correlation Coefficient
In order to quantify the performance of the regression curves, the Pearson correlation coefficient ρ was used.This metric measures the degree of linear correlation between two quantitative variables.It is a dimensionless index with values between −1 and 1 inclusive that reflects the intensity of a linear relationship between two sets of data.ρ = 1 describes a perfect positive correlation between the two variables.ρ = −1 characterizes a perfect negative correlation between the two variables, i.e., if one increases, the other one always decreases.ρ = 0 signifies that the two variables do not linearly depend on each other.However, there may be another dependency that is "non-linear".Thus, the result ρ = 0 must be investigated with other means.
Let X = x 1 , x 2 , . . ., x n be the values of a set of points for a regression, with i = 1, . . ., n, and let Y = y 1 , y 2 , . . ., y n be a set of points representing a perfect regression.The Pearson correlation coefficient is defined as shown in Equation (19).

Feature Extraction
Feature extraction was performed by computing the maximum and mean pressure values from the six pressure cylinder signals, s p 1 (n), s p 2 (n), s p 3 (n), s p 4 (n), s p 5 (n), and s p 6 (n), and acquiring spectral details from the torsional vibration signal, s v (n) [29].The distinguishing fault attributes are presented in Sections 4.1.1-4.1.3.

Estimation of Maximum Pressure Inside the Cylinders
These features correspond to the maximum measure of each discrete pressure curve related to each cylinder, s p 1 (n), s p 2 (n), s p 3 (n), s p 4 (n), s p 5 (n), and s p 6 (n), yielding M p 1 , M p 2 , M p 3 , M p 4 , M p 5 , and M p 6 , respectively.These features are represented by Equation (20).

Estimation of Mean Pressure Inside the Cylinders
These features correspond to the first-order expected values (means) of separately discrete pressure curves related to each cylinder, s p 1 (n), s p 2 (n), s p 3 (n), s p 4 (n), s p 5 (n), and s p 6 (n), yielding µ p 1 , µ p 2 , µ p 3 , µ p 4 , µ p 5 , and µ p 6 , respectively.These features are represented by Equation ( 21).

E[s
where E[•] is the expectation operator µ related to each cylinder pressure curve and N denotes the number of samples of s p i (n).
where F(k) represents the harmonic frequency of torsional spectrum S v (k), F s denotes the sampling frequency, k is the frequency bin related to a frequency (Hz), P(k) represents the phase (degree) of torsional spectrum S v (k), arg[•] denotes the complex argument of the spectrum, and A(k) denotes the amplitude (N.m) of torsional spectrum S v (k).

Feature Vector
Compared with the technique described in [27,28], the methodology proposed in this work differs due to the application of the feature extraction process in torsional vibration and the addressing of maximum and mean values in the feature vector.The feature vector is expressed in Equation ( 26), which concatenates the three spectral variables previously described, and is composed, in total, of 84 attributes.The diesel engine employed was a six-cylinder four-stroke engine (intake, compression, combustion, and exhaust), which resulted in 24 half-orders that needed to be described in order to fully characterize each stroke of each cylinder.More details can be found in [29].

Results and Discussions
In the fault regression experiment, a procedure similar to the one presented in Pestana [26,61] was adopted.In this procedure, the ability of a system to identify fault severity is evaluated by adding the maximum and average pressure, and spectral measures for each possible fault.In this work, ANN and RF regressors were trained to identify fault severity.For both regressors, the feature vector was used as an input.The outputs of the regressors were the fault severity variables, one for each fault severity.The number of trees of the RF regressor was obtained empirically as in [26].The algorithm used to implement the ANN was MLP.The ANN input layer had the same dimension as the feature input vector.The ANN had one hidden layer, whose number of neurons was empirically obtained with the tuning of the hyperparameters, as discussed in Section 5.1.Finally, the ANN output layer had a number of neurons equal to the number of severity variables to be estimated.
For the RF regressor, the 3500-DEFault dataset was divided into two disjoint sets, and approximately 80% of the signals were used for training, while 20% were employed for testing.For the ANN regressor, on the other hand, the 3500-DEFault dataset was divided into three disjoint sets, and approximately 70%, 10%, and 20% of the signals were utilized for training, validation, and testing, respectively.Each set must represent the fault severity intensity data with maximum variability.The RF learning (training) process was performed by applying the bootstrap aggregating (bagging) algorithm.The ANN learning process was performed by applying the scaled conjugate gradient backpropagation algorithm to improve the training time.The validation set was employed to avoid overtraining, which results in generalization capability loss [27].
In order to avoid biased data performance in regression, the K-fold cross-validation technique was applied in all regression tests.In order to do so, the 3500-DEFault dataset was divided in K = 5 folds to circularly change the test subset.Each subset maintained the proportions of 80% and 20% for training and testing.
The tables and plots in this section are related to the fault regression experiment that presents the regression RMSE performance on the test data from the 3500-DEFault dataset.In such tables and plots, X i represents the test set observed elements, and Xi is the vector of test set predicted elements.The total RMSE accuracy regression performance is represented by W ± σ, where W is the total accuracy and σ is its standard deviation during K-fold validation.
Empirical tests were performed to determine whether for k = 5, the execution time of each test was reduced [62].By using K = 5 instead of the usual k = 10, the total execution time (ToE) of each regression, depending on each method and the evolved noise levels, was significantly reduced.The ToE was reduced due to size of the dataset and the parameters that were optimized in each regressor.In the simplest optimization regression (i.e., ANN), when K = 5, 13,000 rounds (5 folds × 13 operating variables × 4 noise levels × 50 neuron configurations in the hidden layer) were performed.The computational ToE was measured in seconds, while the regression was performed with a personal computer with CPU core i5-2500 and 8 GB RAM and without GPU, using the two physical cores for parallel processing.

Regressor Hyperparameter Tuning
Hyperparameter tuning is the step dedicated to adjusting the hyperparameters of the regressor in order to maximize the training results of the regressors and, consequently, the testing performance.In the case of the ANN, the hyperparameter to be tuned is the number of neurons, while in RF, the tuning variable is the number of trees.The optimization of the hyperparameters of a regressor is implemented during training by varying the hyperparameters in a predefined set of values and monitoring the RMSE minimum (i.e., mean and standard deviation).For each hyperparameter simulation, k = 5 is defined in order for the cross-validation step not to reduce the bias results and to extract the first-and second-order statistics from the obtained results.The tuning curves are evaluated using the average value of the RMSE, µ RMSE , and the variability of the RMSE, σ RMSE , for each simulated hyperparameter during training.
Figure 10 presents the tuning curve plot for each hyperparameter.In these plots, µ RMSE and σ RMSE are shown with respect to the number of neurons, in the case of the ANN, or the number of trees, in the case of RF.In Figure 10, the x-axis shows the range of values assigned to the simulations of the regressor hyperparameters.
For the ANN, the number of neurons of the hidden layer in the hyperparameter simulation was tested over the set H = {1, 2, . . ., 1000}.For RF, the number of simulated trees was tested over the set B = {1, 2, . . ., 1000}.After analyzing each tuning curve in Figure 10, it was decided that 120 neurons had to be employed in the ANN hidden layer and 100 trees had to compose the RF regressor.Table 2 summarizes these values.
One can notice in Figure 10 that the tuning curves tend to be minimum transient or stable.Moreover, there is an intrinsic decrease in the standard deviation of the validation set.The standard deviation decrease emphasizes the importance of using second-order statistics aside from first-order statistics when choosing these parameters, since smaller variability in the training error leads to more stable results.

Regression Tests
After adjusting the hyperparameters of the regressors in the tuning step (described in Section 5.1), regression tests were implemented according to studies described in [20,21].The hyperparameters used in all regression tests are the ones shown in Table 2.With the purpose of evaluating the noise influence on regression performance, different noise levels were applied to the 3500-DEFault dataset original signals.Tables 3 and 4, and Figures 11 and 12 show the results when different SNR levels of additive Gaussian white noise (AWGN) 60, 30, 15, and 0 dB, respectively, were applied.In the tests, all scenarios of the 3500-DEFault dataset were considered; thus, all types of faults were included.Different fault classes with varying levels of severity were used to investigate the performance of the regressor.The regressor output layer must be able to predict different variables.In addition to obtaining the results with the analysis of the degree of severity of each variable (failure), the output variable precision was also considered.The tests were defined by the severity level that had to be predicted by the regressor.The severity levels considered in the tests belonged to S ∈ {1.0, 1.1, 1.2, . . ., 50}%.Figures 11 and 12 show the prediction curves of the tests with SNRs equal to 60 and 0 dB.Tables 3 and 4   and ANN and RF regressors, respectively.In the above graphs, the black lines represent the perfect predictions, and the black dots represent the true (x-axis) vs. predicted (y-axis) elements.All graphs have an interval between zero and fifty (x-and y-axes), that is, with the same amplitudes as the severity values of the dataset.and ANN and RF regressors, respectively.In the above graphs, the black lines represent the perfect predictions, and the black dots represent the true (x-axis) vs. predicted (y-axis) elements.All graphs have an interval between zero and fifty (x-and y-axes), that is, with the same amplitudes as the severity values of the dataset.
By analyzing Figure 11, and Tables 3 and 4, it is possible to see a good performance of the regressors for a high SNR, with RMSE of less than 1%.The low RMSE values are a consequence of the low dispersion of the characteristic vector elements and low noise level present in the signals coming from the machine.In Figure 12, there is an increase in RMSE value in the presence of a low SNR.One can notice a significant dispersion of the predictions when compared with the ideal prediction curve.By examining Tables 3 and 4, it is possible to observe that the highest dispersion is for ∆m c j .Tables 3 and 4 show the prediction performance in each failure class, where three failure classes were studied: ∆P r , ∆r j , and ∆m c j .
The ∆P r fault, which is global in nature and affects the entire engine combustion dynamics, was easier to predict once it had strong coupling with the operating parameters used in this work.The ∆r j and ∆m c j faults, which are local faults that only affect the dynamics of combustion of one cylinder, had a higher level of prediction difficulty, because they were only coupled with the variables of the cylinder in question.The regression performance in the failures referring to the ∆m c j variables was lower than the others, indicating that this failure class was more difficult to predict.One possibility for improving the ∆m c j prediction performance could consist in adding more discriminating features and performing specific adjustments in the regressor.
The Pearson correlation coefficient, displayed in Figures 11 and 12, shows a moderate correlation between the vector of points predicted by the regressor and the perfect prediction curve.For signals with 60 dB AWGN, the prediction was moderate, whilst for signals with 0 dB AWGN, the prediction was not as accurate.The latter was caused by the dispersion of the prediction points due to the high level of noise in the tests.As expected, ANN and RF regressors presented stable results with low ToE due to the noise level and the defined adjustment.The obtained results are consistent with [21].Overall, the RF was the most stable regressor among the four addressed noise levels.The RF presented the lowest RMSE amongst the analyses considered.In addition, RF also presented the lowest dispersion among the cross-validation procedures.

Conclusions and Future Work
A quantitative framework for severity analysis composed of signal processing and regression techniques based on machine learning is proposed in this work.Compared with traditional classification methods reported in other studies, the proposed framework does not solely aim at identifying patterns of diesel engine failure.Instead, it assesses the diesel engine health conditions in the deterioration process by evaluating the absolute value of the failure severity.
The proposed scheme is crucial, since in practice, the life service of a diesel engine undergoes a deterioration process before its functional failure occurs.This was accomplished by extracting representative statistical parameters from the signals measured inside the cylinders and from the frequency response of the torsional vibration signal of the engine flywheel.
The evaluation of the fault severity level was performed using machine learning techniques that took into account the extracted features.This approach presents important advantages over traditional numerical methods, namely, it is able to predict the severity in a shorter time.This time advantage is a consequence of employing a trained regressor.On the other hand, in traditional numerical methods, there is always a prior need for convergence to estimate the severity level.
The hyperparameters of the regressors had to be fine-tuned for obtaining the best classification performance.Furthermore, the regression analysis with four different levels of white noise was fundamental, as it allowed the robustness assessment of each regressor to be performed, given the intrinsic increase in the noise level of the measurement.Therefore, it was possible to define the best regressor for signals with a low SNR.The RF was the most stable regressor among the four addressed noise levels and presented the lowest RMSE and the lowest dispersion for the cross-validation procedure.
In this work, the only rotation frequency considered was 2500 RPM.Future work will further expand the dataset, covering other rotation frequencies.Furthermore, new features will be incorporated to improve the regression results of fuel failure under loud-noise conditions.Other regressors, based on kernel machines, will also be studied, such as support vector machines for regression (SVM) and the Gaussian regression (GP) process.

Figure 1 .
Figure 1.The 12-block diagram of the proposed system. t

Figure 2 .
Figure 2. Control volume of the thermodynamic analysis.

Figure 3 .
Figure 3. Validation of the developed thermodynamic model for different rotations: (a) 2100 and (b) 2500 RPM.

Figure 5 .
Figure 5. Torque curves according to the ignition order at 2500 RPM.

Figure 6 .
Figure 6.Torsional vibration for normal signal of crankshaft at 2500 RPM.

•
Normal operation; • Pressure reduction in the intake manifold; • Compression ratio reduction in the cylinders; • Reduction in the amount of fuel injected into the cylinders.3.1.1.Normal This case represents engine operation without faults.A total of 250 different signals were generated by simulating using the ∆ f (n)

Figure 7 .
Figure 7. Process of applying additive noise: (a) M(t) with 15 dB, (b) M(t) with 0 dB, (c) P(θ) with 15 dB, (d) P(θ) with 0 dB, (e) T(θ) with 15 dB, and (f) T(θ) with 0 dB.Note that the black line represents the original signal without noise.On the other hand, the gray line denotes the signal with noise.The new variable with AWGN Vs i (•) is Vs i (•) = V s i (•) + ν, where ν is white Gaussian noise in L = [15, 0] dB signal-to-noise ratio (SNR).

Figure 8 .
Figure 8. K-fold blocks.Gray blocks correspond to the test sets, and white blocks, to the training sets.

Figure 9 .
Figure 9. Box plots of 3500-DEFault dataset for feature subsets F(k), P(k), µ p i , and M p i , respectively, for different AWGN levels: (a) 60 dB, (b) 30 dB, (c) 15 dB, and (d) 0 dB.The plot shows the median; 25% quartile; 75% quartile; and the lower and upper ranges (whiskers), which are the max.and min.for each distribution, respectively.

Figure 10 .
Figure 10.Error plots of several tuning curves with AWGN SNR level L = 60 dB in the training step: (a,d) ∆P r , and ANN and RF regressors, respectively; (b,e) ∆r 1 , and ANN and RF, respectively; and (c,f) ∆m c 1 , and ANN and RF regressors, respectively.In the above graphs, the dotted lines represent µ RMSE , and the whiskers represent σ RMSE after 5-fold cross-validation.
present a summary of the RMSE values considering regressor and AWGN level L.

Figure 11 .
Figure 11.Plots of several regression tests with AWGN SNR level L = 60 dB: (a,d) ∆P r , and ANN and RF regressors, respectively; (b,e) ∆r 1 , and ANN and RF regressors, respectively;and (c,f) ∆m c 1 , and ANN and RF regressors, respectively.In the above graphs, the black lines represent the perfect predictions, and the black dots represent the true (x-axis) vs. predicted (y-axis) elements.All graphs have an interval between zero and fifty (x-and y-axes), that is, with the same amplitudes as the severity values of the dataset.

Figure 12 .
Figure 12.Plots of several regression tests with AWGN SNR level L = 0 dB: (a,d) ∆P r , and ANN and RF regressors, respectively; (b,e) ∆r 1 , and ANN and RF regressors, respectively;and (c,f) ∆m c 1 , and ANN and RF regressors, respectively.In the above graphs, the black lines represent the perfect predictions, and the black dots represent the true (x-axis) vs. predicted (y-axis) elements.All graphs have an interval between zero and fifty (x-and y-axes), that is, with the same amplitudes as the severity values of the dataset.

Table 2 .
Selected values of the hyperparameters of each regressor.

Table 3 .
Summary of RMSE of each failure parameter (FP) in the regression tests for several AWGN levels and ANN regressors.

Table 4 .
Summary of RMSE of each failure parameter (FP) in the regression tests for several AWGN levels and RF regressors.