In-Cylinder Pressure Based Engine Knock Classiﬁcation Model for High-Compression Ratio, Automotive Spark-Ignition Engines Using Various Signal Decomposition Methods

: Engine knock determination has been conducted in various ways for spark timing calibration. In the present study, a knock classiﬁcation model was developed using a machine learning algorithm. Wavelet packet decomposition (WPD) and ensemble empirical mode decomposition (EEMD) were employed for the characterization of the in-cylinder pressure signals from the experimental engine. The WPD was used to calculate 255 features from seven decomposition levels. EEMD provided total 70 features from their intrinsic mode functions (IMF). The experimental engine was operated at advanced spark timings to induce knocking under various engine speeds and load conditions. Three knock intensity metrics were employed to determine that the dataset included 4158 knock cycles out of a total of 66,000 cycles. The classiﬁcation model trained with 66,000 cycles achieved an accuracy of 99.26% accuracy in the knock cycle detection. The neighborhood component analysis revealed that seven features contributed signiﬁcantly to the classiﬁcation. The classiﬁcation model retrained with the seven signiﬁcant features achieved an accuracy of 99.02%. Although the misclassiﬁcation rate increased in the normal cycle detection, the feature selection decreased the model size from 253 to 8.25 MB. Finally, the compact classiﬁcation model achieved an accuracy of 99.95% with the second dataset obtained at the knock borderline (KBL) timings, which validates that the model is sufﬁcient for the KBL timing determination.


Introduction
Knocking is the most critical combustion phenomena in spark ignition engines. Knocks are considered as the noise perceived via the engine structure due to the spontaneous ignition of a portion of the end-gas [1]. Moreover, the end-gas area is larger in the knocking cycle than in the normal cycle [2]. Knocking can lead to severe damage in the in-cylinder components such as the piston, piston rings, cylinder liner, spark plug, and valves. In recent years, there has been an increase in the significance of knock phenomena due to the increased compression ratio and power density of modern gasoline engines. The knock tendency is a function of various parameters, which include the compression ratio, fuel grading, and spark timings. Knock suppression can be achieved by the use of exhaust gas recirculation (EGR), proper injection techniques, the enhancement of the tumble and squish flows [3], modification of the autoignition characteristics by hydrogen enrichment [4], and water addition [5]. For a specific engine under given operating conditions and a known fuel, the spark timing control is an effective method for the suppression of knocking. Although it soothes knocking, the spark timing retard deteriorates the fuel economy. Thus, the spark timing is generally set at the instance when the first knock is observed. Such a spark timing is referred to as the knock borderline (KBL) timing or knock-limited maximum brake torque timing [6]. Therefore, significant research attention has been directed toward the development of a reliable knock detection method, possibly by the examination of individual engine cycles in real-time and the distinction between knock cycles and normal non-knock cycles.
It should be noted that preprocessing can distort the in-cylinder pressure characteristics required for the accurate evaluation of the knock intensity. Maurya et al. [7] carried out various digital signal processing techniques on the in-cylinder pressure traces from a homogenous charge compression ignition (HCCI) engine. The power spectrum results revealed that smoothing and averaging eliminated most of the high-frequency components from the in-cylinder pressure signal.
Several knock intensity metrics have been developed for the quantification of the knocking intensity, most of which are based on the in-cylinder pressure oscillation. Shu et al. [8] evaluated two knock intensity parameters calculated using the maximum amplitude of pressure oscillation (MAPO) [9] and the knock intensity factor for the 20 crank angle (CA) range [10] methods at various engine speeds. It was concluded that the KI20 knock metric was superior to MAPO due to its more significant pressure oscillation information.
Shen et al. [11] developed a knock intensity metric based on the resonant frequency of the in-cylinder pressure. The in-cylinder pressure traces from the engine speed within the range of 800-2000 rev/min were employed for the development of the parameters. The proposed knock metric exhibited a high signal-to-noise ratio and required a shorter computational time than the reference knock parameter based on the average energy calculated over a frequency range of 6-25 kHz.
Ettefagh et al. [12] investigated knock detection based on the engine block vibration using an auto regressive moving average (ARMA) model. The vibration signal from an accelerometer was obtained during the operation of the experimental engine at 600 rev/min. It was found that the fourth moving average parameter of the ARMA model exhibited significant discrepancies between the non-knock and knock phenomena. Moreover, the proposed method could detect the weakest knock.
Bares et al. [13] suggested a knock probability estimation based on the in-cylinder temperature and exogenous noise. The trapped mass estimated from the in-cylinder temperature improved the accuracies of the air mass flow rate measurement and knock probability estimation. The results revealed that the random nature of knocking can be attributed to the irregularity in the in-cylinder temperature. Moreover, the model could predict the knock probability during the estimation of the trapped mass from the resonant content of the in-cylinder pressure.
Molinaro et al. [14] developed pattern classification techniques for knock detection based on the vibration signal from a piezoelectric accelerometer. The classification techniques were evaluated using 144 vibration cycles, which consisted of 88 knock cycles and 56 normal cycles. Based on the validation of the techniques based on 500 cycles and five engine speeds, the classification techniques decreased the false alarm rates to 0-5% at all of the engine speeds. Moreover, the conventional detector exhibited knock detection errors of 2-30%.
To obtain accurate results, an appropriate signal processing technique should be selected for the given in-cylinder pressure signals from the engine. Liu et al. found that the deflation method is effective in vibration signal separation [15]. Moreover, reweighted singular value decomposition (RSVD) exhibited a remarkable denoising performance in the detection of faults in rotary machines by enhancing the weak significant features [16].
Since Huang et al. first introduced the empirical mode decomposition [17], empirical mode decomposition (EMD) has been widely used in signal analysis and fault diagnosis [18]. Since it decomposes high-frequency oscillating signal components from low-frequency oscillating components [19], the EMD method is suitable for the in-cylinder pressure signal processing. The numbers of peaks, zero crossing, entropy, and sifting were extracted from the resultant intrinsic mode functions (IMF).
Wavelet packet decomposition (WPD) is one of the promising decomposition methods applicable for evaluating harmonic distortion in power systems [20]. WPD, essentially Energies 2021, 14, 3117 3 of 18 the same as discrete wavelet transform (DWT), provides 2 n sets of wavelet coefficients, whilst DWT produces only n + 1 sets of wavelet coefficients. Subasi et al. [21] concluded that WPD was superior to EMD and DWT in the seizure onset determination in the electroencephalogram recordings.
Recent approaches based on neural networks exhibit improved knock prediction performances. Bennett et al. [22] reconstructed the in-cylinder pressure from accelerometer measurements using a recurrent nonlinear autoregressive with exogenous input (NARX) neural network. The neural network model was then trained using the robust adaptive gradient descent (RAGD) algorithm. The trained model exhibited a 5% error in the peak in-cylinder pressure values and 2-CA error in the peak in-cylinder pressure location at an engine speed of 1000 rev/min. A highly accurate in-cylinder pressure estimation technique based on the cylinder block vibration can provide a substantial cost reduction in knock prediction. Multilayer neural networks can potentially be used to extract the in-cylinder pressure information, peak pressure, and timing from the crank speed signal [23].
In this study, several conventional knock classification methods were thoroughly reviewed and applied to the present experimental data as comparison to the proposed machine learning model performance. Firstly, an algorithm was developed that can detect whether an individual cycle is a knock or normal cycle based on the in-cylinder pressure trace. The feature extraction was based on the WPD and EEMD, and it was trained via the supervised machine learning technique. In most previous studies, the in-cylinder pressure or accelerometer signals of several representative knock cycles from limited engine operating conditions were analyzed using various signal processing techniques. The proposed knock determination algorithm is a standalone model that can classify individual cycles in real time. The model algorithm was trained using several knock cycles with significantly different intensities under various operating conditions, and its accuracy was verified under a different dataset under normal operating conditions. In addition, the proposed model was sufficiently compact to be embedded in a microcontroller for real-time monitoring.

Instrumentation
The experimental engine was a natural-aspired, four-cylinder, spark-ignition engine for automotive applications. The engine specifications are listed in Table 1. An eddy-current (EC) dynamometer was coupled with the engine for loading. A stock electric starter motor was employed to initiate the engine rotation, given that an EC dynamometer cannot start an engine. When the engine speed reached a given level, fuel injection and spark ignition were provided for the engine to run. The dynamometer controlled the engine for operation at the target engine speed. Figure 1 shows the schematic of the experimental facility.
The spark timing was controlled using the engine control system, which consisted of a stock engine control unit (ECU) and emulator (ETK, ETAS, Bosch). The fuel injection timings and valve timings were not changed. Moreover, the stock components in the fuel injection and spark ignition systems were employed as obtained.  The spark timing was controlled using the engine control system, which consisted of a stock engine control unit (ECU) and emulator (ETK, ETAS, Bosch). The fuel injection timings and valve timings were not changed. Moreover, the stock components in the fuel injection and spark ignition systems were employed as obtained.
The external engine cooling system circulated the engine coolant through a watercooled heat exchanger. The stock engine thermostat was disabled; thus, the coolant fluid was consistently directed toward the external cooling system. The coolant temperature at the engine inlet was maintained at the desired temperature using a chilled water system from the laboratory and an electric heater. The heater was used only for the engine startup and low load operations.
Four piezo-electric pressure transducers (6113B, Kistler, Switzerland) were installed on the four cylinders for in-cylinder pressure measurements. Each transducer was installed on a hole with M10 thread machined on the engine head near the spark plug. It should be noted that the pressure transducer is a reliable tool for in-cylinder pressure measurements in internal combustion engine research. The specification of the pressure transducer is listed in Table 2. The crank angle was determined using an encoder installed on the crank shaft, with a CA resolution of 0.1. The corresponding sampling frequency was 180 kHz at the engine speed of 3000 rev/min. A high-speed data acquisition and combustion analyzer system (Indicom, AVL, Austria) was employed to display the real-time in-cylinder pressure traces, and to record the pressure data over 300 consecutive cycles.

Engine Experiments
Two experiments were conducted to obtain two separate datasets. The two datasets were obtained at three different speed levels and under two load conditions. In particular, the engine was operated under conditions that are common in daily use. The target engine speed and brake mean effective pressure (BMEP) of the two datasets are presented in Tables 3 and 4. After the engine reached the steady condition at desired point, spark timing was advanced by 0.75 CA to induce knocking until severe knocking was observed. The resultant spark timings indicate substantial advance as shown in Tables 3 and 4. Then, data was acquired repeatedly 5-15 times to obtain total 30 knock cycles at each point. Each The external engine cooling system circulated the engine coolant through a watercooled heat exchanger. The stock engine thermostat was disabled; thus, the coolant fluid was consistently directed toward the external cooling system. The coolant temperature at the engine inlet was maintained at the desired temperature using a chilled water system from the laboratory and an electric heater. The heater was used only for the engine startup and low load operations.
Four piezo-electric pressure transducers (6113B, Kistler, Switzerland) were installed on the four cylinders for in-cylinder pressure measurements. Each transducer was installed on a hole with M10 thread machined on the engine head near the spark plug. It should be noted that the pressure transducer is a reliable tool for in-cylinder pressure measurements in internal combustion engine research. The specification of the pressure transducer is listed in Table 2. The crank angle was determined using an encoder installed on the crank shaft, with a CA resolution of 0.1. The corresponding sampling frequency was 180 kHz at the engine speed of 3000 rev/min. A high-speed data acquisition and combustion analyzer system (Indicom, AVL, Austria) was employed to display the real-time in-cylinder pressure traces, and to record the pressure data over 300 consecutive cycles.

Engine Experiments
Two experiments were conducted to obtain two separate datasets. The two datasets were obtained at three different speed levels and under two load conditions. In particular, the engine was operated under conditions that are common in daily use. The target engine speed and brake mean effective pressure (BMEP) of the two datasets are presented in Tables 3 and 4. After the engine reached the steady condition at desired point, spark timing was advanced by 0.75 CA to induce knocking until severe knocking was observed. The resultant spark timings indicate substantial advance as shown in Tables 3 and 4. Then, data was acquired repeatedly 5-15 times to obtain total 30 knock cycles at each point. Each run acquired 1200 consecutive cycles from four cylinders. The first dataset included 66,000 cycles from 55 runs under the six points, whereas the second dataset consisted of 25,200 cycles from 21 runs under five points. The first dataset was employed for the Energies 2021, 14, 3117 5 of 18 development of the model, whereas the second dataset was employed for the validation of the model, which was acquired at the KBL spark timings.
The KBL spark timing at each operating point was first determined from the spark timing sweep. The KBL timing was determined based on the knock audibility. A 10-mm diameter copper tube with a length of 6 m was fixed on a bolt head of the engine cylinder block near Cylinder 1. The sound due to knocking was transmitted through the copper tube; thus, rendering the knocking audible to an engine operator via the other end of the tube. Moreover, the test cell door was kept closed to filter out the engine noise from the knocking sound. This is the standard procedure in the spark timing calibration that has been employed in automotive industry for many decades.
In the first experiment, the spark timing of the Cylinder 1 was advanced to induce severe knocking, whereas the spark timings of the other three cylinders were kept at the KBL timing. The tables below reveal that a minimum of three sets of data logging were carried out at all operating points. The spark timing was advanced until the number of knock cycles reached 30 over 300 consecutive cycles. While advancing the spark timing, the operator acquired 300 cycles at each timing and then checked the knock intensities of the 300 cycles. The Indicom combustion analyzer provided the knock intensity parameter referred to as KI_AVL. As shown in Table 3, data logging was carried out more at Points 1, 3, and 5 than the other three points, given that knocking occurred less frequently under low load conditions.  Table 4 lists the second experiment, where spark timings of all the cylinders were maintained at the KBL timings for the normal engine operation. Thus, only a few and mild knock cycles were obtained from this experiment. The classification model was validated with the mild knock cycles from this second experiment. In order for the model to determine the KBL timing, it is crucial to be able to identify low-intensity knock cycles. Figure 2 shows the flow chart of the present study.

In-Cylinder Pressure Trace
The Kistler pressure transducer transmitted a voltage signal in the range of 0-5 V for a gauge pressure of 0-12 MPa. The voltage signal was first converted to the gauge pressure values using the calibration curve. The pressure was then pegged under the assumption of a polytropic process.
The 300 consecutive cycles from each of the four cylinders, which amounted to 1200 cycles, were obtained under each set of operating conditions. The individual incylinder pressure traces were used in several conventional knock determinations using the combustion timings, the knock intensity calculation, and signal processing procedures for the development of the classification model.

In-Cylinder Pressure Trace
The Kistler pressure transducer transmitted a voltage signal in the range of 0-5 V for a gauge pressure of 0-12 MPa. The voltage signal was first converted to the gauge pressure values using the calibration curve. The pressure was then pegged under the assumption of a polytropic process.
The 300 consecutive cycles from each of the four cylinders, which amounted to 1200 cycles, were obtained under each set of operating conditions. The individual in-cylinder pressure traces were used in several conventional knock determinations using the combustion timings, the knock intensity calculation, and signal processing procedures for the development of the classification model.

Heat-Release-Rate Analysis
A heat-release-rate (HRR) analysis was conducted to determine the combustion timings. The HRR also reveals the autoignition phenomena in knock cycles. The HRR was estimated from the measured in-cylinder pressure under the assumption of a uniform temperature throughout the cylinder during the combustion process [1]. A Savitzky-Golay filter was applied to the in-cylinder pressure for the elimination of the pressure oscillation [24]. Preliminary filter tests were then conducted, which confirmed that the characteristics of the combustion event were preserved in the smoothed pressure trace; whereas the oscillation amplitude decreased. The net heat release rate equation (Equation (1)) was derived from the 1st law of thermodynamics, whereas the heat transfer estimation was required to obtain the gross heat release rate.
The reliability of the net heat release rate can be significantly improved by the use of the temperature-dependent specific heat ratio in place of a constant value. Gatowski et al. [25] formulated the specific heat ratio equation (Equation (2)) based on a single-cylinder engine experiment with indolene (C18H25) under stoichiometric operation.

Heat-Release-Rate Analysis
A heat-release-rate (HRR) analysis was conducted to determine the combustion timings. The HRR also reveals the autoignition phenomena in knock cycles. The HRR was estimated from the measured in-cylinder pressure under the assumption of a uniform temperature throughout the cylinder during the combustion process [1]. A Savitzky-Golay filter was applied to the in-cylinder pressure for the elimination of the pressure oscillation [24]. Preliminary filter tests were then conducted, which confirmed that the characteristics of the combustion event were preserved in the smoothed pressure trace; whereas the oscillation amplitude decreased. The net heat release rate equation (Equation (1)) was derived from the 1st law of thermodynamics, whereas the heat transfer estimation was required to obtain the gross heat release rate.
The reliability of the net heat release rate can be significantly improved by the use of the temperature-dependent specific heat ratio in place of a constant value. Gatowski et al. [25] formulated the specific heat ratio equation (Equation (2)) based on a single-cylinder engine experiment with indolene (C 18 H 25 ) under stoichiometric operation.
The in-cylinder gas was assumed to be an ideal gas. The mass of the gas in the cylinder was then calculated from the residual mass fraction in [26]. It should be noted that Equation (3) was formulated under the assumption that the exhaust process is a polytropic process. The ideal gas law with the trapped mass was used to determine the bulk-gas temperature during the combustion process, which was substituted into Equation (2) to determine the specific heat ratio. Moreover, the trapped mass can also be determined from the direct transformation of the in-cylinder pressure resonance phenomenon [27].

Knock Cycle Determination
There are many schemes to determine the engine knock. Since no particular one is widely accepted as a universal knock detector, several methods collaborated to identify individual knock cycles. While acquiring data during the engine experiments, the spark timing advance was performed based on the combustion sound transmitted from the copper tube and the knock intensity parameter provided by the combustion analyzer in the data acquisition (DAQ) system. Subsequent data analysis determined individual knock cycles based on the shapes of the in-cylinder pressure trace and knock intensity developed in the present study.
Engine knocking is initiated by autoignition, which can be revealed on the net apparent heat release rate. Subsequent high-speed pressure wave propagates through the entire cylinder, which exhibits the ringing pattern on the in-cylinder pressure trace. Visual examination on the in-cylinder pressure and heat release rate traces of each cycle was performed for the autoignition and ringing. However, when knocking occurred at late combustion phase, the autoignition and ringing patterns are rather indecisive. Figure 3 shows the signal processing procedure for individual knock cycle determination. First, the measured in-cylinder pressure trace was filtered using a bandpass filter as shown in Figure 3b. The high and low cut-off frequencies were then adjusted for optimal filtering. Then, the signal of which absolute values were lower than 0.012 MPa was filtered out. Figure 3c shows the substantial distinction between two cycles. Finally, the integration of the filtered signal in Figure 3c provided the quantitative intensity of the knocking. The resultant integration of knock intensity (IKI) in this study was compared with the knock intensity provided by the high-speed data acquisition system employed for the in-cylinder pressure measurements. The knock intensity parameter of the combustion analyzer in the DAQ system, which was denoted as KI_AVL, was used as the knock intensity in the automotive industry.

WPD Method
The knock detection model was developed using MATLAB. WPD was employed to extract features from the in-cylinder pressure signal [28]. Previous report has confirmed the effectiveness of the WPD method with respect to in-cylinder pressure characterization [29]. Although Fourier transform decomposes the signal using sines and cosines, the The knock detection model was developed using MATLAB. WPD was employed to extract features from the in-cylinder pressure signal [28]. Previous report has confirmed the effectiveness of the WPD method with respect to in-cylinder pressure characterization [29]. Although Fourier transform decomposes the signal using sines and cosines, the wavelet packet decomposes the signal into many subspaces using wavelet functions [28,30]. The decomposition includes the translation and scaling of the selected wavelet function, in addition to the projection of the signal onto the subspace. The number of decompositions was set as expressed by Equation (4). After decomposition, features were extracted from each subspace at each level, such as the root mean square, the variance of the signal, and other statistically significant parameters. The WPD method provided 255 features from seven decomposition levels for the training of the machine learning model. Table 5 lists the numbers of features from the seven decomposition levels.

EEMD Method
The EEMD method is basically white noise added EMD. After a white noise is added onto the signal, the mode decomposition was performed as in EMD. The white noise forces the ensemble to exhaust all other components and averages them out, so that only the signal can remain. EEMD is superior to EMD since the white noise addition solves the mode-mixing problem in EMD [31].

Neighborhood Component Analysis (NCA)
Based on the prediction results, the feature selection was carried out by conducting an NCA for classification [32]. The NCA was conducted to evaluate the influence of the individual features on the prediction results. The NCA impact ranking of the features facilitated the selection of the significant features. The model was then retrained using the selected features. The retraining decreased the size of the model without impacting the prediction accuracy. The regularization parameter optimization was performed on basis of the regression loss. The value of the regularization parameter that exhibited the lowest regression loss was 1 × 10 −5 .

Supervised Learning
A supervised learning algorithm was employed to obtain an inferred function based on the class labels. Moreover, several decision tree, k-nearest neighbors (KNN), and ensemble algorithms were evaluated. Although the singular value decomposition (SVD) is a wellknown signal denoising tool [16], it was excluded because it required more computational time than the algorithms evaluated in this study, which were less complex. During the evaluations, 30% of the dataset was employed for validation. Two critical performance parameters considered were the accuracy and training time. Given that the training should be completed prior to embedding of the algorithm into the prediction model, the training time did not have an influence on the performance of the prediction model. However, a decrease in the training time of the selected algorithm decreased the overall time required for the development of the model.
The misclassification was weighted more significantly in the detection of all knock cycles. The misclassification of a normal cycle as a knock cycle was considered as overprediction, whereas the misclassification of a knock cycle as a normal cycle was considered as underprediction. It should be noted that the misclassification of several normal cycles as knock cycles was considered as acceptable, given that the model could identify all the knock cycles. Therefore, the resultant model tended to overpredict the knock cycles. In addition, the correction of the overpredicted cycles is more effective than the detection of missed knock cycles, given that the proportion of knock cycles is significantly greater than that of normal cycles.

Results and Discussion
From the knock cycle determination process 4158 cycles were classified as knock cycles from 66,000 cycles of 55 runs. Point 3 exhibited the highest number of knock cycles from 14 repeated runs. The number of knock cycles of the six operating points are listed in Table 6.  Figure 4 presents the in-cylinder pressure traces of one run at Point 6, which consisted of 209 knock cycles and 91 normal cycles. The 16 selected knock cycles varied from mild to severe ringing. Note that some retarded knock cycles exhibited only mild ringing. The significant increase in combustion with respect to the normal cycles indicated the autoignition prior to the knocking phenomena [2,33].

Results and Discussion
From the knock cycle determination process 4158 cycles were classified as knock cycles from 66,000 cycles of 55 runs. Point 3 exhibited the highest number of knock cycles from 14 repeated runs. The number of knock cycles of the six operating points are listed in Table 6.  Figure 4 presents the in-cylinder pressure traces of one run at Point 6, which consisted of 209 knock cycles and 91 normal cycles. The 16 selected knock cycles varied from mild to severe ringing. Note that some retarded knock cycles exhibited only mild ringing. The significant increase in combustion with respect to the normal cycles indicated the autoignition prior to the knocking phenomena [2,33]. The heat release analysis results revealed the difference in the combustion timings between the knock and normal cycles, as shown in Figure 5. Three knock cycles with different knock intensities were selected for comparison. All three knock cycles exhibited  The heat release analysis results revealed the difference in the combustion timings between the knock and normal cycles, as shown in Figure 5. Three knock cycles with different knock intensities were selected for comparison. All three knock cycles exhibited autoignition with various magnitudes and similar onset timings. The net heat release rate traces indicate that the combustion advancement and autoignition were greater in the more severe knocking cases. Given that the knock cycles exhibited a more rapid pressure rise and higher peak incylinder pressure than the average normal cycle, the pressure rise rate (PRR) and peak incylinder pressure (PIP) of the knock and normal cycles were investigated, as shown in Figure 6. With respect to the normal cycles, a linear relationship was observed between the PRR and PIP, whereas the two parameters of the knock cycles were slightly dispersed. Moreover, PRR and PIP were greater in the knock cycles than in the normal cycles. However, there was a substantial number of knock cycles in which PRR and PIP were similar to those of the normal cycles. As can be seen in Figure 5 (knock cycle + normal cycle average), there were several knock cycles that did not exhibit severe ringing, as observed from the in-cylinder pressure trace. The knock cycles exhibited earlier CA10 (crank angle of 10% heat release) and CA50 (crank angle of 50% heat release), and consequently shorter durations between CA10 and CA50 than the normal cycles as shown in Figures 7 and 8. Although the discrepancies between the knock and normal cycles were significant for all four parameters, the combustion timings and durations were insufficient in the individual knock cycle detection, Figure 5. In-cylinder pressure and apparent heat release traces of three knock cycles.

Combustion Characteristics
Given that the knock cycles exhibited a more rapid pressure rise and higher peak in-cylinder pressure than the average normal cycle, the pressure rise rate (PRR) and peak in-cylinder pressure (PIP) of the knock and normal cycles were investigated, as shown in Figure 6. With respect to the normal cycles, a linear relationship was observed between the PRR and PIP, whereas the two parameters of the knock cycles were slightly dispersed. Moreover, PRR and PIP were greater in the knock cycles than in the normal cycles. However, there was a substantial number of knock cycles in which PRR and PIP were similar to those of the normal cycles. As can be seen in Figure 5 (knock cycle + normal cycle average), there were several knock cycles that did not exhibit severe ringing, as observed from the in-cylinder pressure trace. Given that the knock cycles exhibited a more rapid pressure rise and higher peak incylinder pressure than the average normal cycle, the pressure rise rate (PRR) and peak incylinder pressure (PIP) of the knock and normal cycles were investigated, as shown in Figure 6. With respect to the normal cycles, a linear relationship was observed between the PRR and PIP, whereas the two parameters of the knock cycles were slightly dispersed. Moreover, PRR and PIP were greater in the knock cycles than in the normal cycles. However, there was a substantial number of knock cycles in which PRR and PIP were similar to those of the normal cycles. As can be seen in Figure 5 (knock cycle + normal cycle average), there were several knock cycles that did not exhibit severe ringing, as observed from the in-cylinder pressure trace. The knock cycles exhibited earlier CA10 (crank angle of 10% heat release) and CA50 (crank angle of 50% heat release), and consequently shorter durations between CA10 and CA50 than the normal cycles as shown in Figures 7 and 8. Although the discrepancies between the knock and normal cycles were significant for all four parameters, the combustion timings and durations were insufficient in the individual knock cycle detection,  The knock cycles exhibited earlier CA10 (crank angle of 10% heat release) and CA50 (crank angle of 50% heat release), and consequently shorter durations between CA10 and CA50 than the normal cycles as shown in Figures 7 and 8. Although the discrepancies between the knock and normal cycles were significant for all four parameters, the combustion timings and durations were insufficient in the individual knock cycle detection, given that several normal cycles exhibited combustion timings that were as advanced as the knock cycles.   Figure 9 reveals that IKI and KI_AVL were in good agreement, with the exception of the cycles with very low knock intensities, i.e., normal cycles. The IKI analysis was applied only for the cycles wherein the filtered signal peaked above 0.12. Based on this criteria, 4158 cycles were identified as knock cycles. Thus, the IKI value of all the normal cycles was zero. Moreover, KI_AVL was calculated for all cycles. The two knock parameters indicated that the knock cycles in this study exhibited a wide range of knock intensities.    Figure 9 reveals that IKI and KI_AVL were in good agreement, with the exception of the cycles with very low knock intensities, i.e., normal cycles. The IKI analysis was applied only for the cycles wherein the filtered signal peaked above 0.12. Based on this criteria, 4158 cycles were identified as knock cycles. Thus, the IKI value of all the normal cycles was zero. Moreover, KI_AVL was calculated for all cycles. The two knock parameters indicated that the knock cycles in this study exhibited a wide range of knock intensities.  Figure 9 reveals that IKI and KI_AVL were in good agreement, with the exception of the cycles with very low knock intensities, i.e., normal cycles. The IKI analysis was applied only for the cycles wherein the filtered signal peaked above 0.12. Based on this criteria, 4158 cycles were identified as knock cycles. Thus, the IKI value of all the normal cycles was zero. Moreover, KI_AVL was calculated for all cycles. The two knock parameters indicated that the knock cycles in this study exhibited a wide range of knock intensities. Figure 10 shows that the first two IMF contained the high-frequency signal components due to the knocking phenomena. The peak and amplitude of the 1st and 2nd IMF seemed to be adequate for knock cycle determination. The pressure rise due to autoignition was not shown in any IMF, which was exhibited in the heat release rate traces as in Figure 5. Figure 9 reveals that IKI and KI_AVL were in good agreement, with the exception of the cycles with very low knock intensities, i.e., normal cycles. The IKI analysis was applied only for the cycles wherein the filtered signal peaked above 0.12. Based on this criteria, 4158 cycles were identified as knock cycles. Thus, the IKI value of all the normal cycles was zero. Moreover, KI_AVL was calculated for all cycles. The two knock parameters indicated that the knock cycles in this study exhibited a wide range of knock intensities.   Figure 10 shows that the first two IMF contained the high-frequency signal components due to the knocking phenomena. The peak and amplitude of the 1st and 2nd IMF seemed to be adequate for knock cycle determination. The pressure rise due to autoignition was not shown in any IMF, which was exhibited in the heat release rate traces as in Figure 5.

Supervised Learning Model for Knock Classification
All the algorithms demonstrated accuracies of 99.9-100%, as shown in Table 7. Thus, an algorithm was selected for the model development with respect to the training time. The boosted tree ensemble and random undersampling boost (RUSB) algorithms demonstrated the shortest training times of 6.5 and 5.9 s, respectively. Based on the comparison, the ensemble algorithm was selected for the predictive model. The proposed model was trained with 46,200 cycles, which amounted to 70% of the total dataset (66,000 cycles). The training dataset included 43,289 normal cycles and 2911 knock cycles. The remaining 30% of the dataset was employed for validation. The validation results in Figure 11 reveal that the proposed model detected 99.92% of the knock cycles, with a 55% misclassification of the normal cycles as knock cycles. The higher accuracy of the knock cycle prediction was because the misclassification weighting applied to the knock cycle prediction was greater by a factor of 10. In other words, the misclassification weighting was set such that the model does not misclassify knock cycles, although several normal cycles were misclassified as knock cycles.

Supervised Learning Model for Knock Classification
All the algorithms demonstrated accuracies of 99.9-100%, as shown in Table 7. Thus, an algorithm was selected for the model development with respect to the training time. The boosted tree ensemble and random undersampling boost (RUSB) algorithms demonstrated the shortest training times of 6.5 and 5.9 s, respectively. Based on the comparison, the ensemble algorithm was selected for the predictive model. The proposed model was trained with 46,200 cycles, which amounted to 70% of the total dataset (66,000 cycles). The training dataset included 43,289 normal cycles and 2911 knock cycles. The remaining 30% of the dataset was employed for validation. The validation results in Figure 11 reveal that the proposed model detected 99.92% of the knock cycles, with a 55% misclassification of the normal cycles as knock cycles. The higher accuracy of the knock cycle prediction was because the misclassification weighting applied to the knock cycle prediction was greater by a factor of 10. In other words, the misclassification weighting was set such that the model does not misclassify knock cycles, although several normal cycles were misclassified as knock cycles.  The NCA analysis was conducted to evaluate the weights of the 325 features obtained from the WPD. The feature weights represent the contributions to the classification accuracy. Figure 12 presents the weights of the 325 features employed in the proposed model. The NCA results indicated that 37 features were significant in the classification. The numbers of significant features from EEMD and WPD are 5 and 32, respectively. Only the peak values of IMF exhibited significance. The decomposition levels of the 32 significant features are listed in Table 8. As can be seen from the table, seven levels of decomposition were necessary to obtain all significant features. The NCA analysis was conducted to evaluate the weights of the 325 features obtained from the WPD. The feature weights represent the contributions to the classification accuracy. Figure 12 presents the weights of the 325 features employed in the proposed model. The NCA results indicated that 37 features were significant in the classification. The numbers of significant features from EEMD and WPD are 5 and 32, respectively. Only the peak values of IMF exhibited significance. The decomposition levels of the 32 significant features are listed in Table 8. As can be seen from the table, seven levels of decomposition were necessary to obtain all significant features.  The proposed model was retrained based on the same dataset using only the significant features. Based on the NCA analysis results, many features exhibited negligible weights. The features with near-zero weights contributed to the accuracy significantly less than those with high weights. Thus, low-weight feature elimination can significantly re-  The proposed model was retrained based on the same dataset using only the significant features. Based on the NCA analysis results, many features exhibited negligible weights. The features with near-zero weights contributed to the accuracy significantly less than those with high weights. Thus, low-weight feature elimination can significantly  Figure 13 reveals that the reduced features improved the model accuracy. The retrained model with 37 selected features demonstrated exhibited substantial improvement in the normal cycle detection. In literature the use of selected features resulted in not only the computational time reduction, but also the classification accuracy improvement by minimizing possible misleading and overfitting [34]. duce the size of the model without impacting the accuracy. Figure 13 reveals that the reduced features improved the model accuracy. The retrained model with 37 selected features demonstrated exhibited substantial improvement in the normal cycle detection. In literature the use of selected features resulted in not only the computational time reduction, but also the classification accuracy improvement by minimizing possible misleading and overfitting [34]. The second dataset was employed for the validation of the determination model. The engine was operated under five different engine speed and load conditions. A total of 21 runs were taken from repeated operations. The operating conditions are presented in Tables  1 and 2. Unlike the first experiment, the spark timings were set as the KBL timing to evaluate the performance of the classification model under normal engine operating conditions. Hence, the second dataset included only 25 knock cycles out of a total 25,200 cycles.
The validation results revealed that the original and retrained models achieved knock detection accuracies of 99.9% as shown in Figure 14. The two models demonstrated different performances in the normal cycle prediction. The removal of the insignificant features increased the overestimation rate from 10% to 30%.

Conclusions
A classification model was developed to distinguish between individual knock cycles and normal cycles using the machine learning algorithm with an in-cylinder pressure of 66,000 cycles under various engine speed and load conditions. The findings of this study are as follows: The second dataset was employed for the validation of the determination model. The engine was operated under five different engine speed and load conditions. A total of 21 runs were taken from repeated operations. The operating conditions are presented in Tables 1 and 2. Unlike the first experiment, the spark timings were set as the KBL timing to evaluate the performance of the classification model under normal engine operating conditions. Hence, the second dataset included only 25 knock cycles out of a total 25,200 cycles.
The validation results revealed that the original and retrained models achieved knock detection accuracies of 99.9% as shown in Figure 14. The two models demonstrated different performances in the normal cycle prediction. The removal of the insignificant features increased the overestimation rate from 10% to 30%. duce the size of the model without impacting the accuracy. Figure 13 reveals that the reduced features improved the model accuracy. The retrained model with 37 selected features demonstrated exhibited substantial improvement in the normal cycle detection. In literature the use of selected features resulted in not only the computational time reduction, but also the classification accuracy improvement by minimizing possible misleading and overfitting [34]. The second dataset was employed for the validation of the determination model. The engine was operated under five different engine speed and load conditions. A total of 21 runs were taken from repeated operations. The operating conditions are presented in Tables  1 and 2. Unlike the first experiment, the spark timings were set as the KBL timing to evaluate the performance of the classification model under normal engine operating conditions. Hence, the second dataset included only 25 knock cycles out of a total 25,200 cycles.
The validation results revealed that the original and retrained models achieved knock detection accuracies of 99.9% as shown in Figure 14. The two models demonstrated different performances in the normal cycle prediction. The removal of the insignificant features increased the overestimation rate from 10% to 30%.

Conclusions
A classification model was developed to distinguish between individual knock cycles and normal cycles using the machine learning algorithm with an in-cylinder pressure of 66,000 cycles under various engine speed and load conditions. The findings of this study are as follows:

Conclusions
A classification model was developed to distinguish between individual knock cycles and normal cycles using the machine learning algorithm with an in-cylinder pressure of 66,000 cycles under various engine speed and load conditions. The findings of this study are as follows: The heat release analysis revealed that there were significant differences between the combustion timings of the normal and knock cycles. The CA10 and CA50 of the knock cycles were significantly earlier than those of the normal cycles. In addition, the knock cycles exhibited higher PRR and PIP values, which can be attributed to the progression of combustion with severe ringing. However, the parameters derived from the heat release analysis and in-cylinder pressure yielded an acceptable level of accuracy in the classification of the knock and normal cycles, given that several knock and normal cycles exhibited similar values.
The WPD and EEMD were effective signal processing techniques for the cycle characterization. The 325 features from seven decomposition levels were sufficient for the training of the classification model, which demonstrated an accuracy of 99.9% accuracy in the knock cycle detection. It should be noted that the model achieved a very high accuracy in the validation test and with the normal engine operation data, which included only mild knock cases.
The classification model using the features from the WPD and EEMD demonstrated an accuracy of 99.26% in the knock cycle detection. Moreover, the model demonstrated an accuracy of 99.96% using a different dataset. The second dataset was obtained from the engine experiments at the KBL spark timings. In particular, the model detected all the knock cycles, including those during the engine operation under the normal operating conditions. It should be noted that the knock intensities of the knock cycles from the normal operating conditions were significantly lower.
The NCA was found to effectively decrease the size of the classification model. When only the significant features, i.e., the significantly weighted features based on NCA, were employed to increase the compactness of the model; the size of the classification model decreased by 58%, whereas the knock detection accuracy was maintained at 99%. However, the compact model exhibited a marginal overestimation increase from 10% to 30% in the misclassification of normal cycles as knock cycles.
It should be noted that the knock cycles were initially classified based on the incylinder pressure trace, HRR, IKI, and KI_AVL. In actual engine calibration process, knock determination is often based on the engine sound. Even though the in-cylinder pressurebased determination is reliable, further investigation should be performed on the model validation with acoustic knock detection method as done in the spark timing calibration process. Once sufficiently validated, the proposed knock classification model can be implemented in the engine management system for real-time knock detection.