Data-Based In-Cylinder Pressure Model with Cyclic Variations for Combustion Control: A RCCI Engine Application

Cylinder pressure-based control is a key enabler for advanced pre-mixed combustion concepts. Besides guaranteeing robust and safe operation, it allows for cylinder pressure and heat release shaping. This requires fast control-oriented combustion models. Over the years, mean-value models have been proposed that can predict combustion measures (e.g., Gross Indicated Mean Effective Pressure, or the crank angle where 50% of the total heat is released) or models that predict the full in-cylinder pressure. However, these models are not able to capture cyclic variations. This is important in the control design for combustion concepts, like Reactivity Controlled Compression Ignition, that can suffer from large cyclic variations. In this study, the in-cylinder pressure and cyclic variation are modelled using a data-based approach. The model combines Principle Component Decomposition and Gaussian Process Regression. A detailed study is performed on the effects of the different hyperparameters and kernel choices. The approach is applicable to any combustion concept, but most valuable for advance combustion concepts with large cyclic variation. The potential of the proposed approach is demonstrated for an Reactivity Controlled Compression Ignition engine running on Diesel and E85. The prediction quality of the evaluated combustion measures has an overall accuracy of 13.5% and 65.5% in mean behaviour and standard deviation, respectively. The peak-pressure rise-rate is traditionally hard to predict, in the proposed model it has an accuracy of 22.7% and 96.4% in mean behaviour and standard deviation, respectively. This Principle Component Decomposition-based approach is an important step towards in-cylinder pressure shaping. The use of Gaussian Process Regression provides important information on cyclic variation and provides next-cycle controls information on safety and performance criteria.


Introduction
Concerns about global warming require a significant reduction in CO 2 emissions for on-road applications.This resulted in the interest of high efficient and low carbon propulsion methods in the transportation sector.This trend The move to model-based CPBC requires a Control-oriented Model (COM) of the in-cylinder pressure.These COMs can help in improving controller design and calibration, and can be embedded in the controller.The model should give a relation between the in-cylinder mixture composition, intake manifold pressure and intake manifold temperature, and the resulting in-cylinder pressure.The computation time of this COM should be below the duration of a combustion cycle to make sure a new control action has been determined before the start of the next combustion cycle.In the case of RCCI, a description of the cycle-to-cycle variations should be present in the COM.
A distinction can be made between two types of models: first-principle physics-based models and data-based models.Physics-based models use first-principle physical relations to capture the combustion behaviour.On the other side, purely data-based models use black box modelling methods where measurements are used to create a mapping from input to output.
To model important combustion measures (e.g., Gross Indicated Mean Effective Pressure (IMEP g ), or crank angle where 50% of the total heat is released (CA50)) basic first-principle models have been proposed [9,10,11,12].These models provided a deterministic and dynamic view of the relation between actuation and combustion measures without determining the full in-cylinder pressure.To add new combustion measures these models should be extended.This can be time consuming and reduces the flexibility of these models during combustion control development.
To model the full in-cylinder pressure, more complex first-principle models have been proposed.These include the multi-zone model of Bekdemir et al. [13] or the fluid dynamic model of Klos and Kokjohn [14].The complexity of these models result in computation times that exceeds the combustion time and are therefore not directly suited as COM.A reduction in computation time is achieved by using static, data-driven, deterministic regression models to capture the behaviour of important combustion measures.
On the other hand, data-based models have been developed.A Gaussian Process Regression (GPR) model to map in-cylinder conditions to combustion measures has been proposed by Xia et al. [15].A state-space model identified using data to model combustion phasing and peak pressure rise-rate has been proposed by Basina et al. [16].These models are made to only provide information on the modelled combustion measures.Therefore, the model has to be extended to include other measures.
Capturing the full in-cylinder pressure using data, Principle Component Decomposition (PCD) models have been proposed.These models consist of a weighted sum of principle components where the weights are modelled using regression methods.Pan et al. [17] use a deterministic neural network to capture the behaviour of the weights.On the other hand, Vlaswinkel et al. [18] use GPR model to capture the behaviour of the weights.The later approach makes it possible to include cycle-to-cycle variation in the model.The use of the PCD of the in-cylinder pressure has already been proposed in several control and detection methods.Henningsson et al. [19] used this decomposition as input to a virtual emission sensor.They where able to predict the air-to-fuel ratio and NO x emissions quite accurately.Panzani et al. [20] and Panzani et al. [21] proposed this decomposition for knock detection and avoidance.They used the decomposition to derive a measure of closeness to engine knocking.Vlaswinkel and Willems [22] used this decomposition as an alternative method to maximise the thermal efficiency.They used the decomposition to derive a measure of closeness of a measured in-cylinder pressure to an idealised thermodynamic cycle.
In this study, we will extend the work of Vlaswinkel et al. [18] by giving an extensive analysis on: 1) the comparison of different kernels in the GPR approach with regards to prediction quality of important combustion measures;  understanding the effects of modelling correlated process as uncorrelated Gaussian process; 3) using a data set with a wide range of operating conditions to show the effectiveness of the model.This work is organised as follows.In Section 2 an overview is given of the experimental setup and the used data set.Section 3 describes the data-based combustion model including cycle-to-cycle variation.A detailed analysis of the effect on different hyperparameters is presented in Section 4. The prediction quality of the combustion model is demonstrated and validated in Section 5.

Single Cylinder Engine
In this section, we will give a description of the setup and the used data sets.A discussion is provided on the chosen inputs to the model and how these are determined.

System Description
In this study, a modified PACCAR MX13 engine is used as shown in Figure 1.Cylinders 2 to 6 have been removed and only cylinder 1 is operational.To keep the engine running at a constant speed, the electric motor of the engine dynamometer provides the require torque.The focus is on RCCI combustion with a single injection of diesel to auto-ignite the well-mixed charge of E85, air and recirculated exhaust gas.The injection of diesel does not ignite the mixture itself, but the ignition is caused by the increased temperature as a result of cylinder compression.Therefore, there is a clear temporal separation between the ignition of diesel and combustion.The Direct Injection (DI) of diesel is handled by a Delphi DFI21 injector connected to a common rail.The E85 Port Fuel Injection (PFI) is handled by a Bosch EV14 injector fitted into the intake channel set at 5 Bar.Both the DI and PFI fuel mass flows are measured using a Siemens Sitrans FC Mass 2100 Coriolis mass flow meter coupled with Mass 6000 signal converters.Boosted intake air is supplied at 8 Bar and the pressure and temperature is regulated using a pressure regulator and an electric heater, respectively.The Exhaust Gas Recirculation (EGR) fraction is regulated by the EGR and back-pressure butterfly valves.The EGR flow is cooled down to approximately room temperature by a cooled stream of process water.The condensation tank collects the condensation form the EGR flow and is drained regularly.The expansion and mixing tank dampen pressure fluctuations in the intake and exhaust manifold as a result of single cylinder operation.The Data-Based In-Cylinder Pressure Model with Cyclic Variations for Combustion Control: A RCCI Engine Application A PREPRINT in-cylinder pressure is sampled at 0.2 • CA with a Kistler 6125C uncooled pressure transducer and amplified with a Kistler 5011B.A Leine Linde RSI 503 encoder provides crank angle information at a 0.2 • interval.A Bronkhorst IN-FLOW F-106BI-AFD-02-V digital mass flow meter is used to measure the mass of the intake air flow.The pressure and temperatures located at different locations in the air-path are measured every combustion cycle using a Gems Sensors & Controls 3500 Series pressure transmitter and Type-K thermocouples, respectively.The concentration of CO 2 in the intake and exhaust flows are measured using an Horiba MEXA 7100 DEGR system.Table 1 shows the specifications of the engine setup.

Data Set for Model Training and Validation
The model relates in-cylinder conditions, determined at intake valve closing, to a resulting in-cylinder pressure.These conditions consist of a range of parameters related to engine speed, cylinder wall temperature, and mixture composition, pressure and temperature.Since the engine is running at a single speed and at steady-state conditions the most relevant changes throughout the data set are a result of differences in mixture composition, pressure and temperature.These can be described using intake and fuelling conditions.The chosen measurable parameters used to describe in-cylinder conditions are: where m PFI and m DI are the injected masses of PFI and DI fuels, and LHV PFI and LHV DI are the lower heating values of the PFI and DI fuels; • Energy-based blend ratio • Start-of-injection of the directly injected fuel SOI DI ; • Pressure at the intake manifold p im ; • Temperature at the intake manifold T im ; and • EGR ratio with CO 2,in and CO 2,out the concentration of CO 2 as a fraction of the volume flow at the intake and exhaust, respectively.
The variation in the in-cylinder conditions for the training data and validation data is shown in Figure 2. The diagonal shows the distribution of each measure for the in-cylinder conditions.The off-diagonal shows the joint distribution of the measures used for the in-cylinder conditions.The data set contains 95 different measurements consisting of n cyc = 50 consecutive cycles each.Both small and large cycle-to-cycle variations, and non-firing behaviour are present within the data set.In this work, each cycle is used and no averaging over the n cyc in-cylinder conditions and in-cylinder pressure traces in a measurement is performed before analysis.The data set is randomly split into a training set of n train = 75 measurements and a validation set of the remaining n val = 20 measurements.

Combustion Model
In this section, the data-based approach to model the in-cylinder pressure is introduced.It is based on the method presented in Vlaswinkel et al. [18].The approach combines Principle Component Decomposition (PCD) and Gaussian Process Regression (GPR).To describe the in-cylinder pressure during the compression and power stroke, the PCD is used to minimise the amount of information required by separating the influence of the in-cylinder conditions s ICC and the crank angle θ into two different mappings.GPR gives the possibility to model the in-cylinder pressure and cycle-to-cycle variation at different in-cylinder conditions.

Principle Component Decomposition of the In-Cylinder Pressure
The in-cylinder pressure p(θ, s * ICC ) at crank angle θ ∈ {−180 • , −180 • + ∆CA, . . ., 180 • − ∆CA, 180 • } with ∆CA the crank angle resolution is decomposed as where w(s * ICC ) is a vector of weights and f (θ) is the vector of principle components.In these vectors, the ith element is related to the ith Principle Component (PC).The in-cylinder condition s * ICC ∈ S * ⊂ S are in the set S * containing all in-cylinder conditions present in the training set and the set S spanning the modelled operation domain.It is assumed that the in-cylinder pressure during the intake stroke is equal to p im .The PCs are computed using the eigenvalue method.The n train • n cyc in-cylinder pressures p(θ, s * ICC ) contained in the training set are used.The vector F i is the ith unit eigenvector of the matrix P P T where P ∈ R nCA×ntrainncyc with n CA the number of crank angle values.The elements in matrix P are defined as such that the ath row of P contains the values of the in-cylinder pressure at the ath crank angle for all s * ICC ∈ S * and the bth column of P contains the full in-cylinder pressure at all θ ∈ {−180 • , −180 • +∆CA, . . ., 180 • −∆CA, 180 • } for the bth s * ICC .The ith PC is defined as (3) The weight related to the ith PC is given by The training set generates a single set of PCs.These PCs are ordered by relevance, where i = 1 is the most relevant PC.The determination of the PCs and the required amount of PCs will be done later in this study.

Gaussian Process Regression to Capture Effects of In-Cylinder Conditions
GPR is used to estimate the behaviour of w(s ICC ) over the full operation domain S. To include cycle-to-cycle variations, w(s ICC ) is described by a stochastic process as w(s During this study, the correlation between output variable will be neglected (i.e., , W (s ICC ) is a diagonal matrix), since most literature on GPR assumes the output variables to be uncorrelated.This might effect the quality of the prediction of the cycle-to-cycle variation.
To improve the prediction accuracy and determination of the hyperparameters, normalised in-cylinder conditions sICC and weights wi (s * ICC ) will be used.The in-cylinder condition scaling uses the mean μs * ICC ,j and standard deviation σs * ICC ,j of the jth in-cylinder conditions variable over the full training set S * as sICC,j = s ICC,j − μs * ICC ,j σs * ICC ,j .
The weight scaling uses the mean μw * ICC ,i and standard deviation σw * ICC ,i of the ith in-cylinder conditions variable over the full training set S * as wi (s * ICC ) = Following [23], the scaled expected value and scaled covariance matrix without correlation can be computed as:

) and
Wii (s where K(•, •, ϕ) is the kernel and ϕ and φ n are the kernel's hyperparameters.The selection of both elements will be discussed in the next section.
To optimise the set of hyperparameters ϕ and φ n found in the kernels, the marginal log-likelihood is maximised for each PC separately.The marginal log-likelihood is often used in determining the hyperparameters in GPR and does not depend on the kernel type .It is given by where wi is a vector of the weights related to the ith PC at measured s * IVC in the training set and K s * IVC := K(s * IVC , s * IVC , ϕ) + φ n I.At last, the scaled expected value and scaled covariance matrix are descaled to complete the description of (5).The descaled expect value is given by ŵi (s ICC ) = ŵi (s ICC )σ wi + μwi (11) and the descaled covariance matrix is given by Data-Based In-Cylinder Pressure Model with Cyclic Variations for Combustion Control: A RCCI Engine Application A PREPRINT

Reconstructing the In-Cylinder Pressure with Cycle-to-Cycle Variation
The PCs f (θ) (Section 3.1) and the estimate behaviour of w(s ICC ) (Section 3.2) can be combined to reconstruct a predicted in-cylinder pressure p(θ, s ICC ).Using (1), the mean and variance of the in-cylinder pressure can be described by respectively.

Combustion Model Identification
The PCD and GPR require the selection of the number of PCs as well as the kernel type and hyperparameters.The training set is used to determine the PCs and values for the hyperparameters, while the validation set is used to determine the required amount of PCs n PC and the best performing kernel type.For this selection, an assessment is made on the prediction accuracy of combustion measures that are relevant for control.To this end, the Mean Absolute Error (MAE) is analysed, which is defined as where n val is the number of validation measurements, and z meas and z model are the combustion metrics resulting from the measured in-cylinder pressure or modelled in-cylinder pressure, respectively.The following combustion measures are studied: • gross Indicated Mean Effective Pressure with displacement volume V d ; • peak pressure max(p(θ)); • peak pressure rise-rate max dp dθ ; • crank angle where 50% of the total heat is released with the heat release [24] Q • burn duration CA75 − CA25 with CA75 and CA25 compute in a similar fashion as CA50; • and burn ratio [25]

Selection of Principal Components
The first hyperparameter is the number of PCs n PC .The GPR formulation proposed in Section 3.2 is not used in this part of the discussion.Figure 3 shows the four most relevant PCs derived from the training data, as discussed in Section 3.1.This figure illustrates that adding more PCs will add more higher frequency components to the in-cylinder pressure.Figure 4 shows the absolute error in the corresponding combustion metrics by comparing measurements and model results.The modelled, decomposed in-cylinder pressure is based on an increasing number of PCs using (3) to compute the required weights.Each measured cycle in the validation set is analysed separately.The figure indicates the minimum, maximum, median, first and third quartile, while the crosses show outliers.It can be seen that the largest gain in improvement is made at lower numbers of PCs.From the used training and validation sets, it is concluded that having more than eight PCs gives a negligible improvement.Therefore, n P C = 8 is used in this study.

Selection of Kernel
Another important aspect in the quality of the model lies in the chosen kernel which describes the correlation between all measured w(s * ICC ) and to be predicted mean ŵ(s ICC ) and variance W (s ICC ).The kernel types compared in this study rely on the distance measure , where sICC and s′ ICC are scaled in-cylinder conditions.Each element of the kernel is computed individually.The elements of the kernels used in this work are: with the set of hyperparameters ϕ = {φ f , Φ l }; with the set of hyperparameters ϕ = {φ f , Φ l }; with the set of hyperparameters ϕ = {φ f , φ α , Φ l }.For each kernel, a distinction is made between with and without Automatic Relevance Determination (ARD).In the case where ARD is not used, the hyper-parameter Φ l reduces to a scalar.in the case where ARD is used, the hyperparameter Φ l is a diagonal matrix with unique elements on the diagonal.The hyper-parameters are determined by maximising the marginal log-likelihood as described in (10) using the training set.

Data-Based
For the studied combustion measures, Tables 2 and 3 show the mean absolute error in the mean behaviour and in the standard deviation, respectively.For each combustion metric, the best result is highlighted.In some cases the difference between the best and second best option are negligible.The Matérn kernel with ν = 3 2 gives the best result for the most combustion metrics in both mean behaviour and the standard deviation for the used data sets.The resulting MAE of the mean-value behaviour shows a comparable or improved modelling error as found in literature [14,13,9,10,11,17,16,12,15].

Validation of the Prediction Quality of the Combustion Model
The main goal of this work is to predict the in-cylinder pressure and cycle-to-cycle variation.In this section, the outcome of the model is compared to measurements using the validation data set.The hyperparameters shown in Table 4 are used.These choices for hyperparameters give the overall best prediction for the used data set, as discussed in Section 4.     5 where four options are distinguished: 1) prediction is correct (✓), 2) the trend is followed but the predicted values are too high (↑), 3) the trend is followed but predicted values are too low (↓), and 4) the prediction is incorrect (×).

Variation in Start-of-Injection Directly Injected Fuel
Figure 5 shows the modelled mean-value and cycle-to-cycle variation of important combustion parameters over a range of SOI DI and the nominal conditions shown in Table 5.The quality of the prediction is classified in Table 6.Except for the peak-pressure rise rate, the mean-value of the model is similar to that of the measurements.The modelled trend of the peak-pressure rise rate seems to correspond the the measured values.The standard deviation of the model only matches with max(p(θ)).The trend of the standard deviation of the model of max dp dθ and R b seems correct, but it is either too high or too low.The standard deviation of the model does not match the measurements for the IMEP g , CA50 and CA75 − CA25.

Variation in Intake Manifold Temperature
Figure 6 shows the modelled mean-value and cycle-to-cycle variation of important combustion parameters over a range of T im and the nominal conditions shown in Table 5.The quality of the prediction is classified in Table 7. Similarly to the sweep of SOI DI , the mean-value of the model is similar to that of the measurements except for the peak-pressure rise rate.The modelled trend of the peak-pressure rise rate seems to correspond the the measured values.The standard deviation of the model only matches with max(p(θ)) and CA50.The trend of the standard deviation of the model of CA75 − CA25 seems correct, but it is too high.The standard deviation of the model does not match the measurements for the IMEP g , max dp dθ and R b .

Discussion
In both sweeps, the predicted standard deviations do not always match the measurements.In (5), w i (s IVC ) and w j (s IVC ) ∀i, j ∈ {1, 2, . . ., m} are assumed to be independent to align with the available GPR literature; however, this independence is not necessarily the case.To evaluate the correlation between weights at a fixed s ICC , the Table 7: The quality of the predictions for each combustion measure in Figure 6 where four options are distinguished: 1) prediction is correct (✓), 2) the trend is followed but the predicted values are too high (↑), 3) the trend is followed but predicted values are too low (↓), and 4) the prediction is incorrect (×).Pearson correlation matrix R is used.This is given by: where μwi (s ICC ) and σwi (s ICC ) are the mean and standard deviation of the measured weights at s ICC , respectively.The values of R range from -1 to 1.When an element of R is zero, there is no correlation between the two variables.However, when an element is -1 or 1 there is full correlation between the two variables.The determinant of the R can be used as a measure for the amount of correlation, where det(R) ranges from 0 to 1.If det(R) = 1 all variables are fully uncorrelated.However, if det(R) = 0 at least two variables are fully correlated.
Figure 7 shows the distribution of the weights for 50 consecutive cycles running at a constant s * ICC ∈ S * with the least amount of coupling according to the determinant of the Pearson correlation matrix.In Figure 7 with det(R) = 0.23.This shows that the distribution between some of the weights are significantly correlated, as is also illustrated in Figure 7. Therefore, it is no surprise that the quality of the prediction of the cycle-to-cycle variation deviates in the proposed model.This emphasises the importance of developing GPR methods that include correlation between the outputs.
Data-Based In-Cylinder Pressure Model with Cyclic Variations for Combustion Control: A RCCI Engine Application A PREPRINT

Conclusions
In this study, a data-based model for the in-cylinder pressure and corresponding cycle-to-cycle variations is proposed.This model combines a PCD of the in-cylinder pressure and GPR to map in-cylinder conditions to a resulting incylinder pressure and the corresponding size of the cycle-to-cycle variation.
In the presented approach, the correlation between w i (s IVC ) and w j (s IVC ) has been neglected for ease of implementation.To improve the accuracy of the cycle-to-cycle variations this correlation should be added.However, there are very few approaches that extend the GPR framework to including correlation between model outputs known in literature.
The proposed data-based modelling approach is successfully applied to an experimental RCCI engine set-up.
The assumption that the model can be split in a general principal component part and operating condition dependent weights is confirmed.A detailed analysis of the hyperparameters for the PCD and GPR has been performed.It was found that for the used data set more than eight PCs do not improve the accuracy of the decomposition based on important combustion measures.For the GPR, the Matérn kernel with ν = 3 2 and without ARD gives the best results.The prediction quality of the evaluated combustion measures has an overall accuracy of 13.5% and 65.5% in mean behaviour and standard deviation, respectively.The peak-pressure rise-rate is traditionally hard to predict, in the proposed model it has an accuracy of 22.7% and 96.4% in mean behaviour and standard deviation, respectively.
In conclusion, the mean-value performance of our model is comparable or shows improvements compared to models found in literature.This shows that, even when neglecting correlation, the model performs well.The model can be used for in-cylinder pressure shaping as proposed in Vlaswinkel and Willems [22].Furthermore, it can be used in model-based optimisation approaches that take into account cycle-to-cycle variations and safety criteria.When combined with the PCD-based emission model of Henningsson et al. [19], the model provides a base for optimisation approaches with emission constraints.

Figure 1 :
Figure 1: Schematic of the single cylinder PACCAR MX13 engine equipped with Exhaust Gas Recirculation (EGR), Direct Injection (DI) and Port Fuel Injection (PFI).

Figure 2 :
Figure 2: Distribution of the in-cylinder conditions of the training (black) and validation data (red) Data-Based In-Cylinder Pressure Model with Cyclic Variations for Combustion Control: A RCCI Engine Application A PREPRINT

Figure 3 :
Figure 3: Four most relevant Principle Components (PCs) resulting from the used training data Data-Based In-Cylinder Pressure Model with Cyclic Variations for Combustion Control: Gross indicated mean effective pressure (solid) and 5% cov (IMEPg) (dashed)

Figure 5 :
Figure 5: Average and cycle-to-cycle variation of important combustion measures (black) and the measured distribution (red) for different values of SOI DI and the nominal conditions shown inTable 5 using the hyperparameters as shown in Table 4.

Figure 6 :Figure 7 :
Figure 6: Average and cycle-to-cycle variation of important combustion measures (black) and the measured distribution (red) for different values of T im and the nominal conditions shown inTable 5 using the hyperparameters as shown in Table 4.
Figure7shows the distribution of the weights for 50 consecutive cycles running at a constant s * ICC ∈ S * with the least amount of coupling according to the determinant of the Pearson correlation matrix.In Figure7, the weights have been scaled as wi (s ICC ) = w i (s ICC ) − μwi (s ICC ) σwi (s ICC ) to emphasise the coupling.The corresponding symmetric Pearson correlation matrix is given by

Table 1 :
Specifications of the engine setup In-Cylinder Pressure Model with Cyclic Variations for Combustion Control: A RCCI Engine Application A PREPRINT

Table 2 :
Mean absolute error in the mean behaviour of important combustion metrics for the validation set using different kernels with n PC = 8.The best result for each combustion metric is highlighted.

Table 3 :
Mean absolute error in the standard deviation of important combustion metrics for the validation set using different kernels with n PC = 8.The best result for each combustion metric is highlighted.

Table 4 :
Selected hyperparameters and kernel used during the validation in Section 5

Table 4 .
Table 5 using the hyperparameters as shown in Data-Based In-Cylinder Pressure Model with Cyclic Variations for Combustion Control: A RCCI Engine Application A PREPRINT

Table 5 :
Nominal operating conditions of the simulated model for the results shown in Figures5 and 6.For reference, the ranges in experiments are indicated.

Table 6 :
The quality of the predictions for each combustion measure in Figure Based In-Cylinder Pressure Model with Cyclic Variations for Combustion Control: A RCCI Engine Application