As discussed in
Section 1, the first step of the current study is to build prediction models for estimating engine and SCR system output
as close as possible to the actual output y for an input vector
using optimal SVM (support vector machine) models. The SVM is the most appropriate method for obtaining the prediction model of a stochastic system from a small amount of experimental data [
25]. After obtaining the prediction models, non-dominated sorting genetic algorithm (NSGA-II) was implemented to deal with the trade-off between the two conflicting objective functions of NO
x conversion efficiency and NH
3 slip under certain working points. The whole procedures of modeling and multi-objective optimization were implemented in MATLAB
® (MathWorks, Inc., Natick, MA, USA).
2.2. Data Preprocessing
Generally, there are some numerical discrepancies among input variables
, and, as a consequence, some variables may not have a significant influence on the system’s output y. Therefore, in order to eliminate the negative impact caused by the huge numerical discrepancies of input variables, there was the need to normalize each variable to vary from 0 to 1. For each column of data sample
, the normalized data sample is given by:
where
is the original data,
is the normalized data, and
xi(max) and
xi(min) are the maximum and minimum in
, respectively. Subsequently, the normalized data
were used as the inputs for the SVM models.
2.3. Building and Optimizing the SVM Models
The Support Vector Machine (SVM) [
26] was developed by Vapnik and Cortes in 1995. As a novel kind of machine learning method, the SVM is gaining increasing popularity because of its many attractive features and promising empirical performance. A detailed description of SVM theory can be found in references [
27,
28,
29].
Here, only a brief description is given. A support vector machine takes advantage of the kernel function to map the input data onto a high-dimensional feature space. Linear regression is then performed in the high-dimensional feature space. As a result, non-linear problems can be addressed in a linear space through non-linear feature mapping. After training on the input data sample, the SVM model can be used to predict variables whose values are unknown.
The final prediction function used by an SVM is as follows:
where
ai is Lagrange multiplier, and
xi is a feature vector corresponding to a training variable. The components of vector
ai and the constant b are optimized during training.
C is a penalty factor, which indicates the degree of attention paid to outliers and determines the range of
ai. The larger the value of
C is the more attention is paid to the outliers. The kernel function is represented by
, and is one of the most important parts of the SVM model. There are four common kinds of kernel functions. Among them, the Gaussian radial basis function kernel is most commonly used because of its effectiveness and speed in the modeling process [
30].
The Gaussian function takes the form:
where
g is the parameter of the kernel function, which is as important as penalty factor
C, with
x and
xi representing independent variables.
2.4. Model Parameter Optimization with Grid Search and GA
Based on the preceding discussion, it is clear that the kernel function parameters C and g are considerably vital in the SVM model. The selection of parameters C and g directly influences the performance of the SVM model. Thus, it could be deduced that the performance of the SVM model lies in the choice of the kernel function parameter g and penalty factor C. In this study, a multi-algorithm combined method is proposed for optimizing the model parameters.
A Genetic Algorithm (GA) [
31] is a kind of optimization algorithm, which uses genetics to simulate the natural evolution process. Three basic operators of selection, crossover and mutation are used to generate better offspring populations in order to find exact or approximate solutions to optimization problems. GAs have been successfully used to solve a broad spectrum of optimization problems owing to its high effectiveness and lower time consumption [
32].
Therefore, in this study, a genetic algorithm was used to optimize the
C and
g parameters of the SVM model. Furthermore, the k-fold cross-validated root mean squared error (
RMSE) [
33] was selected as the fitness of the objective function for the GA to reflect the prediction accuracy of the SVM model.
where
N is the number of samples, with
y and
representing the actual and predicted values respectively. The main objective of the GA was to yield the smallest k-fold cross-validated
RMSE by searching for the best combinations of the
C and g parameters for the SVM model.
In a GA, prior to the initial random population generation, the ranges of each parameter need to be given, which are almost given empirically at present. If the ranges were too wide, the optimization process would be time-consuming. On the other hand, if the ranges were too small, the best parameters may not be captured in the ranges. Based on this, a grid search is proposed for determining the rough scope in this study. In addition, existing studies have shown that identifying a good parameter pair by searching exponentially in the sequences is more practical and less time-consuming. For example,
C = 2
−2, 2
−1, ⋯, 2
10;
g = 2
−1, 2
0, ⋯, 2
10 [
34].
The procedure of optimizing the
C and
g parameters of the SVM with grid search and GA is shown in
Figure 2 as follows:
Step 1: Involves the use of a grid search in the range of C ∈ [2−10, 215] and g ∈ [2−5, 210] according to the exponential sequences mentioned earlier. The k-fold cross-validated RMSE for all pairs of C and g parameters is evaluated for the SVM model in the process of finding the best pair.
Step 2: Based on the best pair of C and g parameters determined in Step 1, the ranges of each parameter in the GA are established. Subsequently, a two-dimensional random initial population, which is binary encoded, is created. Each dimension (chromosome) represents C and g, respectively.
Step 3: Involves the calculation of the fitness function of all populations given by the k-fold cross-validated RMSE for the SVM model.
Step 4: Based on fitness level (k-fold cross-validated RMSE), a better offspring population is generated through the use of the three basic genetic operators of selection, crossover and mutation. In this study, the probabilities for the selection, crossover and mutation operators were set to 0.9, 0.8 and 0.05, respectively.
Step 5: In this step, Step 4 was repeated until the stopping criterion (100 generations for the current study) was satisfied.
After these steps, the best pair of C and g was obtained, ending the optimization process for the SVM models.
2.5. Multi-Objective Optimization
After building and optimizing the three prediction models for the upstream and downstream NOx emissions as well as NH3 slip, the ultimate goal of this study was to optimize the urea injection amount to maximize the NOx conversion efficiency while minimizing the NH3 slip, a situation that characterizes a typical multi-objective optimization problem.
A multi-objective optimization problem is usually concerned with maximizing or minimizing a number of objective functions in the presence of certain inequality and equality constraints, as well as other constraints in the form of lower and upper bounds defining the decision variable space. Generally, multi-objective optimization problems do not have a single optimal solution but a Pareto optimal set for the conflict between the various goals, and, as a consequence, it is therefore virtually impossible for multiple objectives to simultaneously achieve optimal results.
The NSGA-II genetic algorithm was proposed by Kalyanmoy
et al. [
35] to help reduce the computational complexity based on a certain number of decision variables and a given population of solutions, preserve the elite members of a population of solutions and eliminate the need for a sharing parameter associated with other multi-objective evolutionary algorithms like the Pareto-archived evolutionary strategy (PAES) [
36] and the strength-Pareto evolutionary algorithm (SPEA) [
37]. NSGA-II takes advantage of the non-dominated sorting and crowding distance so that the algorithm has the capacity to approximate the best Pareto frontier and ensure that the obtained Pareto optimal solution has a good spreading. The specific description of the NSGA-II can be found in references [
35,
38,
39].
The basic procedure for executing NSGA-II for the current multi-objective problem is summarized in
Figure 3 and the steps are as follows:
Step 1: Involves fixing the parameters and the range of the decision variable (urea injection amount). In the current study, an initial solutions population of 100 and a maximum number of generations of 200 were used for the multi-objective optimization. The lower and upper bounds for the urea injection amount variable were 0 and 2000 mL/h, respectively.
Step 2: Based on the parameters and ranges determined in Step 1, a random initial population, which is binary encoded, was created.
Step 3: Involves the calculation of the ranks and crowding distance of the population, followed by the application of the tournament selection method for selecting the best solutions in a particular population for creating a mating pool for producing child solutions based on non-dominated sorting.
Step 4: Involves the generation of offspring and parent population through crossover and mutation operators to ensure that the diversity within different generations of solutions is preserved. The best individuals were selected as the new population from the population incorporating offspring and parent population based on non-dominated sorting. In this study, probabilities of 0.9 and 0.1 were used for the crossover and mutation operators for the multi-optimization framework, respectively.
Step 5: In this step, a repetition of Step 3 and Step 4 was carried out until the stopping criterion (200 generations for the current study) was satisfied.
After these steps, the optimal Pareto set of the decision variable (urea injection amount) was obtained.