A Many-Objective Optimization for an Eco-Efﬁcient Flue Gas Desulfurization Process Using a Surrogate-Assisted Evolutionary Algorithm

: The ﬂue gas desulfurization process in coal-ﬁred power plants is energy and resource-intensive but the eco-efﬁciency of this process has scarcely been considered. Given the ﬂuctuating unit load and complex desulfurization mechanism, optimizing the desulfurization system based on the traditional mechanistic model poses a great challenge. In this regard, the present study optimized the eco-efﬁciency from the perspective of operating data analysis. We formulated the issue of eco-efﬁciency improvement into a many-objective optimization problem. Considering the complexity between the system inputs and outputs and to further reduce the computational cost, we constructed a Kriging model and made a comparison between this model and the response surface methodology based on two accuracy metrics. This surrogate model was then incorporated into the NSGA-III algorithm to obtain the Pareto-optimal front. As this Pareto-optimal front provides multiple alternative operating options, we applied the TOPSIS to select the most appropriate alternative set of operating parameters. This approach was validated using the historical operation data from the desulfurization system at a coal-ﬁred power plant in China with a 600 MW unit. The results indicated that the optimization would cause an improvement in the efﬁciency of desulfurization and energy efﬁciency but a slight increase in the consumption of limestone slurry. This study attempted to provide an effective operating strategy to enhance the eco-efﬁciency performance of desulfurization systems.


Introduction
China has set particularly ambitious targets for energy saving and carbon emission mitigation and is increasingly concerned about the eco-efficiency of industrial production [1]. Between 2019 and 2024, China was forecast to account for 40% of global renewable capacity expansion [2]. However, as China's electricity structure is still coal-dominated, coal demand and production remain high even in the context of COVID-19. The national bureau of statistics in China declared that 68.6% of the primary energy supply and 57.7% of the total energy consumption in 2019 was covered by coal [3]. Hitherto, coal is still supposed to be the primary energy source in the next decades [4]. As coal-fired power generation is widely recognized as a high pollution process with considerable atmospheric emissions, how to elevate the environmental performance has been a critical challenge for the sustainable transformation of coal-fired power plants.
The supercritical/ultra-supercritical technology and clean coal technology have shown remarkable progress and contributed to significant emission reductions in coal-fired power The supercritical/ultra-supercritical technology and clean coal technology have shown remarkable progress and contributed to significant emission reductions in coalfired power generation [5]. Nonetheless, the flue gas still contains substantial pollutants, including SO2 among others. Such pollutants can cause serious environmental issues, including fine sulfate particles [6] and acid rain. Eliminating sulfur dioxide emissions is one of the pivotal steps in flue gas treatment. The problem with this step is the massive consumption of energy and. For example, the wet desulfurization system consumes a tremendous amount of limestone slurry and has various energy-intensive pumps. More importantly, the control of the desulfurization process is commonly based on the rule-ofthumb which induces huge energy and slurry wastes [7]. This phenomenon also has significant potential for optimization for the desulfurization in coal-fired power plants.
Previous works on the ecological performance of the desulfurization system in a coalfired power plant are scarce. Additionally, the optimization of desulfurization considered only one or two objectives. In the present study, the ecological performances we attempt to optimize are systematic energy use, resources consumption, and desulfurization efficiency. An eco-efficient operation of the system tends to essentially synthesize these three environment-related indicators which are typically conflicting in nature [8]. The primary goal of this study was to figure out the optimal operation with high eco-efficiency, and the outline is presented in Figure 1. This optimization towards eco-efficiency with three objectives is a many-objective problem. An increased number of objectives can exponentially increase the non-dominated solutions and make the optimization computationally expensive, thereby, bringing obstacles into the subsequent decision-making work [9]. Thus, we applied the Non-Dominant Sorting Genetic Algorithm-III (NSGA-III) following the work by the authors of [9,10] to optimize the eco-efficiency of the WFGD system of a coal-fired power plant. To reduce the computational cost of NSGA-III, we incorporated a surrogate model (Kriging model) into this algorithm to approximate the objective function revealing the relationship between the input variables and the outputs of the system, i.e., the values of ecological indicators. Given that the dense representation of the Pareto-optimal front may hinder the eco-efficient decision-making, the technique for order preference by similarity to ideal solution (TOPSIS) was adopted to select the best operational parameter to set as the optimal operation. The present study provides a robust empirical basis to develop eco-efficient WFGD strategies and facilitate the sustainable transformation of coal-fired power plants.

Literature Review
The desulfurization system can be optimized in different ways, e.g., using alternative sorbent composition and redesigning the structures of critical components. Researchers have attempted to explore effective and affordable fume sorbents to enhance desulfur-Sustainability 2021, 13, 9015 3 of 17 ization performance. Magnesium oxide [11], iodine [12], magnesium hydrate [13], and amine [14] were demonstrated to be rather efficient but price prohibitive for massive applications. The modified structural property and device of the absorber promoted the gas−liquid mass transfer resulting in higher desulfurization efficiency. Chen and colleagues [15] designed a flow pattern controlling device with a perforated plate to enhance the desulfurization performance. Dou et al. [16] introduced an electrostatic spaying absorber to increase the contact area between SO 2 and Ca(OH) 2 slurry droplets. Fang et al. [17] simulated the flow in the spray tower numerically and found that a specific increase in the flue gas entrance velocity and a decrease in the inlet angle can extend the residence time of flue gas. Another important type of desulfurization enhancement method is based on the absorption mechanism. These approaches identified the characteristics of the desulfurization process and modified the influential factors. Michalski [18] studied the scrubber aerodynamic properties based on the balance of forces on droplets falling in the spray tower. Their model unveiled the relationship between pressure drop, residence time, and droplet concentration. Dou et al. [19] applied the two-film theory of mass transfer to the SO 2 removal process and further investigated how the droplet size of the spray, the flow rates of gas and liquid, and the pH value of the liquid compound influenced the reaction between SO 2 and slurry. Shen et al. [20] illustrated the oxidation mechanism in the magnesium-based wet flue gas desulfurization (WFGD) process. Understanding the kinetics and mechanism of sulfite oxidation contributes to the optimization of the FGD system. However, uncovering the mechanism of WFGD in specific cases requires solid knowledge of the physicochemical absorption and the characteristics of the relevant devices in the tower. Additionally, the excessive assumptions made in the mechanism models and the high computation cost limits the accurate capture of the complexities and characteristics of the WFGD system [21].
The desulfurization performance is closely related to a wide range of variables, which technically complicate the improvement of the WFGD process. Fortunately, the operation data can be monitored and stored by the distributed control system and the management information system, respectively, in the coal-fired power plant. Such an enormous amount of data makes artificial intelligence and data mining techniques alternative options to address the optimization of the desulfurization process. Liu et al. [7] adopted data mining techniques to optimize the economic cost and desulfurization efficiency under varying conditions. Uddin et al. [22] incorporated the Monte Carlo experiment into artificial neural network process models to establish the relationship between nine input and output variables. They further determined the optimal control variables and demonstrated the environmental benefits of the optimization. Qiao et al. [23] proposed an improved fuzzy clustering algorithm integrating K-means, information entropy, and C-means to select the optimum operation data with minimum cost. The basic idea of this type of approach is to extract the operation parameters with the best performance based on the historical operation data presenting the varying states of the desulfurization process.
Operation strategy optimization for the desulfurization system was primarily performed from the perspectives of economic cost, desulfurization efficiency, and energy consumption. For example, Wang et al. [24] optimized the combination of pumps with different types to reduce the power cost. They also investigated the relationship between the pH of slurry and the efficiency of desulfurization to obtain the optimal pH under the different power of the ultra-supercritical boil. Guo et al. [25] developed an artificial neural network for the prediction of SO 2 emissions and built a cost model of the desulfurization system. A particle swarm optimization was then designed to minimize the total operation cost subject to the SO 2 emission standard. Most of these studies optimized the desulfurization system to achieve one single or specific two goals. Bi-objective optimization problems are common in industrial cases [26][27][28], and multi-objective optimization in the literature has seldom involved over three objectives.

Problem Formulation
The limestone-gypsum wet flue gas desulfurization is the most widely used technology for SO 2 removal in coal-fired power plants due to its high efficiency, reliability, and cost-effectiveness [29]. As depicted in Figure 2, the general limestone-gypsum WFGD system is comprised of the sorbent preparation system, the gypsum production system, the flue gas system, the SO 2 absorption system, and the auxiliary system. The boiler flue gas pressurized by the booster fan enters the absorption tower from the bottom and flows upward, while the limestone slurry droplets are sprayed by the atomizer in the opposite direction. A series of chemical reactions and heat transfer processes simultaneously occur in the tower. These reactions amongst SO 2 in the flue gas, the limestone slurry, and the O 2 from oxidation fans generate calcium sulfate dihydrate, i.e., gypsum. The mist eliminator on the top of the absorption tower separates and captures the tiny droplets in the desulfurized flue gas flows to protect the flue pipes from erosion. The flue gas after mist elimination is directly exhausted into the atmosphere through the chimney. problems are common in industrial cases [26][27][28], and multi-objective optimization in the literature has seldom involved over three objectives.

Problem Formulation
The limestone-gypsum wet flue gas desulfurization is the most widely used technology for SO2 removal in coal-fired power plants due to its high efficiency, reliability, and cost-effectiveness [29]. As depicted in Figure 2, the general limestone-gypsum WFGD system is comprised of the sorbent preparation system, the gypsum production system, the flue gas system, the SO2 absorption system, and the auxiliary system. The boiler flue gas pressurized by the booster fan enters the absorption tower from the bottom and flows upward, while the limestone slurry droplets are sprayed by the atomizer in the opposite direction. A series of chemical reactions and heat transfer processes simultaneously occur in the tower. These reactions amongst SO2 in the flue gas, the limestone slurry, and the O2 from oxidation fans generate calcium sulfate dihydrate, i.e., gypsum. The mist eliminator on the top of the absorption tower separates and captures the tiny droplets in the desulfurized flue gas flows to protect the flue pipes from erosion. The flue gas after mist elimination is directly exhausted into the atmosphere through the chimney. Prior studies [29,30] have shown that the performance of the WFGD system is affected by multiple factors, such as the flue gas flow rate, the slurry flow rate, the air flow rate, the pH value of the limestone slurry, and the SO2 inlet concentration. The operational parameters, particularly the liquid gas ratio (L/G) and the calcium sulfur ratio (Ca/S), exert significant influence on sorbent consumption, SO2 removal efficiency, energy use, and equipment corrosion. It should be noted that SO2 concentration and the flow rate of the inlet flue gas are non-adjustable depending on the coal type and plant capacity. The number of circulation pumps, the flow rate of the limestone slurry, and the flow rate of air were selected as controllable parameters in this study. These parameters, to some extent, determine the pH and density of the slurry, the L/G ratio, and the Ca/S ratio affecting the performance indicators of the WFGD system.
The objective was to optimize the eco-efficiency indicators, i.e., energy use, limestone slurry consumption, and desulfurization efficiency while satisfying the regulation of SO2 emission. This optimization was based on historical operation data containing empirical Prior studies [29,30] have shown that the performance of the WFGD system is affected by multiple factors, such as the flue gas flow rate, the slurry flow rate, the air flow rate, the pH value of the limestone slurry, and the SO 2 inlet concentration. The operational parameters, particularly the liquid gas ratio (L/G) and the calcium sulfur ratio (Ca/S), exert significant influence on sorbent consumption, SO 2 removal efficiency, energy use, and equipment corrosion. It should be noted that SO 2 concentration and the flow rate of the inlet flue gas are non-adjustable depending on the coal type and plant capacity. The number of circulation pumps, the flow rate of the limestone slurry, and the flow rate of air were selected as controllable parameters in this study. These parameters, to some extent, determine the pH and density of the slurry, the L/G ratio, and the Ca/S ratio affecting the performance indicators of the WFGD system.
The objective was to optimize the eco-efficiency indicators, i.e., energy use, limestone slurry consumption, and desulfurization efficiency while satisfying the regulation of SO 2 emission. This optimization was based on historical operation data containing empirical knowledge of the operating characteristics of the WFGD system. We performed the system optimization for an industrial case, namely, the WFGD of a coal-fired power plant with a 600 MW unit at Chaozhou, Guangdong Province, China. The energy-consuming equipment in the desulfurization system mainly includes four circulation pumps, two oxidation fans, and two pressurized fans. Limestone slurry is the primary resource consumed in the SO 2 removal process. The vector of the input variables was expressed by x = [x 1 , x 2 , x 3 , x 4 ]. These variables refer to the flow rate of inlet SO 2 , the air flow rate, the power load, and the slurry flow rate. According to the legislation proclaimed by the local bureau of environmental protection, the concentration of outlet SO 2 must be no more than 35 mg/m 3 . The optimization problem can be formulated as follows: where f 1 (x), f 2 (x), and f 3 (x) denote the desulfurization efficiency, power load, and limestone slurry flow rate, respectively. g 1 (x) is the concentration of outlet SO 2 at sample x, and g 2 (x) is the desulfurization efficiency at sample x. g 1 (x) and g 2 (x) were constructed by a surrogate model due to the complexity between the variables. X is the set of samples used in this study. Figure 3 presents 900 samples of critical variables related to this optimization problem taken at 1 min intervals from the desulfurization system. The historical data of 15 h of desulfurization operations were obtained from the management information system at a coal-fired power plant in Chaozhou. The energy use in this figure indicates the total energy consumption of the aforementioned energy-consuming equipment in one minute. Sustainability 2021, 13, x FOR PEER REVIEW 6 of 18

Method Description
Understanding the relationship between the inputs and outputs of the WFGD system is a prerequisite of optimization. To reduce the computational cost in the many-objective optimization, we utilized a surrogate model to approximate this relationship based on historical operation data. Commonly used surrogate models in the optimization domain include Response Surface Methodology (RSM), Support Vector Machine (SVM), Artificial Neural Networks (ANN), and the Kriging model. SVM and ANN, however, have often been regarded as 'black box' and were computationally intensive when applied to problems with high dimensions and nonlinearity in particular. Therefore, the Kriging model was adopted in this study and performed based on the MATLAB toolbox "DACE" [31]. RSM was then employed as a comparative basis to test the accuracy of the surrogate

Method Description
Understanding the relationship between the inputs and outputs of the WFGD system is a prerequisite of optimization. To reduce the computational cost in the many-objective optimization, we utilized a surrogate model to approximate this relationship based on historical operation data. Commonly used surrogate models in the optimization domain include Response Surface Methodology (RSM), Support Vector Machine (SVM), Artificial Neural Networks (ANN), and the Kriging model. SVM and ANN, however, have often been regarded as 'black box' and were computationally intensive when applied to problems with high dimensions and nonlinearity in particular. Therefore, the Kriging model was adopted in this study and performed based on the MATLAB toolbox "DACE" [31]. RSM was then employed as a comparative basis to test the accuracy of the surrogate model. Then, the surrogate-assisted NSGA-III was applied to optimize the desulfurization efficiency, energy use, and limestone slurry consumption. Finally, the optimal scheme of operation conditions was selected from the Pareto front by the TOPSIS method. The Kriging model is a statistical interpolation technique proposed by the author of [32], based on the optimal linear unbiased estimation [33]. This model utilizes the training samples to establish the surrogate model for the prediction of the system output. The prediction result can provide not only the mean value but also the standard deviation [34]. To approximate the real function, the Kriging model is expressed as follows: where F(β, x) is the regression part, β is the coefficient vector, x is the variable vector, Z(x) denotes a Gaussian process with mean 0 and variance σ 2 . For any two points x i and x j , their covariance can be formulated as: is the correlation function of two points and is critical to the accuracy of approximation. n is the dimension of variables and θ k is the coefficient representing the correlation between two points in the k-th direction. Then, the correlation between any two points in the samples can be used to formulate the correlation matrix as: According to the characteristics of the surrogate, i.e., the unbiased estimation and minimum variance, the parameters β and σ 2 can be obtained as below. For the detailed derivation procedure, see the work by the authors of [31].
where F is the vector of polynomial function, . . , f p (x)] T , and Y is the response vector of the samples. However, both parameters are associated with coefficient θ. The definition of θ is based on the maximum likelihood estimation. Assuming a Gaussian process, the optimal θ satisfies the following optimization problem: where |R| is the determinant of R. This unconstrained optimization problem can be solved using an intelligent algorithm. With the optimal θ, the predicted mean value corresponding to an unknown point x * is expressed as: where the r(x*) is the vector representing the correlation between x * and the training samples as below: For the surrogate model, the Latin hypercube sampling method is usually adopted to sample training data [35]. This method can ensure the uniform distribution of samples in a multi-dimensional space.

Model Accuracy Metrics
To test the performance of the surrogate model, this study used two commonly known metrics: R-squared (R 2 ) and Root Mean Squared Error (RMSE), as presented below: where y i and y i are the true response values from the sample and the prediction values from the surrogate model, respectively. y i is the mean of the true response values, and n t is the number of samples. A lower RMSE and a greater R 2 (closer to 1) indicate a higher global approximation of the model. It should be noted that data in the test set should keep a certain distance from the data in training set to avoid over-optimistic testing results.

Surrogate-Assisted NSGA-III Algorithm
As larger number of dimensions and objectives in the optimization problem would exponentially increase the computational cost, the surrogate-assisted NSGA-III algorithm in the present study employed the Kriging model to approximate each objective function. The basic framework of this algorithm is presented in Figure 4. NSGA-III and NSGA-II have a similar structure but differ in their selection mechanisms. The selection process in NSGA-II depends on the crowding distance while in NSGA-III, the diversity amongst the population members is maintained and improved by adaptively updating the wellspread reference points. The primary stages in this algorithm are highlighted in Figure 4, including the genetic operators, reference-point determination, non-dominated sorting, normalization, and association operation. The following section briefly describes these five stages.
(1) Genetic operators Genetic operators are of importance to control the optimization process of evolutionary algorithms. In this section, the crossover and mutation operators were applied to create a new population with the same size as the initial population. Two parents x 1 (t) and x 2 (t) from the t-th generation were randomly picked and the crossover operation was performed using the simulated binary crossover technique [36] to generate two offspring x 1 (t + 1) and x 2 (t + 1): where the random parameter µ j ∈ U(0, 1), and usually η = 1. γ is a spreading factor representing the ratio of absolute difference in offspring to that of the parent.  (1) Genetic operators Genetic operators are of importance to control the optimization process of evolutionary algorithms. In this section, the crossover and mutation operators were applied to create a new population with the same size as the initial population. Two parents x1(t) and x2(t) from the t-th generation were randomly picked and the crossover operation was performed using the simulated binary crossover technique [36] to generate two offspring x1(t + 1) and x2(t + 1): where the random parameter   0,1 j U   , and usually η = 1. γ is a spreading factor representing the ratio of absolute difference in offspring to that of the parent.
The mutation process adopted the polynomial mutation method [37], and the operator is expressed following the equations as below: The mutation process adopted the polynomial mutation method [37], and the operator is expressed following the equations as below: if u > 0.5 (16) where x t and x t are the solutions before and after the mutation, random parameter µ j ∈ U(0, 1), and η is constant. This operator uses a polynomial probability distribution to perturb a solution in the vicinity of a parent solution.
(2) Determination of reference points on a Hyper-Plane The predefined reference points guarantee the diversity of individuals in the population. For a problem with M objectives, the reference points are distributed on a (M-1)dimensional hyperplane. Based on Das and Dennis's approach [38], if H divisions are considered for each objective, the total quantity of reference points P are given by: Suppose the set of reference points is s = (s 1 , s 2 , . . . , s M ), s i = (s i1 , s i2 , . . . , s iM ) should satisfy: Let set X = {x k |1 ≤ k ≤ P}, x k is an arbitrary combination of (M − 1) elements from the set {0/H,1/H, . . . , (M + H − 2)/H}, x kj denotes the j-th element in x k , s kj is the j-th element of the k-th reference point and is determined by: (3) Non-dominated sorting The population P t in the t-th generation in conjunction with the new population Q t created by the crossover and mutation results in a greater population R t with a total of 2 N individuals, among which N individuals should be selected as offspring for the nextgeneration P t + 1 , as shown in Figure 5. The detailed selection process using non-dominated sorting (Algorithm 1) is presented in the form of pseudocode below. Notably, the sorting process was based on the corresponding output value of the individual determined by the Kriging surrogate model.

H  
Suppose the set of reference points is s = (s1, s2,…, sM), si = (si1, si2,…, siM) should satisfy: Let set X = {xk|1 ≤ k ≤ P}, xk is an arbitrary combination of (M − 1) elements from the set {0/H,1/H,…, (M + H − 2)/H}, xkj denotes the j-th element in xk, skj is the j-th element of the k-th reference point and is determined by: (3) Non-dominated sorting The population Pt in the t-th generation in conjunction with the new population Qt created by the crossover and mutation results in a greater population Rt with a total of 2 N individuals, among which N individuals should be selected as offspring for the nextgeneration Pt + 1, as shown in Figure 5. The detailed selection process using non-dominated sorting (Algorithm 1) is presented in the form of pseudocode below. Notably, the sorting process was based on the corresponding output value of the individual determined by the Kriging surrogate model.  for j∈s k do 7: n j = n j -1 8: if n j = 0 then 9: m = m + 1; save j to the level F m 10: end 11: end 12: end 13: repeat step [5][6][7][8][9][10][11][12] (4) Normalization of individuals The ideal point of the t-th population is determined by identifying the minimum value at each objective function, expressed as z = z min 1 , z min 2 , . . . , z min M . Each objective value is then translated by the following equation: Then, the extreme point of each objective is determined via the achievement scalarizing function: These M extreme points constitute an (M − 1) dimensional hyper-plane. Let a i denote the intercept of the i-th objective axis. The objective function can be normalized by: (5) Association operation After the normalization process, we need to associate each individual with a reference point. A reference line was defined by connecting the reference point and the origin. Then, the perpendicular distance of each individual to the reference lines should be calculated. The basic idea is that, in the normalized space, the individual closest to a reference line is regarded as associated with the corresponding reference point. For the reference point with less associated individuals, these individuals have a higher chance of being preserved to ensure the diversity of the population. For details on the association procedure, see [10].

TOPSIS for Multicriteria Decision-Making
For the high-dimensional and multi-objective problem, the Pareto front provides numerous options of operational parameters and hampers informed decision-making. This section applied the TOPSIS to select the optimum decision scheme from the Pareto-optimal front. The basic idea is that the best alternative should have the largest distance to the negative ideal solution but the closest distance to the ideal solution [39]. The detailed procedure of this method is presented below.
Step 1: construct a decision matrix A: where a ij is the rating value of the i-th alternative on the j-th criterion.
Step 2: normalize the decision matrix. As the various scales and units of criteria may lead to an unfair comparison, the element in A should be normalized by the following formula: where b ij is the corresponding normalized element of a ij and an element of the normalized decision matrix B.
Step 3: construct the weighted normalized decision matrix V as below: where w i is the weight on i-th criterion, and ∑ n i=1 w i = 1.
Step 4: determine the ideal (V + ) and negative ideal (V − ) solutions as follows: where J + and J − are the benefit criterion and cost criterion, respectively.
Step 5: determine the distance of each alternative to the ideal and negative ideal solution in Euclidean sense as follows: where D + i and D − i are the distances from the i-th alternative to the ideal and negative ideal solution, respectively.
Step 6: calculate the relative closeness (RC) of each alternative to the ideal solution as below: Step 7: rank the alternatives according to their RC values, and regard the one with the highest RC as the best option.

Data Description and Validation for the Kriging Model
In the present case study, we applied the Latin hypercube sampling method to select operation data under the steady-state conditions in Figure 3 to guarantee uniform distribution in a multi-dimensional space. The Kriging model for mapping the system inputs to the output was performed on the training data. This surrogate model reflected the underlying relationship between the input and output variables. Let x 1 denote the inlet SO 2 flow rate, x 2 the air flow rate, x 3 the energy consumption, x 4 limestone slurry flow rate, y 1 the outlet SO 2 concentration, and y 2 the desulfurization efficiency. A sample i is expressed as s i = [X i , Y i ] = [x 1 , x 2 , x 3 , x 4 , y 1 , y 2 ], in which the vectors X and Y represent the system input and output, respectively.
The data of the training set and test set in this study contained 156 and 44 samples, respectively. To obtain a preliminary understanding of the system inputs and outputs, we performed the analysis of variation (ANOVA) test on the Minitab platform to examine the effects of the input variables on the outlet SO 2 concentration ( Figure 6) and desulfurization efficiency (Figure 7). The outlet SO 2 concentration is primarily determined by the inlet SO 2 flow rate and slurry flow rate, while the desulfurization efficiency is merely dominated by the inlet SO 2 flow rate. The Kriging model was built based on the training data. We compared the accuracy of the Kriging model and traditional response surface methodology in terms of the R 2 and RMSE. Table 1 presents their performance in relation to the training data and training data. It can be observed from this table that the Kriging model perfectly fits the training data in both system outputs. This phenomenon is in line with the work by the author of [40]. In the test data set, the Kriging model still had an edge over the RSM on the prediction of y 1 and y 2 in terms of two accuracy metrics. Particularly, the RMSE of the desulfurization efficiency prediction by the Kriging model was 0.21. To a large extent, the test set validated the effectiveness of this surrogate model. ogy in terms of the R 2 and RMSE. Table 1 presents their performance in relation to the training data and training data. It can be observed from this table that the Kriging model perfectly fits the training data in both system outputs. This phenomenon is in line with the work by the author of [40]. In the test data set, the Kriging model still had an edge over the RSM on the prediction of y1 and y2 in terms of two accuracy metrics. Particularly, the RMSE of the desulfurization efficiency prediction by the Kriging model was 0.21. To a large extent, the test set validated the effectiveness of this surrogate model.

Many-Objective Optimization Using Kriging-Assisted NSGA-III
This optimization problem had three objectives: maximizing the desulfurization efficiency, minimizing the energy consumption, and limestone slurry consumption. As previously mentioned, the inlet SO2 flow rate is uncontrollable. We conducted a Kolmogorov-Smirnov test on the historical data. The statistical analysis demonstrated that this parameter satisfied the triangular distribution, namely, tri (24.34, 30.8, 50.15). We performed the Kriging-assisted NSGA-III algorithm on the MATLAB R2018b. The size of the initial population in this algorithm was set as 80, the maximum generation was 50, the number of reference points was 136, the number of divisions was 15, the rate of crossover was 0.5, and the rate of mutation was 0.02. Hereby, we fixed the flow rate of the inlet SO2 at the mean value of 36.01 kg/min. The optimal solutions associated with these objectives were presented in the Pareto front. As shown in Figure 8, the Pareto front is displayed on a curved surface, on which each point implies a non-dominated optimal solution. As there are 80 individuals in the population of each generation, the Pareto front provides 80 opti-

Many-Objective Optimization Using Kriging-Assisted NSGA-III
This optimization problem had three objectives: maximizing the desulfurization efficiency, minimizing the energy consumption, and limestone slurry consumption. As previously mentioned, the inlet SO 2 flow rate is uncontrollable. We conducted a Kolmogorov-Smirnov test on the historical data. The statistical analysis demonstrated that this parameter satisfied the triangular distribution, namely, tri (24.34, 30.8, 50.15). We performed the Kriging-assisted NSGA-III algorithm on the MATLAB R2018b. The size of the initial population in this algorithm was set as 80, the maximum generation was 50, the number of reference points was 136, the number of divisions was 15, the rate of crossover was 0.5, and the rate of mutation was 0.02. Hereby, we fixed the flow rate of the inlet SO 2 at the mean value of 36.01 kg/min. The optimal solutions associated with these objectives were presented in the Pareto front. As shown in Figure 8, the Pareto front is displayed on a curved surface, on which each point implies a non-dominated optimal solution. As there are 80 individuals in the population of each generation, the Pareto front provides 80 optimal solutions, or more specifically, 80 possible combinations of operating parameters for decision-makers.

Multi-Criteria Decision-Making for the WFGD System
Selecting a proper solution from 80 alternative solutions in the Pareto front is a typical multi-criteria decision-making problem. The selection process considered stakeholder's interests in the optimization objectives, also referring to the criteria. TOPSIS was applied to select the most appropriate alternative solution to guide the operating practice.
The decision matrix was directly derived from the optimization results of NSGA-III, i.e., the Pareto-optimal front. The normalization process was then performed on the decision matrix, as the objectives had unequal importance for varying stakeholders. In this regard, different weighting factors should be allocated to the criteria. Commonly used methods to determine the weighting factors include the analytic hierarchy process (AHP), the Delphi method, the entropy method, and the principal components analysis [41]. The present study adopted the AHP to determine the weighting factors of desulfurization efficiency (f1), energy consumption (f2), and limestone slurry consumption (f3), and the results were 0.2, 0.5, and 0.3, respectively. The weighted normalized decision matrix was constructed subsequently. The ideal and negative ideal solutions were V + = {0.2 0.5 0.3} and V − = {1.997 0.1195 0.0462}. We further calculated the Euclidean distance of each alternative to the ideal and negative ideal solution and ranked the alternatives according to their relative closeness. The results indicated that the optimal solution in the Pareto front was [f1, f2, f3] = [98.11%, 4498.8 kW, 0.272 m 3 /min].
To further illustrate the eco-efficiency of the optimization, Table 2 lists the empirical and optimal values related to average desulfurization efficiency, energy consumption, and limestone slurry consumption. As evident from this table, the desulfurization efficiency was improved by 0.23%. Although the number seems quite small, this improvement implies a substantial reduction in SO2 due to its huge emission base. Additionally, the efficiency of limestone slurry decreased by 34.6% but the energy efficiency increased by 45.8%. Increased SO2 absorption would inevitably lead to a higher consumption of limestone slurry. However, the reduction in energy use is enough to offset this negative effect. From the perspective of total environmental impact, the optimal eco-efficient operating condition would gain significant environmental benefits even though the limestone consumption would increase slightly.

Multi-Criteria Decision-Making for the WFGD System
Selecting a proper solution from 80 alternative solutions in the Pareto front is a typical multi-criteria decision-making problem. The selection process considered stakeholder's interests in the optimization objectives, also referring to the criteria. TOPSIS was applied to select the most appropriate alternative solution to guide the operating practice.
The decision matrix was directly derived from the optimization results of NSGA-III, i.e., the Pareto-optimal front. The normalization process was then performed on the decision matrix, as the objectives had unequal importance for varying stakeholders. In this regard, different weighting factors should be allocated to the criteria. Commonly used methods to determine the weighting factors include the analytic hierarchy process (AHP), the Delphi method, the entropy method, and the principal components analysis [41]. The present study adopted the AHP to determine the weighting factors of desulfurization efficiency (f 1 ), energy consumption (f 2 ), and limestone slurry consumption (f 3 ), and the results were 0.2, 0.5, and 0.3, respectively. The weighted normalized decision matrix was constructed subsequently. The ideal and negative ideal solutions were V + = {0.2 0.5 0.3} and V − = {1.997 0.1195 0.0462}. We further calculated the Euclidean distance of each alternative to the ideal and negative ideal solution and ranked the alternatives according to their relative closeness. The results indicated that the optimal solution in the Pareto front was To further illustrate the eco-efficiency of the optimization, Table 2 lists the empirical and optimal values related to average desulfurization efficiency, energy consumption, and limestone slurry consumption. As evident from this table, the desulfurization efficiency was improved by 0.23%. Although the number seems quite small, this improvement implies a substantial reduction in SO 2 due to its huge emission base. Additionally, the efficiency of limestone slurry decreased by 34.6% but the energy efficiency increased by 45.8%. Increased SO 2 absorption would inevitably lead to a higher consumption of limestone slurry. However, the reduction in energy use is enough to offset this negative effect. From the perspective of total environmental impact, the optimal eco-efficient operating condition would gain significant environmental benefits even though the limestone consumption would increase slightly.

Conclusions
The desulfurization system is energy and resource-intensive in the coal-fired power plant and its eco-efficiency requires considerable improvement. This paper developed a mathematical model for a many-objective optimization of the operating parameters from the perspective of eco-efficiency. Based on the historical operating data, a Kriging model was constructed to reveal the relationship between the input and output variables of the WFGD system. We incorporated the Kriging model as a surrogate of objective function into the NSGA-III algorithm to reduce the computational cost. The Pareto-optimal front was then obtained from this evolutionary algorithm. TOPSIS was utilized to select an appropriate solution from the Pareto-optimal front. The results indicated that the optimization of the operating parameters would improve the desulfurization efficiency and energy use but increase the limestone slurry slightly. The optimal operating scheme would gain overall environmental benefits. When the external factor changes, for example, combustion of another type of coal or the installation of new equipment substituting outdated equipment, the optimization method proposed in the present study might be adopted to rapidly improve the eco-efficiency based on historical operational data. Additionally, our operation strategy can also be extended to a wide range of applications, for example, dust elimination and the denitration process in the flue gas treatment.
A limitation of the present study is that we conducted the optimization merely based on the selective and offline operating data at steady-state conditions. Follow-up works on the sufficient utilization of operating data and a deeper exploration of the optimization strategy would be more desirable. As the operating data was monitored with high frequency, efforts could be made for a real-time optimization that adaptively enhances the overall performance of the desulfurization system. However, in this case, the highly efficient optimization algorithm would pose a greater challenge to the online optimization of the WFGD system in future work. This study has validated the effectiveness via the industrial case of WFGD. It provides a scientific basis for developing eco-efficient operating strategies in the WFGD system and will enhance the environmental performance of the desulfurization process.
Funding: This study was financially supported by STU Scientific Research Foundation for Talents (NTF20019), Academic funding project for top notch talents of disciplines (majors) in Col-leges and universities (gxbjzd2021083), Key projects of Humanities and Social Sciences in Anhui Province (SK2019A0529), and Provincial natural science foundation of Anhui (2008085ME150).

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable. Data Availability Statement: All data generated or analyzed during this study are included in this published article.