Parameter Estimation of Water Quality Models Using an Improved Multi-Objective Particle Swarm Optimization

Water quality models are of great importance for developing policies to control water pollution, with the model parameters playing a decisive role in the simulation results. It is necessary to introduce estimation through multi-objective parameters, which is often affected by noise in the data, into water quality models. This paper presents a multi-objective particle swarm optimization algorithm, which is based on the Mahalanobis distance operation, mechanism of cardinality preference and advection-diffusion operator. The Mahalanobis distance operation can effectively reduce the influence of noise in the data on model calibration. The mechanism of cardinality preference and the use of the advection-diffusion operator can prevent non-dominated solutions from falling into the local optimum. Four cases were used to test the proposed approach. The first two cases with true Pareto fronts show that this approach can accurately estimate the true Pareto front with a good distribution, even in the presence of noise. Furthermore, the application of the approach was tested by the O’Connor model and Crops of Engineers Integrated Compartment Water Quality Model. We show that our approach can produce satisfactory results for the multi-objective calibration of complex water quality models. In general, the proposed approach can provide accurate and efficient parameter estimation in water quality models.


Introduction
Water quality models play a very important role in water quality research that assesses ecosystem health and aids in the development of water pollution control strategies [1,2].Most water quality models are characterized by differential equations with multiple indicators and a large number of parameters [3,4].These parameters may be difficult to obtain by experimental measurements or may not even have physical measurements.Furthermore, the selection of these parameters plays a decisive role in the simulation results.Therefore, the calibration of models is crucial in the successful application of water quality models [5,6].
In general, model calibration can be manual or automatic.The manual trial-error process is usually inefficient, time consuming and labor intensive.It strongly depends on the experiential knowledge of the modeler.Implementation of this approach is becoming increasingly difficult due to increases in the dimensions of parameters and number of objective functions [7][8][9].The shortcomings of the trial-error approach can be overcome by automatic calibration, which utilizes the performance of the optimization algorithms and computational efficiency of digital computers.This method has become more and more popular with the increase in computation efficiency [10][11][12].The most common method for the automatic calibration of water quality models involves the translation of multiple indicators into a single aggregated scalar with weighted sums [5,13,14].However, these indicators, which reflect the characteristics of the ecosystem from different aspects, may be inconsistent and conflicting.Therefore, more than one objective of the automatic calibration for water quality models should be considered simultaneously.The essence of automatic calibration is transformed into solving multi-objective optimization problems (MOPs) [15][16][17][18].
A set of solutions denoted as the Pareto optimal set can be obtained as the result of trade-off among the indicators [19][20][21][22].A variety of multi-objective evolutionary algorithms (MOEAs) have been developed, which are considered as being effective for solving MOPs [23][24][25][26][27][28][29].Multi-objective particle swarm optimization (MOPSO) is a mainstream algorithm of MOEAs [30,31].It is a highly competitive algorithm because it is easier to implement and attain convergence compared to other evolutionary algorithms.Furthermore, it is especially suitable for solving MOPs [29,32,33].
There are three aspects that should be considered to improve the quality of solutions when using MOPSO to solve MOPs.First, it is important to properly redefine the global best guide (gbest) distribution to obtain a set of non-dominated solutions, which will directly affect the search ability and convergence of the algorithm [34,35].Some methods for updating gbest are based on the topological structure of particles, such as the sigma method [35], minimal particle angle method [36], fitness sharing mechanism [37], niching mechanism [38], crowding distance [39] and its deformation [40], solution distribution entropy [41], dynamic neighborhood strategy [42] and decomposition approach [43,44] among others.The second point involves maintaining non-dominated solutions, which are stored in a so-called external repository.These solutions are identified through the search process.The size of the external repository is usually controlled to balance the tradeoff between a good Pareto optimal set and the computational cost [45].There are many strategies to attain this goal, such as the ε-dominance method [46,47], fuzzy clustering technique [38], adaptive grid mechanism [28], the measurements of parallel cell distance [48], preference order scheme [49] and so on.The third aspect involves controlling and maintaining the diversity of particles.The methods include the multi-swarm strategy [45,50], adaptive control of vital parameters [40,51], mutation operator [39], local search scheme [52], population recombination strategy [53], jump improved operator and proportional distribution mechanism [34].
The performance metrics of the MOPSO algorithms are usually based on standard test functions.These algorithms have been widely applied in modeling [10,54,55], although few studies have been examined water quality models [18].To calibrate water quality models, the observed data are inevitably accompanied by noise.There is a lack of knowledge and validation on whether these methods are able to obtain reasonable results in the case of noise in the data.The noise in the data has a significant impact on the calibration results of water quality models, but the existing MOPSO algorithms do not take these issues into account.Therefore, it is important to note whether the algorithm has an anti-noise performance.
A novel MOPSO algorithm is proposed, which has the purpose of performing parameter estimation of water quality models with additional noise.This algorithm is inspired by mathematical statistics and fluid mechanics theory.The shortcomings of traditional MOPSO algorithms can be overcome using this algorithm.The algorithm is tested on four cases, which include several standard test functions, test functions under different noise strengths and two water quality models.

Method
The proposed MOPSO algorithm is based on the Mahalanobis distance operation, mechanism of cardinality preference and advection-diffusion operator, so the algorithm was abbreviated as

Mahalanobis Distance Operation
The Mahalanobis distance is widely used in many fields, such as electronics and information systems [56,57], machine learning and pattern recognition [58][59][60][61] as well as water quality assessment [62].The Mahalanobis distance in these applications usually replaces the Euclidean distance for clustering analysis.However, it is less frequently used in the research of MOEAs.The similarity of points in a solution space is usually measured by the Euclidean distance [29,45] and Manhattan distance [39].For these distance metrics, each objective value of the point is equally important and is considered to be independent from the others.This assumption may not always be satisfied for water quality models.The Mahalanobis distance takes into account unequal variances and correlations among points in the objective space.It has the ability to evaluate distances by assigning different weights or important factors to the points in the objective space [63][64][65].Therefore, the Mahalanobis distance is more appropriate for the calibration of water quality models in ecosystems.The distribution information of each function was applied with the covariance matrix of the multiobjective vectors in the Mahalanobis distance, which minimizes the effect of noise in this operation.It is the one of the most important basic features of the MCAD-MOPSO algorithm for the personal best position (pbest) selection, gbest selection and external repository updating.
Definition 1 (Mahalanobis distance).The Mahalanobis distance between the two particles in objective space is defined as:

Mahalanobis Distance Operation
The Mahalanobis distance is widely used in many fields, such as electronics and information systems [56,57], machine learning and pattern recognition [58][59][60][61] as well as water quality assessment [62].The Mahalanobis distance in these applications usually replaces the Euclidean distance for clustering analysis.However, it is less frequently used in the research of MOEAs.The similarity of points in a solution space is usually measured by the Euclidean distance [29,45] and Manhattan distance [39].For these distance metrics, each objective value of the point is equally important and is considered to be independent from the others.This assumption may not always be satisfied for water quality models.The Mahalanobis distance takes into account unequal variances and correlations among points in the objective space.It has the ability to evaluate distances by assigning different weights or important factors to the points in the objective space [63][64][65].Therefore, the Mahalanobis distance is more appropriate for the calibration of water quality models in ecosystems.The distribution information of each function was applied with the covariance matrix of the multi-objective vectors in the Mahalanobis distance, which minimizes the effect of noise in this operation.It is the one of the most important basic features of the MCAD-MOPSO algorithm for the personal best position (pbest) selection, gbest selection and external repository updating.
Definition 1 (Mahalanobis distance).The Mahalanobis distance between the two particles in objective space is defined as: where are objective functions of the ith and jth particles; T means the transposed matrix; and S is the covariance matrix of the multi-objective functions, which is calculated by the particle's objective functions.

Mechanism of Cardinality Preference
The cardinality preference mechanism was added to update the gbest and pbest in the algorithm to ensure that the particles move toward the sparse regions of the search space for maintaining the diversity of particles.
Definition 2 (Particle inducer).In the particle inducer, we defined ∀x i ∈ A, y j ∈ B, i = 1, 2, • • • , N, j = 1, 2, . . ., L, A = φ, B = φ.The inducer of particle x i is denoted as s x i .For particles in B, y k is the most similar particle to x i , meaning that y k is the inducer of x i .Therefore, it is defined as: Definition 3 (Particle induced set).The particle set composed of the particles in set A induced by y k is called the particle y k induced set I y k on set A. It is abbreviated as the induced set and is defined as follows: The definition of the induced set is based on Mahalanobis distance and reflects the similarity of particles.In a similar way, the set induced by each element in B can also be obtained.The gbest and pbest are updated based on the definition of the particle induced set.

Updated gbest
The gbest is updated by the following three steps.
(1) Compute the cardinality of the induced set of each particle in the external repository E on the particle population, corresponding to Equations ( 2) and (3).(2) Sort each particle in the external repository according to the cardinality of their induced set in ascending order.(3) Randomly select one particle from the specified top portion in the external repository, such as 10%, as the gbest.
In this way, the non-dominated solutions in the external repository with smaller cardinality are selected as the gbest of the particles.This strategy enhances the search ability of the algorithm.It also makes it easy for particles to explore the sparse area to prevent the solutions from falling into the local optimum and maintain the diversity of the non-dominated solutions.

Updated Pbset
The updated pbset is based on the Pareto dominance relationship between pbset and the current particle x i .Otherwise, pbset is updated according to the induced set cardinality in the following steps: (1) Compare the cardinality of the induced set of the current particle x i and its pbest.The elements in the induced set are on the external repository E. The pbest is updated with a greater particle cardinality, which means that the pbest becomes closer to the external repository.This strategy can accelerate the convergence rate of the algorithm.

Advection-Diffusion Operator
In the MOPSO algorithm, particles may aggregate after several generations.Therefore, it is difficult for the algorithm to explore potentially better solutions.The advection-diffusion operator was introduced to prevent the solutions from falling into the local optimum.
For each particle with n dimensions ), the proposed advection-diffusion of particles is depicted in Figure 2.
(1) Advection motion (a) Arbitrarily select the sth dimension of each particle in I y k .

(b)
The advection magnitude for each where C 0 is a constant, while u kup and u klow are the upper and lower boundaries of x is , respectively.
(2) Diffusion motion (a) The diffusion motion D is for sth dimension of particle x i in I y k is determined as The magnitude of advection-diffusion for each particle decreases as the number of iterations increases: where t is the number of iterations; maxgen is the maximum number of iterations; and α (α = 1.5) and β (β ∈ (0, 1)) are adjustment indices.When t ≥ maxgen • β, this operator should not be employed.
(1) Compare the cardinality of the induced set of the current particle i x and its pbest.The elements in the induced set are on the external repository E.
( The pbest is updated with a greater particle cardinality, which means that the pbest becomes closer to the external repository.This strategy can accelerate the convergence rate of the algorithm.

Advection-DiffusionOperator
In the MOPSO algorithm, particles may aggregate after several generations.Therefore, it is difficult for the algorithm to explore potentially better solutions.The advection-diffusion operator was introduced to prevent the solutions from falling into the local optimum.
For each particle with n dimensions , , , , the proposed advection-diffusion of particles is depicted in Figure 2.
(1) Advection motion (a) Arbitrarily select the sth dimension of each particle in (b) The magnitude of advection-diffusion for each particle decreases as the number of iterations increases: where t is the number of iterations; maxgen is the maximum number of iterations; and where the weight parameter w is equal to 4. The values of the cognitive parameter c 1 and social parameter c 2 are both 0.5; while r 1 and r 2 are random numbers from 0 to 1. Compute the new position of x t i : The computational complexity of the algorithm is dominated by calculating the non-dominated relation of the particles, working out the induced set for particles and sorting the particle cardinality.The objective function computation has O(MN) complexity if there are N particles and M objective functions.Therefore, the complexity of non-dominated relation computation is O(MN 2 ) with N particles in archive.The main complexity of working out the induced set for particles involves the search of the minimum Mahalanobis distance, which requires a computational cost of O(N 2 ).The complexity of sorting the particle cardinality is O(Nlog2N).In summary, the overall complexity of the proposed algorithm is O(MN 2 ), which is the same as the MOPSO-CD [39] and NSGA-II algorithms [23] and being slightly more complex than MOPSO [28].

Results and Discussion
The proposed approach is tested by four cases.Case I is based on several test functions that have a true Pareto front.Several performance metrics from the standard literature are also shown, such as generational distance, error ratio and spacing.Case II adds noise of different strengths in test functions to test our approach's ability to reduce the influence of noise.The applicability of our approach is further tested in Case III, which is the O'Connor model, and Case IV, which is the Crops Engineers Integrated Compartment Water Quality Model (CE-QUAL-ICM) in Chaohu Lake.

Case I
Multi-objective optimization test problems are crucial in determining the effectiveness of the detection algorithm in solving the multi-objective optimization problems.Currently, some well-known test functions have been widely used in the testing of various multi-objective algorithms, such as ZDT [66], DTLZ [67] and so on.ZDT was proposed by Zitzler et al. and contains six problems (ZDT1-ZDT6).These problems have different characteristics of the Pareto front.ZDT1-ZDT6 are based on two objective functions to reflect essential aspects of multi-objective optimization.DTLZ1-DTLZ7were proposed by Deb et al., which can have more than two objective functions to test the ability of algorithms to optimize objective functions.
Several test functions, such as ZDT1-ZDT3, ZDT6, DTLZ1 and DTLZ2, were adopted to test the performance of algorithms in this present paper.ZDT1 has a convex Pareto front, while ZDT2 has a non-convex Pareto front.The Pareto front of ZDT3 is characterized by five discontinuous convex regions.ZDT6 has a complex Pareto front, which is a discontinuous non-convex region with a non-uniform distribution of the solution.Therefore, the test problem for ZDT6 is more difficult than that for ZDT1-ZDT3.DTLZ1 has a linear Pareto front with more than two objective functions.Furthermore, the search space contains multiple local Pareto fronts, which makes it difficult for the multi-objective optimization algorithm to achieve the global optimization.DTLZ2 has a spherical non-linear Pareto front.These test functions have different characteristics of Pareto fronts, which examine the capability of multi-objective algorithms to find and produce a quality distribution of the Pareto front.
There are three performance metrics (generational distance, spacing and error ratio) for multi-objective optimization, which are selected to compare the performance of the proposed algorithm to other popular multi-objective algorithms, such as MOPSO [28], NSGA-II [23] and MOPSO-CD [39].The parameter setting for each algorithm is shown in Table 1.The parameter dimensions for ZDT1-ZDT3, ZDT6 were set to 30 and the number of objective functions was 2. For the DTLZ1 and DTLZ2, the parameter dimensions and the number of objective functions were 10 and 3, respectively.The following are the results of 30 independent runs of each algorithm.The true Pareto front and median results with respect to the generational distance metric produced by the proposed algorithm for the six test functions are shown in Figure 3.The comparison of results of the different performance metrics of four algorithms are shown in Table 2.
The generational distance (GD) [68] is used to estimate how far the members in the non-dominated set are from their nearest members in the Pareto optimal set.It is measured in objective space and is defined as: where s is the number of members in the non-dominated set found thus far; and d i is the Euclidean distance between the members in the non-dominated set and their nearest members in the Pareto optimal set.If GD is 0, all non-dominated solutions obtained are in the Pareto optimal set.In Table 2, the average performance of the MOPSO-CD and our approach with respect to GD is far better than that of NSGA-II and MOPSO for the six test functions.For ZDT1 and ZDT2, the average performance of our approach is equivalent to that of MOPSO-CD with respect to GD.For ZDT3 and ZDT6, our approach has the best average performance of the four algorithms, with GDs of 0.0011 and 0.0181, respectively.The results of our approach are also best for all test functions with three objects, especially DTLZ1.Furthermore, Table 2 and Figure 3 shows that our approach is able to find good solutions that are accurate estimates of the true Pareto optimal set.The GD can be applied to estimate the convergence of the optimal iterations [28,68].When there is a small difference of GD values between the front and back generation, the results can be considered as having converged.The threshold of this difference is set to be 5% times the GD of this generation.Overall, the convergence of proposed algorithms is about the same as MOPSO-CD, while it is slightly better than MOPSO and NSGA-II.This is especially the case in complexity test functions, such as ZDT6, DTLZ1 and DTLZ2.The error ratio (ER) [28] indicates the percentage of solutions (from the non-dominated set obtained so far) that are not members of the Pareto optimal set and is defined as: where s is the numbers of members in the non-dominated set found thus far.If e i = 1, it indicates that the solution i is the member of the Pareto optimal set, while e i = 0 otherwise.A value of 0 for ER indicates that all non-dominated solutions obtained by an algorithm belong to the Pareto optimal set.Table 2 shows that all the non-dominated solutions obtained by the proposed approach so far are almost in the Pareto optimal set within a certain precision (<0.01) for most of the test functions.
The average performance of NSGA-II and MOPSO with respect to ER is equal to 1 for the six test functions.Essentially, these two algorithms may fail to find the true Pareto optimal set for the six test functions.For ZDT1-ZDT3 and DTLZ2 the ER evaluation of our approach is 0. In addition, for ZDT6, the ER evaluation of our approach is 0.0094, while MOPSO-CD has a value of 0.0124.These results indicate that MOPSO-CD is better than NSGA-II and MOPSO for evaluating ER, but is slightly inferior to our proposed approach.This is also seen in the ER results of DTLZ1.These results indicate that our approach has better performance than the other three algorithms because they have relatively larger ER values than our approach inmost test functions.Spacing (SP) [69] is used to measure the distribution of solutions throughout the set of non-dominated solutions and is defined as: where s is the number of members in the non-dominated set found so far; m is the numbers of objectives; and d is the mean of d i .If SP is 0, it indicates that all of the currently available members of the Pareto front are equidistantly spaced.In Table 2, although the average performance of the proposed algorithm for SP is better than those of NSGA-II and MOPSO in some of the test functions, including DTLZ1 and DTLZ2, the results of our approach for the ranking of SP are lower than the results of MOPSO-CD.The reason for this difference may be because our approach is based on the Mahalanobis distance for measuring the degree of similarity between the solutions.Therefore, there is a natural inconsistency with SP, which is based on the Euclidean distance for measuring the equidistance of the solutions.In addition, if the non-dominated solutions produced by the algorithm are not part of the true Pareto front, the SP metric is irrelevant [28].For an example, the SP for the MOPSO is 0.0016, which is better than the SP value of our algorithm (0.0083) in ZDT2.However, the ER for the MOPSO is 1, which indicates that the non-dominated solutions obtained by MOPSO are far from the true Pareto front.
Based on the above discussion, it is shown that the proposed approach can generate a better set of non-dominated solutions for the six test functions.

Case II
For testing the ability of the proposed approach in minimizing the influence of noise, different noise strengths from Gaussian distributions N(0, 0.05) and N(0, 0.15) were added to the test functions in Case II.The setting of the algorithm-related parameters is the same as in Case I.The comparison of the algorithm results is shown in Tables 3 and 4.
Table 3 shows that the average performance of the two algorithms in all of the test functions with noise N(0, 0.05) is inferior to that without noise, which is shown in Table 2.However, the proposed algorithm has an overwhelming advantage compared to MOPSO-CD in all of the performance tests.The average performance of our approach with respect to GD and ER for ZDT1 is 0.0079 and 0.0127, respectively, which is 5.4-5.8 times better than that of MOPSO-CD.For ZDT2, the average performance of MOPSO-CD with respect to GD and ER is 0.0571 and 0.1302, which is approximately 158% and 350% more than the GD (0.0221) and ER (0.0290) of our approach, respectively.The GD and ER of the MOPSO-CD for ZDT3 are approximately 6 and 7 times that of our approach, respectively.Compared to other ZDT series test functions, the superior performance of our approach is not as significant in ZDT6, but its corresponding GD, SP and ER are also approximately 1/3-7/8 of those of MOPSO-CD.In addition, the average performance of our approach with respect to SP for all of the test functions are better than those of MOPSO-CD.The SP is 0.0521 in our approach and 0.1571 in MOPSO-CD for ZDT3, which is 70% higher than ours.For DTLZ1 and DTLZ2 test functions, the performance of our algorithm is far better than MOPSO-CD with the results of the MOPSO-CD approach being 2-5 times our results.
Table 3 shows that our approach has better stability compared to MOPSO-CD in all test functions.The standard deviation of the GD values from the MOPSO-CD for ZDT1 and ZDT3 is approximately 6 times that of our approach, while this difference is around 2-9 times for DTLZ1 and DTLZ2.The above results show that when noise N(0, 0.05) is added to the test functions, our approach has better results than MOPSO-CD in terms of the average performance metrics and standard deviation, which means that proposed algorithm can effectively reduce the influence of noise on the optimization results.
To test the performance of the algorithm under stronger noise, noise according to N(0, 0.15) was added to the test functions.The test results of the performance of the two algorithms can be seen in Table 4.The advantages of our approach are not as obvious as in Table 3, but are 1.5-6 times better than the results of MOPSO-CD.This may be because the signal-to-noise ratio decreases with the increase in the disturbance intensity, causing the value to become submerged by noise.Nevertheless, our approach is still superior to MOPSO-CD in almost all of the test functions in Table 4.For instance, the ER of MOPSO-CD for DTLZ2 is 1, while the ER of our algorithm is only about 0.5.This indicates that the MOPSO-CD algorithm is hardly to obtain the true Pareto optimal set, while our algorithm was able to obtain this set even under the stronger noise.
The convergence of the proposed algorithm under noise is better than MOPSO-CD.There are 1500-2000 fewer function evaluations from our algorithm compared to MOPSO-CD for ZDT1, ZDT2, ZDT3 and ZDT6 under N(0, 0.05) noise.For DTLZ1 and DTLZ2, this difference is 1200.The lead of the proposed algorithm is slightly larger under the stronger noise.Our approach has a good performance compared to MOPSO-CD with respect to GD, SP and ER almost in all of the test functions with different noise strengths.This result indicates that our approach can produce a well-distributed non-dominated solution set even under different strengths of noise.The stability of our approach, which has a smaller standard deviation with respect to GD, SP and ER, is also better than that of MOPSO-CD.Under different noise strengths, our approach has a good performance for convergence, distribution and anti-noise.

Case III
The O'Connor model is a one-dimensional steady-state river model based on the Streeter-Phelps model, which takes into account the degradation, sedimentation, oxygen consumption of organic matter decomposition oxygen demand (CBOD) and nitrification oxygen demand (NBOD), as well as the role of dissolved oxygen (DO) re-oxygenation [70].It contains four parameters: the CBOD degradation coefficient K 1 , re-oxygenation coefficient K 2 , CBOD settlement coefficient K 3 and NBOD degradation coefficient K N .The four parameters are a good measure of the performance of the algorithm in model applications, in which these values vary widely.These values have a range of 0.9-4.According to the measured results of a river section on the Taihu Plain in China, the value of the initial conditions and parameters are shown in Table 5.In the following section, the two algorithms are used for parameter estimation of the O'Connor model.The three objection functions, which are the relative errors of CBOD, NBOD and OD, are optimized by MOPSO-CD and the proposed algorithm.The population size, number of iterations and size of the external repository are set to 40, 50 and 100, respectively.The calibration initially occurs in regard to the variables CBOD, NBOD, and OD without noise, with the results shown in Table 6.In Table 6, although the range and standard deviation of the parameters obtained by MOPSO-CD are small, which shows more stability than ours, the K 1 , K 3 and K N obtained do not cover the true parameter values.This means that the parameters from MOPSO-CD are unreliable and cannot be used in modeling.Compared with the MOPSO-CD results, the parameters generated by the proposed algorithm cover these true values and the means of the parameters obtained by our algorithm are close to the true values.For example, the mean of K N obtained by our algorithm is 0.127, while it is 0.072 for MOPSO-CD.The relative error for the true values (0.125) are 1.6% and 42.4%, respectively.Since the means of the parameters are often used for calculating the model in practice, the above results indicate the advantage of our algorithm.Taking OD as an example, the changes in OD (0-100 km) are shown in Figure 4, which correspond to different non-dominated solutions obtained.The line generated by the balance solution (BS) is also shown, which has the smallest sum of the distance to the true parameters.In the following section, the two algorithms are used for parameter estimation of the O'Connor model.The three objection functions, which are the relative errors of CBOD, NBOD and OD, are optimized by MOPSO-CD and the proposed algorithm.The population size, number of iterations and size of the external repository are set to 40, 50 and 100, respectively.The calibration initially occurs in regard to the variables CBOD, NBOD, and OD without noise, with the results shown in Table 6.In Table 6, although the range and standard deviation of the parameters obtained by MOPSO-CD are small, which shows more stability than ours, the K1, K3 and KN obtained do not cover the true parameter values.This means that the parameters from MOPSO-CD are unreliable and cannot be used in modeling.Compared with the MOPSO-CD results, the parameters generated by the proposed algorithm cover these true values and the means of the parameters obtained by our algorithm are close to the true values.For example, the mean of KN obtained by our algorithm is 0.127, while it is 0.072 for MOPSO-CD.The relative error for the true values (0.125) are 1.6% and 42.4%, respectively.Since the means of the parameters are often used for calculating the model in practice, the above results indicate the advantage of our algorithm.Taking OD as an example, the changes in OD (0-100 km) are shown in Figure 4, which correspond to different non-dominated solutions obtained.The line generated by the balance solution (BS) is also shown, which has the smallest sum of the distance to the true parameters.Since the relationship between the variables and parameters is non-linear and they have complex correlations with each other, the range of OD values obtained by our approach is narrower than that of MOPSO-CD even if the parameter range of our approach is larger than that of MOPSO-CD.This result indicates that there is less uncertainty in the results worked by our approach.Although the result of the true parameters is covered by the results of two algorithms, the OD curve generated by Since the relationship between the variables and parameters is non-linear and they have complex correlations with each other, the range of OD values obtained by our approach is narrower than that of Water 2018, 10, 32 14 of 23 MOPSO-CD even if the parameter range of our approach is larger than that of MOPSO-CD.This result indicates that there is less uncertainty in the results worked by our approach.Although the result of the true parameters is covered by the results of two algorithms, the OD curve generated by the BS of our approach is closer to the true value curve.The maximum error of the curve from the BS of our approach is 1.2%, while the maximum error is approximately 15.6% in MOPSO-CD.The maximum OD position calculated by MOPSO-CD is different from the true position.These results showed that there would be a large ecological risk if the MOPSO-CD results were applied to water quality predictions.Our results are almost identical with the true position at the same time.
The noise according to the N(0, 0.05) and N(0, 0.15) are added to the observed values and the optimum objection functions are the same as the condition without noise.The estimation of the parameters is conducted according to the relative errors of CBOD, NBOD and OD.The calibration results of the two algorithms are shown in Tables 7 and 8.The OD curves with different noise levels are shown in Figures 5 and 6. the BS of our approach is closer to the true value curve.The maximum error of the curve from the BS of our approach is 1.2%, while the maximum error is approximately 15.6% in MOPSO-CD.The maximum OD position calculated by MOPSO-CD is different from the true position.These results showed that there would be a large ecological risk if the MOPSO-CD results were applied to water quality predictions.Our results are almost identical with the true position at the same time.
The noise according to the N(0, 0.05) and N(0, 0.15) are added to the observed values and the optimum objection functions are the same as the condition without noise.The estimation of the parameters is conducted according to the relative errors of CBOD, NBOD and OD.The calibration results of the two algorithms are shown in Tables 7 and 8.The OD curves with different noise levels are shown in Figures 5 and 6     The range of K 1 , K 2 and K 3 obtained by MOPSO-CD still does not cover the true value, while the range of parameters obtained by our method covered the true values.In the case of adding noise according to N(0, 0.05) being used to multiply the calculated values, there are only 8 non-dominated solutions generated by MOPSO-CD.The range of OD values obtained by MOPSO-CD does not cover the true value curve and has large errors (Figure 5).The position and value of the maximum OD for the BS curve are very different from the true value curve with an error of 23%.The shape of the curve determined by BS also diverges from the true value curve.These parameters can hardly be used for model prediction due to the large risk.By contrast, our approach generates more non-dominated solutions and the OD curves completely cover the true value.The BS curve is not as good as that without noise, but the errors of the value and phase are much smaller than those of the MOPSO-CD.Therefore, our approach can be applied in a practical environment with noise.
Due to the increase in noise strength, the range of parameters obtained by MOPSO-CD and our approach expanded, while the range of K 3 by MOPSO-CD failed to cover the true value.The range of the maximum OD calculated by MOPSO-CD is 0.038-0.082,which is significantly larger than that from our approach (0.035-0.059; Figure 6).This result indicates that the uncertainty of the maximum OD generated by MOPSO-CD is higher.The OD curves obtained by the two algorithms cover the true value curve.However, the true value curve is at the lower edge of the OD curve area of MOPSO-CD (Figure 6a).The shape of the OD curves and position of the maximum OD of the MOPSO-CD results have large discrepancies from the true values.By contrast, the true value curve is located in the central region of the OD curves produced by our approach (Figure 6b).For both the shape and position of the maximum OD, the OD curves of our approach are very close to the true value curve.In addition, the curve of BS obtained by MOPSO-CD is very different from the true value curve at the position of the maximum OD.This shows that the results from our approach are better.This also shows that our approach has more advantages in the case of noise.
In general, the O'Connor model calibration results from our approach are consistent with the true parameter values in a water ecosystem with noise, even if the range of parameters is quite different.Its performance and calibration results are superior to those of the classic MOPSO-CD.

Case IV
CE-QUAL-ICM is used in the fourth case.It was developed by the U.S. Army Corps of Engineers Engineer Research Development Center [71].The model computes physical properties, multiple forms of algae, nitrogen, phosphorus, carbon, silica, chemical oxygen demand, salinity, temperature, metal and dissolved oxygen cycles.It involves interactions between various algae and various nutrients, which takes into account the process of algal growth, metabolism, predation and settling.At present, Water 2018, 10, 32 16 of 23 CE-QUAL-ICM has been successfully applied to the study of eutrophication processes of various water bodies [72,73].However, it is considered to be difficult to calibrate due to inevitable data noise and the high dimensionality of the parameter space.
A lake is an open system for material and energy exchange with the environment.We used Chaohu Lake in eastern China as an example to study the parameter optimization problem of the CE-QUAL-ICM model.Chaohu Lake is approximately 780 km 2 , with a depth of 1-7 m.As it has been seriously polluted, it is one of the most eutrophic lakes in China and has suffered from frequent outbreaks of algae in recent decades.Therefore, the eutrophication of Chaohu Lake is a significant concern [74][75][76].Nine major rivers around Chaohu Lake were considered in the model calibration.Although the lake is large, a proper water quality model can capture the main characteristics of the water quality changes over time in Chaohu Lake, even when ignoring spatial heterogeneity and hydrodynamic influences [77,78].
The cyanobacteria and DO are two important water quality indicators and selected for use in the model parameter estimation in the Chaohu Lake water quality model.All data for the modeling were collected from 25 sites that were monitored about weekly in July-September in 2009.It is necessary to select the parameters for the model calibration.The 15 parameters with the greatest influence on the model variables were identified, with their names and ranges provided in Table 9.The other parameters used the default values [71].
are the root mean square error (RMSE) of DO and cyanobacteria, respectively, are optimized by the proposed algorithm.
where O are the observations; P are the predictions; and D is the total number of observations.The ranges of the parameters for the Pareto set is shown in Table 9.The population size, number of iterations and size of the external repository are set to 20, 30 and 50 here, respectively.Table 9 shows that the range 2 obtained for most of the parameters by calibration is significantly less than the original range 1.Hence, the uncertainty in most of the parameters can be reduced by our approach.The Pareto front in the objective space is shown in Figure 7.There are 15 non-dominated solutions that are obtained by our approach.From the definition of the non-dominated solution set, none of the corresponding tradeoffs can be considered to be better than the others if there is a lack of information by which to judge.However, a decision-maker can choose an appropriate solution based on the importance of the objectives for their own interests.The leftmost point of the Pareto front corresponding to the parameter vector assigns the highest fit to the cyanobacteria calibration at the expense of the weakest DO calibration fit.The rightmost point indicates that the highest importance is assigned to DO.The solution with the solid circle in Figure 7 is a better tradeoff of the calibration of the cyanobacteria and DO.The objection functions OF 1 and OF 2 for the selected solution are equal to 0.80 mg/L and 1.79 µg/L, respectively.The selected solution is used to simulate the DO and cyanobacteria in summer in Chaohu Lake (Figure 8).Comparing the simulated results of the model with the observed values, the average relative errors of cyanobacteria and DO were approximately 29% and28%.The Pearson correlation coefficients between the simulated and observed values of cyanobacteria and DO are 0.59 and 0.64, respectively.The simulation results are consistent with the observed values and trends in Chaohu Lake.These correlation coefficients were not very large, suggesting the trade-off between RMSE of DO and cyanobacteria and the possibility of a non-linear relationship between simulated and observed values.The selected solution is used to simulate the DO and cyanobacteria in summer in Chaohu Lake (Figure 8).Comparing the simulated results of the model with the observed values, the average relative errors of cyanobacteria and DO were approximately 29% and 28%.The Pearson correlation coefficients between the simulated and observed values of cyanobacteria and DO are 0.59 and 0.64, respectively.The simulation results are consistent with the observed values and trends in Chaohu Lake.These correlation coefficients were not very large, suggesting the trade-off between RMSE of DO and cyanobacteria and the possibility of a non-linear relationship between simulated and observed values.
Figure 9 presents the simulation results of phosphate (PO 4 ) and nitrate (NO 3 ), which are the parameters of the selected solution.Their average relative errors and Pearson correlation coefficients for PO 4 and NO 3 were approximately 24% and 0.62 as well as 27% and 0.66, respectively.The simulation trends of PO 4 and NO 3 are also consistent with the observed values.This result shows that the parameters obtained by the calibration of the cyanobacteria and DO can also be used to simulate other environmental variables, such as PO 4 and NO 3 .This result indicates that there are correlations among variables in the water quality model.
The above results show that our approach has the ability to obtain good results in the multi-objective calibration of a complex water quality model, even though there is unavoidable observation noise in the ecosystem model.The selected solution is used to simulate the DO and cyanobacteria in summer in Chaohu Lake (Figure 8).Comparing the simulated results of the model with the observed values, the average relative errors of cyanobacteria and DO were approximately 29% and28%.The Pearson correlation coefficients between the simulated and observed values of cyanobacteria and DO are 0.59 and 0.64, respectively.The simulation results are consistent with the observed values and trends in Chaohu Lake.These correlation coefficients were not very large, suggesting the trade-off between RMSE of DO and cyanobacteria and the possibility of a non-linear relationship between simulated and observed values.Figure 9 presents the simulation results of phosphate (PO4) and nitrate (NO3), which are the parameters of the selected solution.Their average relative errors and Pearson correlation coefficients for PO4 and NO3 were approximately 24% and 0.62 as well as 27% and 0.66, respectively.The simulation trends of PO4 and NO3 are also consistent with the observed values.This result shows that the parameters obtained by the calibration of the cyanobacteria and DO can also be used to simulate other environmental variables, such as PO4 and NO3.This result indicates that there are correlations among variables in the water quality model.The above results show that our approach has the ability to obtain good results in the multiobjective calibration of a complex water quality model, even though there is unavoidable observation noise in the ecosystem model.

Conclusions
The water quality models involve problems with multiple objectives and parameters, which increase the difficulty of model calibration.Furthermore, models are often affected by noise in the data.This paper presents a novel algorithm (MCAD-MOPSO) to reduce the impact of noise on the parameter estimation of water quality models.The proposed approach is based on the Mahalanobis distance operation, mechanism of cardinality preference and advection-diffusion operator.The Mahalanobis distance operation can effectively reduce the influence of data noise on the calibration of a water quality model.The mechanism of cardinality preference can use more populated information to enhance the search ability of the algorithm and force the particles to move toward the sparse regions of the search space.The advection-diffusion operator can prevent the solutions from falling into the local optimum, explore potentially better solutions and maintain the diversity of solutions in the external repository.
The first two cases are based on test functions and metrics from the literature, with addition of noise to the test functions.In the absence of noise, the average performance of our approach is significantly better than that of NSGA-II and MOPSO, while it is even better than MOPSO-CD regarding most test functions.This result shows that our approach can produce a good approximation with a good distribution of the true Pareto front in all test functions.When adding the

Conclusions
The water quality models involve problems with multiple objectives and parameters, which increase the difficulty of model calibration.Furthermore, models are often affected by noise in the data.This paper presents a novel algorithm (MCAD-MOPSO) to reduce the impact of noise on the parameter estimation of water quality models.The proposed approach is based on the Mahalanobis distance operation, mechanism of cardinality preference and advection-diffusion operator.The Mahalanobis distance operation can effectively reduce the influence of data noise on the calibration of a water quality model.The mechanism of cardinality preference can use more populated information to enhance the search ability of the algorithm and force the particles to move toward the sparse regions of the search space.The advection-diffusion operator can prevent the solutions from falling into the local optimum, explore potentially better solutions and maintain the diversity of solutions in the external repository.
The first two cases are based on test functions and metrics from the literature, with addition of noise to the test functions.In the absence of noise, the average performance of our approach is significantly better than that of NSGA-II and MOPSO, while it is even better than MOPSO-CD regarding most test functions.This result shows that our approach can produce a good approximation with a good distribution of the true Pareto front in all test functions.When adding the different noise strengths from the Gaussian distribution to the test functions, the results show that the ability of our approach to reduce noise is better than that of MOPSO-CD.It has a good performance with respect to GD, SP and ER in all of the test functions.The average performance with respect to GD, ER and SP of MOPSO-CD for all of the test functions is approximately 1.3-7.5, 1.4-6.4 and 1.2-3.0times that of our approach corresponding to the noise of N(0,0.05), while this is approximately 1.4-7.0,1.2-1.8 and 1.2-4.3times for the noise of N(0,0.15),respectively.Even if the effect of noise is larger, our approach has the ability to accurately approximate the true Pareto front with a good distribution.
The application of the proposed algorithm is further tested on two frequently used models, the O'Connor model and CE-QUAL-ICM model, in the ecosystem.The four parameters of the O'Connor model were calibrated by MOPSO-CD and our approach.The results show that regardless of different noise strengths, the range of parameters obtained by our approach covered the true parameter values.The means of the parameters calculated by the proposed approach were closer to their true values.The OD prediction of our approach had less uncertainty, while the BS curve was closer to the true value curve.Moreover, the of the maximum OD value and position obtained by our approach was much smaller than that of the MOPSO-CD.The performance and calibration results of our approach were superior to those of the classic MOPSO-CD.Finally, in a shallow lake, a complex water quality model (CE-QUAL-ICM) was calibrated using our approach.The results show that our approach performed well in the calibration of the lake ecosystem model with added noise from natural observations.The selected non-dominated solution obtained good results for the simulation of the cyanobacteria biomass, DO as well as the PO 4 and NO 3 concentrations.
In summary, our approach can effectively reduce the influence of noise in data on the multi-objective calibration of the models, which the traditional algorithm does not take into account.We conclude that our approach is a highly competitive algorithm for optimization problems, regardless of the standard test function or the complex model in the ecosystem.Our approach provides a useful exploration for parameter estimation of water quality models.
. The main aims of the present study are:(1) To reduce the influence of noise in the data on the calibration of water quality models; (2) to maintain the diversity of solutions and prevent the solutions from falling into the local optimum; and (3) to enhance the search ability of algorithms and better guide the search towards the true Pareto front.The flowchart of the proposed approach is shown in Figure1.The improved part of the approach is shown in the color block diagram and is further discussed in Sections 2.1-2.3.Water 2018, 10, 32 3 of 23 the solutions from falling into the local optimum; and (3) to enhance the search ability of algorithms and better guide the search towards the true Pareto front.The flowchart of the proposed approach is shown in Figure 1.The improved part of the approach is shown in the color block diagram and is further discussed in Sections 2.1-2.3.

Figure 1 .
Figure 1.Flowchart of the model calibration with the MCAD-MOPSO.

Figure 1 .
Figure 1.Flowchart of the model calibration with the MCAD-MOPSO.
indices.When t maxgen β ≥ ⋅ , this operator should not be employed.

Figure 2 .
Figure 2. Advection-diffusion of particles in an induced set.Figure 2. Advection-diffusion of particles in an induced set.

Figure 2 .
Figure 2. Advection-diffusion of particles in an induced set.Figure 2. Advection-diffusion of particles in an induced set.

2 .
Initialize the particle's velocity v 0 i = 0, particle's position x 0 i , personal best position pbest 0 i = x 0 i and iteration counter t = 0 (b) Evaluate f (x 0 i ) (c) Initialize the external repository E 0 , storing the non-dominated solutions found in particle population in to E (d) Initialize the gbest 0 in accordance with Section 2.2.1.Repeat Until the Maximum Number of Iterations Is Reached (a) Compute the new velocity of each particle using the following formula:

Figure 3 .
Figure 3. Pareto front produced by the proposed algorithm for the six test functions.

Figure 4 .
Figure 4. OD prediction from 0 to 100 km.The red line is generated by true parameters.The gray lines are generated by non-dominated solutions, while the blue line is generated by the BS.

Figure 4 .
Figure 4. OD prediction from 0 to 100 km.The red line is generated by true parameters.The gray lines are generated by non-dominated solutions, while the blue line is generated by the BS. .

Figure 5 .
Figure 5. OD prediction for 0-100 km under the condition of noise according to N(0, 0.05) multiplied by the calculated values obtained.The red line is generated by true parameters.The gray lines are generated by non-dominated solutions, while the blue line is generated by BS.

Figure 5 .Figure 5 .Figure 6 .
Figure 5. OD prediction for 0-100 km under the condition of noise according to N(0, 0.05) multiplied by the calculated values obtained.The red line is generated by true parameters.The gray lines are generated by non-dominated solutions, while the blue line is generated by BS.

Figure 6 .
Figure 6.OD prediction for 0-100 km under the condition of noise according to N(0, 0.15) being multiplied by the calculated values obtained.The red line is generated by true parameters.The gray lines are generated by non-dominated solutions, while the blue line is generated by BS.

Water 2018 ,
10, 32 17 of 23on the importance of the objectives for their own interests.The leftmost point of the Pareto front corresponding to the parameter vector assigns the highest fit to the cyanobacteria calibration at the expense of the weakest DO calibration fit.The rightmost point indicates that the highest importance is assigned to DO.The solution with the solid circle in Figure7is a better tradeoff of the calibration of the cyanobacteria and DO.The objection functions OF1 and OF2 for the selected solution are equal to 0.80 mg/L and 1.79 μg/L, respectively.

Figure 7 .
Figure 7. ThePareto front in the objective space.

Figure 7 .
Figure 7. ThePareto front in the objective space.

Figure 7 .
Figure 7. ThePareto front in the objective space.

Figure8.
Figure8.The simulation of cyanobacteria and DO in the summer in Chaohu Lake.(a) DO, (b) Cyanobacteria.The solid line shows the result of the simulation.The diamond box represents the observed value.

Figure 8 .Figure9.
Figure 8.The simulation of cyanobacteria and DO in the summer in Chaohu Lake.(a) DO, (b) Cyanobacteria.The solid line shows the result of the simulation.The diamond box represents the observed value.Water 2018, 10, 32 18 of 23

Figure 9 .
Figure 9.The simulation of phosphate and nitrates in the summer in Chaohu Lake.(a) PO 4 , (b) NO 3 .The solid line shows the result of the simulation.The diamond box represents the observed value.
• • • x jn are independent variables of the optimized function that are marked by the ith and jth particles; f If |I x i | > I pbest , update pbest = x i .Otherwise, do not update pbest.
) If Use the advection-diffusion operator on each particle, as stated in Section 2.3.(c) Maintain the particles within the search space.If decision variable x i goes beyond its boundaries, the reflection should be used.(d) Evaluate f (x t+1 i ) (e) Update the external repository using the Pareto dominant relationship.If the external repository is full, reduce the non-dominated solutions in the external repository as follows: First, sort the cardinality of the induced set of particles in the external repository in ascending order.Second, a non-dominated solution in the external repository is randomly selected and removed from the specified bottom portion, such as the bottom 10%.It is likely that the non-dominated solutions in dense areas are replaced.(f) Update the gbest t in accordance with Section 2.2.1.Update the pbest t i of each particle in accordance with Section 2.2.2.
(g) Increment the iteration counter t.

Table 1 .
Parameter setting of the multi-objective algorithms.

Table 2 .
Algorithm performance metrics for the six test functions.Test Metrics NSGA-II MOPSO MOPSO-CD Proposed Algorithm Figure 3. Pareto front produced by the proposed algorithm for the six test functions.

Table 2 .
Algorithm performance metrics for the six test functions.

Table 3 .
Algorithm performance metrics for the six test functions with noise N(0, 0.05).

Table 4 .
Algorithm performance metrics for the six test functions with noise N(0, 0.15).

Table 5 .
The values of the parameters and initial conditions of a river section.

Table 6 .
Calibration results of the O'Connor model without noise.

Table 5 .
The values of the parameters and initial conditions of a river section.

Table 6 .
Calibration results of the O'Connor model without noise.

Table 7 .
O'Connor model results with noise N(0, 0.05) multiplied by the calculated values.

Table 8 .
O'Connor model results with noise N(0, 0.15) being used to multiply the calculated values.

Table 7 .
O'Connor model results with noise N(0, 0.05) multiplied by the calculated values.