Application of a Hybrid Optimized BP Network Model to Estimate Water Quality Parameters of Beihai Lake in Beijing

: Nowadays, freshwater resources are facing numerous crises and pressures, resulting from both artiﬁcial and natural process, so it is crucial to predict the water quality for the department of water environment protection. This paper proposes a hybrid optimized algorithm involving a particle swarm optimization (PSO) and genetic algorithm (GA) combined BP neural network that can predict the water quality in time series and has good performance in Beihai Lake in Beijing. The data sets consist of six water quality parameters which include Hydrogen Ion Concentration (pH), Chlorophyll-a (CHLA), Hydrogenated Amine (NH4H), Dissolved Oxygen (DO), Biochemical Oxygen Demand (BOD), and electrical conductivity (EC). The performance of the model was assessed through the absolute percentage error ( APE max ), the mean absolute percentage error (MAPE), the root mean square error (RMSE), and the coe ﬃ cient of determination ( R 2 ). Study results show that the model based on PSO and GA to optimize the BP neural network is able to predict the water quality parameters with reasonable accuracy, suggesting that the model is a valuable tool for lake water quality estimation. The results show that the hybrid optimized BP model has a higher prediction capacity and better robustness of water quality parameters compared with the traditional BP neural network, the PSO-optimized BP neural network, and the GA-optimized BP neural network.


Introduction
Human activities are considered to cause major pollution of water quality, which calls for urgent action around the world. Some of the main emission sources that can significantly affect surface water quality are discharge of hazardous substances from industrial processes and urban waste water, accidental hazardous substances pollution, and diffuse pollution originating from agricultural areas [1].
Numerous water quality parameters are being measured to indicate lake water status and guide decision makers towards implementing optimal and sustainable measures. Dissolved oxygen (DO) content is one of the most important water quality parameters as it directly indicates the status of aquatic ecosystem and its ability to sustain aquatic life. DO is considered to be the most badly affected among all water quality parameters [1]. So far, many methods have been used to predict water quality: [2] use the grey relational method to predict the water quality and perform water quality evaluation of rivers in china; [3] developed a Bayesian approach to river WQM combined with inverse methods to support the practical adaptive water quality management under uncertainty of Hun-Taiz River in northeastern China; [4] used genetic algorithm (GA) and geographic information system Beijing (39 • 28 N-41 • 05 N, 115 • 25 E-117 • 30 E), the capital of China, is located at a central latitude, belonging to the eastern warm temperate monsoon zone with semi-humid continental climate and four distinct seasons. Human activities and climate change have important effects on water quality. The local government faces serious challenges of pollution control and natural resource management of lakes in Beijing, especially for landscape water and drinking water in lakes and reservoirs. As an issue of national concern, numerous studies in lakes have been paid more attention regarding their physical, chemical (e.g., total nitrogen [17], total phosphorus [18], heavy metals [19], etc.), and biological (e.g., phytoplankton [20], aquatic fish [21] and aquatic plant [22] etc.) parameters, as well as the influences of land use [23] and eutrophication [24]. In this research, we focus on water quality parameters of Beihai Lake. Beihai Lake, connecting Qianhai Lake in the north and Zhonghai Lake and Nanhai Lake in the south, is the largest landscape lake around the Forbidden City. The location of Beihai Lake is shown in Figure 1. Water of Beihai Lake comes from Miyun and Guanting Reservoirs, which are very important drinking water areas in Beijing. The water quality parameters of reclaimed landscape water can reach the national water quality standards Class IV at most, which is the minimum standard for industry and recreation [25].
We used continuous time series water quality monitoring data from the Beijing Water-affair Authority for about 120 h in August 2013. Water quality parameters measured include PH, CHLA (mg/L), NH4H (mg/L), DO (mg/L), BOD (mg/L), and EC (µs/cm). Basic statistics of the measured water quality variables of Beihai Lake are shown in Table 1 as follows.

87
We used continuous time series water quality monitoring data from the Beijing Water-affair 88 Authority for about 120 hours in August 2013. Water quality parameters measured include PH, 89 CHLA (mg/l), NH4H (mg/l), DO (mg/l), BOD (mg/l), and EC (µs/cm). Basic statistics of the measured 90 water quality variables of Beihai Lake are shown in Table 1 as follows.

Data Preparation and Input Selection
Because the water quality parameters of the test come from different Internet of Things (IOT) collection devices and different time data intervals, and a large amount of data was input manually as well, the original data is not satisfactory. In addition, water parameters usually have different dimensions and orders of magnitude, and the data can be converted to the same order of magnitude without dimension by a normalized processing method (Equation (2)). At the beginning, the standard deviation was calculated using Equation (1). After the models have been successfully executed, the outputs of the models in the form of normalized values are converted to original values by inverse transformation using Equation (3). The priority is to preprocess the original data. We eliminate invalid and discrete data; the remaining data were normalized and grouped into training samples and test samples. These procedures were done with SPSS.18. where SD = standard deviation of X i ; X i = input data; X = arithmetic average of X; X = values after normalized method and also as input to models; X max = maximum value of X and X min = minimum value of X. After normalization of linear functions, the experimental data values will be mapped to [0,1].

Back Propagation Neural Network (BPNN)
BP is a multilayer forward feedback neural network, and it is also named error reverse propagation neural network according to the method of error inverse propagation. According to incomplete statistics, 80-90% of the neural network models used by people adopt BP network or some form of change. The core process consists of four parts, forward calculation, feedback calculation of local gradient, weight correction between neurons, and calculation of the total mean squared error (Equation (4)). A sigmoidal (Equation (5)) function is taken as the transfer function (5) where, N = sample number, c = collection of all output units.

Optimizing BPNN Using PSO
Particle swarm optimization (PSO) is inspired by the behavior of bird and fish swarms. It was developed by Eberhart and Kennedy in 1995 [26]. Each member of the population is called a particle, each particle represents a potential feasible solution, and the position of the particle is considered to be the global optimal solution. The population searches for the global optimal solution in the d-dimensional space. Meanwhile, each particle decides its own flight direction through the value of the adaptive function and the velocity, and gradually moves to the better region to finally search for the global optimal solution. Particles in a particle swarm are described by position vectors and velocity vectors, and the possible solutions of particle position vectors correspond to the weight values in the network. The velocity vector and position vector are updated with Equations (6) and (7) respectively. Using PSO to optimize the BP neural network accelerates the convergence speed and reduces the possibility of falling into local extremum.
where, v id (t) is the d-dimensional flying speed component of particle i when it evolves to generation t; x id (t) is the d-dimensional position component of particle i when it evolves to generation t; gBest id (t) is the optimal position gBest i component of d-dimensional individual in the evolution of particle i to generation t; zBest d (t) is the d-dimensional component of zBest i , the optimal position of the whole particle swarm in the evolution to generation t; ω = inertia weight, c 1,2 = accelerated constant, rand d is a random number of [0,1]. It can be seen that inertia weight ω determines the global optimization and local optimization of PSO in Equation (8). The larger the ω value is, the stronger the global optimization ability is. On the contrary, the smaller the ω value is, the stronger the local optimization ability is. Linear decrement is used to adjust the weight for this part in Equation (8).
where, ω max = maximum inertia weight; ω min = minimum inertia weight; t = current iteration number; G max = maximum evolutionary algebra. The particle swarm optimization algorithm procedures are usually demonstrated as follows.
Step 1: Initialize swarm According to the problem to be optimized, the particle swarm's velocity v i , position x i , population size N, individual extreme value gBest i and global extreme value zBest i are initialized.
Step 2: Calculate the particle fitness value The mean square error (MSE) Equation (9) is selected as the objective function to calculate the fitness value of the initial particle swarm.

Step 3: Update individual extremum gBest i
The individual fitness value calculated by step 2 is compared with the fitness value of the individual extreme value gBest i . If the individual fitness value is good, the current position of the individual will be regarded as the historic optimal position of the individual; that is, the individual extreme value of gBest i . Otherwise, the current individual extreme value gBest i will be maintained until a better individual appears.

Step 4: Update global extremum zBest i
The fitness values of gBest i and zBest i are compared. If the fitness value of gBest i is better, the optimal position of an individual will be taken as the historical optimal position of the group, namely the global extreme value. Otherwise, the current global extreme value will be maintained until a better individual extreme value appears.
Step 5: Update the particle's speed and position Update particle speed v i and position x i according to particle swarm optimization speed and position (Equations (6) and (7)).
Step 6: End particle swarm optimization algorithm The particle swarm optimization algorithm is judged by the end condition. According to the set end condition of the algorithm (maximum iteration number or target fitness value), if the condition is not met, then jump to step 2, otherwise, output the global optimal solution zBest i .

Optimizing BPNN Using GA
The GA is an optimization algorithm that simulates Darwinian evolutionary mechanisms to find the optimal solution. It features adaptability, randomness, and high parallelism. GA obeys the principle of survival of the fittest, repeats the operation of selection, crossover, and mutation with individual fitness as the evaluation standard, eliminates chromosomes with poor adaptability, retains the fitness of individuals, and forms a new population. This algorithm's procedures are usually demonstrated as follows.

Step 1: Population initialization
Within a certain range, a random initial population with the number of N is generated, and each individual in the population becomes a chromosome.

Step 2: Code the population
The initial population is coded according to binary rules, which are made up of Numbers 0 and 1.

Step 3: Calculate fitness
Mean square error (MSE) is selected as the objective function. According to the objective function, the fitness value of each individual in the population is calculated.
where f i represents individual fitness; Y i = actual output value of the sample; Y i = excepted output value of sample; X = number of samples.

Step 4: Select operator
According to the fitness value of each individual, individuals with high fitness are selected to carry out to the next iteration operation, while those with low fitness are less likely to enter the next iteration operation or may even be eliminated. The probability of an individual being selected is proportional to its fitness by using the wheel type probability selection method. The probability selection is as follows.
where, N = initial population number, f k is the fitness value of individual k and P k is the probability that individual k is selected.
Step 5: Crossover operator The selected individuals pair with each other according to the principle of arithmetic crossover, exchange some genes, and form new individuals, which will have the characteristics of their parents. The arithmetic crossover operator is as follows [27].
where, x 1,2 represent two parent individuals, while x , 1,2 represent two offspring individuals, a is a random number between 0 and 1.

Step 6: Mutation operator
By replacing certain alleles on individual chromosomes with a certain probability of mutation, new individuals different from their parents can be created to expand the population size.
Repeat Step 3 to Step 6. The algorithm converges by iteration. When the iteration number reaches the maximum iteration number T, the individuals with the maximum fitness obtained in the evolutionary process are taken as the output of the optimal solution.

The Combined Model of PSO, GA, and BPNN
Both PSO and GA are optimization algorithms that try to simulate the adaptability of individual populations on the basis of natural characteristics. Both of them use certain transformation rules to solve the problem through searching space. Besides, they both have parallel search features, thus reducing the possibility of falling into local minimum in BPNN. Both PSO and GA algorithms have good optimization performance but also have disadvantages and limitations. Both in PSO and GA, parameters are determined by experience, which will cause premature convergence, slow convergence speed, and finally affect the optimization performance.
PSO and GA optimize BPNN by optimizing the connection weight and threshold of BPNN. This PSO and GA hybrid algorithm is based on the PSO algorithm, and the genetic algorithm is added in the process of the PSO algorithm. It combines the advantages of the two algorithms and has the advantages of less computation, fast convergence, and good global convergence performance. The steps of the PSO-GA-BP are as follows.
Step 1. BP neural network initialization. According to the input and output dimensions of the model, the hierarchical structure of the neural network and the number of nodes in the hidden layer are determined.
Step 2. Particle swarm initialization. According to the network structure, the particle parameters and the number of particles are determined. The velocity and position of the particles are encoded by binary code. The mean square error (MSE) is selected to calculate the fitness value.
Step 3. Calculate the fitness value. Calculate the fitness value of each particle and determine whether the target conditions are met. If the target condition is met, the output result is obtained; If the target condition is not satisfied, the individual optimum and global optimum of the particles are updated.
Step 4. PSO added Crossover operator. Particle swarm optimization adds the selection crossover step of genetic algorithm, selects the particles with better fitness by the wheel bet method, crosses the position and speed of the particle group according to the probability Pa = 0.4, and selects the particles with better fitness after crossing into the particle group for the next iteration.
Step 5. PSO added Mutation operator. Particle swarm optimization adds the mutation step of the genetic algorithm. According to the probability Pb = 0.01, the mutation operation is carried out on the position and velocity of the particles with poor fitness in the particle swarm, and the particles after the mutation operation are put into the particle swarm.
Step 6. Calculate the fitness value by fitness function and update the gBest and zBest particle swarm optimization.
Step 7. Determine whether the target value of the set particle swarm is met or has reached the maximum evolutionary algebra. If the condition is met, the optimal solution zBest is output; If it is not met, then jump to step (3) and continue to complete the iteration.
Step 8. Decode the optimal solution after the iteration is completed and substitute initial weights and thresholds into the preset BP neural network. Further, the PSO-GA-BP neural network model was obtained.
Flow chart of PSO-GA-BP neural network prediction model is shown in Figure 2 as follows.

Evaluation of Performance
In this study, a time series data set was divided into two subsets for training and testing the Where APE is the absolute percentage error, MAPE is the mean absolute percentage error, RMSE 249 is the root mean square error. In Equations (14)- (17), is the measured quality parameter in period 250 , while is the predicted quality parameter, is the total number of periods. is the coefficient 251 of determination, is the average of measured quality parameter in period , is the average of 252 the predicted quality parameter in these equations.

253
The APE is able to show how far each predicted value deviated from the measured value even

Evaluation of Performance
In this study, a time series data set was divided into two subsets for training and testing the models; the first 70% of the data set was used to train and the remaining (30%) was used to test the models. To evaluate the performance of the PSO-GA-BPNN model and other prediction models, some comparison standards were employed as follows.
where APE is the absolute percentage error, MAPE is the mean absolute percentage error, RMSE is the root mean square error. In Equations (14)- (17), y is the measured quality parameter in period t, whileŷ is the predicted quality parameter, n is the total number of periods. R 2 is the coefficient of determination, y is the average of measured quality parameter in period t,ŷ is the average of the predicted quality parameter in these equations. The APE is able to show how far each predicted value deviated from the measured value even when the value of error is very small. APE max also shows the point at which the worst prediction effect occurs. Average prediction accuracy can be seen in MAPE, which helps to show the performance of the prediction model. RMSE is recognized as one of the most important indicators for evaluating the performance of prediction models; the smaller the value of RMSE, the better the prediction of the model. While the value of R 2 (between 0 and 1) represents whether the measured value is related to the predicted value, it represents the degree of fitting between the measured value and the predicted value. The closer the value of R 2 is to 1, then the better the fitting result is. Normally, MAPE and RMSE are always used to evaluate the accuracy, while APE and R 2 are more suitable for evaluating the robustness of the model.

Results and Discussion
The algorithm was realized by the mathematical software Matlab18a which provides support for algorithm development, data visualization, data analysis, and numerical calculation and has the advantages of simple operation, convenient operation, and fast calculation speed, in which the structure of the BP neural network was 5-11-1. The maximum number of iterations was 2000, the threshold of error precision ε was 0.0001, and the learning rate η was 0.005. The population size was set at 50 and the number of evolutions was 200 in the PSO algorithm and GA algorithm. The learning factor c 1 = c 2 = 1.49. PH, CHLA, NH4H, BOD, and EC were used as inputs; the subsequent DO predictive values were used as the output in models. According to the steps of the neural network training algorithm model, after the prediction model is properly trained, prediction of the dissolved oxygen concentration of the Beihai Lake was possible. In order to test the algorithm's performance, the researchers compared the PSO-GA-BP neural network algorithm with the common BP neural network model, the PSO-BP neural network model, and the GA-BP neural network model. The prediction performance of the four models are shown in Figures 3 and 4, while the water quality prediction results are listed in Tables 2 and 3 as follows. Figure 3 shows the water quality prediction performance of these models intuitively, and it is easy to identify the most different time point between the predicted value and the measured value (real value). Table 2 shows the specific values of the predicted results and the observed results in time series. It can be seen in Figure 3 that the BP neural network prediction model optimized with the genetic algorithm and particle swarm optimization is much better than the traditional BP neural network prediction model. It can even be said that the BP neural network model without optimization shows poor prediction, while the curve fitting result of PSO-GA-BPNN model is the closest to that of the curve of real value. and the predicted value. The closer the value of is to 1, then the better the fitting result is.

261
Normally, MAPE and RMSE are always used to evaluate the accuracy, while APE and are more 262 suitable for evaluating the robustness of the model.

276
prediction performance of the four models are shown in Figure 3 and Figure 4, while the water quality 277 prediction results are listed in Table 2 and Table 3 as follows.     For the four models in our testing, the results of , MAPE, RMSE, and have been 293 respectively shown in Table 3. One can see exactly how the model performed and the degree to which 294 the four models differ. Figure 4 shows , the linear formula, and the linear curve. PSO-GA-BPNN

295
shows the best prediction effect in terms of both accuracy and robustness. As can be seen from   For the four models in our testing, the results of APE max , MAPE, RMSE, and R 2 have been respectively shown in Table 3. One can see exactly how the model performed and the degree to which the four models differ. Figure 4 shows R 2 , the linear formula, and the linear curve. PSO-GA-BPNN shows the best prediction effect in terms of both accuracy and robustness. As can be seen from Table 3, the root mean square error mildly dropped from 1.2733, 0.7873, and 0.4019 to 0.3596; the max absolute percentage error dropped from 45.4614%, 47.5328%, and 31.7989% to 16.2661%; the mean absolute percentage error dropped from 25.1506%, 15.7102%, and 8.4506% to 6.7219%; and the coefficient of determination rose from 0.2957, 0.5333, and 0.7818 to 0.9276. PSO-GA-BP achieved the best model performance in all evaluation indices. Specifically, MAPE went up 14.4% and RMSE shrunk to a quarter of the worst result in terms of accuracy; APE max went up 29.2% and R 2 had increased by 63.2% compared with the common BP neural network model. We conclude that the model based on combined particle swarm optimization and genetic algorithm can better fit the complex dynamic nonlinear relationship between the water ecological environment factors and dissolved oxygen. Furthermore, the improved prediction results of the PSO-GA-BPNN algorithm correspond to the real observations of Beihai Lake.

Conclusions
Water quality prediction plays an important role in the control, management, and planning of water quality. The common BP neural network prediction model has many weaknesses, so we combined the genetic optimization algorithm and particle swarm optimization algorithm with BP neural network, and established the PSO-GA-BPNN prediction model in this study. The improved model integrated the function of self-learning, bionic, and nonlinear approximation technology. This network learning model realized fast convergence speed, avoidance of local minima, stronger stability, and suitable results. Due to current hardware, the training time of PSO-GA-BPNN model is relatively long. The PSO-GA-BPNN had improved prediction accuracy and robustness.
There is an urgent requirement for new methods to deal with abnormal data. In complex aquatic ecosystems, the proposed model can meet the management requirements of water quality monitoring and early warning. It is strongly recommended to establish different predictive models according to diversified weather conditions and complex surface environments, and to combine those prediction models to improve the prediction accuracy because water quality is heavily affected by hydrological, meteorological, and surficial factors.