Annual Runo ﬀ Forecasting Based on Multi-Model Information Fusion and Residual Error Correction in the Ganjiang River Basin

: Accurate forecasting of annual runo ﬀ time series is of great signiﬁcance for water resources planning and management. However, considering that the number of forecasting factors is numerous, a single forecasting model has certain limitations and a runo ﬀ time series consists of complex nonlinear and nonstationary characteristics, which make the runo ﬀ forecasting di ﬃ cult. Aimed at improving the prediction accuracy of annual runo ﬀ time series, the principal components analysis (PCA) method is adopted to reduce the complexity of forecasting factors, and a modiﬁed coupling forecasting model based on multiple linear regression (MLR), back propagation neural network (BPNN), Elman neural network (ENN), and particle swarm optimization-support vector machine for regression (PSO-SVR) is proposed and applied in the Dongbei Hydrological Station in the Ganjiang River Basin. Firstly, from two conventional factors (i.e., rainfall, runo ﬀ ) and 130 atmospheric circulation indexes (i.e., 88 atmospheric circulation indexes, 26 sea temperature indexes, 16 other indexes), principal components generated by linear mapping are screened as forecasting factors. Then, based on above forecasting factors, four forecasting models including MLR, BPNN, ENN, and PSO-SVR are developed to predict annual runo ﬀ time series. Subsequently, a coupling model composed of BPNN, ENN, and PSO-SVR is constructed by means of a multi-model information fusion taking three hydrological years (i.e., wet year, normal year, dry year) into consideration. Finally, according to residual error correction, a modiﬁed coupling forecasting model is introduced so as to further improve the accuracy of the predicted annual runo ﬀ time series in the veriﬁcation period.


Introduction
Hydrological forecasting, especially medium and long-term runoff forecasting, is an indispensable part of water resources management and water conservancy projects' operation [1][2][3]. Forecasting at different time scales can provide valuable information for flood control, power generation, water supply, and drought resistance [4][5][6][7]. Medium and long-term runoff forecasting, with a forecast period of patterns. The prediction results indicated that the proposed hybrid model was a far better technique compared to the original support vector machine (SVM) model [27].
Apart from selecting appropriate forecasting models, identifying key predictors driving runoff variation is another step towards developing a reliable forecasting model [28]. Rough set (RS), global sensitivity analysis (GSA), factor analysis (FA), principal component analysis (PCA), Gamma test (GT), and forward selection (FS) techniques are used to reduce the number of input variables for recognizing forecasting factors [29]. With the development of information theory, mutual information (MI) as a measure representing information between two random variables provides optional means for screening forecasting factors [30]. To tackle key problems of generating minimal inference rule set and selecting complex factors, Zhu et al. (2009) proposed a forecasting model integrating the rough set theory with the fuzzy inference technique to improve the medium and long-term forecast precision [31]. Five principal impact factors were recognized by Li et al. (2012) by means of GSA and the back-propagation arithmetic, and these were pivotal factors that make a great difference to runoff during the flood season in the Nenjiang River Basin [29]. Some input selection techniques (e.g., GT, FS) designed to reduce the number of input variables, were fed to an SVM model to predict the monthly streamflow, and the developed GT-SVM model was superior to the original SVM model [32]. As a multivariate statistical technique used to identify important factors, PCA has been proposed to reduce the number of variables by providing a better interpretation of variables involving large volumes of information, as well as reducing the computational dimension [33,34]. Moreover, the information from independent and linear compound input variables is capable of presenting us with the minimum losses by employing this method [35]. Thus, PCA is acknowledged to be pivotal towards reducing the complexity of input variables and has been widely adopted into simplifying forecasting factors.
Nevertheless, it is simply not stable to rely on a single forecasting model, such as multiple linear regression (MLR), back propagation neural network (BPNN), Elman neural network (ENN), and particle swarm optimization-support vector machine for regression (PSO-SVR), to predict annual runoff, for runoff time series tending to be nonlinear, nonstationary and, even, chaotic. In view of this, multi-model information fusion technology and residual error correction methods are introduced to acquire more accurate annual runoff, taking the advantages of different forecasting methods into account [36,37]. The main objective of this paper is to develop a modified coupling forecasting model to predict annual runoff time series. Firstly, key forecasting factors are screened from two conventional factors (i.e., rainfall and runoff) and 130 atmospheric circulation indexes, with the help of PCA. Then, annual runoff time series are predicted by using the MLR model, the BPNN model, the ENN model, and the PSO-SVR model, respectively. Subsequently, a coupling model is constructed to predict annual runoff according to the multi-model information fusion technique. Finally, the residual error correction method is employed to further modify annual runoff time series in the validation period.

Study Area
As the seventh largest branch of the Yangtze River and the longest river of the Poyang Lake water system, the Ganjiang River Basin is located within 113 • 42 -116 • 38 E and 24 • 30 -28 • 42 N, Jiangxi province, China, controlling a drainage area of 83,500 km 2 and reaching a total length of 766 km [38]. With a subtropical humid monsoon climate, the Ganjiang River Basin has an annual average temperature ranging from 17 to 20 • C, an annual rainfall ranging from 800 to 1700 mm, and an annual mean inflow varying from 1300 to 3700 m 3 /s. The Wan'an reservoir is located in the middle reaches of the Ganjiang River, 2 km upstream of the Wan'an County, with a drainage area of 36,900 km 2 . As the largest water conservancy project in the Jiangxi Province, the Wan'an Reservoir plays a significant role in power generation, flood control, waterway transport, and irrigation.
However, since the water resources shortage, contradictions between water supply and demand have become frequent in the dry season in the middle and lower reaches of the Ganjiang River Basin over the past years. Rational allocation of water resources is a key measure to solve above problems and minimize its adverse impact. Annual streamflow prediction is equally important for reservoir operation and water resource management in the Ganjiang River Basin, because it is the prerequisite and foundation for compiling a water resource allocation plan and carrying out reservoir operation. As an important control station in the upper reaches of the Ganjiang River Basin and a runoff monitoring station at the dam site of the Wan'an Reservoir, the Dongbei Hydrological Station is the basin outlet of the study area, controlling a drainage area of 40,231 km 2 [39]. Influenced by the plum rain, rainfall of the Dongbei Hydrological Station concentrates from March to June, accounting for about 54% of annual rainfall. Rainfall from July to September also occupies a large proportion of annual rainfall, due to the influence of the typhoon rain and the monsoon rain [40]. The location of the study area is shown in Figure 1. However, since the water resources shortage, contradictions between water supply and demand have become frequent in the dry season in the middle and lower reaches of the Ganjiang River Basin over the past years. Rational allocation of water resources is a key measure to solve above problems and minimize its adverse impact. Annual streamflow prediction is equally important for reservoir operation and water resource management in the Ganjiang River Basin, because it is the prerequisite and foundation for compiling a water resource allocation plan and carrying out reservoir operation. As an important control station in the upper reaches of the Ganjiang River Basin and a runoff monitoring station at the dam site of the Wan'an Reservoir, the Dongbei Hydrological Station is the basin outlet of the study area, controlling a drainage area of 40,231 km 2 [39]. Influenced by the plum rain, rainfall of the Dongbei Hydrological Station concentrates from March to June, accounting for about 54% of annual rainfall. Rainfall from July to September also occupies a large proportion of annual rainfall, due to the influence of the typhoon rain and the monsoon rain [40]. The location of the study area is shown in Figure 1.

Data Series
Runoff and rainfall data of the Dongbei Hydrological Station covering 1964 to 2015 were obtained from the government hydrologic database. 130 atmospheric circulation indexes including 88 atmospheric circulation indexes, 26 sea temperature indexes, and 16 other indexes were provided by the National Climate Center of China Meteorological Administration (https://cmdp.ncccma.net/Monitoring/cn_index_130.php). Runoff data from 1964 to 2002 were used to confirm the forecasting model, while those from 2003 to 2015 were used to verify the forecasting model.

Implementation of the Annual Runoff Forecasting
The annual runoff forecasting model presented in this paper consists of four parts: extraction of key forecasting factors based on the principal component analysis (PCA) method; comparison of the predicted runoff time series of four forecasting models, including the multiple linear regression (MLR) model, the back propagation neural network (BPNN) model, the Elman neural network (ENN) model, and the particle swarm optimization-regression support vector machine (PSO-SVR) model; coupling three forecasting models (i.e., BPNN, ENN, PSO-SVR) by means of multi-model information fusion from the aspects of wet year, normal year, and dry year, respectively; modification of annual runoff time series predicted by the coupling model based on the residual error correction technique. In the

Data Series
Runoff and rainfall data of the Dongbei Hydrological Station covering 1964 to 2015 were obtained from the government hydrologic database. 130 atmospheric circulation indexes including 88 atmospheric circulation indexes, 26 sea temperature indexes, and 16 other indexes were provided by the National Climate Center of China Meteorological Administration (https://cmdp.ncc-cma.net/ Monitoring/cn_index_130.php). Runoff data from 1964 to 2002 were used to confirm the forecasting model, while those from 2003 to 2015 were used to verify the forecasting model.

Implementation of the Annual Runoff Forecasting
The annual runoff forecasting model presented in this paper consists of four parts: extraction of key forecasting factors based on the principal component analysis (PCA) method; comparison of the predicted runoff time series of four forecasting models, including the multiple linear regression (MLR) model, the back propagation neural network (BPNN) model, the Elman neural network (ENN) model, and the particle swarm optimization-regression support vector machine (PSO-SVR) model; coupling three forecasting models (i.e., BPNN, ENN, PSO-SVR) by means of multi-model information fusion from the aspects of wet year, normal year, and dry year, respectively; modification of annual runoff time series predicted by the coupling model based on the residual error correction technique. In the following sections, these parts will be elaborated upon in detail. The detailed technique flow chart is introduced in Figure 2.
Water 2020, 12, x FOR PEER REVIEW 5 of 18 following sections, these parts will be elaborated upon in detail. The detailed technique flow chart is introduced in Figure 2.

Principal component analysis (PCA)
Principal component analysis (PCA) is a statistical technique aiming at reducing the dimensionality of a dataset with a large number of interrelated variables, while retaining most of the variability of the original datasets [41]. By calculating the cumulative contribution rate, PCA is applied to screen principal forecasting factors with an objective of losing as little information as possible. Given the initial forecasting factors 12 ,,, n x x x and the final forecasting factors  12 , , , ( ) 1 z , the variance of which is the second largest among all linear combinations. And so on.

Multiple Linear Regression (MLR)
Multiple linear regression (MLR) is a classical statistical tool to describe the complex inputoutput relationship [42]. The key goal of MLR is to find out an approximation linear function between a set of independent variables and the dependent variable. Without a loss of generality, the regression line of MLR can be described as follows:

Principal component analysis (PCA)
Principal component analysis (PCA) is a statistical technique aiming at reducing the dimensionality of a dataset with a large number of interrelated variables, while retaining most of the variability of the original datasets [41]. By calculating the cumulative contribution rate, PCA is applied to screen principal forecasting factors with an objective of losing as little information as possible. Given the initial forecasting factors x 1 , x 2 , · · · , x n and the final forecasting factors z 1 , z 2 , · · · , z m (m ≤ n), the equation of extracting forecasting factors using PCA can be defined as follows: z 1 = l 11 x 1 + l 12 x 2 + · · · + l 1n x n z 2 = l 21 x 1 + l 22 x 2 + · · · + l 2n x n · · · z m = l m1 x 1 + l m2 x 2 + · · · + l mn x n (1) where L is the load matrix composed of coefficient l. If i j, z i has nothing to do with z j . z 1 is the linear combination of x 1 , x 2 , · · · , x n , the variance of which is the largest among all linear combinations. In a similar way, z 2 is the linear combination of x 1 , x 2 , · · · , x n that is not related to z 1 , the variance of which is the second largest among all linear combinations. And so on.

Multiple Linear Regression (MLR)
Multiple linear regression (MLR) is a classical statistical tool to describe the complex input-output relationship [42]. The key goal of MLR is to find out an approximation linear function between a set of independent variables and the dependent variable. Without a loss of generality, the regression line of MLR can be described as follows: Water 2020, 12, 2086 where k is the number of independent variables, β j ( j = 1, 2, · · · , k) is the partial regression coefficient, x i is the i th independent variable, y is the dependent variable, and ε i is the error term corresponding to y i . Then, the equation for a set of samples mentioned above can be rewritten in a compact matrix form, which can be expressed as follows: According to the classical matrix calculation theory, the least-square method can be adopted to calculate the coefficient vector β, and the coefficient vector can be described as follows: In such a way, the coefficient vector β is known, and the obtained MLR model can be used to predict the possible dependent variable related to the newly input vector.

Back Propagation Neural Network (BPNN)
The back propagation neural network (BPNN) is one of the most widely used neural network models, and it is a multi-layer feedforward network trained based on the error back propagation algorithm. BPNN can learn and store a large number of input-output mode mapping relations, and users can obtain relatively satisfactory prediction results without having to understand the mathematical equations of this mapping relation in advance [8]. BPNN continuously adjusts the weights and thresholds of the network through back propagation to achieve the least sum of square error. BPNN consists of three parts, namely an input layer, a hidden layer and an output layer, and its structure is drawn in Figure 3: where k is the number of independent variables,   ( 1,2, , ) j jk is the partial regression coefficient, i x is the i th independent variable, y is the dependent variable, and  i is the error term corresponding to i y .
Then, the equation for a set of samples mentioned above can be rewritten in a compact matrix form, which can be expressed as follows: According to the classical matrix calculation theory, the least-square method can be adopted to calculate the coefficient vector  , and the coefficient vector can be described as follows: In such a way, the coefficient vector  is known, and the obtained MLR model can be used to predict the possible dependent variable related to the newly input vector.

Back Propagation Neural Network (BPNN)
The back propagation neural network (BPNN) is one of the most widely used neural network models, and it is a multi-layer feedforward network trained based on the error back propagation algorithm. BPNN can learn and store a large number of input-output mode mapping relations, and users can obtain relatively satisfactory prediction results without having to understand the mathematical equations of this mapping relation in advance [8]. BPNN continuously adjusts the weights and thresholds of the network through back propagation to achieve the least sum of square error. BPNN consists of three parts, namely an input layer, a hidden layer and an output layer, and its structure is drawn in Figure 3: where  where X = (x 1 , x 2 , · · · , x i , · · · , x n ), HI = (hi 1 , hi 2 , · · · , hi j , · · · , hi p ) and YI = (yi 1 , yi 2 , · · · , yi k , · · · , hi q ) are the input vector in the input layer, hidden layer, and output layer, respectively. n, p and q are the number of neurons in the input layer, hidden layer and output layer, respectively. HO = (ho 1 , ho 2 , · · · , ho j , · · · , ho p ) and YO = (yo 1 , yo 2 , · · · , yo k , · · · , ho q ) are the output vector in the hidden layer and output layer, respectively. DO = (do 1 , do 2 , · · · , do k , · · · , do q ) is the expected output vector in the output layer. w i,j and w j,k are the input layer-hidden layer connection weight and the hidden layer-output layer connection weight, respectively. BH = (bh 1 , bh 2 , · · · , bh j , · · · , bh p ) and BO = (bo 1 , bo 2 , · · · , bo k , · · · , bo q ) are the threshold values corresponding to each neuron in the hidden layer and output layer, respectively. Given the number of samples and the activation function, the error function e can be expressed as follows: When BPNN is applied for predicting annual runoff time series, the corresponding output value is the prediction result on the premise of the specific input factors being transferred to the model. The learning procedures of the BPNN model are summarized as follows: Step 1. Assign a random number to the connection weights between the layers, and determine the error function, as well as the given calculation error accuracy value and the maximum training time.
Step 2. Randomly select a sample as the input value and determine the expected output value.
Step 3. Calculate the input value and the output value of each neuron in the hidden layer.
Step 4. Calculate the partial derivative of the error function to each neuron in the output layer.
Step 5. Calculate the partial derivative of the error function to each neuron in the hidden layer.
Step 6. Correct the hidden layer-output layer connection weight.
Step 7. Correct the input layer-hidden layer connection weight.
Step 8. Calculate the global error and judge whether the model error meets the demand.

Elman Neural Network (ENN)
The Elman neural network (ENN) was firstly proposed by Elman (1990) to address the voice processing problem, and it is a typical dynamic recursive neural network. Based on the basic structure of BPNN, a context layer is added from the hidden layer to the input layer in the structure of ENN, and this context layer is taken as a one-step delay operator to record information from the last network iteration as input to the current iteration [43]. In addition, ENN enables the system to adapt to time-varying characteristics, as it enhances the global stability and has stronger computing power. A standard structure of ENN is drawn in Figure 4: When BPNN is applied for predicting annual runoff time series, the corresponding output value is the prediction result on the premise of the specific input factors being transferred to the model. The learning procedures of the BPNN model are summarized as follows: Step 1.
Assign a random number to the connection weights between the layers, and determine the error function, as well as the given calculation error accuracy value and the maximum training time.
Randomly select a sample as the input value and determine the expected output value.
Calculate the input value and the output value of each neuron in the hidden layer.
Calculate the partial derivative of the error function to each neuron in the output layer.
Calculate the partial derivative of the error function to each neuron in the hidden layer. Step 6.
Correct the hidden layer-output layer connection weight.
Correct the input layer-hidden layer connection weight.
Calculate the global error and judge whether the model error meets the demand.

Elman Neural Network (ENN)
The Elman neural network (ENN) was firstly proposed by Elman (1990) to address the voice processing problem, and it is a typical dynamic recursive neural network. Based on the basic structure of BPNN, a context layer is added from the hidden layer to the input layer in the structure of ENN, and this context layer is taken as a one-step delay operator to record information from the last network iteration as input to the current iteration [43]. In addition, ENN enables the system to adapt to time-varying characteristics, as it enhances the global stability and has stronger computing power. A standard structure of ENN is drawn in Figure 4:  where u is the input vector, y is the output vector, x is the output vector in the hidden layer, x c is the output vector in the context layer. w 1 , w 2 and w 3 denote the context layer-hidden layer connection weight, the input layer-hidden layer connection weight, and the hidden layer-output layer connection weight, respectively.
The state space of the ENN model can be expressed as follows: where g(·) is the transfer function of the neurons in the output layer, which is the linear combination of the hidden-layer output value. f (·) is the transfer function of the neurons in the hidden layer, which is commonly expressed by S function. To achieve the minimum mean square deviation between the actual output value and the expected output value, the least-square algorithm and the gradient search technique are adopted in the ENN model. Then, the learning procedures of the ENN model are summarized as follows: Step 1. Normalize the original sample data. Set the maximum training time, the minimum expected error, the learning efficiency, and the sample number. The mean square error (MSE) function, is taken as the error function to describe the relation between the expected output value and the actual output value, and its equation can be expressed as follows: where O t ex is the t th expected output value. O t ac is the t th actual output value, and that is the observed value of the runoff.
Step 2. Initialize the connection weights including w 1 , w 2 and w 3 . Train the ENN model for the first time when t = 1.
Step 3. Calculate each input sample to find the output value in the hidden layer, the output value in the output layer, the sample error value, and the sample weight correction value. Calculate the global error value based on the sample error value.
Step 4. Judge whether the global error is less than the specified accuracy. If it is, the ENN model ends its iteration and saves the current connection weight.
Step 5. Judge whether the iteration time t is less than the maximum iteration time. If not, the ENN model ends its iteration and saves the final connection weight. Step 6. Sum the sample weight correction value obtained in Step 3.
Step 7. Correct the connection weight to obtain a new connection weight, taking the sample weight correction value obtained in Step 6 mentioned above into account. Turn to Step 3 mentioned above to continue to iterate.

Particle Swarm Optimization-Regression Support Vector Machine (PSO-SVR)
Support vector machine for regression (SVR) is a regression algorithm based on the support vector machine (SVM), and this technique has gradually become a new research hotspot in the fields of water resources engineering and hydrology [44,45]. The basic idea of the SVR model is to represent the entire sample set through a small number of support vectors and to map the training set to another high-dimensional feature space by means of nonlinear mapping. Then, the nonlinear function estimation problem in the input space can be transformed into a linear function estimation problem in the high-dimensional feature space. In spite of the efficiency of SVR for modeling nonlinear and complicated runoff time series, it still suffers from drawbacks, such as the selection of the parameters C, ε, and σ making a great difference to the forecasting accuracy of an SVR model [46]. Considering that the particle swarm optimization (PSO) algorithm has the advantages of easy implementation, fast convergence speed, and strong global search ability, the PSO algorithm is employed to optimize the free parameters of SVR. Moreover, applying the PSO algorithm into the parameter optimization of the SVR model has certain advantages compared to the traditional grid search methods [47]. The forecasting process of the PSO-SVR model is shown in Figure 5.
Water 2020, 12, x FOR PEER REVIEW 9 of 18 of the SVR model has certain advantages compared to the traditional grid search methods [47]. The forecasting process of the PSO-SVR model is shown in Figure 5.

Residual Error Correction
Due to the high complexity and nonlinearity of runoff time series, as well as the instability of the forecasting model, it is not always ideal for the runoff forecasting results directly obtained by the forecasting model. Therefore, revising the forecasting results and improving the prediction accuracy is of necessity in the annual runoff forecasting. The simplest way to modify annual runoff sequences is to establish a regression error correction equation [36]. The specific steps of establishing a residual correction equation are as follows: Step 1.
This correction equation is established based on the predicted value and the corresponding residual value, and the residual sequences can be expressed as follows: where , ii yq and i x are the measured sequences, the predicted sequences, and the residual sequences, respectively. Step 2.
A residual equation between the predicted sequences and the residual sequences can be established as follows: where , ab are the regression coefficients, and n is the number of sequences.  i is the random error, and it obeys normal distribution Step 3.
Regression coefficients , ab are estimated by the least-square method, and the sum of the deviation square is calculated as follows:

Residual Error Correction
Due to the high complexity and nonlinearity of runoff time series, as well as the instability of the forecasting model, it is not always ideal for the runoff forecasting results directly obtained by the forecasting model. Therefore, revising the forecasting results and improving the prediction accuracy is of necessity in the annual runoff forecasting. The simplest way to modify annual runoff sequences is to establish a regression error correction equation [36]. The specific steps of establishing a residual correction equation are as follows: Step 1. This correction equation is established based on the predicted value and the corresponding residual value, and the residual sequences can be expressed as follows: where y i , q i and x i are the measured sequences, the predicted sequences, and the residual sequences, respectively. Step 2. A residual equation between the predicted sequences and the residual sequences can be established as follows: where a, b are the regression coefficients, and n is the number of sequences. ε i is the random error, and it obeys normal distribution N ∼ (0, σ 2 ).
Step 3. Regression coefficients a, b are estimated by the least-square method, and the sum of the deviation square is calculated as follows: In order to obtain the minimum S(a, b), take the derivations of S(a, b) to a, b, respectively, and define the derivative values as 0. The equations of the derivation process can be obtained as follows: The equations mentioned above are normal equations, and the solution is not the true values of the regression coefficients a, b but the estimated values. Then, replace the true values a, b with the estimated valuesâ,b, and above equations can be rewritten as follows: In this case, the regression coefficients have a unique solution: Calculate the regression equation between y and x: x i =ây i +b(i = 1, 2, · · · , n).
Step 4. Calculate the residual values corresponding to the predicted sequences.
Step 5. Calculate the modified prediction values by the predicted residual values:

Evaluation Index
In order to evaluate the performance of the annual runoff forecasting model, three main criteria including mean absolute relative error (MARE), absolute relative error (ARE), root mean squared error (RMSE), qualification rate (QR) and Nash-Sutcliffe efficiency coefficient (NS) are taken as evaluation indexes, and these criteria are suitable to describe the prediction accuracy [19]. MARE and ARE are employed for examining the error between the predicted data and the observed data [48]. RMSE is chosen as an evaluation index to test the differences between the predicted data and the observed data [49]. In the standard for hydrological information and hydrological forecasting, QR is an important index to distinguish the accuracy grade of runoff projects [50]. NS is a popular evaluation index for evaluating the performance of forecasting models [51].
(1) The equation of mean absolute relative error (MARE) and absolute relative error (ARE) can be expressed as follows: where Q sim,i is the predicted value of the i th sample, Q obs,i is the observed value of the i th sample, and k is the number of samples. (2) The equation of root mean squared error (RMSE) can be expressed as follows: (3) The equation of qualified rate (QR) can be expressed as follows: where m is the time of the qualified forecasts, and n is the time of the total forecasts. If the absolute value of the relative error between the predicted value and the measured value is within 20% the forecast is qualified. (4) The equation of Nash-Sutcliffe efficiency coefficient (NS) can be expressed as follows: (19) where Q obs denotes the average value of the observed annual runoff time series.

Determining Forecasting Factors
Based on annual runoff time series from 1964 to 2015 of the Dongbei Hydrological Station, annual rainfall time series from 1963 to 2015, and 130 monitoring indexes (i.e., 88 atmospheric circulation indexes, 26 sea temperature indexes, 16 other indexes) from 1963 to 2014, the forecasting factors are screened according to the principle of selecting principal components whose cumulative contribution rate is greater than 85%. When analyzing the correlation between the forecasting factors of the year before the forecast year and the runoff sequence of the forecast year, the cumulative rainfall of the year before the forecast year is also added as the forecasting factor by the PCA method for dimensionality reduction, considering rainfall sequence is a key factor affecting annual runoff variation. The variance contribution-rate ranking of components is shown in Table 1. As can be seen in Table 1, the variance contribution rate of the first principal component can reach 30.93%, indicating that it contains most of the information of the selected factors. The variance contribution rate of other principal components is getting smaller, which means that the information of the selected factors is less and fewer. The first seven principal components are determined as the forecasting factors, the cumulative variance contribution rate of which can reach 86.74%. According to equation (1), these seven principal components are the linear combinations of eleven initial forecasting factors, including the Eastern Pacific Subtropical High Northern Boundary Position Index (in June last year), the Pacific Subtropical High Northern Boundary Position Index (in June last year), the Atlantic Meridional Mode SST Index (in July last year), etc. Then, the score coefficient matrix of principal components is shown in Table 2.

Four Annual Runoff Forecasting Models
Seven forecasting factors determined by the PCA method are taken as input conditions of the MLR model, BPNN model, ENN model, and PSO-SVR model, respectively. In this paper, the trial and error method is adopted to compare the prediction results of different forecasting models and determine the optimal parameters used in the models. Taking the ENN model as an example, different combinations of the node number in the input layer and hidden layer are proposed to determine the optimal combination. As a result, the optimal node number of the input layer is seven, while the optimal that of the hidden layer is eight. Thus, the combination of node numbers in the ENN model is (seven, eight, one). Then, major structures of four forecasting models are listed in Table 3, such as the regression equation of the MLR model, as well as other parameter settings of the BPNN model (e.g., node number, maximum training time, learning rate), the ENN model (e.g., node number, maximum training time, learning rate), and the PSO-SVR model (e.g., population size, maximum iteration time, C, ε, σ), respectively. In order to compare the performances of the proposed MLR model, BPNN model, ENN model, and PSO-SVR model of the Dongbei Hydrological Station in the Ganjiang River Basin, the annual runoff time series predicted by four forecasting models are shown in Figure 6, and the comparison results of three evaluation indexes are illustrated in Table 4.
Water 2020, 12, x FOR PEER REVIEW 13 of 18 In order to compare the performances of the proposed MLR model, BPNN model, ENN model, and PSO-SVR model of the Dongbei Hydrological Station in the Ganjiang River Basin, the annual runoff time series predicted by four forecasting models are shown in Figure 6, and the comparison results of three evaluation indexes are illustrated in Table 4.    As shown in the figure above, it is evident that the MLR model has the worst prediction performance whether runoff time series is in the calibration period or in the validation period, except that MARE of the PSO-SVR model is larger than that of the MLR model in the calibration period. By contrast, the ENN model has the best prediction performance in the calibration period, because this model has the smallest MARE, the smallest RMSE, the largest QR, and the largest NS, in terms of evaluation indexes. Likewise, the PSO-SVR model with the best prediction performance is glaringly obvious in the validation period, MARE, RMSE, QR, and NS of which are 15.73%, 202.50, 84.62%, and 0.411, respectively. In addition to these, it is obvious that the prediction performance of the BPNN model is superior to that of the MLR model, expect that QR values of both models are 69.23%.

Coupling Annual Runoff Forecasting Model
Aimed at proposing a coupling forecasting model by considering three forecasting models (i.e., BPNN model, ENN model, PSO-SVR model) that have different prediction performances, the historical annual runoff time series are divided into three hydrological years (i.e., wet year, normal year, dry year), and three coupling multi-model parameter equations based on the least-square method are introduced to train annual runoff time series in the calibration period from the aspect of hydrological years. Then, these coupling multi-model parameter equations are adopted to verify the annual runoff time series in the validation period. Coupling multi-model parameter equations are shown in Table 5. Table 5. Coupling multi-model parameter equations corresponding to different hydrological years (note: y means the annual runoff predicted by the coupling model. x 1 , x 2 , x 3 mean the annual runoff predicted by the BPNN model, the ENN model, and the PSO-SVR model, respectively).

Hydrological Year Coupling Multi-Model Parameter Equation
Wet Year y = 0.0253x 1 + 0.7938x 2 − 0.3046x 3 + 765.3838 Normal Year y = 0.0382x 1 + 0.4586x 2 + 0.303x 3 + 155.1607 Dry Year Based on these coupling equations, the annual runoff time series predicted by the coupling model is shown in Figure 7, and the comparison results of three evaluation indexes are illustrated in Table 6. It is demonstrated that the prediction performance of annual runoff time series in the calibration period is better than that in the validation period, and the former predicted value is closer to the observed value than the latter one. As far as the calibration period is concerned, MARE, RMSE QR, and NS of the coupling model are 2.92%, 42.13, 100%, and 0.980, respectively, while that of the ENN model are 4.63%, 73.64, 97.44%, and 0.941, respectively. When it comes to the validation period, MARE, RMSE, QR, and NS of the coupling model are 8.06%, 157.90, 84.62%, and 0.642, respectively, while that of the PSO-SVR model are 15.73%, 202.50, 84.62%, and 0.411, respectively. Thus, it is concluded that this coupling model has better model performances than any single forecasting model (i.e., BPNN model, ENN model, PSO-SVR model) mentioned above, because this coupling model has the smaller MARE, the smaller RMSE, the larger QR, and the larger NS, in terms of evaluation indexes. However, it can be clearly seen that the least-square method is greatly affected by the disturbance of the outlier (the highest blue peak in the validation period in Figure 7) when modifying the predicted values, which is closely related to its correction strategy. The least-square method takes the distance as the measure, and the parameters of the fitting function are obtained by the least-square sum of errors, which will enlarge the influence of the large error to this method. MARE, RMSE, QR, and NS of the coupling model are 8.06%, 157.90, 84.62%, and 0.642, respectively, while that of the PSO-SVR model are 15.73%, 202.50, 84.62%, and 0.411, respectively. Thus, it is concluded that this coupling model has better model performances than any single forecasting model (i.e., BPNN model, ENN model, PSO-SVR model) mentioned above, because this coupling model has the smaller MARE, the smaller RMSE, the larger QR, and the larger NS, in terms of evaluation indexes. However, it can be clearly seen that the least-square method is greatly affected by the disturbance of the outlier (the highest blue peak in the validation period in Figure 7) when modifying the predicted values, which is closely related to its correction strategy. The least-square method takes the distance as the measure, and the parameters of the fitting function are obtained by the least-square sum of errors, which will enlarge the influence of the large error to this method.

Modified Coupling Annual Runoff Forecasting Model
Based on the observed values and the predicted values of the coupling model from 1964 to 2002, a modified function is proposed by means of residual error correction, and the residual correction function equation can be expressed as follows: Then, the prediction sequences of the coupling model y i are taken into the model correction function, and the residual error sequences x i can be obtained. Subsequently, modified annual runoff sequences from 2003 to 2015 can be calculated, and comparisons of annual runoff sequences predicted by the coupling forecasting model and the modified coupling forecasting model are illustrated in Table 7.
As shown in the table above, MARE of the coupling model is 8.06%, while that of the modified coupling model is 7.99%. RMSE of the coupling model is 157.90, while that of the modified coupling model is 156.69. Moreover, QR of the coupling model is 0.642, while that of the modified coupling model is 0.647. Therefore, the modified coupling model can further improve the prediction performance of annual runoff time series in the validation period, with the help of the residual error correction technique.

Conclusions and Discussions
For the sake of improving the forecasting accuracy of annual runoff time series, the principal component analysis (PCA) method is adopted to screen forecasting factors from rainfall, runoff, and 130 monitoring indexes. A modified coupling runoff forecasting model is proposed based on multiple linear regression (MLR), back propagation neural network (BPNN), Elman neural network (ENN), and particle swarm optimization-regression support vector machine (PSO-SVR), by means of multi-model information fusion and residual error correction in this paper. The main conclusions of this study are as follows: Firstly, seven principal components are screened as key forecasting factors from the Eastern Pacific Subtropical High Northern Boundary Position Index (in June last year), the Pacific Subtropical High Northern Boundary Position Index (in June last year), the Atlantic Meridional Mode SST Index (in July last year), and other monitoring indexes, as well as annual rainfall.
Then, compared to the MLR model, the BPNN model, the ENN model, and the PSO-SVR model provide better prediction performances involving predicting annual runoff. In terms of a single forecasting model, the PSO-SVR model has the best prediction performances in the validation period, while the ENN model has the best prediction performances in the calibration period.
Subsequently, from the point of hydrological years (i.e., wet year, normal year, dry year), a coupling model is proposed by means of multi-model parameter equations by taking the advantages of three forecasting models (i.e., BPNN Finally, the residual error correction technique is referenced to modify runoff sequences predicted by the coupling model in the validation period. Compared to the coupling model, the modified coupling model has the smaller MARE, the smaller RMSE, and the larger NS.
In conclusion, the modified coupling method proposed in this paper can be applied to the Ganjiang River Basin to improve prediction performances of annual runoff sequences. It would also play an important role in providing a significant improvement of runoff forecasting in other similar river basins, especially by means of multi-model information fusion techniques to combine the advantages of different forecasting models, which is an important innovation in this paper. Besides, the residual error correction method in the modified coupling model is another feature in this study that would further improve the performance of the predicted annual runoff in the validation period, based on the least-square sum of errors. Nevertheless, there is still some room for perfection in this study, for example the number of nodes in the input layer and the hidden layer is determined by the trial and error method, rather than the optimization method, which may reduce the calculation efficiency. Using optimization algorithms to determine the number of nodes and weights of neural networks may be the trend of future development. The least-square method is not always a good correction strategy when there are outliers in the predicted results, that is, the error between the predicted value and the observed value is large in the coupled model. Mean absolute error method and smoothed mean absolute error method might be alternative options for data fitting. Therefore, making full use of the advantages of different models to achieve the optimality of the coupled model remains a challenge in the forecasting runoff model, which will also be the focus of future research.
Author Contributions: P.S., C.W., and W.L. conducted the research and designed and wrote the paper; J.S., L.K., and Z.N. helped to revise the paper; X.L. and H.W. gave the comments. All authors have read and agreed to the published version of the manuscript.