Next Article in Journal
Innovative Multistage Constructed Wetland for Municipal Wastewater Treatment and Reuse for Agriculture in Senegal
Previous Article in Journal
Modeling River Runoff Temporal Behavior through a Hybrid Causal–Hydrological (HCH) Method
Article

An Ensemble Flow Forecast Method Based on Autoregressive Model and Hydrological Uncertainty Processer

School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
*
Author to whom correspondence should be addressed.
Water 2020, 12(11), 3138; https://doi.org/10.3390/w12113138
Received: 7 October 2020 / Revised: 31 October 2020 / Accepted: 4 November 2020 / Published: 9 November 2020
(This article belongs to the Section Hydrology and Hydrogeology)

Abstract

In the process of hydrological forecasting, there are uncertainties in data input, model parameters, and model structure, which cause a deterministic forecasting to fail to provide useful risk information to decision-makers. Therefore, the study of ensemble forecasting and the analysis of hydrological uncertainty are of great significance to guide the actual operation of reservoirs in the flood season. This study proposed a Bayesian ensemble forecast method, comprising of a Gaussian mixture model (GMM), a hydrological uncertainty processer (HUP), and an Autoregressive (AR) model. First, the GMM is selected as the marginal distribution function to estimate the uncertainty of observed and modelled data. Next, the AR model is used to correct the forecast rainfall data. Then, a modified HUP is used to deal with the uncertainty of hydrological model structure and rainfall input data. In the end, the ensemble flow forecast results are composed of the expected values of the posterior distribution obtained by HUP under different rainfall conditions. Taking the Three Gorges Reservoir (TGR) as a case study, the ensemble flow prediction in the forecast period is calculated by using the above method. Results show that the method proposed in this paper can improve the accuracy of runoff forecasts and reduce the uncertainty of the hydrological forecast.
Keywords: ensemble flow forecast; autoregressive model; hydrological uncertainty processer; Three Gorges Reservoir ensemble flow forecast; autoregressive model; hydrological uncertainty processer; Three Gorges Reservoir

1. Introduction

Flooding is the most common natural hazard and third most damaging globally after storms and earthquakes [1]. Flood forecasting can provide key information for disaster warning and flood control, which plays an important role in reducing the damage caused by flooding [2,3]. The common flood forecasting methods generally use a deterministic hydrological model to predict the future flood process [4,5,6,7]. However, the traditional deterministic hydrological forecast cannot provide the hydrological forecast value in the forecast period to meet the needs of the corresponding river basin management department [8].
Ensemble flow forecasting can provide more hydrological forecasting information for policymakers in order to better ensure the safety of life and property in downstream areas and improve the social and economic benefits [9,10]. Ensemble weather forecasting (EWF) plays an important role in ensemble flow forecasting [11]. First, EWF contains more information, which can provide more comprehensive meteorological data for a hydrological forecast in a future period [12]. Second, research shows that combining EWF with a hydrological forecast model can improve the reliability and accuracy of forecast results [13]. However, due to the diversity of atmospheric conditions and topography, the simplification of physical and thermodynamic processes, the uncertainties of parameterization of models, and the limited spatial resolution, the standard EWF offers systematic deviations compared with the observations [14]. Thus, EWF should be corrected first before being applied to the hydrological forecasting process. For example, Hamill et al. [15] used a logistic regression method to process the set rainfall forecast and obtained the conditional distribution function of rainfall under given conditions. Sloughter et al. [16] used a logistic regression function and the Gamma distribution to describe the distribution of rainfall events of zero value and non-zero value, respectively, then built a mixed distribution function to describe the distribution of rainfall events, and improved the Bayes Model Averaging (BMA) method to make it applicable to the post-processing of aggregate rainfall forecasts. Wilks [17] improved the traditional logistic regression approach by introducing quartiles into the function as independent variables, thus, providing a continuous probability distribution function to describe the distribution of rainfall. However, the above methods need to make assumptions about the distribution of rainfall in EWF, and the calculation process is tedious. To overcome the above problems, the Autoregressive (AR) model, which is independent of the rainfall distribution, was used to post-process the precipitation value of EWF.
In recent years, great progress has been made in the research and theory of hydrological ensemble forecasting. Mylne [18] conducted a multi-mode ensemble prediction test, proving that the information given by multi-mode ensemble prediction is more accurate than that provided by a single mode prediction in either the probabilistic sense (such as probability density distribution) or the deterministic sense (such as average ensemble prediction). Brown et al. [19] used the National Center for Environmental Forecasting (NCEP) ensemble prediction model to drive the distributed hydrological model to obtain the hydrological ensemble prediction results, and the results showed that the ensemble prediction could improve the effective flood warning period. Choubin et al. [20] proposed a new flood sensitivity set prediction method based on multiple discriminant analysis (MDA) and classification and regression trees (CART), combined with support vector machine (SVM). However, most of the above methods only consider the uncertainty of hydrological models or the uncertainty of rainfall input data, and fail to reflect the comprehensive impact of uncertainties from different sources in the prediction process and results. Therefore, this paper proposes a method that combines an AR model with a Bayesian forecast system (BFS). The proposed model, comprehensively considering the uncertainties of rainfall input, model structure, and parameters, is designed to yield good ensemble flow forecast results in this paper.
Krzysztofowicz constructed the theoretical system of BFS for the first time in 1999 [21], and pointed out that the source of uncertainty in short-term runoff forecasting is mainly the future time series of rainfall as input to the hydrological model. BFS generates probabilistic forecasts of hydrological variables through deterministic hydrological models. As a result, the basic principle is to divide the sources of hydrological uncertainty into two categories: one is the input uncertainty with rainfall as the core and the other is the structural uncertainty of the hydrological model. Quantitative research, which is then combined into the final uncertainty probability forecast results, verifies that BFS can be combined with any deterministic hydrological model, and is independent of whether the hydrological model structure is linear or not [22,23].
Marshall [24] realized the probabilistic forecast by combining the Markov Monte Carlo (MCMC) with Bayesian method, and compared it with conceptual rainfall-runoff model to illustrate the advantages of probabilistic prediction. Subsequently, Marshall [25] used the Adaptive Metropolis Algorithm to calculate the posterior density of Bayesian probability prediction and verified the feasibility of Bayesian probability prediction as a hydrological prediction model. Henry [26,27] proposed a random ensemble Bayesian probabilistic forecast system (REBFS) based on the BFS. The REBFS model can output multiple set elements, effectively reducing the scale of the set weather prediction system, and making it more feasible to apply the BFS to large-scale regions. However, the process of parameter estimation of the above methods is complicated, which makes their application inconvenient. Feng et al. [28] used the Gaussian mixture model (GMM) to fit the marginal distribution and applied it to the hydrological uncertainty processer (HUP), and concluded that GMM had good applicability in HUP. However, it did not consider the effect of precipitation uncertainty. In order to solve the above two problems, a method combining precipitation-dependent HUP (PD-HUP) and GMM is proposed in this paper. The GMM function is used as the marginal distribution function in BFS, and the uncertainty of precipitation input data is also considered.
In summary, the main objective of this paper is to propose a Bayesian ensemble forecast method considering uncertainty of hydrological models and precipitation. First, the Autoregressive (AR) model is used to correct the rainfall data of Ensemble Weather Forecasts (EWF). Then, a precipitation-dependent HUP (PD-HUP) based on GMM is proposed to deal with the uncertainty of both the input data and the hydrological model. Finally, the ensemble flow forecast results are generated from the results determined by the PD-HUP method.

2. Methods and Data

2.1. Postprocessing of Ensemble Weather Forecasts

The Autoregressive (AR) model was used to postprocess the precipitation series of the EWF. The AR model only reflects the influence and effect of related factors on the prediction target through its own modelling of historical observations of time series variables. It is not restricted by the assumptions that the model variables are independent of each other. The model constructed can eliminate the difficulties caused by independent variable selection and multicollinearity in general regression prediction methods. The mathematical expression of the AR model is shown below.
e t j = α 1 j e t 1 j + α 2 j e t 2 j + + α p j e t p j + ξ j w c j = w f o r e , t j e t j ,
where w f o r e , t j is the uncorrected rainfall forecast value of jth ensemble member, e t j ( t = 1 , 2 , , N ) is the prediction error sequence of the jth ensemble member at time t and represents the difference between the EWF result and the observed precipitation. N is the length of the precipitation sequence of each ensemble member. e t 1 j , e t 2 j , , e t p j is the prediction residual series for the p periods of the jth ensemble member before time t, α 1 j , α 2 j , , α p j are the regression coefficients of the jth ensemble member, p is the order, which can be determined by the Akaike information criterion (AIC) criterion, ξ j represents a white noise sequence who’s mean value is 0, and variance is determined from the parameterization and w c j being the corrected prediction of the jth ensemble member. The calculation steps of the AR model are as follows.
First, determine the order of p according to the AIC criterion. Letting p take different values, calculate the AIC value of each ensemble member, and select the p value with the minimum AIC value as the regression order of the AR model. AIC is expressed as follows:
A I C = 2 k + n ln L ,
where k is the number of parameters, referring to the order of AR in the model in this paper, and L is the variance of the residuals.
Y = [ e p + 1 , e p + 2 , , e N ] T ,
and
ξ = [ ξ t 1 ξ t 1 ξ t n ] T ,
A = [ α 1 , α 2 , , α p ] T ,
X = e p e p 1 e 1 e p + 1 e p e 2 e N 1 e N 2 e N p
Then the p-order AR model can be expressed as:
Y = X A + ξ ,
The estimation of model parameters can be obtained from the least square principle.
A = X T X 1 X T Y ,
In the end, the AR models with the parameters determined above were used to correct the precipitation forecast errors.

2.2. Establishment of the Ensemble Flow Forecast Model

The Bayesian probability forecast system (BFS) was first proposed by Roman Krzysztofowicz in 1999. As an important part of BFS, the hydrologic uncertainty processor (HUP) presents the hydrological prediction values in a probabilistic form through Bayesian analysis and calculation of the results of the deterministic hydrological prediction model.

2.2.1. The Hydrological Uncertainty Processor Methodology

Let n n = 0 , 1 , 2 , , N denote forecast time. Suppose that h n n = 0 , 1 , 2 , , N represents the observed inflow to the Three Gorges Reservoir (TGR) and s n n = 0 , 1 , 2 , , N represents the reservoir inflow simulated by the hydrological model. Hn and Sn are, respectively, the implementation values of the random variables. The prior probability density function (PDF) under different prediction periods can be expressed as the following equation.
g n ( h n | h 0 ) = γ ( h n ) ( 1 c 2 n ) 1 / 2 q ( Q 1 ( Γ ( h n ) ) × q Q 1 ( Γ ( h n ) ) c n Q 1 ( Γ ( h 0 ) ) ( 1 c 2 n ) 1 / 2 ,
where γ is the marginal PDF of observed inflow, q is the standard normal density function, Q 1 is the standard normal quantile function, Γ is the marginal cumulative probability function (CDF) of Hn, and c is the Pearson correlation coefficient obtained from the normal quantile transform.
The PDF under different prediction periods can be expressed as follows.
φ ( h n | s n , h 0 ) = γ ( h n ) T n q ( Q 1 ( Γ ( h n ) ) ) × Q 1 ( Γ ( h n ) ) A n Q 1 ( Λ n ( s n ) ) D n Q 1 ( Γ ( h 0 ) ) B n T n
where Λ n is the marginal distribution function of the reservoir inflow simulated by the hydrological model, and A n , B n , D n , and Tn are the parameters calculated in the normal quantile conversion space and the expression are as follows.
A n = a n τ n 2 a n 2 τ n 2 + σ n 2 ,
B n = a n b n τ n 2 a n 2 τ n 2 + σ n 2
D n = a n d n τ n 2 a n 2 τ n 2 + σ n 2
T n 2 = τ n 2 σ n 2 a n 2 τ n 2 + σ n 2
In the normal space, the stochastic dependence between Wn and Wn–1 is governed by a normal-linear equation.
W n = c n W n 1 + Ξ n ,
where cn is Pearson’s correlation coefficient, Ξ is a stochastically independent and normally distributed variable of Wn with a mean of zero and a variance of τ n 2 = 1 c n 2 .
The stochastic dependence between Xn and Wn, Wn−1, W0 is governed by a normal-linear equation.
X n = a n W n + d n W 0 + b n + Θ n ,
where an, bn, and dn are the regression coefficients, and Θ n is a stochastically independent and normally distributed variable of (Wn, W0) with a mean of zero and a variance of σ n 2 .

2.2.2. Precipitation-Dependent Hydrological Uncertainty Processer Based on Gaussian Mixture Model

In the calculation process of the HUP, the selection of marginal distribution function directly affects the accuracy of calculation results and the complexity of parameter estimation. Feng et al. [28] proposed that the Gaussian mixture model has better performance in fitting the marginal probability distribution of the observed flow and simulated flow, which can improve the performance of the prediction method based on HUP.
The marginal density function and distribution function in the HUP are estimated as follows.
γ ( ) g m m ( | θ Γ ) ,
Γ ( ) G M M ( | θ Γ )
Λ n ( ) G M M ( | θ Λ n )
where g m m ( | θ Γ ) and G M M ( | ) approximately represent the PDF and CDF of Hn, where θ Γ and θ Λ n are the distribution parameters to be estimated. More details about the GMM can be found in Feng et al. [28]. Then, on the conditional of observed inflow at time t0, the prior distribution of Hn can be expressed as:
g n ( h n | h 0 ) = g m m ( h n | θ Γ ) ( 1 c 2 n ) 1 / 2 q ( Q 1 ( G M M ( h n | θ Γ ) ) ) × q Q 1 ( G M M ( h n | θ Γ ) ) c n Q 1 ( G M M ( h 0 | θ Γ ) ) ( 1 c 2 n ) 1 / 2
Similarly, according to Equation (10), under the condition that the observed inflow at time t0 is h0 and the simulated flow is sn, the posterior probability density function of Hn is the following.
φ n ( h n | s n , h 0 ) = g m m ( h n | θ Γ ) T n q ( Q 1 ( G M M ( h n | θ Γ ) ) ) × q Q 1 ( G M M ( h n | θ Γ ) ) A n Q 1 ( G M M ( s n | θ Λ n ) ) D n Q 1 ( G M M ( h 0 | θ Γ ) ) B n T n
In order to accurately describe the uncertainty of the rainfall input data, the parameters of the posterior probability need to be discussed, according to the precipitation during the lead-time period. Krzysztofowicz [23] discussed the HUP model in the case of rainfall occurrence and non-occurrence, and proposed the precipitation-dependent HUP (PD-HUP) model considering the uncertainty of rainfall. However, there are only a few periods during the flood season when no rainfall is generated, so it is difficult to accurately estimate the marginal distribution function of observed and predicted flow in the case of no rainfall. Therefore, this paper establishes a PD-HUP model based on the precipitation in the lead time period, divided into two types of effective rainfall and negligible rainfall, and then weighs the two to form a PD-HUP based on Gaussian mixture model (PD-HUP-GMM) model that takes rainfall uncertainty into account.
Let Variable V represent different rainfall conditions and W represent precipitation. When v = 1, it means effective rainfall ( w 1   mm ) and v = 0 ( 0 w 1   mm ) means negligible rainfall. Let γ o v ( h 0 ) = p ( h 0 | V = v ) represent the marginal PDF of h0 under the condition V = v where v = P ( V = 1 ) represents the probability of effective rainfall in the lead time period. Then, the marginal PDF of Hn is as follows.
γ 0 ( h 0 ) = γ 00 ( h 0 ) ( 1 v ) + γ 01 ( h 0 ) v ,
Under the condition that the observed inflow is h0, the probability of effective and negligible rainfall is as follows.
P ( V = 1 | H 0 = h 0 ) = γ 01 ( h 0 ) v γ 0 ( h 0 ) ,
P ( V = 0 | H 0 = h 0 ) = γ 0 0 ( h 0 ) ( 1 v ) γ 0 ( h 0 )
Under the condition that the observed inflow is h0, the PDF of the Sn is as follows.
k n ( s n | h 0 ) = k n 0 ( s n | h 0 ) γ 00 ( h 0 ) ( 1 v ) γ 0 ( h 0 ) + k n 1 ( s n | h 0 ) γ 01 ( h 0 ) v γ 0 ( h 0 ) ,
Under the condition that the observed inflow is h0 and simulated inflow is sn, the probability of effective and negligible rainfall is as follows.
P ( V = 1 | S n = s n , H 0 = h 0 ) = k n 1 ( s n | h 0 ) γ 0 1 ( h 0 ) v k n ( s n | h 0 ) γ 0 ( h 0 ) ,
P ( V = 0 | S n = s n , H 0 = h 0 ) = k n 1 ( s n | h 0 ) γ 0 0 ( h 0 ) 1 v k n ( s n | h 0 ) γ 0 ( h 0 )
Thus, finally, the posterior probability can be expressed as:
φ ( h n | s n , h 0 ) = φ n 0 ( h n | s n , h 0 ) k n 0 ( s n | h 0 ) γ 0 0 ( h 0 ) ( 1 v ) k n ( s n | h 0 ) γ 0 ( h 0 ) + φ n 1 ( h n | s n , h 0 ) k n 1 ( s n | h 0 ) γ 0 1 ( h 0 ) v k n ( s n | h 0 ) γ 0 ( h 0 )

2.2.3. Generation of Ensemble Flow Forecasts

In order to convert the probability flow forecast result obtained by the PD-HUP-GMM method into the ensemble flow forecast result, the expected value of the probability flow forecast result under each rainfall condition is taken as the member of the flow forecast set to constitute the flow forecast ensemble.
First, the value of flood expectation under a different rainfall forecast is as follows.
E H n , t = + h n , t φ h n , t | s n , t , h 0 d h n , t ,
where Hn,t represents the simulated value of the nth flood ensemble forecast member at time t, and E(Hn,t) is the expected value of Hn,t. Then, the ensemble flow forecast result can be expressed as follows.
E H 1 , 1 E H 1 , 2 E H 1 , T E H 2 , 1 E H 2 , 2 E H 2 , T E H N , 1 E H N , 2 E H N , T ,
where T represents the number of periods of simulated runoff sequence, and each row in the matrix represents a member of a flood ensemble forecast. The flow chart describing the Bayesian ensemble forecast method is shown in Figure 1.

2.3. Data

The Three Gorges Reservoir (TGR) located on the Yangtze River in China was selected as a case study and located at Yichang on the Yangtze river in China, which is the world’s largest hydroelectric project. Its geographical location is shown in Figure 2. The main functions of the Three Gorges Reservoir are flood control, power generation, and navigation, etc. The basin area of the Three Gorges Reservoir is 1 × 106 km2, the surface area of the reservoir is about 1080 km2, and the average width is about 1100 m [29].
The historical daily inflow data of TGR and precipitation from 2006 to 2017 during the flood season were used in this study. This study uses the EWF data from the second edition of the Global Ensemble Forecast System (GEFS) developed by the National Center for Environmental Forecasting (NCEP). This is one of the most widely used numerical forecasting systems in the world. The data set contains 11 set forecast members, and the spatial resolution in latitude and longitude is 1 ° × 1 °. Data from January to September in 2017 was used in this study. The Xin’an Jiang model is selected as the deterministic prediction model of the PD-HUP-GMM method with the precipitation and evaporation in the study area used as the input. We decided to restrict the forecast period to one day ahead.

3. Results and Discussion

3.1. Correction of Ensemble Weather Forecasts

First, the value of order p was determined according to the AIC criterion. In this paper, p takes different values (1, 2, …, 10). When we calculate the AIC value of each ensemble member, the result is shown in Figure 3. The p value (p = 8) with the minimum AIC value was selected as the regression order of the AR model.
Then, the mean absolute error (MAE) and root-mean square error (RMSE) were used to evaluate the performance of the AR model. The MAE and RMSE are defined as follows.
MAE = 1 N t = 1 N F t O t O t ,
RMSE = 1 N t = 1 N F t O t 2
where Ft represents the forecast precipitation, and Ot represents the observed precipitation. The MAE and RMSE value of each GEFS member before and after correction are shown in Table 1.
According to Table 1, after AR model correction, the MAE of precipitation data of GEFS members decreased by an average of 18.10 m3/s, while RMSE decreased by an average of 16.05 m3/s. It shows that the AR model can effectively reduce the precipitation deviation of GEFS. However, there are still some bias and uncertainties in the corrected GEFS precipitation data. Therefore, the PD-HUP model was adopted in this paper to further deal with the uncertainty of hydrological forecast caused by rainfall forecast bias.

3.2. Estimated Parameters of Ensemble Flow Forecast Model

3.2.1. Generation of Ensemble Flow Forecasts

The observed flow data in the flood season from 2006 to 2016 were used to fit the marginal distribution. In order to obtain the marginal distribution function of the simulated flow, the Xin’an Jiang model was used to calculate the simulated flow value in the same period as the observed flow at first, and the lead time n was set to one day. The number of days with effective rainfall is 1178, and the number of days with negligible rainfall is 164.
Following the ideas of Feng [28], the GMM with three Gaussian components was used to fit the observed inflow and simulated inflow data of TGR. The Maximum Likelihood Estimate (MLE) was used to estimate the parameters of the marginal distributions under heavy effective and negligible rainfall conditions, respectively. The estimated parameter values are given in Table 2 and Table 3. The fitting of marginal CDF is shown in Figure 4. As can be seen from Figure 4, GMM fits the observed and simulated flow marginal distribution under effective and negligible rainfall conditions.

3.2.2. Parameters of PD-HUP

The calculation method of prior density, likelihood function, and posterior is shown in Equations (15)–(18). The parameters of prior density and likelihood function in the transformed space is shown in Table 4 and the parameters of parameters of post distribution in the transformed space is shown in Table 5.

3.3. Ensemble Forecast Analysis

In order to verify the effectiveness of the method proposed in this paper, precipitation and inflow data of TGR in the flood season of 2017 were taken as model input data, and the above method was used to calculate the ensemble inflow forecast results of TGR. Three scenarios as shown in Table 6 were set up for comparative analysis. The Nash-Sutcliffe efficiency (NSE) and RMSE were selected to evaluate the accuracy of forecast results under different scenarios and the variance was used to evaluate the uncertainty of the Bayesian ensemble forecast result, as discussed in the following subsections.

3.3.1. Comparison of Forecast Accuracy

For the deterministic forecast, the NSE and RMSE were calculated by observed inflow and simulated result of the Xin’an Jiang model. For the ensemble forecast evaluate, the mean value of ensemble members and observed were used to calculated NSE and RMSE. The forecast results of different scenarios are shown in Figure 5. The formula for calculating NSE is as follows and the calculated results of indicators are shown in Table 7.
NSE = 1 i = 1 N Q s i m Q o b s 2 i = 1 N Q s i m Q ¯ o b s 2 ,
It can be seen from the simulation results that, when the observed historical rainfall data is used as the input of the hydrological model, the NSE of the simulation results can reach 0.92, as shown in Table 7, indicating that the Xin’an Jiang model can be used as a deterministic model for hydrological prediction in the research area. When GEFS data is directly used as input data of the Xin’an Jiang model, the simulation results of the hydrological model are significantly larger than the actual runoff process NSE= −7.14. When the GEFS data is corrected by using the AR model and the corrected GEFS data is used as the input of the Xin’anjiang model, the simulation accuracy of the Xin’an Jiang model has been significantly improved. However, by comparing the ensemble average with the observed inflow process, the NSE improves to only 0.56, which is difficult to meet the demand for prediction accuracy in actual production. This is mainly because the uncertainty of rainfall input data affects the accuracy of hydrological model simulation results. Furthermore, by using the PD-HUP-GMM method proposed in this paper, it can be found that the simulation results of the hydrological model have been effectively improved. The NSE of the ensemble average value is up to 0.91 and the RMSE decreases by 52.6% when compared with the scenario of directly adopting the corrected GEFS data, greatly improve the accuracy of the runoff simulation, and enough to meet demand for hydrological forecast accuracy in practical production. It shows that the method proposed in this paper greatly improves the accuracy of inflow forecast of TGR and is sufficient to meet the demand of hydrological prediction accuracy in actual production.

3.3.2. Uncertainty Evaluation of Ensemble Prediction Results

In order to analyze the uncertainty of hydrological model simulation results under different scenarios, the variances of ensemble members in different time periods and different scenarios are calculated. The formula for variance used is shown below.
V a r t = i = 1 n Q i , t Q ¯ t 2 n 1 ,
where n is the number of ensemble members, Q i , t represents the prediction result of the ith ensemble member at time t, and Q ¯ t is the mean value of the prediction result of all ensemble members at time t. The variance of ensemble members in each scenario changes over time as shown in Figure 6.
As can be seen from Figure 6, when different rainfall input data are used, the uncertainty of hydrological model simulation results is significantly different. When GEFS rainfall data is used to directly drive the hydrological model, the variance of the ensemble member of the hydrological model simulation results can reach 2.49 × 107, indicating that the simulation results are highly uncertain. However, by using the adjusted GEFS rainfall data to drive the hydrological model, the variance of the ensemble members of the inflow prediction in different periods will significantly decrease, and the uncertainty of the prediction results will significantly decrease. When the Bayesian ensemble forecast method proposed by this paper is adopted, the uncertainty of hydrological prediction results is further reduced.

4. Conclusions

In this paper, the precipitation data of the Global Ensemble Forecast System (GEFS) was postprocessed by an Autoregressive (AR) model, and a precipitation-dependent hydrological uncertainty processor based on GMM (PD-HUP-GMM) was proposed to generate the ensemble flow forecast. Then, the Nash-Sutcliffe efficiency (NSE), root-mean square error (RMSE), and variances of ensemble members were used to evaluate the accuracy and uncertainty of the ensemble flow forecast. The TGR was selected as a case study. The main conclusions of this study are summarized as follows.
(1) The results of the AR model show that this model can effectively correct GEFS precipitation data, and it is simple and feasible. The use of AR models to correct the GEFS rainfall data also helps to improve the simulation accuracy of the deterministic hydrological model.
(2) The proposed method was compared with three different forecast methods. Results of the case study show that the PD-HUP-GMM method combined with the corrected GEFS data can significantly improve the NSE and RMSE of flow forecasting. From this point of view, the proposed method performs well.
(3) The proposed method can effectively deal with the uncertainty of precipitation input data and the hydrological model and, thus, improve the accuracy of the ensemble flow forecast average value to a useful level.
Future work will involve extending the forecast period to a longer horizon to further evaluate the potential of HUP in operational flow forecasting and improve its practical significance.

Author Contributions

Conceptualization, X.Y. and J.Z. Methodology, X.Y. Data curation, J.Z. Writing—original draft preparation, X.Y. Writing—review and editing, X.Y, W.F., Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation Key Project of China (No. 52039004) and National Natural Science Foundation of China (No. U1865202).

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  1. Chen, L.; Singh, V.P.; Guo, S.; Zhou, J.; Ye, L. Copula entropy coupled with artificial neural network for rainfall–runoff simulation. Stoch. Environ. Res. Risk Assess. 2014, 28, 1755–1767. [Google Scholar] [CrossRef]
  2. Yaseen, Z.M.; Naganna, S.R.; Sa’Adi, Z.; Samui, P.; Ghorbani, M.A.; Salih, S.Q.; Shahid, S. Hourly River Flow Forecasting: Application of Emotional Neural Network Versus Multiple Machine Learning Paradigms. Water Resour. Manag. 2020, 34, 1075–1091. [Google Scholar] [CrossRef]
  3. Ramaswamy, V.; Saleh, F. Ensemble Based Forecasting and Optimization Framework to Optimize Releases from Water Supply Reservoirs for Flood Control. Water Resour. Manag. 2020, 34, 989–1004. [Google Scholar] [CrossRef]
  4. Bashir, A.; Shehzad, M.A.; Hussain, I.; Rehmani, M.I.A.; Bhatti, S.H. Reservoir Inflow Prediction by Ensembling Wavelet and Bootstrap Techniques to Multiple Linear Regression Model. Water Resour. Manag. 2019, 33, 5121–5136. [Google Scholar] [CrossRef]
  5. Fu, J.-C.; Huang, H.-Y.; Jang, J.-H.; Huang, P.-H. River Stage Forecasting Using Multiple Additive Regression Trees. Water Resour. Manag. 2019, 33, 4491–4507. [Google Scholar] [CrossRef]
  6. Bai, Y.; Bezak, N.; Sapač, K.; Klun, M.; Zhang, J. Short-Term Streamflow Forecasting Using the Feature-Enhanced Regression Model. Water Resour. Manag. 2019, 33, 4783–4797. [Google Scholar] [CrossRef]
  7. Zhou, Q.; Chen, L.; Singh, V.P.; Zhou, J.; Chen, X.; Xiong, L. Rainfall-runoff simulation in karst dominated areas based on a coupled conceptual hydrological model. J. Hydrol. 2019, 573, 524–533. [Google Scholar] [CrossRef]
  8. Li, W.; Zhou, J.; Sun, H.; Feng, K.; Zhang, H.; Tayyab, M. Impact of Distribution Type in Bayes Probability Flood Forecasting. Water Resour. Manag. 2017, 31, 961–977. [Google Scholar] [CrossRef]
  9. Cloke, H.L.; Pappenberger, F. Ensemble flood forecasting: A review. J. Hydrol. 2009, 375, 613–626. [Google Scholar] [CrossRef]
  10. Wu, W.; Emerton, R.; Duan, Q.; Wood, A.W.; Wetterhall, F.; Robertson, D.E. Ensemble flood forecasting: Current status and future opportunities. Wiley Interdiscip. Rev. Water 2020, 7, e1432. [Google Scholar] [CrossRef]
  11. Li, X.-Q.; Chen, J.; Xu, C.-Y.; Li, L.; Chen, H. Performance of Post-Processed Methods in Hydrological Predictions Evaluated by Deterministic and Probabilistic Criteria. Water Resour. Manag. 2019, 33, 3289–3302. [Google Scholar] [CrossRef]
  12. Zhang, J.; Chen, J.; Li, X.; Chen, H.; Xie, P.; Li, W. Combining Postprocessed Ensemble Weather Forecasts and Multiple Hydrological Models for Ensemble Streamflow Predictions. J. Hydrol. Eng. 2020, 25, 1–17. [Google Scholar] [CrossRef]
  13. Boucher, M.-A.; Anctil, F.; Perreault, L.; Tremblay, D. A comparison between ensemble and deterministic hydrological forecasts in an operational context. Adv. Geosci. 2011, 29, 85–94. [Google Scholar] [CrossRef]
  14. Han, S.; Coulibaly, P. Probabilistic Flood Forecasting Using Hydrologic Uncertainty Processor with Ensemble Weather Forecasts. J. Hydrometeorol. 2019, 20, 1379–1398. [Google Scholar] [CrossRef]
  15. Hamill, T.M.; Whitaker, J.S.; Wei, X. Ensemble Reforecasting: Improving Medium-Range Forecast Skill Using Retrospective Forecasts. Mon. Weather. Rev. 2004, 132, 1434–1447. [Google Scholar] [CrossRef]
  16. Sloughter, J.M.L.; Raftery, A.E.; Gneiting, T.; Fraley, C. Probabilistic Quantitative Precipitation Forecasting Using Bayesian Model Averaging. Mon. Weather. Rev. 2007, 135, 3209–3220. [Google Scholar] [CrossRef]
  17. Wilks, D.S. Extending logistic regression to provide full-probability-distribution MOS forecasts. Meteorol. Appl. 2009, 16, 361–368. [Google Scholar] [CrossRef]
  18. Mylne, K.R. Decision making from probability forecasts using calculations of forecast value. Meteorol. Appl. 2001, 9, 307–315. [Google Scholar] [CrossRef]
  19. Brown, J.D.; Seo, D.-J. A Nonparametric Postprocessor for Bias Correction of Hydrometeorological and Hydrologic Ensemble Forecasts. J. Hydrometeorol. 2010, 11, 642–665. [Google Scholar] [CrossRef]
  20. Choubin, B.; Moradi, E.; Golshan, M.; Adamowski, J.; Sajedi-Hosseini, F.; Mosavi, A. An ensemble prediction of flood susceptibility using multivariate discriminant analysis, classification and regression trees, and support vector machines. Sci. Total. Environ. 2019, 651, 2087–2096. [Google Scholar] [CrossRef]
  21. Krzysztofowicz, R. Bayesian theory of probabilistic forecasting via deterministic hydrologic model. Water Resour. Res. 1999, 35, 2739–2750. [Google Scholar] [CrossRef]
  22. Krzysztofowicz, R.; Kelly, K.S. Hydrologic uncertainty processor for probabilistic river stage forecasting. Water Resour. Res. 2000, 36, 3265–3277. [Google Scholar] [CrossRef]
  23. Krzysztofowicz, R.; Herr, H.D. Hydrologic uncertainty processor for probabilistic river stage forecasting: Precipitation-dependent model. J. Hydrol. 2001, 249, 46–68. [Google Scholar] [CrossRef]
  24. Marshall, L.; Nott, D.; Sharma, A. A comparative study of Markov chain Monte Carlo methods for conceptual rainfall-runoff modeling. Water Resour. Res. 2004, 40, 183–188. [Google Scholar] [CrossRef]
  25. Marshall, L.; Nott, D.; Sharma, A. Hydrological model selection: A Bayesian alternative. Water Resour. Res. 2005, 41. [Google Scholar] [CrossRef]
  26. Herr, H.D.; Krzysztofowicz, R. Ensemble Bayesian forecasting system Part I: Theory and algorithms. J. Hydrol. 2015, 524, 789–802. [Google Scholar] [CrossRef]
  27. Herr, H.D.; Krzysztofowicz, R. Ensemble Bayesian forecasting system Part II: Experiments and properties. J. Hydrol. 2019, 575, 1328–1344. [Google Scholar] [CrossRef]
  28. Feng, K.; Zhou, J.; Liu, Y.; Lu, C.; He, Z. Hydrological Uncertainty Processor (HUP) with Estimation of the Marginal Distribution by a Gaussian Mixture Model. Water Resour. Manag. 2019, 33, 2975–2990. [Google Scholar] [CrossRef]
  29. Zhou, Y.; Guo, S. Risk analysis for flood control operation of seasonal flood-limited water level incorporating inflow forecasting error. Hydrol. Sci. J. 2014, 59, 1006–1019. [Google Scholar] [CrossRef]
Figure 1. Flow of methods. The PD-HUP-GMM represents the precipitation-dependent hydrological uncertainty processor (PD-HUP) based on a Gaussian mixture model (GMM).
Figure 1. Flow of methods. The PD-HUP-GMM represents the precipitation-dependent hydrological uncertainty processor (PD-HUP) based on a Gaussian mixture model (GMM).
Water 12 03138 g001
Figure 2. Schematic of the Three Gorges Reservoir (TGR) in China.
Figure 2. Schematic of the Three Gorges Reservoir (TGR) in China.
Water 12 03138 g002
Figure 3. The AIC value of each ensemble member.
Figure 3. The AIC value of each ensemble member.
Water 12 03138 g003
Figure 4. Fitting of the marginal CDF of observed value and simulated flood value. (a) Marginal CDF of observed value. (b) Marginal CDF of simulated value. GMM (v = 1) and GMM (v = 0), respectively, represent the fitting of the distribution function in the case of effective rainfall and negligible rainfall.
Figure 4. Fitting of the marginal CDF of observed value and simulated flood value. (a) Marginal CDF of observed value. (b) Marginal CDF of simulated value. GMM (v = 1) and GMM (v = 0), respectively, represent the fitting of the distribution function in the case of effective rainfall and negligible rainfall.
Water 12 03138 g004
Figure 5. Simulated result of the four scenarios: (a) Observed data + Xin’an Jiang model, (b) GEFS data + Xin’an Jiang model, (c) Corrected GEFS data + Xin’an Jiang model, and (d) Bayesian ensemble forecast.
Figure 5. Simulated result of the four scenarios: (a) Observed data + Xin’an Jiang model, (b) GEFS data + Xin’an Jiang model, (c) Corrected GEFS data + Xin’an Jiang model, and (d) Bayesian ensemble forecast.
Water 12 03138 g005aWater 12 03138 g005b
Figure 6. The variance of ensemble flow forecast members at different moments of the 2017 flood season under different scenarios.
Figure 6. The variance of ensemble flow forecast members at different moments of the 2017 flood season under different scenarios.
Water 12 03138 g006
Table 1. Comparison of MAE and RMSE before and after correction.
Table 1. Comparison of MAE and RMSE before and after correction.
GEFS MemberMAERMSE
Before CorrectionAfter CorrectionBefore CorrectionAfter Correction
130.4611.2532.4215.98
229.9511.4931.8916.36
329.4512.0931.5416.30
428.8711.4931.0315.82
528.1311.7530.2915.14
628.5410.4430.8614.17
729.4710.7331.8314.94
829.7411.6131.7215.36
928.9210.5530.7114.18
1028.7610.8330.9714.91
1129.9810.9132.1915.77
Table 2. Estimated parameters of GMM distributions under effective rainfall.
Table 2. Estimated parameters of GMM distributions under effective rainfall.
DatasetComponentsWeightMeanVariance
Observed value10.491.59 × 1041.16 × 107
20.173.61 × 1049.36 × 107
30.342.49 × 1042.20 × 107
Simulated value10.511.75 × 1041.26 × 107
20.123.96 × 1049.19 × 107
30.362.77 × 1042.18 × 107
Table 3. Estimated parameters of GMM distributions under negligible rainfall.
Table 3. Estimated parameters of GMM distributions under negligible rainfall.
DatasetComponentsWeightMeanVariance
Observed value10.461.63 × 1041.12 × 107
20.162.59 × 1046.03 × 107
30.381.06 × 1043.29 × 107
Simulated value10.571.25 × 1044.55 × 107
20.122.82 × 1044.79 × 107
30.311.92 × 1041.20 × 107
Table 4. Parameters of prior density and likelihood function in the transformed space.
Table 4. Parameters of prior density and likelihood function in the transformed space.
Vcandnbn σ n tn
10.9311.058−0.150−0.0800.4200.365
00.9740.8680.7150.0430.3840.225
Table 5. Parameters of parameters of post distribution in the transformed space.
Table 5. Parameters of parameters of post distribution in the transformed space.
VAnBnDnTn
10.4340.0650.5380.269
00.236−0.1690.7650.200
Table 6. Brief descriptions of the four scenarios.
Table 6. Brief descriptions of the four scenarios.
ScenariosDescription
Observed data + Xin’an Jiang modelUse observed data only to run Xin’an Jiang model and get deterministic forecast
GEFS data + Xin’an Jiang modelUse GEFS to run the Xin’an Jiang model and get ensemble forecast
Corrected GEFS data + Xin’an Jiang modelCorrected GEFS data to run the Xin’an Jiang model and get ensemble forecast
Bayesian ensemble forecastUse HUP to postprocess the result of corrected GEFS data with the Xin’an Jiang model and get the ensemble result
Table 7. NSE and RMSE of the four scenarios.
Table 7. NSE and RMSE of the four scenarios.
ScenariosNSERMSE
Observed data + Xin’an Jiang model0.921364.02
GEFS data + Xin’an Jiang model−7.1413613.12
Corrected GEFS data + Xin’an Jiang model0.563167.20
Bayesian ensemble forecast0.911501.23
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Back to TopTop