On the Operational Flood Forecasting Practices Using Low-Quality Data Input of a Distributed Hydrological Model

: Low-quality input data (such as sparse rainfall gauges, low spatial resolution soil type and land use maps) have limited the application of physically-based distributed hydrological models in operational practices in many data-sparse regions. It is necessary to quantify the uncertainty in the deterministic forecast results of distributed models. In this paper, the TOPographic Kinematic Approximation and Integration (TOPKAPI) distributed model was used for deterministic forecasts with low-quality input data, and then the Hydrologic Uncertainty Processor (HUP) was used to provide the probabilistic forecast results for operational practices. Results showed that the deterministic forecasts by TOPKAPI performed poorly in some ﬂood seasons, such as the years 1997, 2001 and 2008, despite which the overall accuracy of the whole study period 1996–2008 could be acceptable and generally reproduced the hydrological behaviors of the catchment (Lushi basin, China). The HUP model can not only provide probabilistic forecasts (e.g., 90% predictive uncertainty bounds), but also provides deterministic forecasts in terms of 50% percentiles. The 50% percentiles obviously improved the forecast accuracy of selected ﬂood events at the leading time of one hour. Besides, the HUP performance decayed with the leading time increasing (6, 12 h). This work revealed that deterministic model outputs had large uncertainties in ﬂood forecasts, and the HUP model may provide an alternative for operational ﬂood forecasting practices in those areas with low-quality data.


Introduction
Floods are the most frequent type of natural disaster and pose a huge threat to society [1][2][3]. Historically, people have been trying to understand floods, fight against them, and manage them [2][3][4][5].
In the process of continuously searching for flood disaster prevention measures, people realized that hydrological forecasting can play an effective role in flood prevention and disaster reduction [6,7]. Hydrological forecasting is an important non-engineering measure in flood prevention and mitigation measures. It plays an important role in flood prevention and rescue, reducing flood losses and protection of water resources. It is based on the current understanding of the basic laws of the hydrological cycle by human beings and predicts future hydrological behaviors using applied science and technology. With the improvement of the level of computer application, the mathematical hydrological forecasting technique has made breakthroughs in theoretical research and has been applied to operational practice in many basins of world [8][9][10][11][12][13]. Overall, in current operational flood forecasts, forecasting accuracy has improved and the leading time has been extended. The information of forecasting results has been also enriched. These advances have largely supported flood prevention decisions [14][15][16].
Hydrological models are an important tool for hydrological forecasting. The development of hydrological models has undergone major stages, such as empirical models, black-box models, conceptual hydrological models and distributed hydrological models. From the beginning, only models with system concepts have gradually developed into models with physical foundations [14,17]. Theoretically, the distributed hydrological model could provide more reliable results than before when the model parameters were physically determined according thematic maps, field measurement, literature and catchment characteristics or hydrological behavior analysis [6,14]. Data in many regions of the world do not perfectly meet the requirement of distributed hydrological models, although these models have been simplified to some extent. The lack of model input is particularly serious in some developing countries [18][19][20]. For example, high resolution thematic maps of digital elevation model (DEM), soil type, land cover and geology are difficult to obtain in many regions of China, and generally, only coarse resolution (low spatial resolution) thematic maps downloaded from the Internet could be used for the application of distributed hydrological models [21,22]. Besides, although distributed hydrological models are applied, only the river discharge type observation data are available for model test [23]. This case does not reflect the value and advantages of applying distributed hydrological models. Therefore, in many cases in China, the accuracy of distributed hydrological models will not be significantly improved compared to lumped models [24,25].
The hydrological process is extremely complicated, and humans have a limited level of cognition of hydrological laws. The development of hydrological models inevitably involves uncertainties in model inputs, structures and parameters. The existence of these uncertainties will inevitably lead to uncertain results of flood forecasts [18][19][20]26,27]. Deterministic forecasting is often applied in current operational flood forecasting. It is impossible to quantify the uncertainty of the forecast results, and therefore it is not possible to make an objective assessment of the possible risks of flood control decisions. Therefore, the realization of probabilistic flood forecasting to quantitatively describe the uncertainty of the forecasting process has gradually become a trend in operational forecasts [13,18,19].
In this study, we used a physically based rainfall-runoff model, the TOPographic Kinematic Approximation and Integration (TOPKAPI) model, with coarse (low spatial resolution) thematic maps and discharge data as inputs for deterministic flood forecasting in a mountainous basin of the middle Yellow River. Furthermore, the TOPKAPI model was coupled with a general uncertainty analysis framework, the Hydrologic Uncertainty Processor (HUP), to conduct probabilistic flood forecasting considering uncertainties in deterministic forecasting results. This study could provide a reference for deterministic and probabilistic operational flood forecasting practices in similar basins. More importantly, this study will contribute to the operational flood forecasting practices research with low-quality data in some data-sparse regions, although the models used (i.e., TOPKAPI and HUP) have been frequently applied.

Study Basin
The drainage area above the Lushi hydrological station (Lushi basin) has a spatial extension of 4482 km 2 and could be viewed as the source area of the Yiluohe River ( Figure 1). The Yiluohe River is a first tributary in the middle reaches of the Yellow River. The topography of the Lushi basin is complex, with landforms dominated by mountains and hills. Precipitation in the basin is Sustainability 2020, 12, 8268 3 of 16 characterized by uneven spatial and temporal distribution. On the spatial scale, mountainous areas are rainy zones, while river valleys and hills are less rainy zones. On the time scale, rainfall events are mainly concentrated in the flood season (June-October), accounting for more than 60% of the rainfall in the whole year, and occur in the form of heavy rain, which is characterized by high flood peaks, large floods and sharp rises and falls.
There are 13 rainfall stations and one air temperature station distributed within the basin. In this study, several rainfall stations outside the basin were also used in precipitation interpolation for providing reliable spatial estimates of precipitation. The Lushi hydrological station located at the basin outlet controls the streamflow process from the whole drainage basin. is a first tributary in the middle reaches of the Yellow River. The topography of the Lushi basin is complex, with landforms dominated by mountains and hills. Precipitation in the basin is characterized by uneven spatial and temporal distribution. On the spatial scale, mountainous areas are rainy zones, while river valleys and hills are less rainy zones. On the time scale, rainfall events are mainly concentrated in the flood season (June-October), accounting for more than 60% of the rainfall in the whole year, and occur in the form of heavy rain, which is characterized by high flood peaks, large floods and sharp rises and falls. There are 13 rainfall stations and one air temperature station distributed within the basin. In this study, several rainfall stations outside the basin were also used in precipitation interpolation for providing reliable spatial estimates of precipitation. The Lushi hydrological station located at the basin outlet controls the streamflow process from the whole drainage basin.

Thematic Maps
A DEM with a spatial resolution of 78 × 78 m and a land use map of the study area were provided by the Yellow River Conservancy Commission of the Ministry of Water Resources, P. R. China (YRCC-MWR). The source area of the Yiluohe River is located in a mountainous area with steep slopes, and its elevations range, approximately, from 551 m to 2636 m. The catchment is characterized by a central valley where the main river flow is receiving water from side tributaries flowing on the mountainside.
According to the land use map, the study basin is characterized by nine different types of land use (Figure 2a). The dominating land use of the Lushi basin is deciduous broadleaf forest with a percentage of 32%, followed by shrubland, irrigated cropland and pasture and cropland/grassland mosaic, which account for 18%, 17% and 15%, respectively. The remaining five types of land use have a smaller distribution area in the catchment, and the area percentages are all less than 10%.
The soil type map at a 1:5,000,000 scale was obtained from the Food and Agriculture Organization (FAO) of the United Nations. Only three soil types in the basin could be obtained from the data set ( Figure 2b). The number of soil types is small, and the main type is soil I-Bc-2c with a percentage of 78%. Soil I-Be-2c mainly characterizes the mountains upstream at high elevations.
For the purpose of operational flood forecasting practice, the hydrological model was conducted at a spatial resolution of 1 × 1 km in the study basin. Thus, the thematic maps of DEM, land use and soil type were resampled to a 1 × 1 km resolution accordingly.

Thematic Maps
A DEM with a spatial resolution of 78 × 78 m and a land use map of the study area were provided by the Yellow River Conservancy Commission of the Ministry of Water Resources, P. R. China (YRCC-MWR). The source area of the Yiluohe River is located in a mountainous area with steep slopes, and its elevations range, approximately, from 551 m to 2636 m. The catchment is characterized by a central valley where the main river flow is receiving water from side tributaries flowing on the mountainside.
According to the land use map, the study basin is characterized by nine different types of land use (Figure 2a). The dominating land use of the Lushi basin is deciduous broadleaf forest with a percentage of 32%, followed by shrubland, irrigated cropland and pasture and cropland/grassland mosaic, which account for 18%, 17% and 15%, respectively. The remaining five types of land use have a smaller distribution area in the catchment, and the area percentages are all less than 10%.
The soil type map at a 1:5,000,000 scale was obtained from the Food and Agriculture Organization (FAO) of the United Nations. Only three soil types in the basin could be obtained from the data set ( Figure 2b). The number of soil types is small, and the main type is soil I-Bc-2c with a percentage of 78%. Soil I-Be-2c mainly characterizes the mountains upstream at high elevations.
For the purpose of operational flood forecasting practice, the hydrological model was conducted at a spatial resolution of 1 × 1 km in the study basin. Thus, the thematic maps of DEM, land use and soil type were resampled to a 1 × 1 km resolution accordingly.

Hydrometeorological Data
The hydrometeorological database contains rainfall, discharge at Lushi hydrological station and air temperate data for the period of 1996 to 2008. The time steps of the original rainfall data are unequally spaced, and the steps in the flood season are mainly one hour to several hours, while the time steps in the non-flood season are mostly 1 day. The flood forecasts were conducted using hourly time steps in this study. Thus, observed rainfall measurements at different time steps were sampled using equal values to 1 h time steps. Point rainfall data have good quality control and were interpolated to each grid cell using the Inverse Squared Distance (ISD) technique. In fact, many popular rainfall interpolation methods do not particularly consider the influence of elevation factors, so the interpolation effect in mountainous areas is not as good as in plain areas. Previous studies [28][29][30][31][32] have proven that some spatial interpolation techniques (such as ISD, Kriging and others) may be superior under certain conditions (e.g., affected by data conditions and climatic conditions). The ISD method was chosen because we have compared and confirmed that the interpolation accuracy was better than the Block Kriging (BK) method in the period when data of some rainfall stations were missing, and the interpolation accuracy was comparable to the BK method in the rainfall data-rich period.
Air temperature at Lushi station was collected and interpolated with a lapse rate of -0.6 ℃/100 m estimated using the 0.5 Degree Gridded Monthly China Surface Precipitation and Air Temperature Dataset (Version 2) [33]. This data product is developed by the Climate Data Center, China Meteorological Administration based on the data collected at 2472 climate stations and the Thin Plate Spline interpolation technique. The data quality is strictly controlled and can be used directly. The historical air temperature data were recorded at 6-h time steps (02:00, 08:00, 14:00, 20:00) and they include daily minimum and maximum values. The original temperature data were interpolated to hourly time steps for operational flood forecasting purposes.
Hourly discharge data were only available during the flood season. During the non-flood season, some discharge data are missing. In the non-flood season, when daily discharge was reported at Lushi hydrological station, the daily discharge was assumed constant over 24 h.

The Distributed Hydrological Model
A physically based rainfall-runoff model, TOPKAPI (TOPographic Kinematic Approximation and Integration) [21,34,35], was used for deterministic flood forecasts in the study basin. TOPKAPI uses three 'structurally-similar', zero-dimensional, non-linear reservoir equations to represent the drainage in the soil, the overland flow and the channel flow. The TOPKAPI structure is model lumping, preserving the process' non-linearities and retaining the physically meaningful parameters. This enhances the model's practicality in basins with low-quality data that cannot meet the requirement of fully distributed hydrological models, e.g., the Systeme Hydrologique Europeen

Hydrometeorological Data
The hydrometeorological database contains rainfall, discharge at Lushi hydrological station and air temperate data for the period of 1996 to 2008. The time steps of the original rainfall data are unequally spaced, and the steps in the flood season are mainly one hour to several hours, while the time steps in the non-flood season are mostly 1 day. The flood forecasts were conducted using hourly time steps in this study. Thus, observed rainfall measurements at different time steps were sampled using equal values to 1 h time steps. Point rainfall data have good quality control and were interpolated to each grid cell using the Inverse Squared Distance (ISD) technique. In fact, many popular rainfall interpolation methods do not particularly consider the influence of elevation factors, so the interpolation effect in mountainous areas is not as good as in plain areas. Previous studies [28][29][30][31][32] have proven that some spatial interpolation techniques (such as ISD, Kriging and others) may be superior under certain conditions (e.g., affected by data conditions and climatic conditions). The ISD method was chosen because we have compared and confirmed that the interpolation accuracy was better than the Block Kriging (BK) method in the period when data of some rainfall stations were missing, and the interpolation accuracy was comparable to the BK method in the rainfall data-rich period.
Air temperature at Lushi station was collected and interpolated with a lapse rate of −0.6 • C/100 m estimated using the 0.5 Degree Gridded Monthly China Surface Precipitation and Air Temperature Dataset (Version 2) [33]. This data product is developed by the Climate Data Center, China Meteorological Administration based on the data collected at 2472 climate stations and the Thin Plate Spline interpolation technique. The data quality is strictly controlled and can be used directly. The historical air temperature data were recorded at 6-h time steps (02:00, 08:00, 14:00, 20:00) and they include daily minimum and maximum values. The original temperature data were interpolated to hourly time steps for operational flood forecasting purposes.
Hourly discharge data were only available during the flood season. During the non-flood season, some discharge data are missing. In the non-flood season, when daily discharge was reported at Lushi hydrological station, the daily discharge was assumed constant over 24 h.

The Distributed Hydrological Model
A physically based rainfall-runoff model, TOPKAPI (TOPographic Kinematic Approximation and Integration) [21,34,35], was used for deterministic flood forecasts in the study basin. TOPKAPI uses three 'structurally-similar', zero-dimensional, non-linear reservoir equations to represent the drainage in the soil, the overland flow and the channel flow. The TOPKAPI structure is model lumping, preserving the process' non-linearities and retaining the physically meaningful parameters. This enhances the model's practicality in basins with low-quality data that cannot meet the requirement of fully distributed hydrological models, e.g., the Systeme Hydrologique Europeen (SHE) model [36,37]. The TOPKAPI parameters are reported as scale-independent and could be determined from digital elevation maps, vegetation or land use maps and soil type maps in terms of slopes, soil permeability, topography and surface roughness. The runoff production and routing processes are described on grid scales. The study basin is divided into several square grids and the calculation of runoff generation is conducted at each grid cell. Thus, the spatial variabilities of meteorological inputs (e.g., precipitation and air temperature) and model parameters could be considered. TOPKAPI is developed based on the saturation-excess runoff mechanism and includes five basic components, namely evapotranspiration, snowmelt, surface flow, interflow and channel flow (Figure 3). The model assumes that soil water percolation towards the deeper subsoil layers does not contribute to the basin discharge. It is acceptable for flood forecast cases because the response time of deep aquifer flow is so large for one specific storm event in a catchment. The percolation rate from the upper soil layer is assumed to increase as a function of the soil water content according to an experimentally determined power law [21].
In the calculation of interflow, the constant saturated hydraulic conductivity (Ks), along with depth in the non-saturation soil zone, is used. This implies the assumption that the total soil content integrated along the vertical profile with constant Ks does not differ strongly from the horizontal flux evaluated from the integration of the vertical soil moisture content profile. The total soil water content is addressed using the continuity equation and the dynamic equation and described in a non-linear reservoir equation. Surface flow only occurs in a grid cell when the soil is already saturated. In addition, water in the soil can exfiltrate on the surface as return flow due to a sudden change in hill slope or soil properties. Surface flow routing is described similarly to the soil component, according to the Kinematic approach, in which the momentum equation is approximated by means of the Manning's formula. The interflow flow and the surface flow together feed the channel along the drainage network. The channel flow routing is described using the Kinematic non-linear reservoir, and two types of river reach, namely rectangular and triangular cross sections, could be considered.
The TOPKAPI model uses a simplified method to calculate evapotranspiration starting from air temperature and from other topographic, geographic and climatic information. Potential evapotranspiration can be computed for a given grid cell using the Thornthwaite equation [38] with the input data of air temperature and is further corrected as a function of the actual soil moisture content to obtain the actual evapotranspiration. A radiation estimate based upon air temperature is developed to represent the snowmelt module. In practice, the inputs to the module are precipitation, air temperature and the same radiation approximation which was used in the evapotranspiration module.  In the calculation of interflow, the constant saturated hydraulic conductivity (K s ), along with depth in the non-saturation soil zone, is used. This implies the assumption that the total soil content integrated along the vertical profile with constant K s does not differ strongly from the horizontal flux evaluated from the integration of the vertical soil moisture content profile. The total soil water content is addressed using the continuity equation and the dynamic equation and described in a non-linear reservoir equation. Surface flow only occurs in a grid cell when the soil is already saturated. In addition, water in the soil can exfiltrate on the surface as return flow due to a sudden change in hill slope or soil properties. Surface flow routing is described similarly to the soil component, according to the Kinematic approach, in which the momentum equation is approximated by means of the Manning's formula. The interflow flow and the surface flow together feed the channel along the drainage network. The channel flow routing is described using the Kinematic non-linear reservoir, and two types of river reach, namely rectangular and triangular cross sections, could be considered.
The TOPKAPI model uses a simplified method to calculate evapotranspiration starting from air temperature and from other topographic, geographic and climatic information. Potential evapotranspiration can be computed for a given grid cell using the Thornthwaite equation [38] with the input data of air temperature and is further corrected as a function of the actual soil moisture content to obtain the actual evapotranspiration. A radiation estimate based upon air temperature is developed to represent the snowmelt module. In practice, the inputs to the module are precipitation, air temperature and the same radiation approximation which was used in the evapotranspiration module.

Hydrologic Uncertainty Processor
The Hydrologic University Processor (HUP) is a main sub-model within the Bayesian Forecasting System (BFS) proposed by R. Krzysztofowicz [39]. BFS is one of the most representative probabilistic forecasting models. It provides a general theoretical framework for probabilistic forecasting of hydrological variables. It combines the effects of various random factors on the forecasting results of hydrological models. It can be coupled with any hydrological model or forecasting scheme to obtain probabilistic forecasting results. Currently, BFS has been used in different river basins in the world [40][41][42][43][44][45]. BFS includes three sub-models: precipitation uncertainty processor (PUP), HUP and integrator (INT). Within the BFS framework, total uncertainty of hydrological forecasting is decomposed into two kinds of uncertainty: input and hydrological uncertainties. They are processed by PUP and HUP, respectively. BFS does not directly deal with the uncertainty of the model structure and model parameters. Instead, it takes the model output into consideration and can be coupled with any form of deterministic hydrological model. In view of the data limitation of the precipitation uncertainty analysis in our case, we do not consider the precipitation uncertainty, but only consider the hydrological uncertainty, and use the HUP model to realize the probability flood forecasting.
In the HUP model, H 0 is defined as the measured flow that is known at the time of forecasting. The variables H n and S n are the measured flow process and the flow process predicted by the deterministic hydrological model, respectively, and N is the leading time. The measured value of H n and the estimated value of S n are represented by h n and s n , respectively. For any time n and the observed value H n = h n , the prior probability function g n and the likelihood function f n could be synthesized using the total probability rule. According to Bayes' theorem, the posterior density function of H n under the condition of S n = s n is as [46]: In this study, the marginal distribution functions of the given measured discharge and the calculated discharge with the leading times (1, 6 and 12 h) were described by the log-Weibull distribution [39]. The parameters of the distribution could be estimated by the method of Moments. Using Equation (1), the probabilistic forecasting results could be obtained in the study basin.

Model Combination of TOPKAPI and HUP
The TOPKAPI model is a deterministic hydrological simulation tool and was used for flood forecasting in this study to provide deterministic results of river discharge. HUP is a "model-free" uncertainty analysis framework that could be coupled with any deterministic models. The coupling between HUP and TOPKAPI is loose, not a form of coupling on the internal structure. Therefore, HUP only needs the output of the TOPKAPI model (such as the forecast flow process) as its input. Finally, the river discharge output of HUP is probabilistic (and could be further analyzed to provide a deterministic form).
In this study, therefore, TOPKAPI independently provides deterministic results, while HUP provides both deterministic and probabilistic results on the basis of the TOPKAPI outputs.

Performance Metrics
In the model simulations, O i and S i are defined as the observed and simulated discharges at the i-th time step, respectively, and n is the sample size. The variables O and S represent the mean values for the n-sized observed and simulated data series, respectively. For deterministic forecasts, the following four indices were used to evaluate the model's performance in consecutive hydrological simulation or multi-flood-event simulation: • Mean Absolute Error (MAE) • Nash-Sutcliffe Coefficient Efficiency (NSCE) • Index of Agreement (IOA) For the model performance evaluation of a specific flood event, the relative bias of flood peak (RBP) and the relative bias of flood volume (RBV) were also used: where S p and O p are the values of peak discharge for simulated and observed samples, respectively, while S v and O v represent the volumes of a specific flood event for simulated and observed samples, respectively.
In the evaluation of probabilistic flood forecasts, the statistics of contained percent (CP) and dispersion index (DI) could be used [13,18,19]. The CP index measures the percentage of observations that are contained within the predictive uncertainty (PU) bounds at a given Confidence Interval (CI). The DI measures the average width of the predicted interval at a given CI after eliminating the influence of magnitude.
where n c is the number of observed discharge samples enveloped by the PU bounds; Q u i and Q d i represent the PU bounds at the i-th time step. Theoretically, an ideal result of probabilistic forecasting should have the value of CP at a given CI close to the value of CI, and the value of DI should be as small as possible to ensure that the PU bounds are not too wide [13]. In this study, the PU bounds from 5% to 95% were calculated for flood events.

Model Calibration and Validation
The TOPKAPI model was calibrated in the Lushi basin using the hydrometeorological data for the period of 1996 to 2005; successively, the data for the period of 2006 to 2008 were used as a validation test. For flood forecasting purposes, hourly time steps were used. Since the TOPKAPI model is a physically distributed model at the theoretical level, the model parameters can be obtained TOPKAPI calculates potential evapotranspiration and actual evapotranspiration at each grid cell for each time step based on the soil water content. The minimum soil water content for having evapotranspiration was fixed at 20% of the soil saturation water content, while soil saturation that was above 80% of the actual evapotranspiration was equal to the potential evapotranspiration.

Soil Type Parameters
Hydrological behavior of the soil at each grid cell was characterized by hydro-geological parameters. The parameters for each soil type were obtained based on FAO soil classification, and the initial values of these hydro-geological parameters were retrieved from a previous TOPKAPI application study [35]. These initial values were used for model calibration in this study. Table 1 shows the final calibrated values of the soil type parameters.

Land Use Parameters
The values of surface roughness and crop factors were obtained from literature according to the land use classification [35]. These parameter values were considered as the initial values and would be modified during the calibration procedure. Tables 2 and 3 show the calibrated values of Manning's coefficient for superficial roughness and crop factors for each month of the year.

Other parameters
The channel characteristic of the reaches of the Lushi basin was customized as triangular type river section, and the Manning coefficient was 0.042 s/m 1/3 . The riverbed side angle was fixed at 4.5 for mainstem and 3.7 for tributaries.
In the snow component of the TOPKAPI model, air temperature for accumulating or melting snow was fixed at 2 • C. Figure 4 shows the comparison of calculated and measured hourly discharges at the Lushi station for the calibration period 1996-2005. It was found that the observed discharge data were available only during the flood season (June-October) and, in general, they were not continuous, with many missing data. Besides, the hydrological behavior seemed to change a lot among these flood seasons.  Table 4 shows the statistics of model performance indices. Overall, the model performance in the validation period 2006-2008 was better than in the calibration period 1996-2005 in terms of all four evaluation indices. From the perspective of a single flood season, some flood seasons, such as the years of 1997 and 2001, had the worst model accuracy with an NSCE statistic less than zero. In general, the model was applicable to capture changes in the flow process during the flood season in the Lushi basin. Considering the lack of some hydrometeorological data and underlying surface data, there should be a large uncertainty in model input. Therefore, the model accuracy in this case was basically acceptable.

Model Test
Overall, the model performance in the validation period (2006)(2007)(2008) was better than in the calibration period (1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005) in terms of all four evaluation indices ( Figure 5). Although the overall accuracy of the validation period was high, the model had a serious overestimation with a negative NSCE value during the flood season of 2008. Due to the missing (or low-quality) of detailed input data needed by the TOPKAPI model, the parameter calibration has been very difficult in the Lushi basin. For example, soil type characteristics were derived from only eight FAO soil types and this number of soil types was probably not enough to describe the variability of soil characteristics in the catchment. Such low-quality input data inevitably cause uncertainty in the model output results. Sustainability 2020, 12, x FOR PEER REVIEW 10 of 16

Flood Event Simulations
We also presented the model performance of eight flood events in Figure 6. The deterministic results of these flood events were produced by the TOPKAPI model as shown in Figures 4 and 5. The HUP model was used to analyze the total forecast uncertainty of the deterministic results using the leading time of one hour. In the deterministic forecast results, the relative bias of flood peak (RBP) in five flood events satisfied the requirement of |RBP| ≤ 20% , and the average absolute value of RBP in all eight floods was 25.4%. In terms of relative bias of flood volume (RBV), the simulation results of three floods satisfied the accuracy requirements (|RBV| ≤ 20% ), while the NSCE values of the selected flood events were greater than 0.75, except for the 2006 flood season.

Model Accuracy Changes with Leading Time Increasing
The above comparison between TOPKAPI and HUP showed that the HUP model could provide very good agreement of hydrograph. Such a highly accurate result of the HUP model is essentially a real-time correction of flood forecasts using observed data. The leading time used by the HUP model was one hour. In operational practices, a one-hour leading time may not be sufficient for flood forecasts. Meanwhile, the observed discharge data may not be updated in time for use in the HUP model. Thus, large leading hours should be used. In this work, we conducted an analysis of HUP model accuracy changes with the increase in leading time (from one hour to six hours and to 12 h). Figure 7 shows the changes of the CP and DI statistics with the increase in leading time (1, 6 and 12 h). When the forecast leading time increased, the 90% PU bounds became wider than that of leading time = 1 h. That is to say, the DI value increased with the leading time. Although the 90% PU bounds had become wider, the number of observations enveloped by the bounds had decreased, manifested as a decrease in the CP value. The very low CP values (62% and 51%) could be found for the seventh flood event (June 2006) with the leading times of 6 and 12 h, respectively. As the HUP model could also provide deterministic forecasts in terms of 50% percentiles of PU, the forecast accuracy of 50% percentiles of selected flood events was also analyzed at different leading times ( Figure 8). All three performance indices showed a consistent decrease in accuracy as the forecast leading time increased. When the leading time increased from one hour to 6 h (or 12 h), the overall absolute values of the relative bias increased. Accordingly, the NSCE statistic decreased largely with the leading time increasing. Therefore, the accuracy of probabilistic flood forecasts The HUP provided the 90% PU bounds and 50% percentiles in the flood simulations, and the latter can be used as the deterministic results. The 50% percentiles simulations had very high accuracies in terms of the RBP and RBV indices. The absolute value of RBP was less than 6% for all floods, while the |RBV| value was even less than 2%. Besides, the 50% percentiles of all flood events matched very well with observations with the NSCE values greater than 0.9. It was obvious that compared with the TOPKAPI model, the deterministic forecast result provided by the 50% percentiles had a significant improvement in accuracy. This was because the observations were also used in the calculation of the HUP model. The 90% PU bounds could also provide useful information for operational flood forecast practices. With the leading time of one hour, more than 95% of observations were contained within the 90% PU bounds for all flood events. The DI statistic ranged from 0.54 to 0.65 with an average of 0.61 among these floods. Overall, the HUP model could provide rich information of probabilistic forecasts at a given CI and even better deterministic results than distributed models in terms of the 50% percentiles. However, this does not mean that the HUP model could run independently, because its input includes the deterministic model's results. The main characteristic of the HUP model is that it can further improve the resulted generated by the deterministic models.

Model Accuracy Changes with Leading Time Increasing
The above comparison between TOPKAPI and HUP showed that the HUP model could provide very good agreement of hydrograph. Such a highly accurate result of the HUP model is essentially a real-time correction of flood forecasts using observed data. The leading time used by the HUP model was one hour. In operational practices, a one-hour leading time may not be sufficient for flood forecasts. Meanwhile, the observed discharge data may not be updated in time for use in the HUP model. Thus, large leading hours should be used. In this work, we conducted an analysis of HUP model accuracy changes with the increase in leading time (from one hour to six hours and to 12 h). Figure 7 shows the changes of the CP and DI statistics with the increase in leading time (1, 6 and 12 h). When the forecast leading time increased, the 90% PU bounds became wider than that of leading time = 1 h. That is to say, the DI value increased with the leading time. Although the 90% PU bounds had become wider, the number of observations enveloped by the bounds had decreased, manifested as a decrease in the CP value. The very low CP values (62% and 51%) could be found for the seventh flood event (June 2006) with the leading times of 6 and 12 h, respectively.   As the HUP model could also provide deterministic forecasts in terms of 50% percentiles of PU, the forecast accuracy of 50% percentiles of selected flood events was also analyzed at different leading times ( Figure 8). All three performance indices showed a consistent decrease in accuracy as the forecast leading time increased. When the leading time increased from one hour to 6 h (or 12 h), the overall absolute values of the relative bias increased. Accordingly, the NSCE statistic decreased largely with the leading time increasing. Therefore, the accuracy of probabilistic flood forecasts provided by the HUP model decayed with the leading time increasing. This suggests that rolling forecasts with short leading times are very important not only in deterministic forecasts, but also in probabilistic forecasts.

Conclusions
Although many application cases of physically based distributed hydrological models have been reported in the literature, conceptual or lumped hydrological models are still the main ones in operational flood forecasting practices, especially in data-sparse regions. The main reason is that the quality of hydrometeorological or underlying data in many areas is low or even missing. In the localization of distributed models, therefore, large uncertainties are inevitably introduced to the final model forecasting results. In this study, we presented a case study of a distributed hydrological model (TOPKAPI) in flood forecasting with low-quality input data, and the forecast uncertainty was evaluated with a general uncertainty analysis framework (HUP model). In the study area (Lushi basin), the existing rainfall gauges cannot catch the spatial distribution of precipitation well. Rainfall estimates are easily overestimated or underestimated. Soil type data may also introduce uncertainties to the application of TOPKAPI as only eight FAO soil types are available. This number of soil types is not enough to describe the variability of soil characteristics in the study basin. In addition, observed

Conclusions
Although many application cases of physically based distributed hydrological models have been reported in the literature, conceptual or lumped hydrological models are still the main ones in operational flood forecasting practices, especially in data-sparse regions. The main reason is that the quality of hydrometeorological or underlying data in many areas is low or even missing. In the localization of distributed models, therefore, large uncertainties are inevitably introduced to the final model forecasting results. In this study, we presented a case study of a distributed hydrological model (TOPKAPI) in flood forecasting with low-quality input data, and the forecast uncertainty was evaluated with a general uncertainty analysis framework (HUP model). In the study area (Lushi basin), the existing rainfall gauges cannot catch the spatial distribution of precipitation well. Rainfall estimates are easily overestimated or underestimated. Soil type data may also introduce uncertainties to the application of TOPKAPI as only eight FAO soil types are available. This number of soil types is not enough to describe the variability of soil characteristics in the study basin. In addition, observed discharge data are only available in the flood season. In operational flood forecasting practices, such uncertainties in TOPKAPI forecast results are very detrimental to flood control decisions. Therefore, probabilistic forecasting of the HUP model can provide new auxiliary information for flood prevention decisions.
The TOPKAPI model was calibrated using hourly hydrometeorological data (e.g., precipitation, air temperature and discharge) and underlying surface data (soil type and land use maps) for the period of 1996 to 2005, then the model was validated using date for the period 2006-2008. The average NSCE values for the calibration and validation were 0.78 and 0.82, respectively, and the average IOA values were 0.93 and 0.95, respectively. It was suggested that the overall model accuracy could be acceptable and the model generally reproduced the hydrological behaviors of the study basin. However, one can also see the low accuracy in some flood seasons, such as in the years 1997, 2001 and 2008. The HUP model was then used to output probabilistic forecasts for several flood events at different leading hours. At the leading time of one hour, the performance of deterministic forecasts provided by the 50% percentiles were largely improved with NSCE values greater than 0.9 for all flood events. Besides, more than 95% of observations were enveloped by the 90% PU bounds, and the average DI statistic was as small as 0.61 for all floods. This suggested that the HUP model could provide much information (both deterministic and probabilistic) for operational flood forecasting practices. In terms of both probabilistic (CP and DI) and deterministic (RBP, RBV and NSCE) performance indices, the accuracy of the HUP model at a leading time of 6 h or 12 h obviously decreased compared to that of leading time = one hour. This proved that the performance of HUP would decay with the leading time increasing. This study could provide a useful reference for operational flood forecasting practices when only low-quality data are available for physically based distributed hydrological models.