Flood Routing Model with Particle Filter-Based Data Assimilation for Flash Flood Forecasting in the Micro-Model of Lower Yellow River , China

Reliable real-time flood forecasting is a challenging prerequisite for successful flood protection. This study developed a flood routing model combined with a particle filter-based assimilation model and a one-dimensional hydrodynamic model. This model was applied to an indoor micro-model, using the Lower Yellow River (LYR) as prototype. Real-time observations of the water level from the micro-model were used for data assimilation. The results show that, compared to the traditional hydrodynamic model, the assimilation model could effectively update water level, flow discharge, and roughness coefficient in real time, thus yielding improved results. The mean water levels of the particle posterior distribution are closer to the observed values than before assimilation, even when water levels change greatly. In addition, the calculation results for different lead times indicate that the root mean square error of the forecasting water level gradually increases with increasing lead time. This is because the roughness value changes greatly in response to unsteady water flow, and the incurring error accumulates with the predicted period. The results show that the assimilation model can simulate water level changes in the micro-model and provide both research method and technical support for real flood forecasting in the LYR.


Introduction
Floods are among the most severe hazards in the world, causing loss of life and excessive damage to property.Thus, if floods can be reliably forecast, proper actions can be initiated to mitigate the damages during the flood event.The ability to provide reliable forecast of river water levels for a short period following a storm with significant precipitation is very important for both flood control and river regulation [1].This study focused on the development of a flood routing model for real time flood forecasting.
In general, flood routing can be classified into two categories: the hydrologic method and the hydraulic method [2].However, the hydrologic method for a hydrologic system cannot achieve the desirable degree of accuracy and lacks water level and detailed flow information [3].In contrast, the hydraulic method can be widely applied to special problems, e.g., to forecast discharges and stages at cross-sections along rivers [4][5][6][7].However, due to input uncertainty, model structure uncertainty, and parameter uncertainty in hydraulic models, flood forecast reliability tends to decrease with increased forecast lead period, so that the simulated results rarely fit the measurements.This is even worse in real-time flood forecasting.Hence, the hydraulic model needs to be improved to enable real-time flood forecasting.
The data assimilation algorithm continuously integrates real-time observational data into numerical models and adjusts both the motion state and parameters, which greatly improves the prediction accuracy of the models.Compared to other assimilation algorithms, the particle filter (PF) assimilation algorithm is suitable for both nonlinear and non-Gaussian systems.It can effectively estimate the uncertainties of the model and improve the prediction accuracy of the model simulation [8][9][10].Han and Li [11] and Pan [12] compared the differences between the PF algorithm and the Kalman filter (KF) algorithm, and found that the PF algorithm can better solve the problem of a nonlinear non-Gaussian error in the model.In the nonlinear Lorenz system, the PF algorithm achieves better stability and accuracy than the KF algorithm [13][14][15][16].Moradkhani et al. [17] used the PF algorithm to estimate the model parameters and state variables in the hydrological model, and found that the results fitted the observed values well.These studies showed that the PF algorithm is a very efficient method to solve problems of nonlinear systems such as flood routing systems.However, it is still rarely applied to real-time flood forecasting.
The purpose of this study was to verify and evaluate the accuracy of the assimilation model system for real-time flood prediction in a laboratory micro-model.Although this model can also be applied to river prototypes, due to the difficulty and cost for obtaining real-time hydrological data, a micro-model was established based on the Lower Yellow River (LYR) for flood experiments.An automatic flow control system and a real-time water level monitoring system were installed on the model to control the upstream flow discharge and the acquisition of real-time water level data of the observation points.The PF algorithm was applied to increase the quality of the predicted water levels.The practical motivation of this study was to improve the application of indoor data assimilation systems, and apply it to real-time forecasting of real floods in the LYR in future.This will improve forecasting accuracy, increase the lead time, and provide sufficient response time for flood control and disaster relief prior to the flood event.

Study Area and LYR Micro-Model
The study area is located in the lower reach of the Yellow River, in central China, stretching from the Xiaolangdi Reservoir to the Yellow River estuary, and comprising a distance of approximately 800 km (Figure 1a).The bed slope varies from 0.010% to 0.012%.The drainage area of the LYR is 23,000 km 2 .The width between the levees varies from 1 to 4 km; however, the width of the main channel is only about 500 m.Most of the cross-sections are compound sections.The wide flood plains between dykes are extensively cultivated.Here, more than 2000 natural villages had formed by 1900 and nearly 1.8 million people live on these flood plains.The total area of the flood plains is 3956 km 2 , accounting for 85% of the total area between the levees.
It is possible to use the real LYR as a study case for the verification of the validity of the assimilation model.However, it is expensive to establish a stable data acquisition and transmission system on such a long river.To verify the performance of the assimilation model under various hydrological conditions, an extended data acquisition duration is required.Thus, a micro-model was constructed using the LYR as a prototype in our laboratory to obtain real-time data more conveniently.This model was first designed in reference to digital elevation model (DEM) data of the LYR and then corrected by the cross-section data measured in 2013 before the flood season.The model has a total length of 27.38 m with a longitudinal ratio 1:28,000 and a total of 57 cross-sections along the river, including Huayuankou, Jiahetan, Gaocun, Shilou, Aishan, Luokou, and Lijin (Figure 1b).Two automatic control systems are assumed for the model, one at the inlet that controls the inlet flow and one at the outlet that controls the downstream water level.There is also an automatic observation system at the water level observation point (see in Figure 1b), which obtains the water level in real time.

Experimental Setup
All measuring equipment was calibrated prior to the experiment.Designed sequential processes of inlet flow discharge and outlet water levels were input into the controlling computer.Once the flood started, the water level observation system began to continually collect water level data at an interval of 10 s.
To verify the performance of the assimilation model under different hydrological conditions, several test events were designed in reference to the peak flow and flood duration in the prototype.Nine experiments were conducted for this study.The first three experiments constituted a calibration group which was used to calibrate the roughness coefficient of the hydrodynamic model only, while the further six experiments were used for the verification of the assimilation model.These six verification experiments were labeled Experiment 1-6 and were divided into two groups with three typical flow processes each.The first group was used to verify the prediction accuracy of the assimilation model for flood events without floodplain flooding, with one floodplain flooding, and with multi-floodplain flooding.The ranges of flow discharge were 3.6-7.2m 3 h −1 , 3-7.92 m 3 h −1 and 7.2-11.88m 3 h −1 , respectively.The second group was used to verify the prediction accuracy of the assimilation model during flood events with different numbers of flood peaks due to the periodically changing flood that may occur in the proto-river.All flow discharges in this group remained within 7.2-16.2m 3 h −1 .

Hydrodynamic Assimilation System
The hydrodynamic assimilation system mainly consists of a hydrodynamic mathematical model, an assimilation model, and a real-time data acquisition system.Details of the mathematical model principle and the model coupling method are described below.

Hydrodynamic Mathematical Model
The one-dimensional hydrodynamic model has been widely used in flood forecasting due to its advantages of simple execution, short calculation time, and high real-time efficiency [18,19]; consequently, it was also used herein to meet the timeliness requirement of flood forecasting.The unsteady one-dimensional free surface flow in an open channel was described by the Saint-Venant equations [4].The four-point implicit finite-difference scheme was used to discretize the governing equations.The equations were then solved by successively applying the double sweep Thomas algorithm under given boundary conditions and initial conditions [7].The upstream conditions are inflow discharges and the downstream boundary conditions are outflow water levels.
The Manning coefficient of roughness reflects the roughness of the inner boundary of the channel and characterizes the resistance to water flow.The roughness value not only depends on the physical properties of the channel, but also on the state of the water flow.The overall roughness coefficient of a given cross-section is a composite value that includes the entire influence of water flow resistance.Typically, the roughness coefficient is determined by the measured hydrological observation data.Considering the change of channel characteristics along the river and the unsteadiness of the flooding flow, the spatiotemporal variation of the roughness coefficient should be considered in a flood forecasting hydrodynamic model.Therefore, it is the most important parameter that needs to be continually adjusted in the assimilation model.

PF-Based Assimilation Model
The PF algorithm is a method that combines the Monte Carlo method with Bayesian estimation.It uses a set of random particles with correlated weights to approximate the probability density function and obtain the optimal probabilistic interval estimation.When the number of particles is sufficient, the posterior probability distribution of the state variables can approximate the true probability distribution [20,21].Supposing a set of random weighted samples can be independently obtained from the posterior probability distribution of state variables, the posterior probability distribution of state variables can be expressed by the following formula: where p (X t |X 1:t ) represents the posterior probability distribution of state variables at time t; X i,t and w i,t represent the ith particle and its weight, respectively, at time t; X t represents the data observation at time t; N represents the number of particles; and δ (•) represents the Dirac delta function.
The water level, flow discharge, and roughness coefficient of each cross-section along the river constitute a basic particle in the PF algorithm.Each basic particle characterizes a possible river flood state.The probability distribution of the river flood state can be represented by the particle set.During the assimilation process, the water level observation data of each station is used to evaluate the accuracy of the prior predicted water level value at the calculation section.The fractional likelihood weight of each station is then computed as: where w i,j,t and Z i,j,t represent the fractal weight and the prior predicted water level value of the ith particle at the jth station at time t, Y j,t represents the observed water level at the jth station at time t, and σ t represents the standard deviation of the likelihood function at the jth station at time t, which can be reduced to constants for different stations and at different times [22].However, the main deficit of the PF is its degradation.It is possible that most particles carry less weight and only few particles are effective after updating the particle weights, which is called "particle degeneracy" [23].Typically, whether particles in the filter are degraded is assessed through effective number of particles [24]: where N eff represents the effective number of particles.When N eff is below a specified value, resampling is required.Resampling as proposed by Gordon et al. [9] can solve this problem.The aim of such resampling is the elimination of less-weighted particles and copy more heavily-weighted particles.The resampling algorithm samples the set of particles {x i,t , w i,t } N i=1 by the probability w i,t for N times and obtains a new set of particles {x j,t , 1/N} N j=1 after sampling.Finally, the posterior probability distribution of the state is obtained by a weighting calculation.
Random noise is added to the resampled particle set to properly perturb the data to ensure particle diversity.The proposed method mainly disturbed the roughness parameter:

.3. Coupling of Hydrodynamic Model and Assimilation Model
The hydrodynamic assimilation system is built by integrating the one-dimensional hydrodynamic model and the PF algorithm.Therefore, the real-time flood routing model of the river channel is mainly divided into two phases.The first phase is the hydrodynamic model forecasting phase.The flood routing process in each particle state is calculated according to known future time boundary conditions.The second phase is the PF correction phase, which updates the particle set at the moment when the observation data becomes available.Furthermore, prior distributions of particles are corrected to the posterior distributions whose mean value are closer to the real process.Both phases are interdigitated and coupled.The specific coupling procedure is shown in Figure 2 and is described in the following: (1) At the initial time, a random sample from the initial distribution of the water level, flow discharge, and roughness of each cross-section along the river as well as a set of N p particles with equal weights are obtained.(2) The hydrodynamic model is used to calculate the water level and flow discharge at the present time step for each particle and a set of calculation results is obtained for all particles.(3) The assimilation module is used for present time step if the water level observation data is available.The weight of the particle set is updated using Equation ( 2) and referring to the observed water level data.The particles with prior water level values closer to the measured water level values obtain a larger weight calculated by a likelihood function.If the updated particles are degraded, resampling is initiated using the method proposed by Gordon et al. [9] and the roughness coefficient is perturbed using Equation ( 4).(4) The assimilation module is exited and the procedure starts at Step (2), continuing the hydrodynamic model calculation for the next time step.
Finally, the water level, flow discharge, and roughness calculated at the present time step are completely different from those of the hydrodynamic model without PF.The state of the water flow and the drag coefficient are corrected, which in turn affects all subsequent calculations.Through continuous model coupling and correction, the prediction results are closer to the true value.

Numerical Model Setup
The topographical data used for the numerical model were consistent with the micro-model and had a total of 57 computational cross-sections.The boundary condition was set to the measured water discharge at the inlet and a synchronous water level at outlet.The initial conditions were set to the discharge and water level of each section determined by the boundary condition at the first moment.The time step was set to 1 s to satisfy Courant stability conditions.The initial roughness coefficient of the river was calibrated using the experiments of the calibration group and was set to the calibrated result of 0.05.
For the assimilation model, the initial discharge, water level, and roughness of each particle were derived by perturbing the initial values of each cross-section along the river.Furthermore, the perturbing noises were 0.01 Q 0 , 0.001 mm and 0.002, respectively.The perturbation of the boundary conditions was similar.

Model Performance Evaluation Index
The root mean square error (RMSE) and the mean relative error (MARE) were used to quantitatively describe the accuracy of the particle set mean.The specific definition is shown in the following: where M represents the number of observations, z m represents the mth observation value, and xm represents the mean value of the particle set corresponding to the mth observation time.Smaller calculation results of RMSE and MARE indicate higher accuracy of the simulation results.The quantile-quantile plot (Q-Q plot) was also used as a visual method with which to examine the reliability of the probability distribution.In the Q-Q plot, the x-axis is a theoretic quantile of a uniform distribution and the y-axis is the actual quantile of the observation.If the plot follows a 1:1 line, the probabilistic prediction is perfectly reliable.A detailed description of the Q-Q plot can be found in Laio and Tamea [25].The Q-Q plot has a corresponding quantitative indicator-reliability α: where y m represents the normalized observation quantile.The reliability α ranges between 0 and 1.
A larger α value indicates a better degree of coincidence of the quantile-quantile curve with the 1:1 line, and better reliability of the probability distribution interval of the particle set.

Determination of the Number of Particles
In the assimilation model, a larger number of particles indicates better correction performance.However, with increasing number of particles, the computation load also increases greatly.Sensitivity analysis on particle number was conducted by choosing one of the experiments.Figure 3 shows the relationship between the number of particles and the water level root mean square errors.The water level RMSEs decreased with increasing number of particles.Especially, the RMSEs decreased sharply when the number of particles was below 50.To ensure the simulation accuracy of the model and short calculation time, the number of particles was set to 100.

Correction of Water Level and Roughness Coefficient
The observed water levels of the last six verification experiments were chosen to verify the correction performance of the assimilation model.The total run time of each experiment was 1 h and the assimilation correction time step was consistent with the water level acquisition time step (both 10 s). Figure 4 shows a comparison between observed and predicted water level processes both with PF and without PF (only traditional hydrodynamic model) for the experiments.Figure 4 shows that the predicted water level without PF deviates greatly from the observed curve with increasing time, while the level predicted with PF is closer to the observed value.This is because the assimilation model incorporates the observed value in real-time, and parameters of the model are constantly optimized to let the result approach the actual value.In addition, many values fall within the 90% confidence intervals.Table 1 shows the RMSEs and MAREs of water levels predicted with and without PF for different experiments.The RMSE of each experiment after assimilation was close and small.Compared to the results without PF, the RMSEs of the predicted value with PF decreased by 75%, 80%, 81%, 90%, 87% and 86%, respectively.The MAREs of the predicted value with PF also decreased by 73%, 82%, 89%, 81%, 86% and 86%, respectively.The reliability of the predicted results was also calculated and found to be above 0.8.More than 90% of the observed values fall into the 90% confidence interval, which illustrates that the assimilated results are reliable.The results shown in Figure 4 and Table 1 indicate that the assimilation model can effectively improve the obtained prediction accuracy.The probability distribution of the water level simulated at each time step was also analyzed.Since the water level changes greatly at t = 40 min, the prior (before assimilation) and posterior (after assimilation) distributions of the water level were compared at this time step to access the performance of the assimilation model at this critical time.The correction result of the water level at this particular time step can better represent the performance of the model.Figure 5 shows histograms of the prior and posterior distributions of the water level of these six verification experiments.
Figure 5 illustrates the differences between the mean value of the water level for both the prior and posterior distribution of particles.The prior distribution of the particle set is more scattered, without obvious aggregation near the observed value, while the posterior distribution of the particle set is concentrated near the observed data.Compared to the prior distribution of particles, the posterior distribution of particles is closer to the observed values.This result indicates that the PF method can effectively utilize the observation data and maintain more particles near the observation data.This also proves that this method achieves good correction performance for the water level.
Figure 6 shows the roughness retrieved from the observed data over time.The curves indicate the real-time correction processes of roughness, and the straight line indicates the fixed value used in the hydrodynamic mathematical model.Figure 6 shows that the changing process of the corrected roughness is rather complicated.Different experiments show that the roughness continuously fluctuated but changed gently in the increasing phase of the flow process.However, during the decreasing phase of the flow process, the roughness changed sharply and increased continuously.This phenomenon is more apparent in Experiments 4-6.This characteristic is consistent with previous studies on the analysis of the change in the resistance characteristics of the channel in the LYR [26].Within a certain flow range, the roughness coefficient is inversely proportional to the flow rate.This is mainly due to the form resistance of both the side walls and the riverbed.

Model Prediction for Different Lead Times
As state above, the correction performance of the PF-based assimilation algorithm is good.Therefore, the corrected flow discharge, water level, and roughness were used as initial conditions for the flood prediction at the next time step.The boundary conditions of the future moments should have been predicted via the hydrological model; however, since this study focused on flood routing forecasting and its uncertainty, the historical observation data at the boundary section was appropriately disturbed as the boundary conditions.The lead times were set to 10 s, 20 s, 30 s, and 60 s.
Figure 7 shows the relationship between the lead times and the RMSEs of water level predictions for different experiments.The following results were obtained: (1) Under different inflow conditions, the RMSEs of the predicted water level increases significantly with increasing lead time.(2) Comparison between the experiments of both verification groups showed that there is little difference of the RMSEs when the lead time remained below 30 s; however, the RMSEs of the second group was clearly larger than that of the first group for a lead time above 30 s.The main reason is that the roughness value used to predict the flooding process in the future is the current roughness value, which changed constantly during the actual flood process.The water flow fluctuations are more complicated in the second group and the magnitude of the change of the incoming water flow exceeds that of the first group.The uncertainty of the roughness coefficient obviously increased with increasing lead time.Figure 8 shows the Q-Q plots of water level predictions for different experiments in response to different lead times.The Q-Q plots are close to the 1:1 uniform line for lead time of 10 s.The fitting effect of the quantile of observed values and the 1:1 uniform line is deteriorating with increasing lead times.In general, with increasing lead time, the reliability of the probability prediction interval of the water level gradually decreases; however, it can still represent the uncertainty of the water level forecast result well.

Influence of the Flooding Lead Time
Theoretically, the hydrodynamic model itself does not have a foreseeable period, and the flood propagation process can only be calculated boundary conditions.For the hydrodynamic model to predict the future flood propagation process, the boundary conditions need to be predicted in advance.These boundary conditions should be predicted using both the weather forecasting model and the hydrological model.However, even if the boundary conditions are correctly chosen, the predicted results of the hydrodynamic model are not necessarily satisfactory, as can be seen in the simulated results of the hydrodynamic model without PF in Figure 4.This is because the given roughness (or the corrected roughness at the previous time step) and the initial field of water flow (the result of the previous time step) cannot adequately represent the hydraulic conditions during the following period of time.When the lead time is small, the model parameters and the initial field of water flow exert less impact on the future water level forecast.However, when the lead time is sufficiently large, the given roughness and flow conditions are quite different from the actual conditions; consequently, the simulated water level results will greatly deviate.
At the same time, even if the lead time remains identical, the PF-based assimilation model achieves different degrees of improvement for the prediction results of both designed sets of experiments, compared to the hydrodynamic model (as shown in Figure 4).The correction performance of the PF algorithm is not affected by complex changes of the water conditions, which shows that the assimilation model has good applicability for water level prediction under different scenarios.It can be inferred that, under the same accuracy requirements, the PF-based assimilation model can increase the flooding lead time, thus gaining time for both flood prevention and emergency evacuation.With reference to the designed scale of the geometry, slope, and roughness of the LYR micro-model, the lead time of 30 s is equivalent to about 7.4 h in the prototype of the Yellow River.Therefore, the improvement of the flood forecasting accuracy and the increase of flooding lead time achieved by the assimilation model are of great significance for flood control and disaster relief.

Application to Real Rivers
It should be emphasized that the flood routing model presented here is universal and not regionally limited.It can not only be used for indoor physical models but also for various practical river channels.However, the premise is that a real-time data automatic acquisition system needs to be established on the actual river channel, and the obtained data have to be continuously transmitted to the hydrodynamic assimilation system for the prediction.The upstream and downstream boundary conditions of the future of the river also need to be specified correctly.The hydrodynamic model developed here only simulates the flow routing without considering other forcing data.To truly improve the accuracy of flood prediction for longer lead times in an actual river channel, a complete flood forecasting system has to be established.In addition to the model presented in this paper, two other basic elements are required: (i) a rainfall forecasting model; and (ii) a rainfall-runoff forecasting model [2].Further coupling of these models requires future research.In addition, the hydrodynamic model developed in this study needs to be further improved against actual river conditions, e.g., considering over-bank transport, curved circulation, and dam scheduling.

Conclusions
This study verified and evaluated the accuracy of an assimilation model system for real-time flood prediction at micro-model scale in the laboratory.A PF-based assimilation model was coupled with a one-dimensional hydrodynamic model to simulate the water level under different boundary conditions.Real-time water level observations at observation points along the channel were assimilated into the modeling system to achieve probabilistic forecasting.
We used the assimilation model system to predict the flood process against six different experiments.All results indicate that the RMSEs and MAREs of the water level predicted by the assimilation model were much lower than the results of the hydrodynamic model without PF.These results prove the precision and accuracy of water level forecasting with the assimilation model.With increasing lead time, the RMSEs of the predicted water level increased significantly.However, the RMSEs of the results predicted by the assimilation model were much smaller than those predicted by the hydrodynamic model without PF, indicating that the assimilation model can enlarge the flooding lead time.Of course, further coupling of hydrological and weather forecasting models will increase the applicability of the assimilation model.The application of this PF-based assimilation model to flood forecasting is of great significance toward improving the accuracy of real flood forecasting in the LYR.It enables more accurate decisions for future floods without delay, thus reducing property losses on both sides of the Yellow River.

Figure 1 .
Figure 1.Map of the Lower Yellow River (LYR): (a) the location of the LYR; and (b) plan view of the micro-model in the laboratory.

Figure 2 .
Figure 2. Flow chart of the hydrodynamic assimilation system.

Figure 3 .
Figure 3. Relationship between the number of particles and the water level RMSE.

6 Figure 4 .
Figure 4. Comparison between observed and predicted water level processes.The dots represent the observed values.The black line represents the mean of the particles simulated by assimilation model.The dotted line represents the water level calculated by the hydrodynamic model.The gray shaded area represents the 90% confidence interval.

6 Figure 5 .
Figure 5. Histograms of prior (gray) and posterior (colorless) distributions of water levels in different experiments.The observed value is marked with a black triangle.The black dotted line indicates the mean value of the prior distribution of particles and the black solid line indicates the mean value of the posterior distribution of particles.

6 Figure 6 .
Figure 6.Dynamic variation process of roughness.The straight line indicates the fixed value used in the hydrodynamic model.

Figure 7 .
Figure 7. RMSEs of water level prediction for different lead times in different experiments.

Table 1 .
Correction performance for different experiments.