A Statistical Vertically Mixed Runo ﬀ Model for Regions Featured by Complex Runo ﬀ Generation Process

: Hydrological models for regions characterized by complex runo ﬀ generation process been su ﬀ er from a great weakness. A delicate hydrological balance triggered by prolonged wet or dry underlying condition and variable extreme rainfall makes the rainfall-runo ﬀ process di ﬃ cult to simulate with traditional models. To this end, this study develops a novel vertically mixed model for complex runo ﬀ estimation that considers both the runo ﬀ generation in excess of inﬁltration at soil surface and that on excess of storage capacity at subsurface. Di ﬀ erent from traditional models, the model is ﬁrst coupled through a statistical approach proposed in this study, which considers the spatial heterogeneity of water transport and runo ﬀ generation. The model has the advantage of distributed model to describe spatial heterogeneity and the merits of lumped conceptual model to conveniently and accurately forecast ﬂood. The model is tested through comparison with other four models in three catchments in China. The Nash–Sutcli ﬀ e e ﬃ ciency coe ﬃ cient and the ratio of qualiﬁed results increase obviously. Results show that the model performs well in simulating various ﬂoods, providing a beneﬁcial means to simulate ﬂoods in regions with complex runo ﬀ generation process.


Introduction
Rainfall-runoff models are basic and important tools for flood forecasting [1][2][3]. Reliable and accurate flood forecasts by rainfall-runoff models are important for efficient reservoir operation, river management, flood control and warning [4][5][6][7][8]. In the early 20th century, basic runoff generation theories were creatively classified as runoff formation in excess of infiltration and on repletion of storage-based on the hypothesis that the unsaturated zone soil is homogeneous [9]. Subsequently, the discovery of interflow on the interface of adjacent soil layers with different soil properties and the saturated surface runoff has further improved the basic runoff theory [10].
In the past decades, a variety of hydrological models have been developed based on the runoff theory. For instance, the semidistributed Xinanjiang (XAJ) model developed by Zhao [11] is widely used in humid areas in China and other regions worldwide [12][13][14][15]. The main feature of XAJ model is the concept of saturation-excess, which means that runoff is not produced until the soil moisture content of the unsaturated zone reaches field capacity [3,11]. The semidistributed TOPMODEL (topography based hydrological model) was developed by using a variable source area concept that assumes

Methodology
In this part, the conceptual rainfall-runoff model is described. The model is based on conservation of mass. It is characterized with several lumped parameters and physical parameters. It can be used for event-based watershed flood simulation. The computational time step is one hour. The generalization of the runoff generation process is shown in Section 2.1. The calculation of runoff is shown in Sections 2. 2-2.4. Moreover, the measures of performance assessment for the model are shown in Section 2.5.
The XAJ [11], SBM [19], VMM [34] and VLRM [3] models are chosen as typical models of flood forecasting to compare with the Statistical Vertically mixed Hydrological Model (SVHM) model developed in this work. XAJ, SBM, VMM and VLRM are selected as representative of saturation-excess based models, infiltration-excess based models, mixed runoff models and partial saturation-excess based models, respectively. The VLRM model is developed using dual distribution curves for variable soil water storage capacity over basin. It is a model for semi-humid and semi-arid basins, the core of which is the partial saturation-excess concept. Figure 1a shows a generalization of the runoff generation process in a vertical element of unsaturated zone. Generally, the precipitation interception (mostly canopy interception) is the first step during a rainfall-runoff event. In this work, the precipitation interception is not explicitly considered. The canopy interception is implicitly considered through evapotranspiration. During a rainfall event, evapotranspiration (E) is first subtracted and the runoff generation is estimated by considering both the soil storage capacities and the infiltration capacities. The soil surface largely is the first point of contact by rainfall, herein the rainfall rate excessing the infiltration capacity of surface soil becomes surface runoff (RS) and the remaining rainfall keeps infiltrating downward (denoted as I). If the rainfall rate does not excess the infiltration capacity of surface soil, the total rainfall directly infiltrates into the soil. After the tension storage deficit of the unsaturated zone is satisfied (i.e., reaches the field capacity), subsurface runoff (R sub ) emerges, which includes interflow (RI) and groundwater runoff (RG). RS, RI and RG collectively constitute the total runoff.

The Generalization of the Runoff Generation Process
What is needed to be pointed out is that infiltration (I) is the input of the latter subsurface runoff process, instead of rainfall. Hereby, it can be inferred that the two runoff generation parts are closely interactive in terms of the vertical water transmission, i.e., the infiltration (I) from surface to subsurface.
Water 2020, 12, 2324 4 of 26 In addition, infiltration (I) is spatial heterogeneous. As it can be seen in Figure 1b, the infiltration-excess runoff generation process at surface and the saturation-excess runoff generation process at subsurface are actually connected by a spatial uneven infiltration amount (I j ,j = 1, 2, . . . , n, n is the total number of the vertical element), which is determined by rainfall (P) and infiltration capacity at a point of surface ( f ).
In a word, the whole runoff generation process is concluded as two parts: (1) surface runoff generation in excess of infiltration capacity of surface soil and (2) subsurface runoff generation on repletion of storage capacity of unsaturated zone.
Water 2019, 11, x FOR PEER REVIEW 4 of 27 interactive in terms of the vertical water transmission, i.e., the infiltration (I) from surface to subsurface. In addition, infiltration (I) is spatial heterogeneous. As it can be seen in Figure 1b, the infiltration-excess runoff generation process at surface and the saturation-excess runoff generation process at subsurface are actually connected by a spatial uneven infiltration amount ( j I , 1, 2,..., j n  , n is the total number of the vertical element), which is determined by rainfall ( P ) and infiltration capacity at a point of surface ( f ).
In a word, the whole runoff generation process is concluded as two parts: (1) surface runoff generation in excess of infiltration capacity of surface soil and (2) subsurface runoff generation on repletion of storage capacity of unsaturated zone.

The Estimation of Surface Runoff
The surface runoff depends on the relationship between rainfall intensity and infiltration capacity according to the Horton theory [9]. The rainfall rate excessing the infiltration capacity of surface soil becomes surface runoff. Therefore, the key to surface runoff calculation is to estimate the inhomogeneous infiltration capacity given the observed rainfall rate. To provide for a spatial distribution of infiltration Water 2020, 12, 2324 5 of 26 capacity throughout the basin, a new infiltration capacity curve ( Figure 2) is developed [34] in this study. The novelty of the spatial distribution of infiltration capacity lies in that the curve is separated into two parts by α bound , which can be used to calculate the spatial uneven infiltration (the details are shown in Section 2.3). In Figure 2, α represents the proportion of pervious area of the basin whose infiltration capacity is less than or equal to the value of the ordinate f . The infiltration capacity at a point ( f ), varies from zero to a maximum f mm according to the relationship: where BF is a parameter.

The Estimation of Surface Runoff
The surface runoff depends on the relationship between rainfall intensity and infiltration capacity according to the Horton theory [9]. The rainfall rate excessing the infiltration capacity of surface soil becomes surface runoff. Therefore, the key to surface runoff calculation is to estimate the inhomogeneous infiltration capacity given the observed rainfall rate. To provide for a spatial distribution of infiltration capacity throughout the basin, a new infiltration capacity curve ( Figure 2) is developed [34] in this study. The novelty of the spatial distribution of infiltration capacity lies in that the curve is separated into two parts by bound  , which can be used to calculate the spatial uneven infiltration (the details are shown in Section 2.3). In Figure 2,  represents the proportion of pervious area of the basin whose infiltration capacity is less than or equal to the value of the ordinate f . The infiltration capacity at a point ( f ), varies from zero to a maximum mm f according to the relationship: where is a parameter.
the areal mean infiltration capacity, FM , can be estimated by using an infiltration curve modified from the Green-Ampt infiltration curve [34], which is expressed as follows: The areal mean infiltration capacity, FM, constitutes an alternative value to the maximum value f mm . They are related through the parameter BF. From Equation (1), by integration, it is easy to show that the areal mean infiltration capacity, FM, can be estimated by using an infiltration curve modified from the Green-Ampt infiltration curve [34], which is expressed as follows: where FC is the stable infiltration rate (mm/day or mm/h), KF is osmotic coefficient, WM is the areal mean tension water storage capacity (mm) and W is the areal mean tension water storage (mm). By integration, surface runoff RS (the area to the left and above the curve in Figure 2) can be successfully calculated [34]: where PE represents net rainfall or precipitation excess (i.e., precipitation P minuses evapotranspiration E), FM represents the total area below the curve.

The Estimation of Infiltration (I)
The infiltration amount (I) is nonuniform over the basin owing to the spatial heterogeneity of infiltration capacity of surface soil. Therefore, the infiltration amount (I) is assumed to be a spatial random variable in this work.
As it can be seen in Figure 2, the infiltration amount (I) is different over the basin. In some areas, the infiltration at a point (I 1 ) depends on the infiltration capacity of that point when rainfall rate is larger than infiltration capacity. These points (or areas) are lumped to the left of α bound . On the other hand, the infiltration (I 2 ) is the net rainfall PE (i.e., precipitation P minuses evapotranspiration E) at some points with great perviousness. These points (or areas) are lumped to the right of α bound .
If PE < f mm , α bound < 1 if PE ≥ f mm , α bound = 1 where α bound is the demarcation point of area with PE exceeding f and area with f exceeding PE, To date, the spatial variable infiltration (I) at each point is successfully estimated, the statistical method used in which is originally developed in this work.

The Calculation of Subsurface Runoff
Subsurface runoff at a spatial point generates on repletion of storage capacity of unsaturated zone at that point. To provide for a nonuniform distribution of water storage capacity throughout the basin, a water storage capacity curve ( Figure 3) used in the XAJ model [11] is introduced in this study, which can be expressed as follows: where β represents the proportion of the pervious area of the basin whose tension water capacity is less than or equal to the value of the ordinate W . The water storage capacity varies from zero to a maximum WMM. B is a parameter. Equation (7) can also be written as: According to the concept of runoff formation on repletion of storage, the discriminant of runoff generation can be expressed as: then, the amount of runoff generation at a point (R sub ) ( Figure 1) can be described as: where j = 1, 2, . . . , n, n is the total number of the vertical element.
then, the amount of runoff generation at a point ( sub R ) (Figure 1) can be described as: where 1, 2,..., j n  , n is the total number of the vertical element. It is difficult to calculate the downward water transport (i.e., j I ) at a specified point of surface layer owing to that the infiltration capacity over the basin is described by a statistical distribution curve. It is the same for the storage capacity at a specified point ( ' j W  It is difficult to calculate the downward water transport (i.e., I j ) at a specified point of surface layer owing to that the infiltration capacity over the basin is described by a statistical distribution curve. It is the same for the storage capacity at a specified point (W j ). Hence, it seems complicated to calculate the subsurface runoff (i.e., R sub = I j − W j ) at a specified point thereby at the whole basin by using this kind of lumped models because of the lack of point-to-point values.
To consider the spatial variability of runoff generation, a statistical method is constructed in this work to estimate the spatial subsurface runoff. R sub can be deemed as a spatial random variable owing to that I and W are spatial random variables. According to the probability theory, it can be derived that the probability of event {I ≥ W } is the proportion of contributing area of subsurface runoff to the whole area and the probability P(R sub ≤ r sub ) for event {R sub ≤ r sub } is just the probability distribution F(r sub ). According to the statistical theory for distribution of random variable, the probability distribution of R sub can be expressed as: where f (i, w ) is the joint probability density for variables I and W , Ω is the integral domain. Generally, there is not enough evidence to deem that there is a direct relationship between I and W . Thus, it could be assumed that the two variables I and W are independent, namely f (i, w ) = f i (i) · f w (w ). Then, the above formula can be written as: the expectation E(R sub ) can be derived according to Equations (12) and (13), which is the subsurface runoff generation over the basin. Hence, the calculation of subsurface runoff R sub successfully converts to the issue of estimating the probabilistic distribution F(r sub ), and the expectation of which is just the subsurface runoff. Unfortunately, it is impracticable to obtain the integral of the above formula Equation (12). Thus, a more convenient and feasible numeric method instead to derive F(r sub ) and E(R sub ) is originally constructed in this work according to the following steps.
Step1: to conduct a random sampling (e.g., 1000 times) for infiltration I and water storage capacity W from the domain, respectively. In details, α 1 is uniformly sampled 1000 times and the corresponding infiltration I 1 (1000 values) are obtained according to the relationship (Equation (5) or (6)). Similarly, the 1000 random W can be obtained through 1000 samplings from β according to Equation (7).
Step2: to obtain the spatial variable R sub based on the concept of runoff formation on repletion of storage (i.e., R sub = I − W ). Negative values can be omitted and replaced by zero values. Subsequently, the spatial runoff generation can be represented by a nonnegative variable R sub .
Step3: to calculate the areal mean subsurface runoff generation R sub . What needs to be pointed out is that R sub is the runoff generated at the points whose infiltration capacities are less than net rainfall PE (i.e., those points located between 0 and α bound in Figure 2). Thereby, the areal mean runoff generation in these areas is R sub1 × α bound , denoted as R sub1 × α bound . The average water content W t can be calculated using the following equation: where the subscript '1' denotes the area mentioned above, i.e., I 1 is the areal mean infiltration in the area mentioned above. In other word, the above procedure is used to calculate runoff generation for those points whose infiltration capacity is less than net rainfall PE. While for the areas or points whose infiltration capacity is larger than net rainfall (PE ≤ f , I 2 = PE), the runoff generation can be estimated by the following formula [11]: where the subscript '2' denotes the area whose point infiltration capacity is larger than net rainfall PE ≤ f, I 2 = PE. Where W is the areal mean water content, WM is the areal mean tension water storage capacity, A is the ordinate corresponding to the initial (or the last time) soil water content W. WMM is the maximum point storage capacity. For the whole area of a basin, the average subsurface runoff generation R sub can be estimated using the following full probability formula: The R sub becomes an addition, ∆S, to the free water storage S, which contributes RI laterally to inflow and RG vertically to ground water, according to the following relations.
where KI and KG are parameters that need to be calibrated in model. Further details about the separations of runoff components can be referred to Zhao [11]. The RS, RI, and RG are both routed through linear reservoirs approach. If the runoff generation estimation is implemented at subbasin scale, the flow routing from the subbasin outlets to the total basin outlet can be achieved by applying the Muskingum method to successive subreaches. The Muskingum method is a channel flow algorithm based on Tank storage equation and water balance equation. It was widely used in practice because of its convenience and high precision. If the runoff is calculated at basin scale, the flood routing for the whole river can be succinctly implemented by using Muskingum at the river channel, which plays a Water 2020, 12, 2324 9 of 26 role of flood regulation for the whole basin. In the case study below, the runoff is calculated at basin scale owing to that the basin is not large and there is less data for model calibration at subbasin scale. Evapotranspiration (E) is related to potential evapotranspiration through a three-layer soil moisture model depending on four parameters KC (ratio of potential evapotranspiration to pan evaporation), UM (tension moisture capacity of upper layer), LM (tension moisture capacity of down layer) and C (coefficient of deep evapotranspiration). For more details about the three-layer evapotranspiration approach, refer to Zhao [11].

Measures of Performance Assessment
The relative error of flood peak (REP), the error of peak appearance time (ET), the relative error of total runoff depth (RER) and the Nash-Sutcliffe efficiency coefficient (NSE) are employed to collectively demonstrate the performance of the model. In addition, the ratio of qualified total runoff (RQR), ratio of qualified flood peak (RQP), ratio of qualified appearance time for flow peak (RQT) and ratio of qualified results (RQ) are used to assess the level or grade of simulation performance [36,37]. "Qualified" here means the results are within the permissible error, which refers to the note below Tables 1-3.
where N R , N P and N T are the number of qualified simulated total runoff, the number of qualified simulated flood peak and the number of qualified simulated appearance time of flood peak, respectively. Q s is the simulated discharge, Q o is the observed discharge, Q o is the mean value of observed discharge. M is the number of hours in a flood event (the time step in this study is hour). N is the number of flood events. RQ ≥ 85%, 70% ≤ RQ < 85% and 60% ≤ RQ < 70% can be classified as "A" grade, "B" grade and "C" grade forecast, respectively [36].

Case Study
In this study, the capability of the statistical vertically mixed hydrological model above (denoted as SVHM) to improve the accuracy of runoff estimation is tested in three typical semi-humid and semi-arid basins in China.

Study Areas and Data
To test the performance of model for flood forecasts in regions characterized by complex runoff process, the Zijingguan Basin, Xiuwu Basin and Chutoulang Basin ( Figure 4) are selected for streamflow simulation. Zijingguan Basin (1760 km 2 ) is a subbasin of the Daqinghe River Basin (43,060 km 2 ) located within the Haihe River Basin (HRB), China. The basin is in the upstream of Juma River where a series of floods happened during history periods, threatening the downstream Beijing City, the capital of China. The elevation of Zijingguan Basin ranges from 511 to 2157 m above sea level. The annual mean precipitation is approximately 543 mm and annual mean temperature is 9.6 • C [38]. The multiyear averaged concentration time of the catchment is 7 h. The annual streamflow of Zijingguan Basin decreases significantly because of the impact of human activity (e.g., over extraction of groundwater) and climate change after 1990s [39], leading to great changes in runoff generation conditions and flow regimes [30]. In some dry period, the river even runs out of discharge. There is no reservoir in the basin. The land use changes with almost 26% reduction of grassland, 17% reduction of urban and rural construction land, 90% reduction of water area and 24% increase of forest. Xiuwu Basin (1287 km 2 ) is located at the southern part of the HRB, the elevation of which ranges from 76 to 1366 m ASL The annual mean precipitation is approximately 608 mm and annual mean air temperature is 14 • C. More than 50% of the total annual precipitation happens in July and August and many flood events occur over this basin. The runoff coefficients in Xiuwu Basin range from 0.05 to 0.23. Chutoulang Basin (3009 km 2 ) is a subbasin of the Laohahe River Basin (18,599 km 2 ) located within the Liao River Basin (one of the top 7 large river basins of China). The elevation of the basin ranges from 675 to 2054 m ASL The annual mean precipitation is approximately 370 mm. The runoff coefficients range from 0.01 to 0.30, most of which range from 0.05 to 0.15, which indicates the existence of complex runoff and it is drier than the Zijingguan and Xiuwu Basins.
Data are compiled from some historical floods including the hourly precipitation, hourly flow data and daily observed evapotranspiration via evaporation pan, the latter of which is evenly interpolated into hourly data.

Calibration and Validation
The hourly data during flood season are used to calibrate and validate the model. The data were compiled from the local Bureau of Hydrology and Water Resources, China. The data were addressed through four steps, including textual research, establishment of the relationship between two hydrological elements, tabulation and rationality check.
For Zijingguan Basin, totally 13 typical flood events are used, of which 8 for calibration and the rest for validation. For Xiuwu Basin, 30 flood events are used, of which 20 for calibration and 10 for validation. For Chutoulang Basin, 9 flood events are used for calibration and 5 for validation. The shuffled complex evolution method (SCE-UA) [40][41][42] technique is used to calibrate the state variables ( WU , WL and WD ) and other model parameters in each flood event. The SCE-UA does not rely on gradients and hence is largely insensitive to microscale roughness, while the stochastic nature helps to avoid entrapment in local optima [40]. WU , WL and WD are the soil tension water in upper layer, down layer and deep layer (Figure 1), respectively. In real-time flood forecasting, these

Calibration and Validation
The hourly data during flood season are used to calibrate and validate the model. The data were compiled from the local Bureau of Hydrology and Water Resources, China. The data were addressed through four steps, including textual research, establishment of the relationship between two hydrological elements, tabulation and rationality check.
For Zijingguan Basin, totally 13 typical flood events are used, of which 8 for calibration and the rest for validation. For Xiuwu Basin, 30 flood events are used, of which 20 for calibration and 10 for validation. For Chutoulang Basin, 9 flood events are used for calibration and 5 for validation. The shuffled complex evolution method (SCE-UA) [40][41][42] technique is used to calibrate the state variables (WU, WL and WD) and other model parameters in each flood event. The SCE-UA does not rely on gradients and hence is largely insensitive to microscale roughness, while the stochastic nature helps to avoid entrapment in local optima [40]. WU, WL and WD are the soil tension water in upper layer, down layer and deep layer (Figure 1), respectively. In real-time flood forecasting, these initial conditions are commonly assigned values based on the continuous running results. In this work, for flood events simulation, they are preset and adjusted.
The parameters are divided into four categories (evapotranspiration parameters (KC, UM, LM, C), runoff generation parameters (WM, KF, BF, B, FC), water source partition parameters (KI, KG) and flow concentration parameters (CI, CS, CG, KE and XE)) according to the physical mechanism. The evapotranspiration parameters (KC, UM, LM, C) intervals are assigned based on experience. They are not sensitive and significant for the event flood simulation. The intervals of runoff generation parameters (WM, KF, BF, B) are assigned according to respective physical meanings. The hydrograph and flood peak are not sensitive to these parameters. The rest parameters play the key role in controlling the flood shape (total volume and flood peak). These parameters including FC, KI, KG, CI, CS, CG, KE and XE are emphasized during the calibration. The runoff volume can be adjusted by the value of KG. A larger (less) KG may lead to a less (larger) runoff volume. KI+KG indicates the flow speed of free water, which is generally stable at a value of 0.7 or 0.8. Hence, KI+KG = 0.8 is assigned and only one of them is calibrated. It can help reduce uncertainty to some degree because that KI and KG are related to each other. CS, CI and CG collectively control the shape of hydrograph. A larger value of CS leads to a lower flood peak and vice versa. CI and CG influence the falling limb. There is no need to adjust the intervals of KE and XE during calibration. FC, KE and XE are calibrated given a reasonable value range according to their physical meanings. Narrow value ranges of the sensitive parameters (KI, KG, CI, CS, CG, KE and XE) are assigned according to corresponding physical meanings to reduce the uncertainties and the probability of equifinality issue. The model is calibrated according to the above rules in terms of human-computer interactions.
The optimized parameters are shown in Tables A1-A3. The objective function of the optimization algorithm is a combination of peak flow error and hydrograph error: where QE denotes the absolute value of the average error of flood peaks and NSE averaged Nash-Sutcliffe efficiency coefficient. For each flood event, NSE and the relative error of flood peak are calculated, which are then transformed to averaged values. The averaged values (NSE and QE) are used for calculating "mu". A lower value of mu indicates a better simulation. The formula above is beneficial to make the model generate more satisfying peak flow and hydrograph. Generally, the accurate forecast of the rising limb of the flood wave is of paramount importance in real-time flood forecasting. On the other hand, it is generally more difficult to directly quantify the rising limb when calibrating a model. Some factors that influence the rising limb are expected to be used to obtain a qualified rising limb. The factors include the start point of rising, peak flow, the appearance time of peak and some others. In practice, all these factors could not be taken into account together during calibration because they are often competitive. Hence, key factors should be identified and included to build the objective function of optimization algorithm. In this study, QE and NSE are used in terms of that lower QE and higher NSE benefit to generating satisfied peak flow and hydrograph, which do benefit to getting a better rising limb. In addition, QE and NSE are usually quantifiable and deemed as more important factors than the others in flood forecasts. QE is the most important indicator for disaster prevention and mitigation, thus it is weighted twofold Equation (26).

Results
The simulation results for calibration and validation periods are both shown in Tables 1-3. The performance of simulation is collectively tested by using the relative error of total runoff depth (RER), the relative error of flood peak (REP), the error of peak appearance time (ET) and NSE.
For Zijingguan Basin, as it can be seen in Table 1, all the flood peaks are simulated well by the model (SVHM) in terms of that REP are found within the permissible error (i.e., qualified) in flood forecast [36]. The permissible error are described in the note under Table 1. The ratio of qualified flood peak and the ratio of qualified peak appearance time are both 100%. The total runoff depth (flood volume) is not simulated as well as peak flow in terms of that the ratio of qualified runoff depth (RQR) is 84.62%. It is probably resulted from that the objective function of optimization algorithm (Equation (26)) contributes less to convergence of flood volume error. Note: For total runoff: the permissible error (pe) is ± 20% of the observation. If pe is larger than 20 mm, pe is set to 20 mm. If pe is less than 3 mm, pe is set to 3 mm. For flood peak, pe is ± 20% of the observation. For appearance time of flow peak, pe is 30% of the period from precipitation peak to flood peak. If pe is less than 3 h, pe is set to 3 h. If pe is less than 1 h, pe is set to 1 h [36]. The simulation that exceeds pe is denoted as italic and bold. The flood events number denotes the occurrence time of flood (e.g., "710814" denotes 14 August 1971, "120721" denotes 21 July 2012).
For Xiuwu Basin, almost all the flood peaks (except one event "760817") and the total runoff depths (flood volume) (except two events "670710" and "000714") are simulated well by the model (SVHM) in terms of that most of the RER and REP are found within the permissible error (i.e., qualified) in flood forecast [36] ( Table 2). The ratio of qualified flood peak, the ratio of qualified total runoff depth and the ratio of qualified peak appearance time (RQT) are 96.67%, 93.33% and 43.33%, respectively.
For Chutoulang Basin, the flood peaks (except two events "710723" and "070716") are simulated well by the model (SVHM) ( Table 3). The total runoff depth is not simulated as well as peak flow. The ratio of qualified flood peak, the ratio of qualified total runoff depth and the ratio of qualified peak appearance time (RQT) are 85.71%, 50.00% and 92.86%, respectively.
According to the standard for hydrological information and hydrological forecasting, China [36], the mean ratio of qualified forecast results (RQ) in Zijingguan Basin is 94.87% (i.e., the mean value of 100% for peak flow, 84.62% for total runoff and 100% for peak appearance time), reaching 'A' grade forecast [36] (see the note under Equation (25)). RQ in Xiuwu Basin is 77.78% and that in Chutoulang is 76.19%, both reaching 'B' grade forecast (see the note under Equation (25)).
On the other hand, the NSE values for Zijingguan Basin shown in Table 1 indicate that not all the hydrographs are simulated well. The mean value of NSE for all flood events is 0.71. It is probably owing to that the objective function used for parameter optimization is more beneficial to promote the simulation of peak flow than to promote hydrograph to some degree. The mean value of NSE for all floods in Xiuwu Basin is 0.81 (Table 2) and that in Chutoulang is 0.63 (Table 3).  In addition, the results of simulations by XAJ [11], SBM [19], VMM [34] and VLRM [3] models are used to compare with the SVHM model, which are shown in Figures 5-7. The performance measures for simulation of all flood events are quantified and shown in Figures 8-13. the simulation of peak flow than to promote hydrograph to some degree. The mean value of NSE for all floods in Xiuwu Basin is 0.81 (Table 2) and that in Chutoulang is 0.63 (Table 3).
In addition, the results of simulations by XAJ [11], SBM [19], VMM [34] and VLRM [3] models are used to compare with the SVHM model, which are shown in Figures 5-7. The performance measures for simulation of all flood events are quantified and shown in Figures 8-13.     For Zijingguan Basin (Figures 8 and 9), it can be seen that the simulation by XAJ has the lowest RQP (69.23%), and the simulation by SBM has the lowest RQR (46.15%). It seems that the VMM combines the merit of XAJ for flood volume simulation and the advantage of SBM for flood peak simulation, leading to more qualified results (RQ = 76.92%) than those by XAJ (RQ = 71.79%) and SBM (RQ = 74.36%). SVHM is more advanced than VMM in terms of RQR, RQP and RQT. Compared with VMM, the ratio of qualified results (RQ = 94.87%) increases 17.95% in Zijingguan Basin by using SVHM, which is probably owing to the tight coupling of the runoff processes at surface and subsurface through the advanced statistical technique. The RQ by VLRM is 80.00%, which is higher than that by VMM, XAJ and SBM, but lower than that by SVHM (RQ = 94.87%). On the other hand, the NSE by SVHM (mean value of NSE is 0.71) is higher than those by XAJ (mean value is 0.60), SBM (mean value is 0.46) and VMM (mean value is 0.51) ( Figure 9) and a little lower than that by VLRM (mean value is 0.75). Moreover, the simulated errors by SVHM (averaged RER, REP and ET) are lower than those by XAJ, SBM and VMM, except that the averaged REP (5.38%) by SVHM is a litter larger than that by SBM (5.33%). The averaged REP and ET by SVHM are lower than that by VLRM and the averaged RER by SVHM is higher than that by VLRM.  For Zijingguan Basin (Figures 8 and 9), it can be seen that the simulation by XAJ has the lowest RQP (69.23%), and the simulation by SBM has the lowest RQR (46.15%). It seems that the VMM combines the merit of XAJ for flood volume simulation and the advantage of SBM for flood peak simulation, leading to more qualified results (RQ = 76.92%) than those by XAJ (RQ = 71.79%) and SBM (RQ = 74.36%). SVHM is more advanced than VMM in terms of RQR, RQP and RQT. Compared with VMM, the ratio of qualified results (RQ = 94.87%) increases 17.95% in Zijingguan Basin by using SVHM, which is probably owing to the tight coupling of the runoff processes at surface and subsurface through the advanced statistical technique. The RQ by VLRM is 80.00%, which is higher than that by VMM, XAJ and SBM, but lower than that by SVHM (RQ = 94.87%). On the other hand, the NSE by SVHM (mean value of NSE is 0.71) is higher than those by XAJ (mean value is 0.60), SBM (mean value is 0.46) and VMM (mean value is 0.51) ( Figure 9) and a little lower than that by VLRM (mean value is 0.75). Moreover, the simulated errors by SVHM (averaged RER, REP and ET) are lower than those by XAJ, SBM and VMM, except that the averaged REP (5.38%) by SVHM is a litter larger than that by SBM (5.33%). The averaged REP and ET by SVHM are lower than that by VLRM and the averaged RER by SVHM is higher than that by VLRM. For Zijingguan Basin (Figures 8 and 9), it can be seen that the simulation by XAJ has the lowest RQP (69.23%), and the simulation by SBM has the lowest RQR (46.15%). It seems that the VMM combines the merit of XAJ for flood volume simulation and the advantage of SBM for flood peak simulation, leading to more qualified results (RQ = 76.92%) than those by XAJ (RQ = 71.79%) and SBM (RQ = 74.36%). SVHM is more advanced than VMM in terms of RQR, RQP and RQT. Compared with VMM, the ratio of qualified results (RQ = 94.87%) increases 17.95% in Zijingguan Basin by using SVHM, which is probably owing to the tight coupling of the runoff processes at surface and subsurface through the advanced statistical technique. The RQ by VLRM is 80.00%, which is higher than that by VMM, XAJ and SBM, but lower than that by SVHM (RQ = 94.87%). On the other hand, the NSE by SVHM (mean value of NSE is 0.71) is higher than those by XAJ (mean value is 0.60), SBM (mean value is 0.46) and VMM (mean value is 0.51) ( Figure 9) and a little lower than that by VLRM (mean value is 0.75). Moreover, the simulated errors by SVHM (averaged RER, REP and ET) are lower than those by XAJ, SBM and VMM, except that the averaged REP (5.38%) by SVHM is a litter larger than that by SBM (5.33%). The averaged REP and ET by SVHM are lower than that by VLRM and the averaged RER by SVHM is higher than that by VLRM.  For Xiuwu Basin (Figures 10 and 11), it can be seen that the simulation by SBM has the lowest RQR (50.00%) and RQ (52.22%) among the five models. VMM has more qualified results (RQ = 68.89%) than those by XAJ (RQ = 67.78%) and SBM (RQ = 52.22%). SVHM is more advanced than VMM in terms of RQR, RQP and RQT. RQ by SVHM (77.78%) increases 12.90% compared with that by VMM (68.89%). SVHM has more qualified results (RQ = 77.78%) than VLRM (RQ = 75.56%) as well. On the other hand, SBM model has the lowest NSE and highest simulated errors. VMM performs better than XAJ in terms of NSE and simulated errors. Compared with VMM, the simulated errors by SVHM decreases and NSE (0.81) increases 12.50%. VLRM has the same NSE with SVHM, lower RER, higher REP and higher ET than SVHM.  For Xiuwu Basin (Figures 10 and 11), it can be seen that the simulation by SBM has the lowest RQR (50.00%) and RQ (52.22%) among the five models. VMM has more qualified results (RQ = 68.89%) than those by XAJ (RQ = 67.78%) and SBM (RQ = 52.22%). SVHM is more advanced than VMM in terms of RQR, RQP and RQT. RQ by SVHM (77.78%) increases 12.90% compared with that by VMM (68.89%). SVHM has more qualified results (RQ = 77.78%) than VLRM (RQ = 75.56%) as well. On the other hand, SBM model has the lowest NSE and highest simulated errors. VMM performs better than XAJ in terms of NSE and simulated errors. Compared with VMM, the simulated errors by SVHM decreases and NSE (0.81) increases 12.50%. VLRM has the same NSE with SVHM, lower RER, higher REP and higher ET than SVHM. For Xiuwu Basin (Figures 10 and 11), it can be seen that the simulation by SBM has the lowest RQR (50.00%) and RQ (52.22%) among the five models. VMM has more qualified results (RQ = 68.89%) than those by XAJ (RQ = 67.78%) and SBM (RQ = 52.22%). SVHM is more advanced than VMM in terms of RQR, RQP and RQT. RQ by SVHM (77.78%) increases 12.90% compared with that by VMM (68.89%). SVHM has more qualified results (RQ = 77.78%) than VLRM (RQ = 75.56%) as well. On the other hand, SBM model has the lowest NSE and highest simulated errors. VMM performs better than XAJ in terms of NSE and simulated errors. Compared with VMM, the simulated errors by SVHM decreases and NSE (0.81) increases 12.50%. VLRM has the same NSE with SVHM, lower RER, higher REP and higher ET than SVHM.   flood peak and ratio of qualified appearance time for flood peak, respectively. RQ is the mean of RQR, RQP and RQT. RQ can be classified by A grade: 85-100%, B grade: 70-85% and C grade: 60-70% [36]. Figure 11. Comparisons of performance measures for different models in simulating all flood events in Xiuwu Basin (Note: REP denotes the relative error of flood peak, RER denotes the relative error of total runoff depth, ET denotes the error of peak appearance time and NSE denotes the Nash-Sutcliffe efficiency coefficient. REP and RER range from 0 to 1, NSE ranges from −0.5 to 1 and ET ranges from 0 to 3).
For Chutoulang Basin (Figures 12 and 13), it can be seen that the simulation by XAJ has the lowest RQP (0.00%) and RQ (38.10%) among the five models. VMM has more qualified results (RQ = 59.52%) than those by XAJ (RQ = 38.10%) and SBM (RQ = 57.14%). SVHM is more advanced than VMM in terms of RQR, RQP and RQT. RQ by SVHM (76.19%) increases 28.00% compared with that by VMM (59.52%). SVHM has more qualified results (RQ = 76.19%) than VLRM (RQ = 50.00%) as well. On the other hand, XAJ model has the lowest NSE and highest simulated errors. VMM performs better than XAJ in terms of NSE and all simulated errors and also better than SBM in terms of NSE and RER and ET. Compared with VMM, the simulated error of flood peak (REP) by SVHM decreases obviously (i.e., from 25.46% to 7.48%) and NSE by SVHM increases 16.67% (i.e., from 0.54 to 0.63). VLRM has the same NSE with VMM, slightly higher simulated errors than VMM. SVHM outperforms VLRM in all the simulated errors and NSE. Figure 11. Comparisons of performance measures for different models in simulating all flood events in Xiuwu Basin (Note: REP denotes the relative error of flood peak, RER denotes the relative error of total runoff depth, ET denotes the error of peak appearance time and NSE denotes the Nash-Sutcliffe efficiency coefficient. REP and RER range from 0 to 1, NSE ranges from −0.5 to 1 and ET ranges from 0 to 3).
For Chutoulang Basin (Figures 12 and 13), it can be seen that the simulation by XAJ has the lowest RQP (0.00%) and RQ (38.10%) among the five models. VMM has more qualified results (RQ = 59.52%) than those by XAJ (RQ = 38.10%) and SBM (RQ = 57.14%). SVHM is more advanced than VMM in terms of RQR, RQP and RQT. RQ by SVHM (76.19%) increases 28.00% compared with that by VMM (59.52%). SVHM has more qualified results (RQ = 76.19%) than VLRM (RQ = 50.00%) as well. On the other hand, XAJ model has the lowest NSE and highest simulated errors. VMM performs better than XAJ in terms of NSE and all simulated errors and also better than SBM in terms of NSE and RER and ET. Compared with VMM, the simulated error of flood peak (REP) by SVHM decreases obviously (i.e., from 25.46% to 7.48%) and NSE by SVHM increases 16.67% (i.e., from 0.54 to 0.63). VLRM has the same NSE with VMM, slightly higher simulated errors than VMM. SVHM outperforms VLRM in all the simulated errors and NSE.   Comparisons of performance measures for different models in simulating all flood events in Chutoulang Basin (Note: RQR, RQP and RQT denote the ratio of qualified total runoff, ratio of qualified flood peak and ratio of qualified appearance time for flood peak, respectively. RQ is the mean of RQR, RQP and RQT. RQ can be classified by A grade: 85-100%, B grade: 70-85% and C grade: 60-70% [36]. Figure 13. Comparisons of performance measures for different models in simulating all flood events in Chutoulang Basin (Note: REP denotes the relative error of flood peak, RER denotes the relative error of total runoff depth, ET denotes the error of peak appearance time and NSE denotes the Nash-Sutcliffe efficiency coefficient. REP, RER and NSE range from 0 to 1 and ET ranges from 0 to 2).

Discussions
It is known that rainfall in semi-humid and semi-arid regions tends to be more variable in both space and time than in humid regions [27], leading to a more complex runoff generation mechanism in this region. Rainfall exceeding infiltration capacity and exceeding storage capacity of soil may appear alternately or synchronous over time according to the comparison between rainfall (intensity) and soil hydraulic properties (infiltration and storage capacity). A prolonged wet or dry condition owing to climate change and a series of prominent Anthropocene drivers (e.g., groundwater Figure 13. Comparisons of performance measures for different models in simulating all flood events in Chutoulang Basin (Note: REP denotes the relative error of flood peak, RER denotes the relative error of total runoff depth, ET denotes the error of peak appearance time and NSE denotes the Nash-Sutcliffe efficiency coefficient. REP, RER and NSE range from 0 to 1 and ET ranges from 0 to 2).

Discussions
It is known that rainfall in semi-humid and semi-arid regions tends to be more variable in both space and time than in humid regions [27], leading to a more complex runoff generation mechanism in this region. Rainfall exceeding infiltration capacity and exceeding storage capacity of soil may appear alternately or synchronous over time according to the comparison between rainfall (intensity) and soil hydraulic properties (infiltration and storage capacity). A prolonged wet or dry condition owing to climate change and a series of prominent Anthropocene drivers (e.g., groundwater overdraft, land use/land cover change, etc.) will change the runoff generation process [43,44], making the currently available models suffer from a weakness in hydrologic estimation [25].
Actually, the peak flow in semi-humid and semi-arid regions consists of both the runoff exceeding infiltration of surface soil and the runoff exceeding storage. Table 4 lists some typical types of hydrological models based on different runoff theory. The XAJ, SCS [45,46] and TOPMODEL are constructed based on the concept of saturation-excess and they both well consider the partial runoff issue through considering the spatial heterogeneity of the underlying surface. The infiltration-excess surface runoff is not included in these models, leading to a great unsuitability of the models (e.g., the poor performance of XAJ model for flood peak and total runoff depth simulation shown in Figure 13). In contrast, the Shanbei model (SBM) is able to capture the infiltration-excess surface runoff caused by heavy precipitation with high intensity but neglects the subsurface runoff that often appears as well, leading to a poor performance for flood volume simulation (Figures 9, 11 and 13). The VLRM simulates rainfall-runoff process by using the spatial distribution of relative soil water storage capacity, omitting the infiltration-excess surface runoff [3], which is the reason of large simulated error for flood peak in the dry Chutoulang Basin (Figure 13). The SVHM developed in this work is different from VLRM. Compared with VLRM based on single runoff generation concept, SVHM is a mixed model including both the infiltration-excess and the saturation-excess runoff. SVHM is more similar to VMM in terms of that they share the similar vertical mixed structure. SVHM is developed based on VMM. What is different is that the coupling of the infiltration-excess and saturation-excess runoff in SVHM is more advanced as it considers the spatial nonuniformity. The novelty and originality lies in the vertical-mixed structure and the addressment of spatial inhomogeneous water transmission from surface to subsurface. Compared with VMM, the ratio of qualified results by SVHM increases 17.95% in Zijingguan Basin, 12.90% in Xiuwu Basin and 28.00% in Chutoulang Basin, respectively. It is probably owing to the tight coupling of the runoff processes at surface and subsurface through the advanced statistical technique. Moreover, SVHM performs almost equally to VLRM in Zijingguan and Xiuwu Basin and better in Chutoulang Basin. It may be due to that the VLRM does not has the infiltration-excess surface runoff, which is an important runoff component in Chutoulang Basin. It is drier than Zijingguan and Xiuwu Basin. Though the SVHM outperforms the other four models in Chutoulang Basin, the NSE in the basin is still not very satisfying. Nevertheless, the "B" grade forecast indicates that the model can be used in practice flood forecast. The effectiveness can be improved in practice forecasting with the help of real-time correction technique.
The SAC takes both the infiltration-excess and saturation-excess processes into account but does not well consider the spatial inhomogeneity of runoff. With respect to the vertical mixed model (VMM), it performs better theoretically in describing the partial runoff than SAC but does not couple the two runoff concepts. These two mixed models have disadvantages in capturing the complex runoff more or less. In contrast, the SVHM in this study is well coupled through a statistical approach. The model comprehensively incorporates spatial heterogeneity of three key elements in runoff process (i.e., infiltration capacity, water storage capacity and water transmission from surface to subsurface), it is more reasonable in reality than the traditional mixed models (e.g., SAC, VMM). Spatially uneven water movement from surface to subsurface is considered to address the spatial partial runoff issue, which is more advanced and reasonable than the spatial average value used in VMM. The model has the ability to capture the complex flow that may consist of different runoff components.
On the other hand, the physically based distributed models by using Richard equation together with a distributed grid modeling structure provides us a new way to simulate streamflow and its pathways and to simulate the transport of sediments or contaminants [47,48]. It also helps the prediction of the impact of land use and other changes [49,50]. However, as Beven (2002) pointed out, these physically based distributed models have their disadvantage that the descriptive equations for each process (such as the Richard equation for runoff) require, in all cases, certain simplifying assumptions and the suitability of these equations for basin scale is likely to be questionable [16].
In addition, there has also been a widespread use of the types of lumped conceptual model for discharge prediction owing to their relative better performance than distributed models and computation convenience. Relative to the physically based distributed model, the proposed model in this study can be deemed as a semidistributed model through a lumped conceptual way. The probabilistic coupling method provides a new and feasible way to conduct a tight coupling of the runoff generation processes in a lumped model. The well coupled mixed model combines the merits of distributed model to describe spatial heterogeneity and the advantages of lumped conceptual model to conveniently and accurately forecast flood. These are probably the main reasons for that the complex streamflow in the case studies are simulated well.

Conclusions
In this paper, a new framework for estimation of runoff generation both in excess of infiltration at soil surface and on repletion of water storage at subsurface is proposed to characterize the behaviors of complex runoff formation. The spatial nonuniformity of the soil hydraulic properties including the infiltration capacity and storage capacity over the catchment is reflected by the dual distribution curves of infiltration capacity and water storage capacity. Through deeming the infiltration amount and storage capacity as spatial variables, the infiltration-excess runoff at surface and the saturation-excess runoff at subsurface are successfully coupled through a probabilistic way. This new concept of runoff generation provides a beneficial and forthrightly feasible means to capture the delicate and flexible runoff process.
Based on the vertically mixed structure and probabilistic coupling way, a hydrological model is constructed. It is a new vertically mixed model that comprehensively incorporates spatial heterogeneity of three key elements (i.e., infiltration capacity, water storage capacity and water transmission from surface to subsurface) in runoff process. It is a semidistributed model combining the merit of distributed model to describe spatial heterogeneity and the advantages of lumped conceptual model to conveniently and accurately forecast flood. The model performance is tested in three basins, which are the most typical semi-humid and semi-arid basins in China. A couple of performance measures are used to collectively show the model efficiency. Results show that the model performs well in simulating floods, indicating a feasibility of the statistical vertically mixed runoff concept. The mean ratio of qualified forecast results (RQ) reaches 'A' grade forecast in Zijingguan Basin, 'B' grade forecast in Xiuwu Basin and Chutoulang Basin. The technique can be widely used to simulate and forecast floods in other semi-humid and semi-arid regions.
Nevertheless, there are still some improvements need to be made in the future. The types of probability distributions for infiltration capacity and water storage capacity need to be further investigated and established. Additionally, the reasonability of the sampling number in coupling the surface and subsurface processes needs to be further investigated in our future work. Moreover, the NSE in the regions with more complex runoff process (e.g., the dry Chutoulang Basin) is still not very satisfying, which calls more efforts in our future studies.
Author Contributions: Conceptualization, P.S. and T.Y.; methodology, P.L., P.S. and Z.L.; software, P.L., Z.L.; validation, T.Y., C.-Y.X.; formal analysis, Z.L. and X.W.; investigation, Z.L. and X.W.; resources, T.Y.; data curation, Z.L.; writing-original draft preparation, P.L.; writing-review and editing, P.L. and P.S.; visualization, X.W.; supervision, T.Y. and C.-Y.X.; project administration, P.S. and T.Y.; funding acquisition, P.S. and T.Y. All authors have read and agreed to the published version of the manuscript. Acknowledgments: Authors are thankful to Hohai University for giving space and funding to carry out the research work. All the authors are highly thankful to the reviewers for their fruitful comments to improve the quality of the study.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses or interpretation of data; in the writing of the manuscript or in the decision to publish the results.