Development of an Event-Based Water Quality Model for Sparsely Gauged Catchments

: This paper describes an event-based water quality model for sparsely gauged catchments. The model was cultivated in a robust way to cope with practical issues, such as limited available data and error propagation. A simpliﬁed model structure and fewer input parameters are the most appealing features of this model. All model components are coupled and controlled within an Excel Spreadsheet Macro as an operational tool. Herein, the geomorphological instantaneous unit hydrograph (GIUH), the simpliﬁed process erosion and sedimentation component, the loading function, and the river routing from different existing modeling systems are adopted and linked together. Furthermore, an add-on Monte Carlo simulation tool is provided to deliver an uncertainty analysis for calibration of the output obtained from the model results. The model was successfully applied to simulate nutrient dynamics for small catchment scales during ﬂood events in Vietnam. The success of the model application shows the ability of our model, which can adapt the model complexity to the data availability, i.e., the dominant processes in the system should be captured, whereas the minor processes may be neglected or treated in a less complex manner. methodology, H.Q.N., G.M., and V.T.N.; software, H.Q.N.; supervision, G.M. and V.T.N.; writing—review and editing, V.T.N.


Introduction
Water pollution is a critical water-related problem, since it threatens human beings, as well as ecosystems. The excess of nutrients in the aquatic environment is a typical example of water pollution caused by various anthropogenic factors from industrial and urban wastewater, and agricultural runoff. Eutrophication can be seen as a consequence of excessive nutrients, which can lead to serious environmental problems such as limited water supply, severe hypoxia, and polluted aquatic ecosystems. Through the food chain, water pollution can cause serious waterborne diseases to human [1][2][3].
Catchment water quality modeling was shown as a useful tool for water quality management. It is widely applied in several water quality management programs, such as the total maximum daily load (TMDL) estimation in the United States of America (USA) [4], or in the water framework directive in the European community [4]. The physical, chemical, and biological processes occurring in the catchment, as well as the anthropogenic factors, can be included in the model so that various management schemes can be tested and implemented. Therefore, it can be used to assist water managers in giving effective decisions for water management and protection.

Erosion Module
In this module, soil erosion and sediment transport from sources (upland) to the nearest water body are simulated at an hourly time step. Given the objective of model application for the datascarce areas, the erosion/sediment component was developed based on a simplified process (SP) model for sediment yield, which was basically adapted from Hartley [31]. The model was tested, and it proved to be suitable for modeling the erosion during (extreme) flood events at hourly time step [32,33]. The model attempts to minimize both data inputs and computational effort while maintaining a relatively high degree of similarity to both hydrologic and hydraulic processes [31]. The erosion model aims to compromise between simple empirical modeling (e.g., Universal Soil Loss Equation (USLE) [34]) and complex physically based approaches (e.g., KNIEROS) [31].
As shown in Figure 1, the sediment supply is basically the sum of sediments detached by runoff and rainfall. The runoff is the only source for the transport of sediment which is represented by the sediment transport capacity. The runoff is calculated from the hydrological component. After comparing between the sediment supply and sediment transport capacity, the smaller value is taken as the sediment yield obtained from the catchment. Results from this module are input into the river routing module.

Hydrology Module
The hydrological module was partly adopted from the previous works of the authors [20], in which the model was verified and applied for several flood events in Vietnam in 2005. In this module, the runoff after storm events is modeled as a total flow discharge of the catchment, then its results are later used to convert from nutrient loads into the concentration at the outlet. Moreover, the runoff generated by this hydrological module is used for both erosion and nutrient components. The geomorphology instantaneous unit hydrograph (GIUH) was first initiated by Rodríguez-Iturbe et al. [21], and restated by Gupta et al. [22]. The GIUH is an empirical event-based model approach that combines easily observable (surface) geomorphologic catchment characteristics with a simple regression analysis. This approach is particularly applicable in data-scarce areas, and the model parameterization relies on GIS-based digital elevation model (DEM) processing [23]. The concept of this approach was improved and successfully implemented as an event-based hydrological model to simulate the rainfall-runoff relationship, and to forecast floods [20,[24][25][26]. The GIUH is superior in comparison to the traditional IUH, because the GIUH takes into account nonlinear and linear characteristics and is able to deal with ungagged catchments [21]. Simulation results showed that this approach is a very promising tool to estimate event discharges, even for ungauged catchments [20,27].
Rodríguez-Iturbe and Valdez [21] defined simple empirical relationships for the time to peak (t p ) and the instantaneous hydrograph peak (q p ) of the GIUH dependent on geomorphologic parameters, as follows: where L Ω is the length of the highest order stream (km), v is the expected stream flow velocity (m/s), R B is the bifurcation ratio, R A is the area ratio, and R L is the length ratio (see Appendix Table A4).
In Equations (1) and (2), the geomorphologic parameters (R B , R A , and R L ) can be easily extracted from the topological characteristics of the catchment using the GIS tool (see, e.g., Integrated Land and Water Information System, ILWIS) [28]. The flow velocity has to be defined by physical reasoning, where an average velocity must be related to an average flow length (i.e., travel path) and the travel time. Travel time is calculated based on estimated flow velocity and geomorphologic parameters (i.e., drainage density).
The response function of the GIUH is characterized as an "impulse response function" [29,30]. The runoff can be obtained from where i(τ) is the effective rainfall intensity at time τ, and it is distributed uniformly over entire basin. The amount of input entering the system between τ and τ + dτ is i(τ) dτ. The effective (excess) rainfall is computed according to the Soil Conservation Service (SCS) runoff method [29,30]. Furthermore, u(t) is the GIUH, which is the impulse response function of the system, and it is determined as follows: where Prob() is the probability of the set given in parentheses, T B is the travel time to the catchment outlet, T Si is the travel time in a particular path, Prob(S i ) is the probability of a drop which will travel all possible paths S i to the outlet, and Prob(T Si ) is the probability density function (pdf) of the total path travel time T Si .

Erosion Module
In this module, soil erosion and sediment transport from sources (upland) to the nearest water body are simulated at an hourly time step. Given the objective of model application for the data-scarce areas, the erosion/sediment component was developed based on a simplified process (SP) model for sediment yield, which was basically adapted from Hartley [31]. The model was tested, and it proved to be suitable for modeling the erosion during (extreme) flood events at hourly time step [32,33]. The model attempts to minimize both data inputs and computational effort while maintaining a relatively high degree of similarity to both hydrologic and hydraulic processes [31]. The erosion model aims to compromise between simple empirical modeling (e.g., Universal Soil Loss Equation (USLE) [34]) and complex physically based approaches (e.g., KNIEROS) [31].
As shown in Figure 1, the sediment supply is basically the sum of sediments detached by runoff and rainfall. The runoff is the only source for the transport of sediment which is represented by the sediment transport capacity. The runoff is calculated from the hydrological component. After comparing between the sediment supply and sediment transport capacity, the smaller value is taken as the sediment yield obtained from the catchment. Results from this module are input into the river routing module.

Sediment Supply
The potential sediment supply caused by rainfall is calculated based on the rainfall energy rate which is used in the USLE method, whereas the sediment caused by runoff is based on the stream power equation.

Sediment Caused by Rainfall
Sediment supply by rainfall [31] is calculated as follows: where G rf is the rate of soil detachment due to rainfall (mass rate of detachment per unit area by rainfall) (kg/m 2 /h), E rf is the rate of rainfall energy (rainfall power per unit area) (J/m 2 h), GC and CF are the ground cover and canopy factors, respectively, which are dependent on the land cover (its values are taken from Table A1 in the Appendix), and K is the soil erodibility factor (kg/J), and its value is obtained from the USLE method [35] (see Table A2 in the Appendix).
The rate of rainfall energy is determined as where i is the rainfall intensity (mm/h).

Sediment Caused by Runoff
Sediment supply by runoff [31] is calculated as follows: where G ro is the rate of soil detachment due to the runoff (mass rate of detachment per unit area by runoff) (kg/m 2 /h), E ro is the runoff power rate of energy input to the soil by the flow (J/m 2 h), and K is the soil erodibility factor (kg/J) (its value is taken from Table A2 in the Appendix). The rate of runoff energy is determined as where K f is the overland flow resistance, which is dependent on ground cover density, and it is calculated based on Harley [31]; K f = 60 + 3140 × GC 1.65 , where GC is the ground cover (e.g., if GC = 0.1 for bare ground as shown in Table A1 of the Appendix, then K f = 130.3); γ is the water specific weight, γ = 9800 (kN/m 3 ); Q L is the unit flow discharge (m 2 /h), and S 0 is the element slope. The unit flow discharge is calculated according to the algorithm presented in Chow et al. [30] (p. 156).
where r is the runoff (m), L 0 = 1 2D is used as the overland flow length in this model, D is the drainage density (m/km), and θ is the slope angle.
Thus, the potential sediment supply (Y S ) within time duration (∆t) is determined by

Sediment Transport Capacity
The transport capacity is based on a sediment concentration ratio estimated with a shear stress relationship between the dominant flow shear on the soil and the critical stress based on the Shields criteria (Simons and Senturk (1976) [31]).
The sediment transport capacity (Y C ) is determined as follows: where ρ is the density of water (kg/m 3 ), c is the sediment concentration ratio, and r is the runoff (m). The sediment concentration [31] ratio is determined by where τ s is the shear stress on the soil (kg/m/h 2 ), and τ C is the critical shear stress (kg/m/h 2 ). The shear stress τ S varies both in space and time during a runoff event on the surface. For the sake of simplicity, it is proposed to define a single, mean or "dominant" shear stress for an entire runoff event from a given surface [31]; the mean τ S is equal to the "dominant" shear stress τ D as follows: where β is the power in the depth-discharge relationship parameter (5/3), and h L is the time average runoff flow depth (mm), which is calculated based on the kinematic approximation: where α = 8gS 0 0.0074K f ν 0.25 0.57 is the coefficient in the depth-discharge relationship, ν is the kinematic viscosity of water (ν = 10 −6 , m 2 /s), and g is the gravity coefficient (m/s 2 ).
The critical shear stress is obtained from following formula: where σ (kN/m 3 ) is the specific weight of sediment (Table A2 in the Appendix), D 50 (mm) is the median size of soil particles (Table A2 in the Appendix), and ϕ is Shields sediment function given by where R* is the Shields criterion Reynolds number, defined below.

Nutrient Loading Module
In this module, total nutrient loads for each land-use type due to rainfall and runoff forcing are calculated, and then lumped at the sub-catchment outlet. Then, the results from this module are input into the river routing module. A physically based simulation of nutrient transport at a catchment scale is complex, especially due to its diffuse sources. A detailed description of involved processes over the whole land surface is hardly possible. One common technique used to estimate the nutrient loading from land areas receiving water is the loading function. This method is applied in some popular models, from simple ones, such as Agricultural Non-Point Source (ANGPS) [36], the Cornell simulation model (CNS) [37,38], and generalized watershed loading functions (GWLF) [39], to complex ones, such as chemicals, runoff, and erosion from agricultural management systems (CREAMS), groundwater loading effects of agricultural management systems (GLEAMS) [40,41], SWAT [42], and ANSWERS [43,44]. Novotny and Olem [45] considered screening models to crudely estimate pollutant loads. However, they can be integrated with other interacting processes such as hydrology and erosion. In this study, an algorithm applied in the CNS loading model [37,38] was adopted, since it is applicable for areas with limited data. There are two main components of the model as follows: where LD kt and LS kt are the dissolved and solid-phase losses of a pollutant (kg/ha), Cd kt and Cs kt are the pollutant concentrations in dissolved and solid-phase forms (mg/L), Q kt is the runoff (cm), X kt is the soil loss (ton/ha), and TD k = 1 and TS k = 2.5 d k (−0.36) are transport factors which indicate the fractions of dissolved and solid-phase pollutants, which move from the edge of the source area to the catchment outlet. If dissolved pollutants are conserved, then all edge-of-field losses will reach the catchment outlet. The term d k is the down-slope distance from the center of the source area k to the nearest identifiable drainage channel (m). It is assumed that d k is equal to the overland flow length.
The most difficult parameters to quantify in the above equations are the pollutant concentrations in dissolved and solid-phase forms (Cd kt and Cs kt ). While the latter can be vaguely estimated based on soil sampling before the event or from literature, the first one is trickier since the collection of water samples from the surface during storm events may be dangerous and rarely feasible, if fixed measuring equipment such as a cable way is missing, and field service personnel cannot be activated in time. Therefore, in this study, the chemical availability for runoff (Cd kt ) was calculated based on the method in GLEAMS [40] as follows: where C (g/kg) is the chemical concentration in dissolved form, and its final value is calculated in Equation (22); C 1 (g/kg) is the chemical concentration or chemical mass/soil mass equal to Cs kt in Equation (19), F (cm) is the total storm infiltration (or rainfall minus runoff), POR is the porosity of surface layer, and ABST (cm) is an initial abstraction from rainfall. In the GLEAMS modeling approach, the abstraction is modeled continuously, and it is related to soil information which is not available for the study area. Thus, this formula was modified with the assumption that the total storm runoff infiltration minus the initial abstraction (the numerator in the exponential function) is equal to the continuing abstraction (F a ) (see Figure 2).
In doing so, Equation (20) becomes where P is the total rainfall, P e is the rainfall excess, I a is the initial abstraction, and F a is the continuing abstraction.
where P is the total rainfall, Pe is the rainfall excess, Ia is the initial abstraction, and Fa is the continuing abstraction.  The final dissolved concentration was calculated based on the GLEAMS approach [40], which is also applied in the ANSWERS model [44].
where C and Cd kt are indicated in Equations (20) and (18), respectively, β is an extraction coefficient, and k is the partition coefficient.
Nitrogen nitrate (N-NO 3 ) is not attached to the soil particles, and it is always in solution (e.g., in overland flow, infiltrating, percolating water). For nitrate, the partitioning coefficient was set to zero as nitrate exists only in the dissolved phase, and the extraction coefficient was set to 0.5. For nitrogen ammonium (N-NH 4 ) and phosphorus phosphate (P-PO 4 ), the partitioning coefficient was larger than 10 and, thus, the extraction coefficient was set to 0.1 [44].

River Routing and Point Sources
The transport of nutrient and sediment through a stream/river is routed in the flow routing module. In addition, this module should take into account the contribution of point sources (i.e., wastewater discharged from industrial areas). The river routing component was adopted from the diffuse-pollution-load (D-POL) model [46]. The model was applied to simulate dissolved nutrients in small Mediterranean catchments during flood events. The D-POL model is an integrated system including catchment pollutant loads driven by the rainfall and river routing. The D-POL model is based on a rainfall-load semi-distributed approach, and it was successfully calibrated against a 20-flood dataset, validated against a 10-flood dataset, and applied to many flash flood events. However, only the river routing was adopted since other model components of D-POL do not include particulate pollutants caused by erosion/sedimentation processes. The most important assumption in this river routing is the pollutant concentration to be conserved along the river reaches during storm events.
The river is discretized into reaches and each reach is discretized into a number of reservoirs depending on the length of each reach. Figure 3 (top) shows how pollutant sources, sinks, and changes Equations for every reach and reservoir are based on the mass conservation law, where the change of a storage is equal to the differences between the output and input, presented as follows: The changes of storage at the ith reach and first reservoir at time t, The changes of storage at the ith reach and jth reservoir at time t (1 < j < n(i)), The changes of storage at the ith reach and nth reservoir (last reservoir of reach i) at time t, where SR(i,j,t) is the stock of pollutants in the jth reservoir of the ith reach (kg), n(i) is the total number of reservoirs of the ith reach, L(i) is the length of the ith reach, PS(i,j,t) is the point sources input (kg/h) to the jth reservoir of the ith reach, OC(ci,t) is the pollutant input from the related sub-catchment ci (kg/h), OR(i,t)/or(i,t) is the pollutant output from the ith reach (kg/h), τ is the lag-time of the river reservoirs (h) (to be fitted by calibration), Lb is the basic length (e.g., 1000 m), and T is the transport parameter (L/h).  Equations for every reach and reservoir are based on the mass conservation law, where the change of a storage is equal to the differences between the output and input, presented as follows:

Model Uncertainty
The changes of storage at the ith reach and first reservoir at time t, The changes of storage at the ith reach and jth reservoir at time t (1 < j < n(i)), The changes of storage at the ith reach and nth reservoir (last reservoir of reach i) at time t, where SR(i, j, t) is the stock of pollutants in the jth reservoir of the ith reach (kg), n(i) is the total number of reservoirs of the ith reach, L(i) is the length of the ith reach, PS(i, j, t) is the point sources input (kg/h) to the jth reservoir of the ith reach, OC(c i , t) is the pollutant input from the related sub-catchment c i (kg/h), OR(i, t)/or(i, t) is the pollutant output from the ith reach (kg/h), τ is the lag-time of the river reservoirs (h) (to be fitted by calibration), Lb is the basic length (e.g., 1000 m), and T is the transport parameter (L/h).

Model Uncertainty
The Monte Carlo simulation method was utilized for an uncertainty analysis. The Monte Carlo simulation technique involves random sampling of model input and/or model parameters to produce hundreds or thousands of scenarios (i.e., outputs; see Reference [47]). The model results are stored and then evaluated statistically. In this manner, an uncertainty in model input and/or model parameters, presented as probability distributions, will propagate through simulation systems. In this study, the Monte Carlo simulation was combined with the Latin hypercube sampling method [47]. Every simulation was run for 1000 time steps within a Microsoft Excel file. The software was assisted by an add-in program called RiskAMP Monte Carlo Add-in for Excel ® (https://www.riskamp.com). For each parameter and input, the uniform distribution was applied.
A full water quality model including all modules merged in a spreadsheet is shown in Figure 4. The Monte Carlo simulation method was utilized for an uncertainty analysis. The Monte Carlo simulation technique involves random sampling of model input and/or model parameters to produce hundreds or thousands of scenarios (i.e., outputs; see Reference [47]). The model results are stored and then evaluated statistically. In this manner, an uncertainty in model input and/or model parameters, presented as probability distributions, will propagate through simulation systems. In this study, the Monte Carlo simulation was combined with the Latin hypercube sampling method [47]. Every simulation was run for 1000 time steps within a Microsoft Excel file. The software was assisted by an add-in program called RiskAMP Monte Carlo Add-in for Excel® (https://www.riskamp.com). For each parameter and input, the uniform distribution was applied.
A full water quality model including all modules merged in a spreadsheet is shown in Figure 4.

Study Area
The study area was the Tra Phi catchment located within the Dong Nai river basin ( Figure 5); this study area was considered as a pilot area because it has the typical hydrologic condition of Dong Nai river basin, a combination of various land uses, and industrial areas containing point sources. The Dong Nai river basin is the largest national river basin in southern Vietnam, and it is considered as a hot spot of water pollution in Vietnam [48] (p. 78).
The study area is affected by the tropical monsoon climate; therefore, it has two distinguished seasons: the dry season (from December to April) and the rainy season (from May to November). Extreme rainfall events up to 180 mm/h often occur in the area during the rainy season [49]. The catchment is characterized by a heterogeneous topography ranging from 2-30 m above sea level (a.s.l.) in low-land areas to about 1000 m a.s.l. at the water head. The catchment area is about 21 km 2 , including agriculture (66%), forest (11%), wetland (3%), semi-urban areas (13%), and urban areas (7%). Gray soil (Acrisol) is major soil type in the catchment [49]. The only identified point source of wastewater comes from a factory which produces tapioca starch from cassava (tapioca) roots. The company produces on average 30 tons of tapioca starch per day, which generates about 360-600 m 3 of wastewater per day [50]. An analysis of the water quality data of the main Tay Ninh river can be found in Nguyen and Meon [3].

Study Area
The study area was the Tra Phi catchment located within the Dong Nai river basin ( Figure 5); this study area was considered as a pilot area because it has the typical hydrologic condition of Dong Nai river basin, a combination of various land uses, and industrial areas containing point sources. The Dong Nai river basin is the largest national river basin in southern Vietnam, and it is considered as a hot spot of water pollution in Vietnam [48] (p. 78).
The study area is affected by the tropical monsoon climate; therefore, it has two distinguished seasons: the dry season (from December to April) and the rainy season (from May to November). Extreme rainfall events up to 180 mm/h often occur in the area during the rainy season [49]. The catchment is characterized by a heterogeneous topography ranging from 2-30 m above sea level (a.s.l.) in low-land areas to about 1000 m a.s.l. at the water head. The catchment area is about 21 km 2 , including agriculture (66%), forest (11%), wetland (3%), semi-urban areas (13%), and urban areas (7%). Gray soil (Acrisol) is major soil type in the catchment [49]. The only identified point source of wastewater comes from a factory which produces tapioca starch from cassava (tapioca) roots.
The company produces on average 30 tons of tapioca starch per day, which generates about 360-600 m 3 of wastewater per day [50]. An analysis of the water quality data of the main Tay Ninh river can be found in Nguyen and Meon [3].

Model Set-Up
The Tra Phi catchment is a third-order catchment with a drainage density of about 1250 m/km 2 . There is an irrigation canal system through the catchment transferring water from Dau Tieng reservoir for irrigation purposes ( Figure 6). This system is independent of the river network in terms of direct contribution to river flows. In this study, it was assumed that the contribution of irrigated water to the river network was negligible.

Model Set-Up
The Tra Phi catchment is a third-order catchment with a drainage density of about 1250 m/km 2 . There is an irrigation canal system through the catchment transferring water from Dau Tieng reservoir for irrigation purposes ( Figure 6). This system is independent of the river network in terms of direct contribution to river flows. In this study, it was assumed that the contribution of irrigated water to the river network was negligible.

Hydrologic Calculations
Model parameters of the GIUH include the Horton's ratios, hillslope, and stream flow velocity. Horton's statistics including R A , R L , and R B parameters were calculated using a new functionality in ILWIS called "Horton statistics" in module "statistical parameter extraction" (in "DEM-Hydro-processing") [51]. The procedure included three calculation processes as follows: (i) calculating the number of streams, the average stream length (km), and the average area of catchments (km 2 ) for all streams (represented by C1_N, C1_L, and C1_A in Figure 7); (ii) calculating the expected values of the number of streams, the average stream length (km), and the average area of catchments (km 2 ) by means of a least-squares fit (represented by C1_N_LSq, C1_L_LSq, and C1_A_LSq in C1_N_LSq, C1_L_LSq, and C1_A_LSq in Table 1); (iii) calculating the area ratio (R A ), length ratio (R L ), and bifurcation ratio (R B ) from the slopes of each fitted line connecting the expected values shown in Figure 7 (results shown in Table 1). Sustainability 2019, 11, x FOR PEER REVIEW 12 of 29

Hydrologic Calculations
Model parameters of the GIUH include the Horton's ratios, hillslope, and stream flow velocity. Horton's statistics including RA, RL, and RB parameters were calculated using a new functionality in ILWIS called "Horton statistics" in module "statistical parameter extraction" (in "DEM-Hydroprocessing") [51]. The procedure included three calculation processes as follows: (i) calculating the number of streams, the average stream length (km), and the average area of catchments (km 2 ) for all streams (represented by C1_N, C1_L, and C1_A in Figure 7); (ii) calculating the expected values of the number of streams, the average stream length (km), and the average area of catchments (km 2 ) by means of a least-squares fit (represented by C1_N_LSq, C1_L_LSq, and C1_A_LSq in C1_N_LSq, C1_L_LSq, and C1_A_LSq in Table 1); (iii) calculating the area ratio (RA), length ratio (RL), and bifurcation ratio (RB) from the slopes of each fitted line connecting the expected values shown in Figure 7 (results shown in Table 1).
The obtained values and the least-squares fit were visualized using a Horton plot to inspect the regularity of the extracted stream network and to serve as a quality control indicator for the entire stream network extraction process. It is expected that the number of streams shows a decrease for subsequent higher-order Strahler numbers and longer streams, and that the catchment areas show an increase for subsequent higher-order Strahler numbers [52].
From the Horton plot ( Figure 7) and Table 1, it can be assessed that the drainage network is well extracted, and that the Horton numbers are representative and fall within the expected range; thus. they could be used without any calibration. The extracted river network is presented in Figure 6.   The above section "GIUH development" showed how to derive the GIUH, whereas the calculation of the surface discharge for each event is described below.
The effective (excess) rainfall was computed according to the Soil Conservation Service (SCS) runoff method (see Reference [29] for the original, and Reference [30] (p. 147) for the latest one). To calculate the curve number (CN) value, the land-use and soil maps used were based on the SCS table.
The CN values of each map unit were aggregated for the whole catchment by means of GIS to get an average CN value (e.g., see Reference [53]). Then, the effective rainfall was calculated by the SCS curve number method.
where Pe is the effective rainfall or direct runoff expressed as a depth (mm), P is the total observed  The obtained values and the least-squares fit were visualized using a Horton plot to inspect the regularity of the extracted stream network and to serve as a quality control indicator for the entire stream network extraction process. It is expected that the number of streams shows a decrease for subsequent higher-order Strahler numbers and longer streams, and that the catchment areas show an increase for subsequent higher-order Strahler numbers [52].
From the Horton plot ( Figure 7) and Table 1, it can be assessed that the drainage network is well extracted, and that the Horton numbers are representative and fall within the expected range; thus, they could be used without any calibration. The extracted river network is presented in Figure 6.
The above section "GIUH development" showed how to derive the GIUH, whereas the calculation of the surface discharge for each event is described below.
The effective (excess) rainfall was computed according to the Soil Conservation Service (SCS) runoff method (see Reference [29] for the original, and Reference [30] (p. 147) for the latest one). To calculate the curve number (CN) value, the land-use and soil maps used were based on the SCS table. The CN values of each map unit were aggregated for the whole catchment by means of GIS to get an average CN value (e.g., see Reference [53]). Then, the effective rainfall was calculated by the SCS curve number method.
where P e is the effective rainfall or direct runoff expressed as a depth (mm), P is the total observed rainfall (mm), and S is the potential maximum retention (mm), calculated as After having the effective rainfall, Horton's statistics values were used to derive the GIUH, the surface runoff was calculated (see Reference [30] (p. 211)), and the discharge was determined by taking into account the catchment area. The catchment flow was composed of three components: (1) surface flow (overland flow), (2) subsurface runoff or interflow, and (3) base flow or groundwater flow. In the GIUH approach, the model can basically simulate the overland flow plus parts of the interflow which leave the unsaturated zone and arrive as surface flow at the river. Therefore, from the observed hydrograph, the contribution of base flow needs to be subtracted. In this study, the base flow was separated manually using the constant slope method [54].
The CN value, overland flow velocity (Vo), and the stream velocity (Vs) need to be calibrated against the observed discharges. Final values obtained from the calibration are shown in Table 10.

Erosion and Sediment Calculation
The development of the erosion/sediment yield module includes four main steps: (1) extracting runoff values from hydrologic calculation; (2) calculating soil detachment by runoff and soil detachment by rainfall; (3) calculating sediment transport capacity; and (4) calculating sediment yield. According to Section 2.3, several model parameters related to soil, catchment characteristics, rainfall, and runoff are needed. In particular, these are (1) soil data: median size of soil particle (D 50 ), specific weight of sediment (σ), and soil erodibility factor (D); (2) land-use data: ground cover factor (GC) and canopy factor (CF); and (3) physical characteristics of catchment: slope (S o ), catchment areas, and overland flow length (L).
Model parameters included constants and calculated or estimated parameters. The first ones were adopted from literature, as shown in Table 2, while the latter ones were calculated for each land use and soil type within a catchment, and then aggregated for each sub-catchment. The catchment parameters were calculated based on GIS/DEM processing, and they were assigned for each sub-catchment (Table 3). Since the catchment was mostly distributed by gray soil (Acrisol), the soil parameters were kept as unique values while other parameters were aggregated according to different land-use types ( Table 4). The predicted sediment in the model, therefore, should be clay. In this approach, the clay fraction (Fcl) in sediment calculated based on the approach presented in Hartley [31] was adapted as follows: where Fcl is the clay fraction in detached sediment, and Ocl is the clay fraction in matrix soil (0.04 for Acrisol).

Nutrient Loading Calculation
Data requirements for the nutrient loading model are rather simple. Hourly input data for the nutrient loading component (i.e., runoff (Q kt ) and soil loss (X kt )) were obtained from the hydrology and erosion modules. Other parameters, such as soil porosity and solid-phased concentration, were estimated from the literature or soil sampling. Model outputs were constituent loadings (hourly) at each outlet of the sub-catchments which were later used as inputs for the flow routing module.
The nutrient parameters were calculated for each land-use type, and they were aggregated for each sub-catchment; the results are shown in Table 5. Since the entire catchment was mostly distributed by gray soil (Acrisol), the soil porosity of 0.6 obtained from the field survey was applied for the whole catchment. The overland flow length was similar to that applied in the erosion/sedimentation module.
The calibrated values of Cd kt and Cs kt were smaller than their values observed by experiment (e.g., in Reference [17]). The reason could be due to the retention effects within the catchment, as well as because of the aggregation technique in GIS processing.

River Routing Calculation
The Tra Phi catchment is discretized into five sub-catchments corresponding to five reaches. In general, discretization is oriented to the hydrological similarity (soil, topology) within a sub-catchment. In addition, each reach is discretized into a number of virtual reservoirs depending on their lengths (e.g., about 1000 m for each reservoir; Figure 8 and Table 6) each sub-catchment; the results are shown in Table 5. Since the entire catchment was mostly distributed by gray soil (Acrisol), the soil porosity of 0.6 obtained from the field survey was applied for the whole catchment. The overland flow length was similar to that applied in the erosion/sedimentation module.
The calibrated values of Cdkt and Cskt were smaller than their values observed by experiment (e.g., in Reference [17]). The reason could be due to the retention effects within the catchment, as well as because of the aggregation technique in GIS processing. Table 5. Aggregated values of nutrient parameters.

River Routing Calculation
The Tra Phi catchment is discretized into five sub-catchments corresponding to five reaches. In general, discretization is oriented to the hydrological similarity (soil, topology) within a subcatchment. In addition, each reach is discretized into a number of virtual reservoirs depending on their lengths (e.g., about 1000 m for each reservoir; Figure 8 and Table 6)   The module simulates flood events on an hourly time step. Data input for this flow routing included diffuse sources, as well as the point source, reach length, average velocity, and initial conditions. The diffuse sources obtained from the sediment and nutrient module were sediment and nutrient loadings from the sub-catchment. The point source was wastewater loading from the identified company at a specified location for each simulated constituent (in Figure 8, it is located at Reach 2, Reservoir 5).
Constituent loadings can be read at every reservoir, along with the reach (as output). Concentrations can only be produced at catchment outlets where the flow discharge is available for converting the loading to concentration. Information on the Tra Phi river reaches can be obtained by GIS processing, and the data are shown in Tables 6 and 7.

Sensitivity Analysis
Model parameters were analyzed for the four main modules (i.e., hydrology, erosion, nutrient loading, and river routing). Table 8 shows the parameters and their variations applied in this analysis. The sensitivity was calculated based on perturbing the model parameters within the permitted physical ranges or observed extreme ranges, and observing the variation of model results (e.g., total flow discharge or total loadings for the whole event). An example resulting from the sensitivity analysis for nutrients is shown in Figure 9. It can be summarized in four points. Firstly, within the hydrological components, the CN value and the rainfall input are the most sensitive factors to model results. The CN value is more sensitive than the rainfall input to the variation of flow discharge volume. The velocity parameters (Vo, Vs) do not influence the total volume, but highly influence the shape of hydrograph as the velocity parameters are used to calculate the travel time of the rain drop in the catchment. Secondly, the velocity parameters V from the river routing module and (Vo, Vs) from hydrological module are the next most sensitive parameters. The effects from the velocity on the shape of hydrograph will significantly contribute to total loadings, since the loading is a function of flow discharge and concentration. Thirdly, the effects of soil parameters K, D 50 , and POR on the final results of sediment and nutrient loadings are very small in comparison to hydrological parameters, because the sediment and nutrient loads are strongly driven by hydrological processes (e.g., rainfall and runoff) in flood events. In addition, the nutrient loading parameters (i.e., Cd kt (N-NO 3 ), Cs kt (P-PO 4 , N-NH 4 )) can only show clear variations when increasing up to 500% (Figure 9). Fourthly, the changes of point sources also significantly affect the model results as they contain a large amount of pollution loaded directly into the rivers. The variation is about 35% when the perturbation is less than 50%. However, once increasing the perturbation to five times (500%), the variation is about 280%. This aspect is essential once dealing with illegal wastewater dumped directly from factories into the rivers as mentioned in Section 1. hydrological components, the CN value and the rainfall input are the most sensitive factors to model results. The CN value is more sensitive than the rainfall input to the variation of flow discharge volume. The velocity parameters (Vo, Vs) do not influence the total volume, but highly influence the shape of hydrograph as the velocity parameters are used to calculate the travel time of the rain drop in the catchment. Secondly, the velocity parameters V from the river routing module and (Vo, Vs) from hydrological module are the next most sensitive parameters. The effects from the velocity on the shape of hydrograph will significantly contribute to total loadings, since the loading is a function of flow discharge and concentration. Thirdly, the effects of soil parameters K, D50, and POR on the final results of sediment and nutrient loadings are very small in comparison to hydrological parameters, because the sediment and nutrient loads are strongly driven by hydrological processes (e.g., rainfall and runoff) in flood events. In addition, the nutrient loading parameters (i.e., Cdkt (N-NO3), Cskt (P-PO4, N-NH4)) can only show clear variations when increasing up to 500% (Figure 9). Fourthly, the changes of point sources also significantly affect the model results as they contain a large amount of pollution loaded directly into the rivers. The variation is about 35% when the perturbation is less than 50%. However, once increasing the perturbation to five times (500%), the variation is about 280%. This aspect is essential once dealing with illegal wastewater dumped directly from factories into the rivers as mentioned in Section 1.

Model Results
Model results of an event from 25-27 July 2008 (for calibration) are described in this section. The results of another event from 14-15 August 2008 (for validation) are presented in the Appendix.
The hydrological parameters and inputs (e.g., CN, rainfall) significantly affect the results of other model components (TSS and nutrients). Therefore, these parameters need to be carefully calibrated before considering other parameters. Table 9 shows the evaluated parameters for flow, total suspended solid (TSS), phosphate (P-PO 4 ), ammonium (N-NH 4 ), and nitrate nitrogen (N-NO 3 ), which are also illustrated in Figures 10-13. The model results are plotted together with observed data whose error bars were calculated using Harmel's method [55]. The cumulative measured data errors due to sampling, preservation, and analysis of total suspended solid, nitrogen ammonium, nitrogen nitrate, and phosphorus phosphate were 28%, 18%, 18%, and 50%, respectively. The results can be summarized in five points. Firstly, the runoff flow discharge is well simulated, including the curve, peak, and time to peak. Furthermore, based on the model evaluation statistics [56], the runoff flow simulation is in very good agreement with the observation due to its Nash-Sutcliffe efficient (NSE), index of agreement (d), coefficient of determination (R 2 ), and percent bias (PBIAS) being 0.9, 0.97, 0.90, and 0.26, respectively. Secondly, regarding water quality parameters, based on the PBIAS values (see Table 4 of [57]), the parameters of P-PO 4 , N-NH 4 , and N-NO 3 are very good (PBIAS < ±25). Only TSS (PBIAS = 29.97) is a bit worse than other parameters in the comparison between simulation and observation; however, this value is still in the good rating range (±25 < PBIAS < ±40). Furthermore, the d index shows a good agreement between simulated and observed results for runoff flow discharge, TSS, and nutrients. Thirdly, prediction of N-NH 4 and N-NO 3 is better than prediction of P-PO 4 . The reason may be that the nitrogen parameters are mainly related to the runoff processes, while P-PO 4 is dependent on the erosion process adding into the runoff processes. In addition, as shown above, the possible errors of P-PO 4 are also very high (50%), as they can accumulate errors, leading to a worse performance of this parameter. Fourthly, the model should be further improved by implementing more certain monitoring techniques, and a higher frequency of measurement. Fifthly, the model did not adequately capture the receding curve. The reason could be the retention effects of agricultural fields, especially rice fields, which were controlled by farmers (e.g., releasing water after rainfall events to ensure that the field is not too inundated). In addition, the omission of nutrient and sedimentation transformation processes in the river routing is another factor that may lead to model underestimation.      The uncertain boundary is rather large, especially during the peak flows, whereby, upon comparing to the mean values, it can be more than 100%. However, the simulation results show that, when including the boundary (confidence interval of 90%) in Monte Carlo simulations, a very good agreement between model simulation and observation data is obtained. The variation in model output is also highly affected by the change of point sources, especially when extreme disposal occurs. This was also shown in the sensitivity analysis, where a fivefold increase of point source compared to normal can change the results up to 280%.

Conclusion
In this study, a water quality model was developed to simulate nutrient dynamics for sparsely gauged catchments during flood events. Most of the model input data were deduced from various methods, which were applied for data-scarce regions. The hydrological component was based on the GIUH [21], using parameters obtained from GIS processing, which was verified and applied for several flood events in Vietnam [20]. The erosion/sediment component was based on Hartley's approach [31], which was verified by Hartley [32] and Leon et al. [59]. In the nutrient loading component, the loading function method was applied, whereby the nutrient loads could be integrated from the outputs of hydrological and erosion processes. The loading function method was successfully applied by Haith and Tubbs [37] and Haith et al. [38,39]. The river routing component was adapted from the D-POL model [46], in which the river was simply discretized into reaches, followed by each reach being discretized into reservoirs; the pollutant was integrated over nutrient loads and sediments obtained from the nutrient loading component, and loadings from point sources were also added into this component. The simplified model structure and limited model parameters are the most appealing features of the model. All model components are coupled and controlled within one Excel Spreadsheet Macro. The running environment is within an Excel Spreadsheet Macro, and this feature makes the model user-friendly and robust, with easy installation. Similar trends are obtained for the validated event from 14-15 August 2008, as shown in Table A5 and Figures A1-A3 in the Appendix.

Model Uncertainty Analysis
Every simulation was run for 1000 time steps within a Microsoft Excel file. The software was assisted by an add-in program called RiskAMP (RiskAMP Monte Carlo Add-in for Excel®). For each parameter and input, uniform distribution was applied. Data for the Monte Carlo simulations are shown in Table 10.
The sensitivity analysis showed that the CN, flow velocity (river routing) parameters, and rainfall (input) are the most sensitive factors for the model results. Thus, firstly, these three parameters and the input were calibrated so that the Nash-Sutcliffe efficiency (NSE) ranges within 0.85-0.95 were considered as acceptable (e.g., as in [58]). Since then, different scenarios were developed in order to see how these parameters propagate to model results. Results are shown in Table 11 and Figure 13 (for TSS and P-PO 4 ).
Due to the high effects of the CN parameter, rainfall, and flow velocity, the propagation of other parameters is not clearly seen (Table 11). The uncertainty boundary is greatly reduced if these parameters are excluded in the uncertainty analysis. The results are also consistent with those obtained from the sensitivity analysis.
The uncertain boundary is rather large, especially during the peak flows, whereby, upon comparing to the mean values, it can be more than 100%. However, the simulation results show that, when including the boundary (confidence interval of 90%) in Monte Carlo simulations, a very good agreement between model simulation and observation data is obtained. The variation in model output is also highly affected by the change of point sources, especially when extreme disposal occurs. This was also shown in the sensitivity analysis, where a fivefold increase of point source compared to normal can change the results up to 280%.

Conclusions
In this study, a water quality model was developed to simulate nutrient dynamics for sparsely gauged catchments during flood events. Most of the model input data were deduced from various methods, which were applied for data-scarce regions. The hydrological component was based on the GIUH [21], using parameters obtained from GIS processing, which was verified and applied for several flood events in Vietnam [20]. The erosion/sediment component was based on Hartley's approach [31], which was verified by Hartley [32] and Leon et al. [59]. In the nutrient loading component, the loading function method was applied, whereby the nutrient loads could be integrated from the outputs of hydrological and erosion processes. The loading function method was successfully applied by Haith and Tubbs [37] and Haith et al. [38,39]. The river routing component was adapted from the D-POL model [46], in which the river was simply discretized into reaches, followed by each reach being discretized into reservoirs; the pollutant was integrated over nutrient loads and sediments obtained from the nutrient loading component, and loadings from point sources were also added into this component. The simplified model structure and limited model parameters are the most appealing features of the model. All model components are coupled and controlled within one Excel Spreadsheet Macro. The running environment is within an Excel Spreadsheet Macro, and this feature makes the model user-friendly and robust, with easy installation. Furthermore, an add-in Monte Carlo simulation tool was implemented in this modeling system to provide a reasonable uncertainty analysis tool for the users to calibrate the input parameters and to validate model results. Implementations of different model components can be easily accessed and monitored under Excel Spreadsheet functions. This aspect is especially useful when implementing the sensitivity and uncertainty analysis. The coupled system can represent the system dynamics, such as flow discharge, suspended solid, and nutrients during flood events. In addition, the point source disposal is also included. In addition, because of the capacity of the model in utilizing GIS and remote sensing data, many model parameters can easily be extracted. This characteristic is very important when dealing with data-scarce regions (ungauged catchments). Therefore, the application of the model confirmed the necessity to adapt the model complexity to the data availability of the investigated area. The success of the developed model proved the importance of selecting suitable model structures for specific regions and applications. There are various available models such as SWAT and HSPF; however, they require intensive input data and, consequently, it is hard to deal with sparsely gauged and ungauged catchments. Our approach was based on findings of previous studies (e.g., [60,61]), whereby all the dominant processes in the system should be captured in the model, whereas processes of minor importance should be neglected or treated in a less complex manner.
Based on the uncertainty simulation results, possible scenarios can be explored when dealing with many uncertain sources (a similar conclusion was drawn in [62]). A further study should focus on reducing the uncertainty caused by input data (e.g., obtaining more reliable data by improving monitoring data in time and space). For example, a high impact on model results was also found in rainfall data. Given only one observation station in the catchment covering an area of extreme topographic variation, the uncertainty propagation due to incorrect rainfall data is a critical issue. Therefore, improving rainfall observation is highly recommended.  Table A1. Land-class schemes [59].  HSC: soil group; Stext: soil texture; K: K factor; D 50 : median particle size, σ: sediment specific weight. Table A3. List of evaluation parameters used in the manuscript.

Ratios Formula Notes
where N i and N i+1 are the number of streams in order i and i + 1. Let Ω represent the highest stream order in catchment, i = 1, 2, . . . , Ω.
Length R L = L i+1 L i _ L i is the average length of channels of order I, A i is the mean area of the contributing sub-catchment to streams of order i, A j,i , where A i,j represents the total area that drains into the jth stream of order i