Rainfall Runoff Balance Enhanced Model Applied to Tropical Hydrology

: The integrative and comprehensive analysis considering the spatial and temporal representation of the hydrological process, such as the distribution of rainfall, land cover and land use, is a challenge for the water resources management. In tropical areas, energy availability throughout the year deﬁnes the rainfall distribution and evapotranspiration rate according to vegetation heterogeneity. To quantify water balance in tropical areas including these heterogeneities in the soil-vegetation-atmosphere relationship, we developed a fully distributed hydrological model called the Rainfall Runoff Balance Enhanced Model (RUBEM). The model was developed under a physics-based process structure, using remote sensing data to represent soil-water balance patterns, such as evapotranspiration, interception, baseﬂow, lateral ﬂow, recharge, and runoff. The calibration procedure was based on nine global parameters. RUBEM could represent the spatio-temporal heterogeneities (soil, land use and land cover (LULC), topography, vegetation, and climate) in three basins in a tropical area. The results showed good adherence between the processes governing the soil-vegetation-atmosphere relationship according to the humidity indicator and the runoff coefﬁcient. Overall, RUBEM can be used to help improve the management and planning of integrated water resources under climate, land use, and land cover changes in tropical regions.


Introduction
Understanding and quantifying hydrological patterns at the basin level under climate, land use, and land cover changes is a complex challenge [1][2][3]. Hydrological models play an important role in integrating water resource management (IWRM). These models have been developed since the 1960s [4]. The effects of the scientific community have led to the development of semi-and fully distributed hydrological models, such as VIC [5], SWAT [6], WetSpass [7], MGB-IPH [8], WEAP [9], SPHY [10], PCR-GLOBWB [11,12], MIKE-SHE [13][14][15][16][17], HIMS [18], IHDM [19], SLURP [20], the Xinanjiang Model [21], InHM [22], and PIHM [23], among others. The number of existing hydrological models is probably in the tens of thousands [24]. The US National Weather Service (NWS) has tested some of these models in the Distributed Model Intercomparison Project (DMIP) [25,26]. The DMIP showed that the performance of these models varies widely from good to very good and allows the application of current data input, such as radar rainfall and remote sensing data. In addition to the number of existing hydrological models, different conceptions and structural models contribute to understanding hydrological similarities and reducing the Hydrological models based on physical process structures have similar descriptions of formulation, scale, and solution technology. Physical processes rely on a set of variables and parameters organized into equations to simulate the response of hydrological patterns. These models are classified according to the level of detail in basin heterogeneity. The combined models do not consider spatial variability variables within the basin. Semidistributed models reflect the spatial variability variables in sub-areas. Fully distributed models process spatial variability using grid cells [45].
In this paper, we present the RUBEM as a fully distributed hydrological model that integrates classical rainfall runoff processes [46]. Water balance formulation was based on [9,10,47]. Figure 1 shows the structure of the hydrological processes used in the model. In tropical regions, evapotranspiration is one of the most important factors affecting the water balance physical process [36,48]. Evapotranspiration refers to the transfer of water from the soil-plant system to the atmosphere. Actual evapotranspiration is calculated based on the sum of evapotranspiration values for each land cover type using the concept of potential evapotranspiration [47]. The potential evapotranspiration ( ) of the vegetated plot is calculated using the Penman-Montheith method [49]. To use the methodology described in [49], the crop coefficient ( ) and soil moisture reduction coefficient ( ) are calculated according to the (Normalized Difference Vegetation Index) and soil 'moisture content (Equations (1)-(5)). In tropical regions, evapotranspiration is one of the most important factors affecting the water balance physical process [36,48]. Evapotranspiration refers to the transfer of water from the soil-plant system to the atmosphere. Actual evapotranspiration is calculated based on the sum of evapotranspiration values for each land cover type using the concept of potential evapotranspiration [47]. The potential evapotranspiration (ET p ) of the vegetated plot is calculated using the Penman-Montheith method [49]. To use the methodology described in [49], the crop coefficient (kc) and soil moisture reduction coefficient (ks) are calculated according to the NDV I (Normalized Difference Vegetation Index) and soil moisture content (Equations (1)-(5)).
ET R,V = ET P · kc · ks (2) kc = kc min + (kc max − kc min ) · NDV I − NDV I min NDV I max − NDV I min (3) i f NDV I ≤ 1.1 · NDV I min then kc = kc min (4) ET R,S = ET P · kc min · ks (6) i f TU R < TU PM then ks = 0 (7) where: ET REAL -real evapotranspiration (mm); α V -vegetated fraction of cell area (%); α S -bare soil fraction of cell area (%); α W -water fraction of cell area (%); α I -impermeable fraction of cell area (%); ET R,V -real evapotranspiration of the vegetated area (mm); ET R,Sreal evapotranspiration of the bare soil area (mm); ET R,W -real evapotranspiration of the water area (mm); ET R,I -real evapotranspiration of the impermeable area (mm); ET Ppotential evapotranspiration (mm); kc-crop coefficient (-); kc max and kc min -maximum and minimum possible values for crop coefficient; ks-soil moisture reduction coefficient (-); NDV I max and NDV I min -maximum and minimum values of the standardized vegetation index obtained based on the historical NDVI series for each cell; TU R -root zone moisture content (mm); TU PM -wilting point moisture content in the root zone (mm); TU CC -field capacity moisture content in the root zone (mm); kp-water evaporation coefficient (-); B-Class A Tank Border Width [between from 20 to 30 m] (m); U 2 -average wind speed at 2 m above ground surface (m/s); UR-relative humidity (%); and I-Interception [from 1 to 3 mm].
The coverage fractions and cell roughness coefficient were defined according to the coverage identification obtained from a previously classified raster image [50]. The value of the cover fraction is empirical and does not change during the analysis period; however, it can be modified according to the experience perception of the modeler. In the Supplementary Table S1 summarizes the values of cover fraction adopted in the model.
Crop water demand is a complex biological process that depends on soil moisture and the evaporative demand of the atmosphere [51,52]. The crop coefficient (kc) depends on the maximum and minimum values of vegetation and the NDVI under the following conditions: if NDVI ≤ 1.1 NDVImin, kc = kc min . The soil moisture reduction coefficient (ks) depends on the moisture content. If the moisture content of the root area (TU R ) is lower than the moisture content of the soil at the wilting point (TU PM ), the coefficient of moisture reduction (ks) is zero. ET R,W depends on the water evaporation coefficient (Equation (8)), ET R,S is calculated using kc min . ET R,w is calculated using the evaporation coefficient kp (Equation (9)), and ET R,I is based on the losses caused by interception (Equation (10)). Incorporating the processes of evapotranspiration and soil moisture content in the water balance process is important to represent heterogeneous sources, such as climate, rain, terrain, soil, and vegetation cover [34,35].
The water balance of soil is calculated in the non-saturated (root zone) and saturated zones (Equation (11)). The moisture content in the unsaturated zone depends on the previous moisture content, effective rainfall, and output (Equation (13)). Effective rainfall is the difference between the rainfall and interception in each cell (Equation (12)). The moisture of the saturated zone depends on the previous moisture, base flow, and groundwater recharge (Equation (S3)). A cell fully occupied by water with moisture in the root zone is considered saturated, that is, if α W = 1 then TU R = TU S .
In the root zone of the soil, surface runoff relies on soil moisture conditions based on lateral flow, recharge, and evapotranspiration. The saturated zone of the soil affects the calculation of base flow and recharge parameters. The water balance equation adopted in the RUBEM allows calculation of the total superficial flow.
where TU R -root zone moisture content (mm); TU R,t−1 -root zone moisture content at the previous time step (mm); P E -effective precipitation (mm); SR-surface runoff (mm); LF-lateral flow (mm); REC-groundwater recharge (mm); ET REAL -total real evapotranspiration (mm); P m -total monthly precipitation (mm); I-total interception (mm); TU S -saturated zone moisture content (mm); TU S,t−1 -saturated zone moisture content at the previous time step (mm); and BF-base flow (mm).

Model Development
The model was written in Python using the PC Raster framework [53] to support the spatial calculation of water balance at the grid cell level. RUBEM is a continuous hydrological model with a formulation that represents the flow process between the soil and atmosphere layers. The model (i) integrates the main hydrological processes; (ii) has the flexibility to be applied for various applications, including climate and LULC change impact, irrigation planning, and drought; (iii) can be used for different spatial resolutions (pixel size options), applied to small and large watersheds, as well as at the farm and country levels; (iv) allow adequate representation of changes in land cover and related processes through monthly temporal resolutions; and (v) can be implemented easily. The implementation is open source, and the input and output maps can be directly used in Geographic Information System (GIS) environments; it uses remote sensing data, has a reduced number of parameters, and uses a configuration file that allows changing parameters and choosing the output. The RUBEM has six sets of data input: hydrological, climate, soil, ground elevation, LULC, and NVDI. Table 1 summarizes the model input data and their descriptions. Supplementary Figure S1 illustrates the water balance variables. To assist the user in model application, we developed a plug-in for QGIS entitled RUBEM Hydrological, which supports the loading of input data, configures parameters, runs the model, and visualizes the results of the simulation. The plug-in does not have any preprocessing tools. However, the documentation and scripts for this purpose are openly available in the Supplementary Materials.

Model Aplication
The RUBEM was applied to three Brazilian basins, two of which are located in wet regions (Piracicaba River Basin (PRB) and Upper Iguaçu River Basin (UIRB)) whereas the third is located in the Brazilian semiarid region (Ipojuca River Basin (IRB)). The IRB has two hydrological zones with half of the region having a dry climate and the other half having a wet climate. The three basins present different characteristics of LULC, soil, evaporation, and hydrological regions (see details in the Supplementary Materials). Figure 2 shows the locations of the model application areas. The basins have different demands for water use, with primary needs for public supplies, industry, and agriculture, as shown in Table S6. The water abstraction and transposition system represents a challenge Water 2022, 14, 1958 6 of 25 in the model calibration process under uncertainties from the observed gauge station and the natural runoff calculated using the applied methodology.
third is located in the Brazilian semiarid region (Ipojuca River Basin (IRB)). The IRB has two hydrological zones with half of the region having a dry climate and the other half having a wet climate. The three basins present different characteristics of LULC, soil, evaporation, and hydrological regions (see details in the supplementary document). Figure 2 shows the locations of the model application areas. The basins have different demands for water use, with primary needs for public supplies, industry, and agriculture, as shown in Table S6. The water abstraction and transposition system represents a challenge in the model calibration process under uncertainties from the observed gauge station and the natural runoff calculated using the applied methodology.

Calibration and Validation
A differential evolution algorithm [55] was used in the model calibration process. This method executes three operations: mutation, crossing and selection. The objective function was set to obtain a Nash -Sutcliffe efficiency (NSE) average value, for all station considered, close to ̅̅̅̅̅̅ = 1. Over 27,000 combinations of the nine parameters were evaluated. For instance, the time required to choose solution in PRJ reached 120 h of processing. The selection criteria of the stations included a series with flaws, location in the

Calibration and Validation
A differential evolution algorithm [55] was used in the model calibration process. This method executes three operations: mutation, crossing and selection. The objective function was set to obtain a Nash -Sutcliffe efficiency (NSE) average value, for all station considered, close to NSE = 1. Over 27,000 combinations of the nine parameters were evaluated. For instance, the time required to choose solution in PRJ reached 120 h of processing. The selection criteria of the stations included a series with flaws, location in the basin, LULC, and rainfall regimes. The RUBEM calibration relies on nine global parameters, as listed in Table 2. Their range values were considered to represent the physical features of the watershed (soil, coverage, and rainfall). The values were based on [10,18,47], and verified on validation test in a subbasin of PRB.
From the 27,000 evaluation for each basin were selected the better sets of parameters, which resulted in the best NSE [56] according to Equations (S37) and (S38) (see Supplementary Materials), to verify the magnitudes of the water balance patterns. Thus, we evaluated whether the parameters of evapotranspiration and recharge flows in the river were consistent with the proportions reported in [10,47,57,58]. The set that results in the best combination of NSE and water balance proportions consistency were select as final calibrated parameters.
In addition to the calibration process assessment, we also evaluated the root mean square error (RMSE) using Equation (S36) [44]; number of times the variability of the observations was greater than the mean error-n t nd relative bias-RB using Equation (S39) [59]; and the asynchronous regression method that determines the function F(Q s ) = u · F(Q o ) according to the percentiles of q o and q s of the distribution of F(Q o ) and F(Q s ) at each probability level using equations from Equations (S40)-(S42) [60]. Table S4 presents the criteria used for evaluating the performance indicators as described in [56,61]. Table 2. Calibration parameters list, descriptions and restrictions.

Parameter Description Restriction
Interception Parameter (α) The interception parameter value. It represents the daily interception threshold that depends on land use. 0.01 ≤ α ≤ 10 Parameter related to Soil Moisture (b) Exponent value that represents the effect of the condition of moisture in the soil.
Land Use Factor Weight (w1) Weight of land-use factor. It measures the effect of the land use on the potential runoff produced. w1+ w2+ w3 = 1 Moisture Soil Factor Weight (w2) Weight of moisture soil factor in the permanent wilting point. It measures the effect of the soil classes on the potential surface runoff produced.
Slope Factor Weight (w3) Weight of slope factor. It measures the effect of the slope on the potential runoff produced. w1+w2+ w3 = 1 Regional Consecutive Dryness Level (RCD) Regional Consecutive Dryness level incorporates the intensity of rain and the number of consecutive days in runoff calculation.
Flow Direction Factor (f) It is used to partition the flow out of the root zone between interflow and flow to the saturated zone.
Decimal value refers to the recession coefficient of the baseflow. The lower values show areas that react slowly to groundwater drainage, while the higher values show areas that react rapidly.
Flow recession coefficient value incorporates a flow delay in the accumulated amount of water that flows out of the cell into its neighboring downstream cell (0 means that during a month with no rainfall, there will be no surface runoff).

Calibration and Validation
The modeled results were evaluated for the calibration period (January 2000 to December 2009) and the validation period (January 2010 to December 2018). At least five stream gauges were considered for each basin in the calibration process. Global parameters obtained during the calibration process were applied during the validation period. Table 3 shows the values for each of these parameters in the three basins. Figure 3 shows the hydrographs for the calibration and validation periods.     Table 4 lists the model efficiency indices (RMSE, NSE, and RB) for the three basins of the model application. The RMSE was acceptable or good at most of the gauge stations in the three basins during the two periods analyzed. Ref. [62] assumed that RMSE values below half of the standard deviation (SD) for the observed flows can be considered low. This criterion was met at all gauge stations during both analysis periods. The number of times (n t ) when the variability of the observations exceeds the average error allows us to assess the representativeness of the range covered by the simulated values [56]. Thus, the performance of the model does not change linearly with the NSE. The NSE was unsatisfactory for the IRB during calibration in two gauge stations (15.7% of the drainage area) at UIRB and in five gauge stations (80.91%) at the PRB. During the validation period, the NSE was acceptable or good in three gauge stations (71.3% of the area) at the IPR, five (72.2%) at the PRB, and three (84.3%) at the UIRB.
The RB measures the relationship between the average error and average flow observed [59]. The RB was unsatisfactory in the IRB, resulting in higher calibration values. In the PRB, bias was equal to or greater than acceptable at three gauge stations in calibration and at four validation stations. In the UIRB, good or very good levels were observed in both periods, and a slight negative bias prevailed ( Table 4).
The hydrographs are presented at the most significant drainage areas for each of the basins of the model application. The IRB (Figure 3a) shows that the model underestimated the highest discharges and adjusted the lowest discharges during the calibration period, resulting in RB = −0.397. The model was thus suitable for both high and low discharges (RB = 0.007). In the PRB (Figure 3b), the model overestimated the low discharges and adjusted them to high discharges in both periods, yielding RB = 0.468 (calibration) and RB = 0.348 (validation). In the UIRB (Figure 3c), all index performances for both periods were considered satisfactory.
Efficiency indices use simultaneous observations and simulations of flow data. Small sample sizes limit the representativeness of efficiency indicators [56,59]. Ref. [60] proposed a regression of one time-varying quantity against another without requiring simultaneous knowledge of both. Asynchronous regression was performed with the observed and simulated discharges related to the percentile of their accumulated distribution, as shown in Figure 4. The proximity of cumulative distribution indicated the model performance. The model did not reproduce the highest discharge in the three basins. The asynchronous regression results R 2 > 0.89 for all basins, but the modified regression coefficient (r m ) considering the sum of squared deviations from the respective averages are low [63]. The model tended to underestimate flows in the IRB (r m = 0.30) and UIRB (r m = 0.46) basins and to underestimate in the PRB (r m = 0.49). The fitted line was effective for comparing the 1:1 ratio between the observed and simulated discharges. Fisher's test of the two samples, considering the indifference of the mean and variance between the samples, indicated no significant difference (99% confidence) between the observed and simulated flows in the three basins. Ref. [64] considered a satisfactory fit for R 2 = 0.72 in simulations of the Brazilian basin. Ref. [65] show the good correlation between modelled and observed base flow index (R 2 > 0.87).
The results showed a good fit for calculating the average monthly discharge for all periods (from January 2000 to December 2018). RUBEM underestimated the median discharges by 23.5% in the wet season and 9.3% in the dry season in the IRB. In the PRB, the median discharge was overestimated by 33.7% and 51% during the wet and dry seasons, respectively. In the UIRB, the model underestimated the median discharge by 8.1% during the wet season and 4.3% during the dry season.

Soil and LULC Heterogeneities Analysis
The entire output parameter has spatial-temporal results as a raster file. Figure 5 shows lower evapotranspiration in areas with low NDVI values, more significant moisture in the soil texture of loam and clay, and higher impervious areas and surface runoff in urban land use. All spatially calculated variables are shown in Figures S11-S13 (Supplementary Materials).
With a suitable performance of the model application results, water balance patterns during the calibration and validation periods could be evaluated according to the observed rainfall and LULC through spatial-temporal changes. Three cells were selected to evaluate the LULC modifications throughout the simulation period ( Figure 6). The evaluated patterns were compared with the total rainfall to compute the ratio of each component in the three cells. Cell 1 (−8.33 lat, −35.20 long) presents a pasture area that turns into a forest area in the IRB. Cell 2 (−22.58 lat, −47.68 long) shows an area without LULC changes in the PRB, and Cell 3 (−25.54 lat, −49.40 long) shows an area that began as a forest, changed to a pasture, back to a forest, followed by cropland, and changed into an urban area in the UIRB. The forest cover in the IRB and UIRB constitutes the Atlantic Forest ecosystem. The NDVI (from 0.23 to 0.83) and kc (from 1.14 to 1.8) extensions were greater in UIRB than in IRB (NDVI from 0.53 to 0.89 and kc from 1.46 to 1.8).
The area with no alterations in the LULC in the simulation period resulted in a comprehensive pattern of the soil-water balance as expected, regarding the antecedent soil moisture conditions, meaning that the pattern had a standard pattern for each component. However, altered LULC resulted in significant changes in the water cycle patterns. Figure 6 shows the water cycle patterns for the selected cell. For instance, in the IRB sample, interception of the coverage area with the forest was significantly higher and evapotranspiration was slightly lower than that in the period in which the cell was cropped. In the PRB, evapotranspiration was significantly higher, with 65% of the rainfall occurring in 2014. A change in the LULC from the crops to urban areas in the last UIRB period resulted in an increase in surface runoff from 40% to 77% of the rainfall, and evapotranspiration from 62% with forest cover to 47% with crop cover and 7% with urban cover. The cell shows that the difference in evapotranspiration corresponds to nearly 60% of precipitation in 2000 (forest cover), approximately 10% between 2016 and 2018 (urban cover), and approximately 10% from 2016 to 2018 (urban cover), while the runoff was 30% and 80%, respectively. Water 2022, 14, x FOR PEER REVIEW 12 of 25    The correlation between the moisture index (P/ETp) and runoff coefficient (SR/P) was verified for a set of cells with different coverages [66], randomly selected in the three basins (IRB = 320; PRB = 1000 and UIRB = 250) in a month representing the wet (Figure 7a) and dry seasons (Figure 7b). The correlation coefficients for the wet (r = 0.71) and dry (r = 0.66) seasons indicate a satisfactory approach [65,67] for the spatial (500 m grid) and temporal (monthly) scales to represent the relationship between water availability and runoff. The correlations between P/ETp and SR/P in the cells sampled with equal coverage in the basins showed a strong correlation in both seasons (Table 5).

Calibration and Validation
We understand that changes in soil saturation significantly impact the patterns of base flow and runoff, supported by the minimum and maximum annual total rainfall Higher specific rainfall events did not have a high impact on recharge, evapotranspiration, or base flow, which were directly related to the soil-water balance layer. Generally application of the model is considered acceptable. However, Figure 3 shows that RUBEM was unable to simulate the highest mean monthly runoff, possibly because the highest

Calibration and Validation
We understand that changes in soil saturation significantly impact the patterns of base flow and runoff, supported by the minimum and maximum annual total rainfall. Higher specific rainfall events did not have a high impact on recharge, evapotranspiration, or base flow, which were directly related to the soil-water balance layer. Generally, application of the model is considered acceptable. However, Figure 3 shows that RUBEM was unable to simulate the highest mean monthly runoff, possibly because the highest rainfall event could modify the mean monthly runoff but could not change the soil balance according to the temporal scale. Furthermore, the poorer quality flow record and anthropogenic act amplify the differences between the series. However, the index performances show that the model were considered satisfactory, according to [64,65,67].
The interception parameter was significantly higher in the UIRB (α = 9.77), indicating greater interception of rainfall in this basin. The RCD and w2 values were higher in the UIRB (RCD = 8.34; w2 = 0.43) and PRB (RCD = 7.96; w2 = 0.35). These parameters contribute to increasing the potential runoff in permeable areas. In the IRB, the parameters RCD = 5.38, w2 = 0.12, and low soil moisture factors (b = 0.08) reduced surface runoff. The largest area of the IRB in the semiarid region explains this parameter value.
The highest value of the factor for the partition of flow directions in the PRB (f = 0.77) and UIRB (f = 0.83) favored lateral flow and did not favor recharge in these basins. Recharge plays a vital role in forming the basic flow, particularly during long dry months [10,47,61]. The higher value of the basic flow recession factor observed in the IRB (α gw = 0.92) disfavors basic flow, a feature of arid regions. Higher values of the damping coefficient increased the importance of the discharge generated in the previous month and reduced the discharge generated in each cell in the current month. This coefficient was the highest in the IRB (x = 0.31), resulting in a more significant damping of discharge into the basin.
The same set of parameters controlled all drainage processes during the analysis period. This condition facilitates representation of soil, climate, cover heterogeneities, and calibration of the best parameter sets. Manual calibration is also possible [10,47,58], but is very laborious. The differential evolutionary algorithm adopted in this study was proven to be effective. However, high levels of heterogeneity may hinder parameter calibration. As most IRBs are in dry regions and the other regions are wet, RUBEM performance was poor.
We considered the representative parameter sets to simulate the discharge in each basin for calibration and validation. RMSE, NSE, RB, and asynchronous regression showed acceptable results. Even though the NSE obtained in the calibration was unsatisfactory for the IRB, it was acceptable or good in the largest basin area during the validation period. The same behavior was observed for the UIRB and PRB. Considering the downstream gauge station, simulation discharges were lower than those observed 60% of the time in the IRB and UIRB and were higher than those observed 92% of the time in the PRB. However, the regression coefficients between the observed and simulated discharges were satisfactory for all three basins.
Asynchronous regressions between cumulative distributions of observed and simulation discharges can be used to fill the missing data in observed datasets, hindcast studies, as well as forecast discharges based on rainfall estimates from regional climate models for ungauged areas [68][69][70].
The NSE was acceptable for UIRB and unsatisfactory for IRB and PRB during the calibration period [71]. Some characteristics of the study areas were determinants of the results obtained. The basins supply water to cities and have an important water demand for different uses. The PRB and UIRB present reservoir systems for supply systems. In the PRB, four of the seven-gauge stations used for calibration are affected by a considerable transfer of water to a neighboring basin. The difference between the observed and natural simulated discharges contributed to the reduction in NSE during the calibration. The results of the gauges located in the upper basins with low anthropogenic interference (Supplementary Figure S8) showed better NSE and index results (acceptable according to Supplementary Table S4). According to [72], global calibration is limited to homogeneous hydrological regimes and implies a diminution in performance, specifically in estimating high flows in small catchments. The two opposite rainfall regimes at the IRB (semiarid and wetland) suggest that it is necessary to consider two sets of different parameters for calibration. During the validation period, NSE was acceptable or good in all three basins. The RB was very good for the UIRB during calibration and for the IRB during validation. Discharges were overestimated in the PRB and underestimated in the IRB and UIRB.
Asynchronous regression represents a linear regression between the observed and simulated flow duration curves. The hydrograph of the observed and simulated monthly discharge shows a good recession shape. The median, mean, maximum, and minimum monthly discharges were also evaluated in the results and discussion.
RUBEM showed different results in the smaller drainage areas of the three basins. The worst performance was found at a 672 km 2 gauge station in the IRB basin. Wang et al. 2018 achieved a moderate performance in basins smaller than 400 km 2 because of the solid nonlinear relationship between rainfall and flow. Ref. [47] found that the WetSpass model failed in small basins because some processes involved had timescales shorter than a month.
In the three basin simulations, the grid resolution was set at 500 m. RUBEM was tested in a 55.6 km 2 PRB sub-basin with a grid resolution of 30 m and showed satisfactory performance. A greater spatial resolution provides more details on the variables influencing runoff. However, this increases the number of variables, simulation computational burden, and difficulty in calibrating the parameters. Therefore, the spatial scale is defined based on the degree of variability and uncertainty to be represented. Refs. [4,35,62,73] pointed out that the model does not represent the heterogeneity of microscale hydrological processes, even at a large spatial resolution.
The physical characteristics of the soil in each basin also varied, resulting in uncertainty. The values adopted for each basin represent the average conditions according to the classification adopted in Brazil [54]. As in the VIC model [73], RUBEM also demonstrated sensitivity in adjusting minimum flows under specific conditions of long-term dry periods.
Under these conditions, better model adaptability can be achieved by improving the physical and hydrological properties of soil.
RUBEM was developed to improve the management and planning of integrated water resources under climate, land use, and land cover changes in tropical areas. The time step to represent hydrological process according to the model formulation at a grid basis level was selected according to the time-scale response of the soil recharge and evapotranspiration processes [74], including the hydrometric gauge station quality in Brazil [64,75]. Furthermore, we noted the successful application of the SWAT model (semidistributed) in Brazilian watersheds, with very good calibration results on a monthly basis rather than on a daily basis [76]. However, RUBEM has been developed using a physics-based process description structure [28,77,78] and remote sensing data [3,50] to improve the soil-water balance (evapotranspiration, interception, baseflow, and recharge) in tropical regions. Different conceptions and structural models contribute to understanding hydrological similarity and reducing the main source of uncertainties introduced by the model structure and specific pattern representation at the basin level [27][28][29][30][31]. Additionally, it is important to note the uncertainties and accuracy of the input dataset, especially the hydrometrical gauge stations, which have one of the most important roles in the calibration and validation modelling assessment.

Soil and LULC Heterogeneities
The volume of moisture available in the root layer is influenced by depth, thickness, bulk density, and moisture content. The depth of soil to be explored by the root system of vegetation is defined by the physical properties of the soil class. This ensures that the water capacity of the soil is not limited by the depth of the root system. Although the different vegetation types present different root system characteristics, removal of water from the soil by the plant is estimated by ET REAL , which depends on the type of physio-morphological characteristics of the vegetation in each cell. The kc and ks define the removal of water from soil. According to [49], the ET of a species can be obtained using empirical values of kc, which should include all the differences in the physiological morphology of the crop in its real state and the potential conditions (soil-water availability, leaf mass, and full physiological activity).
In distributed hydrological models, different responses are expected at different temporal and spatial scales. The correlation between the climatological and physical characteristics of the basin helps understand the prevailing controls on the behavior of the basin at a particular scale [66]. The correlation between the moisture index (P/ETp) and runoff coefficient (SR/P) at specific spatial and temporal scales was found to assess the representativeness of the analysis. Cell sampling during the dry season resulted in a very low average P/ETp in the UIRB, indicating that the month sampled (July 2018) was very dry, resulting in a low SR/P. Located in the same hydrological region, the PRB showed an average P/ETp 11 times higher than that obtained in the UIRB in the sampled month (June 2013), resulting in an SR/P twice as high. The antecedent storage of water in the soil at the UIRB may have been the cause of this disproportionation. The P/ETp anomaly during the dry month of the UIRB contributed to a reduction in the correlation coefficient between the indices. The IRB presented P/ETp in the wet season that was three times greater than that in the dry season (February 2014), resulting in a proportional reduction of SR/P. This basin is located at a higher latitude, with a high atmospheric evaporative demand in the season and shallower soil. The correlations for crops, agriculture, and pasture were lower due to differences in responses in the UIRB and PRB. At the UIRB, the species showed lower water demand (kc = 0.66), and many species had a short development and production cycle in the wet season, including temperate vegetable and fruit crops between sprouting and harvesting, resulting in lower evapotranspiration. In the PRB, the species demanded more water (kc = 0.69). The predominant species in this basin is sugarcane, which has a longer development period and greater water consumption. Ref. [57] found P/ET = 2.27 in a sub-basin of the PRB in 35.5% of the forests and 31% of the pastures. Refs. [58,61] achieved a ratio of 1.14 in Africa and 1.63 in a basin in western São Paulo, respectively.
RUBEM estimates ET REAL using the average ETp data, kc based on NDVI, and ks using soil-water balance. Figures 4 and 5 show the consistency in estimating these variables. The magnitudes of the hydrological components of the cycle were compared with the results presented in literature. Thus, evapotranspiration and soil moisture values were estimated to correspond to the results of multiple studies that used remote sensing in Brazil [79][80][81].
The linear regressions between kc and NDVI for the cells shown in Figure 8 showed r > 0.99 for all vegetation cover. The higher NDVI for the forest was consistent with the kc values in relation with other coverages. [82,83] observed a strong kc-NDVI correlation for crops and pastures at different stages of development. [83] observed a lack of sensitivity of NDVI at high Leaf Area Index (LAI) values. This means that both plant transpiration and light absorption increase at roughly the same rate at low LAI and then saturate. Consequently, kc adjusts better for high NDVI and underestimates for low NDVI. The ETREAL formulation recognizes changes in vegetation over time through the NDVI, but does not consider stomatal resistance, and does not distinguish physiological differences between plant species. However, the formulation adopted for estimating the LAI, kc, and moisture content in the root zone was considered satisfactory for calculating the soil-water balance and ETREAL under different climatic conditions and vegetation cover. The soil layer water balance is the most important representation of water cycle patterns to understand the impact of LULC changes because of anthropic land use and The highest kc for forests indicates greater consumption of water than that observed for crops and pastures. Ref. [84] observed an LAI of 5.8 to 7.8 m 2 m −2 for the Atlantic Forest in Brazil. Although transpiration is positively correlated with LAI, radiation attenuation from the top of the canopy to the lower parts brings the global conductance of several forest species very close for LAI > 3 m 2 m −2 . However, an increase in LAI implies greater interception and consequent evaporation of water accumulated in the canopy. During periods of water scarcity and lower irradiance, absorption of water from the soil occurs at greater depths. The root system of Atlantic Forest species can reach up to 5 m [84]. According to [49], stomatal resistance increases with plant growth and maturation, under water stress conditions, and restriction of water availability in the soil, thus limiting the ET REAL . Ref. [85] observed that water stress conditions increased stomatal resistance, reaching triple its value in tropical forests.
The ET REAL formulation recognizes changes in vegetation over time through the NDVI, but does not consider stomatal resistance, and does not distinguish physiological differences between plant species. However, the formulation adopted for estimating the LAI, kc, and moisture content in the root zone was considered satisfactory for calculating the soil-water balance and ET REAL under different climatic conditions and vegetation cover. The soil layer water balance is the most important representation of water cycle patterns to understand the impact of LULC changes because of anthropic land use and planning.
Sensitivity analyses of input variables in the WetSpass model show that the critical factors are precipitation, LAI, and potential evapotranspiration [47]. RUBEM satisfactorily represented the changes in LULC. In two cells with and in one without coverage changes, the model calculated the variables proportionally according to the changes over time.
The model source code was written in the Python programming language, and the PCRaster library dynamic model framework was used to perform spatial calculations. The LULC simulation used the NDVI time series to change over time. The root depth remained constant throughout the simulation period, and there were no snow-melting equations.

Conclusions
RUBEM input data and formulation are intended to represent variables that impact the physical processes of the hydrological cycle (on which RUBEM is based). Thus, NDVI from remote sensing data tends to cover the characteristics of plant species, such as stomatal resistance. The model also captures a constant pattern of recharge, evapotranspiration, and runoff in a covered area without LULC changes, with respect to precipitation changes that could affect the soil moisture conditions.
The RUBEM formulation has a specific physics-based structure to represent changes in water balance patterns represented by spatio-temporal heterogeneity variables. Overall, RUBEM can be used to help improve the management and planning of integrated water resources under climate, land use, and land cover changes in tropical regions.