2.1. Study Area
The Zamora Huayco (ZH) river basin (3806 ha) is located in the inter-Andean region of the Loja province in southern Ecuador between the geographic coordinates 3°59′42″–4°04′03″ S and 79°11′54″–79°07′35″ W (
Figure 1). The elevation of the basin ranges between 2120 and 3420 m asl, and its average slope is 0.65 m/m.
The basin’s climate is cold temperate mesothermal [
25], characterized by an average annual temperature between 12 and 18 °C and average annual precipitation of 1047 mm. The wet season occurs from December to May and the dry season from June to November [
26].
The basin has a predominantly forested vegetation. Since 1976, the forests of the basin have decreased by 19.3% [
27]. The upper zone of the ZH basin is shared with the buffer zone of the Podocarpus National Park (PNP) [
24]. Since the 1960s, natural vegetation near to PNP has been extensively removed to create pastures and farmland [
27,
28]. The main productive activities in the basin are agriculture and cattle raising [
24,
29]. In the ZH basin, there are two water catchments for potabilization that supply approximately 50% of the demand of the city of Loja with 450 l/s [
30].
2.2. Land Use/Land Cover Change (LUCC)
Using remote sensing techniques, satellite images were processed to analyze LUCC. Three images were obtained: Landsat7 ETM (11 September 2009), Landsat8 OLI_TIRS (20 September 2017), and Sentinel 2B Level 1C (31 July 2019), which were ortho-rectified [
31,
32]. The satellite’s geometric coincidence with two temporarily invariable control points was verified, taking as reference an aerial photograph obtained from the SigTierras portal [
33].
For the Landsat7 ETM and Landsat8 OLI_TIR images, the conversion to reflectance was performed to top of atmosphere (TOA) and to brightness temperature according to L. Congedo (2016). For the Sentinel 2B Level 1C image, the conversion was not performed as it is already scaled in TOA [
32]. In the three images, the atmospheric correction Dark Object Subtraction (DOS) was performed [
34]. In the three images, the topographic correction was also applied due to shading (CTS), using the Lambertian Cosine Model proposed by Teillet [
35], for which it was necessary to determine the angle of incidence according to [
36].
Radiometrically Terrain Corrected (RTC) data from DEM ALOS PALSAR, purchased from the Alaska Satellite Facility (ASF) Fairbanks, with a spatial resolution of 12.5 m, geocoded and radiometric corrected in its geographical extension [
37], was used to carry out the CTS. LULC was obtained by supervised classification, using the method of the maximum likelihood classification algorithm based on Bayes’ theorem, of the three images previously processed. Three LULC categories of anthropogenic type and three of natural-physical type were classified.
For LUCC analysis, the procedure was analogous to that proposed by [
38]. First, coverages obtained for the years 2009 and 2017 were analyzed; the quantitative change represented graphically in terms of gains and losses by categories was obtained.
As explanatory variables, maps were used showing the exchanges between the categories of grasslands and shrub vegetation, forest and bare soil, crops and grasslands, in addition to forest and crops. Additionally, a map containing the spatial trend of change was made, from forest to shrub vegetation, in order to generalize the pattern of change between these categories [
39]. A ninth-degree polynomial order was considered.
Changes below 600 cells were ignored, in order to limit the transition model to 9 sub- transitions. The sub-transitions selected were: shrub vegetation to forest, bare soil to shrub vegetation, bare soil to grassland, bare soil to forest, grassland to shrub vegetation, grassland to forest, forest to shrub vegetation, forest to grassland, and grassland to crops.
The objective of the transition model is to create potential transition maps with a high degree of precision to execute the change model [
38]. Transitions were grouped into a single sub-model; also, 6 explanatory variables were included.
The first explanatory variable was the shrub vegetation cover contained in the 2009 map. This model was included as a dynamic component that is recalculated over time during the course of the prediction. The other explanatory variables were included as static components, corresponding to the spatial trend maps and exchanges between categories previously generated.
Physical restrictions of any kind have not been considered, because they are not necessary in a simple prediction [
38]. For the transition model, the Multi-Layer Perceptron (MLP) algorithm has been applied, based on the Backpropagation (BP) algorithm, which is a supervised training algorithm [
40]. For the change model, Markov chains were applied, generating a matrix with the probability that each category of land cover will change to any other category [
38]. A validation was performed with the map generated for the year 2019.
To verify the image projected to 2019, a confusion matrix was performed and the kappa coefficient was determined. The confusion matrix shows the relationship between the reference data (2019 classified map) and the data to be evaluated (2019 projected map), constructing a matrix comparison of the classes verifying the overall accuracy of the projection by relating the number of points correctly assigned to the total [
36]. The kappa coefficient evaluates the degree of agreement between categorical variables and takes into account the coincidences by randomness and by decision criteria; that is, it shows the degree of agreement that exists above random [
41].
Similar to what was performed by [
7,
42], following the same conditions used in the 2009–2017 projection, the land use map for the year 2029 was generated, based on the 2009 and 2019 maps. The 2019 classified map containing LULC is called Scenario (1), and the 2029 projected map containing LULC is called Scenario (2).
2.3. Hydric Recharge Estimation
The multi-year average hydric recharge of the ZH basin was estimated, following the methodology set forth by [
8] and also considering the infiltration criteria of [
10], taking into account multi-year average precipitation, evapotranspiration (ET), basic soil infiltration rate, LULC, and terrain relief. The multi-year average rainfall was determined using data from La Argelia meteorological station over a 30-year period (1985 to 2015). Potential evapotranspiration (
) was determined through Hargreaves and Samani [
43], who considered precipitation over the same time period (30 years).
As there is a correlation between the crop coefficient (
) and the
NDVI [
44,
45,
46,
47],
was deducted by applying the expression [
44]:
NDVI was obtained from the Landsat8 OLI_TIRS image (20 September 2017). The multi-year average value of the calculated ETP was multiplied by the
value to estimate the actual ET [
48].
A slope map was generated using the DEM ALOS PALSAR RTC. This map was classified into 6 different categories (limits of <15, 15, 30, 50, 70, and >70%), assigning a coefficient to each.
, known as the surface runoff coefficient, indicates the fraction that infiltrates due to the effect of slope [
9]. Its value was taken from the table proposed by [
8].
To each land use scenario (2019 and 2029) was assigned a value, named the land use coefficient
, which represents the fraction that infiltrates due to the effect of the vegetation cover [
9]. Its values were extracted from the table proposed by [
8].
The infiltration fraction related to soil texture
was estimated using the following expression [
10]:
where
is the soil basic infiltration rate in mm/d. This was estimated using the Green-Ampt infiltration model [
49], from soil data sampled in the ZH basin by [
27].
According to [
27], the most common soil texture in the study area is loam, with a bulk density from 0.07–1.09 g/cm³, organic carbon percentage from 1.66 to 5.98%, field capacity around 25.34%, and hydraulic conductivity that varies between 4.6 and 8.9 mm/h in a saturated state.
The multi-year average recharge, for LULC Scenarios (1) and (2), was determined using the following expression [
8,
10]:
where
is the multi-year average precipitation corresponding to the La Argelia station.
is the potential precipitation determined through Hargreaves and Samani.
2.4. Flash Flood Risk Assessment
Precipitation records of 51 years (from 1964 to 2015) were prepared and the data were controlled, corrected, and revised; the maximum precipitation in 24 h for each year was ordered. Different distribution functions were evaluated. Their precision was estimated through accumulated error, the Nash–Sutcliffe coefficient (NSE) and root mean square error (RMSE). Using GIS, physical parameters of the basin were determined such as area, average slope, channel length, channel slope, etc. Concentration time was estimated by Kirpich, Clark, and Temez [
50,
51].
For subsequent calculations, the concentration time was considered equal to design storm time. The maximum intensity was determined with this time, taking into account the intensity, duration, and frequency equations based on the maximum rainfall in 24 h corresponding to La Argelia station [
52].
The hydrological response to extreme events was evaluated through synthetic unit hydrographs of the concentrated models Ven Te Chow, Snyder, Triangular SCS, and Temez. The mean and standard deviation of these results were calculated to assess dispersion and choose a model whose response is close to the mean. Curve numbers (CN) for scenario (1) and scenario (2) was determined using its respective land cover maps, taking into account the soil data provided by [
27] through GIS.
Information corresponding to paths and flows was input to the HEC RAS hydraulic model and the outputs obtained were returned to GIS [
11,
53]. For the hydraulic evaluation, the main channel was divided into 29 cross sections. The roughness coefficients of the main channel and the banks were determined with the procedure explained by the USGS [
54,
55].
2.5. Meteorological Forecast
Meteorological information was taken from La Argelia station records, analogously to the method undertaken by [
56]. It was developed with the annual mean temperature values; in addition, we obtained the values of maximum precipitation in 24 h of each month, and they were averaged obtaining an annual mean. These parameters will be referred to as T_average and Pmax_average from now on in the document.
The maximum precipitation data in 24 h of each month were prepared, since, as indicated by [
21], one of the effects to a near horizon due to the increase of temperatures in high Andean basins is the increase in high intensity rainfall. These two annual series were forecast for the year 2029 using the Integrated Autoregressive Moving Average (ARIMA) model, the Holt exponential smoothing model (double exponent), and the Holt–Winters model (triple exponent). These models seek to describe the future behavior of the variables in relation to their past values [
18].
The ARIMA model was used to predict series of a single variable. These are optimal with data series without seasonal variation, so they are widely used in annual series and meteorological variables [
18,
19,
20]. In the ARIMA model (p, d, q), p indicates the correlation between current values and their immediate past values (autoregressive component), d indicates the order of differentiation to be applied in the series so that it becomes seasonal, and q indicates the moving average component order [
17,
18,
57].
The double exponential model (Holt) considers an exponential smoothing, an estimation of trend, and a forecast. It involves adjusting the trend of the series at the end of each period. Triple Exponential Smoothing (Holt–Winters) adds an additional seasonality component to Holt model [
58].
Series were attempted to be temporarily consistent, from 1985 to 2015. For the ARIMA model and the exponential smoothing models, their parameters were estimated (autoregressive components, seasonality, moving average and functional transformations), selecting the appropriate ones according to the best fit [
17,
19,
56,
57]. In order to verify the precision of the model and the chosen parameters, statistical measures such as the mean absolute error (MAE) and the root mean square error (RMSE) were determined.
2.6. Water Availability Estimation
In order to estimate water availability, a semi-distributed hydrological modeling was carried out through SWAT, with LULC scenarios (1) and (2). With the flow values simulated, FDC were determined for each scenario. For the modelling, DEM ALOS PALSAR RTC was used and weather information was collected from La Argelia station. The model worked with 24 h precipitation and daily average values of maximum temperature, minimum temperature, relative humidity, and wind speed. Solar radiation was estimated applying the Hargreaves and Samani equations [
43].
Monthly climatic parameters required by SWAT were calculated using the SWAT Weather Database tool [
59], similar to what was done by [
14]. Missing values were input as −99.0, so that SWAT can estimate data for that day [
60]. A soil map was generated from the soil sampling data provided by [
27], data was supplemented and adapted to SWAT requirements, and the
variable was determined by applying Williams equations [
60]. As these are soils with unique characteristics, they were added to the SWAT user-soil database table, analogously to the research carried out by [
61].
The LULC maps of 2019 and 2029 contained in scenarios (1) and (2) respectively were used for analysis. Coverages of forest, shrub vegetation, grassland, bare soil, agriculture, and urban life were concatenated to SWAT database coverages considering similarity between characteristics and parameters, similar to what was done by [
16].
The slopes range considered for the definition of the hydrological response units (HRU) were selected based on those established by [
8], which respond to an adaptation of the considerations made by [
9] considering water infiltration easiness into the ground due to slope, similar to what was performed by [
62].
Simulation was performed considering a daily periodicity. The number of years of omission (NYSKIP) was entered, which is necessary to train the model; in this case 5 years was input as NYSKIP. As a general rule, a minimum of 3 NYSKIP is an acceptable heating period [
1]. The simulated flow begins in 1991 and ends in 2015.
In order to analyze the occurrence and frequency of the flow at the exit of the basin, and to be able to predict its availability, FDC were generated for each scenario [
63,
64]. This theoretical curve has been determined from modeled daily average data.