Identifying a Minimum Time Period of Streamflow Recession Records to Analyze the Behavior of Groundwater Storage Systems: A Study in Heterogeneous Chilean Watersheds

: Aquifers are complex systems that present significant challenges in terms of characterization due to the lack or absence of watershed-scale hydrogeological information. An alternative to address the need to characterize watershed-scale aquifer behavior is recession flow analysis. Recession flows are flows sustained by groundwater release from the aquifer. Aquifer behavior can be characterized using recession flow records available from gauging stations, and therefore an indirect measure of aquifer behavior is obtained through watershed-scale recession flow records and analysis. This study seeks to identify the minimum time period necessary to characterize the behavior of groundwater storage systems in watersheds with different geological, morphological, and hydrological characteristics. To this end, various watersheds in south-central Chile underwent recession flow analysis, with eight time periods considered (2, 3, 4, 5, 10, 15, 20, and 25 years). The results indicate that 25 years of records are sufficient for the characterization of watershed-scale aquifer behavior, along with the representation of the groundwater storage-release (S-Q) process in watersheds with different geological, morphological, and hydrological characteristics. Additionally, the results show that an initial characterization of the groundwater system behavior in watersheds with different geological characteristics can be carried out with two years of records. This information could be important for practical engineering and the study of groundwater systems in watersheds with limited hydrological and hydrogeological information.


Introduction
Characterization of groundwater systems is often a complex task because watershedscale hydrogeological information is limited due to data scarcity, limitations of monitoring techniques, or access difficulties due to geological and/or topographic factors [1,2].
Meanwhile, to characterize the temporality of hydrological processes or the hydrogeological functioning of watersheds, complete hydrological or hydrogeological data series are required (e.g., spatially distributed groundwater levels in a watershed).However, hydrogeological time series spatially distributed for these studies are generally limited or nonexistent, and when they are available, they tend to present discontinuities [3].
(fractures, porosity, permeability, etc.).These characteristics favor the ability of the watersheds to conduct and transmit water, facilitating interactions among groundwater infiltration, storage, and discharge processes [33,34].The studied watersheds are monitored on the western slope of the Andes Mountains, the Central Valley, and the Coastal Range.
The period selected for the recession flow analysis comprises 30 years of daily mean flows .The daily mean flow records were obtained from the Catchment Attributes and Meteorology for Large Sample Studies-Chile Dataset (CAMELS-CL) platform, presented by Alvarez-Garreton et al. [35].This platform includes meteorological (e.g., precipitation and evapotranspiration) and hydrological (e.g., streamflow) data from throughout Chile for each watershed; therefore, post-processing such data (i.e., interpolation) was not necessary.Appendix A presents a table with information on the studied watersheds, while the flow data can be downloaded from the CAMELS-CL platform (https://camels.cr2.cl/,accessed on 15 July 2023.).The selected watersheds present different geological formations (volcanic, plutonic, metamorphic, sedimentary, and mixed) as well as different hydrogeological properties (fractures, porosity, permeability, etc.).These characteristics favor the ability of the watersheds to conduct and transmit water, facilitating interactions among groundwater infiltration, storage, and discharge processes [33,34].The studied watersheds are monitored on the western slope of the Andes Mountains, the Central Valley, and the Coastal Range.
The period selected for the recession flow analysis comprises 30 years of daily mean flows (1990-2019).The daily mean flow records were obtained from the Catchment Attributes and Meteorology for Large Sample Studies-Chile Dataset (CAMELS-CL) platform, presented by Alvarez-Garreton et al. [35].This platform includes meteorological (e.g., precipitation and evapotranspiration) and hydrological (e.g., streamflow) data from throughout Chile for each watershed; therefore, post-processing such data (i.e., interpolation) was not necessary.Appendix A presents a table with information on the studied watersheds, while the flow data can be downloaded from the CAMELS-CL platform (https://camels.cr2.cl/,accessed on 15 July 2023.).

Recession Flow Analysis
The method proposed by Brutsaert and Nieber [25] was used for the recession flow analysis.The snowmelt periods (October-December) of each year of records were removed to obtain periods in which the river streamflow is generated entirely by a groundwater storage-discharge process.
The recession events were obtained from the hydrograph of each basin when dQ/dt was below zero for at least 5 consecutive days until it exceeded zero.To avoid the influence of rapid precipitation-runoff processes, the beginning of the recession period was considered one day after the peak flow.
To determine recession slope b (Equation ( 1)), the logarithm of the rate of change in flow (dQ/dt) was plotted against the logarithm of average discharge ((Q i + Q i−1 )/2) during the same period.The rate of change (dQ/dt) was calculated using the exponential time-step method (ETS) proposed by Roques et al. [36].The ETS removes artifacts (errors) resulting from the computation of the rate of change in discharge (dQ/dt).This method differs from the traditional method, in which the time step is constant (∆t is equal to t i − t i−1 ), since the time step increases exponentially in each recession event [37].The time step (∆t) is obtained with an increment m value that is calculated by fitting an exponential function to the recession period.The m value is the total of values in an interval (∆t = t i+m − t i ).The change rate (dQ/dt) is calculated using the linear least-squares method over the interval (Q i , Q i+m ; t i , t i+m ) [36].Slope b was calculated by the log-log linear least-squares method applied to grouped recession data (data bins).The grouped data from an average of log (dQ/dt) and log (Q) considering intervals with 5% of the recession data were obtained.
Parameter b represents the recession regime (or the rate of the decrease in flow over time).A high b value (>1.5) represents a rapid decrease in flows following a precipitation event (fast recession), while a low b value (<1.5) represents a gradual decrease in flows after a precipitation event (slow recession).In other words, in watersheds with low b values, the constant contribution of groundwater to rivers would predominate, blunting the flow decrease.Meanwhile, in watersheds with high b values, a rapid groundwater contribution would predominate and would decrease quickly.

Clusters
This study used the 8 watershed clusters presented by Parra et al. [28].These groups were obtained using principal component analysis (PCA) and K-means methods, which were programmed in MATLAB.With the PCA, the coefficients were obtained through covariance matrix decomposition to identify the principal components that capture the greatest data variability.The clusters were obtained via the K-means method, which requires as an input parameter the number of groups (K) to cluster the data.The parameter K represents the number of randomly assigned centroids in each cluster.Via an iterative process, the distances between each centroid and the data are calculated using the squared Euclidian distance.In each iteration, the centroids are recalculated, and all observations are assigned to the nearest cluster or centroid until the assigned data and centroid of each cluster stabilize.The clustering characteristics were the degree of permeability, the mean slope, and the aridity index of each watershed.The degree of permeability was obtained using geological formations and the drainage network of each basin.A permeability value was assigned to each stream/river based on the type of geological formation (5.0 for formations with high permeability, such as fractured geology, and 1.0 for formations with low permeability, such as plutonic geology).Finally, the value of the degree of permeability for the entire watershed was obtained as an average of each stream/river section.The aridity index was obtained as the ratio between mean annual precipitation and mean annual evapotranspiration in each watershed.
Figure 2 shows boxplots with the hydrogeological, morphological, and climate characteristics of the basins for the 8 clusters obtained by Parra et al. [28] (mean elevation, mean slope, degree of permeability, and aridity index).The basins that compose clusters C1 and C6 are located in the Andes Mountains and foothills, with mean elevations ranging from 650 to 1900 m.a.s.l., slopes over 13 • , and medium to high permeability (volcanic, plutonic, and mixed geological formations) (see Figure 2a-c).Meanwhile, C3 and C5 are composed of basins monitored in the Central Valley, with mean elevations below 600 m.a.s.l., slopes between 0.7 and 7 • , and variable permeability (low, medium, and high) (see Figure 2a-c).Clusters C2, C4, and C8 are composed of a broad group of basins monitored in the Central Valley and Coastal Range (heterogenous clusters).Finally, cluster C7 is composed of basins monitored in the Coastal Range, with mean elevations between 200 and 800 m.a.s.l., slopes between 4 and 11 • (Figure 2a,b) and medium-low permeability associated with sedimentary to mixed characteristics.
Water 2024, 16, x FOR PEER REVIEW 5 of 13 Figure 2 shows boxplots with the hydrogeological, morphological, and climate characteristics of the basins for the 8 clusters obtained by Parra et al. [28] (mean elevation, mean slope, degree of permeability, and aridity index).The basins that compose clusters C1 and C6 are located in the Andes Mountains and foothills, with mean elevations ranging from 650 to 1900 m.a.s.l., slopes over 13°, and medium to high permeability (volcanic, plutonic, and mixed geological formations) (see Figure 2a-c).Meanwhile, C3 and C5 are composed of basins monitored in the Central Valley, with mean elevations below 600 m.a.s.l., slopes between 0.7 and 7°, and variable permeability (low, medium, and high) (see Figure 2a-c).Clusters C2, C4, and C8 are composed of a broad group of basins monitored in the Central Valley and Coastal Range (heterogenous clusters).Finally, cluster C7 is composed of basins monitored in the Coastal Range, with mean elevations between 200 and 800 m.a.s.l., slopes between 4 and 11° (Figure 2a,b) and medium-low permeability associated with sedimentary to mixed characteristics.

Estimation of b as a Function of Flow Record Length
To determine the value of b (S-Q process behavior) as a function of flow record length, eight sizes of moving time windows of clustered recession events (point clouds) were analyzed.Time windows of 2 years (w1), 3 years (w2), 4 years (w3), 5 years (w4), 10 years (w5), 15 years (w6), 20 years (w7), and 25 years (w8) were selected to determine a minimum or adequate analysis period, covering long-term processes such as climate variability.

Results
Figure 3 shows the results (box-plots) obtained from the recession flow analysis, along with a comparison of results for each time window and cluster, while Figure 4 presents the median slope b obtained from the analysis of each time window.

Results
Figure 3 shows the results (box-plots) obtained from the recession flow analy along with a comparison of results for each time window and cluster, while Figure 4 p sents the median slope b obtained from the analysis of each time window.The results demonstrate a relationship between the recession slope b and the geol ical and morphological characteristics of the watersheds since the variability of slope greater in mountain basins (clusters 1 and 6) than in controlled basins in the central val (clusters 3 and 5 in Figure 3).In general, Figure 3 shows that the variation (i.e., the size the box-plot) in the estimation of recession slope b slightly decreases as the size of the ti window w increases.Regarding the variation of parameter b, it is observed that in clust 1, 6, and 7 (formed by watersheds of the Andes Mountains and Coastal Range with a p dominance of volcanic and granitic geology), the highest proportion of b values (man values approximately above the 3rd quartile; see Figure 3) is greater than 1.5.Meanwh groups C2, C3, C4, C5, and C8 (formed by watersheds with a predominating sediment and mixed geology) present the highest proportion of b values below 1.5 (many value b below the 3rd quartile; see Figure 3).
Figure 4 shows the median boxplot for each moving window.The median b val in groups C1, C6, and C7 are over 1.5, while in other groups (C2, C3, C4, C5, and C8) median b value is below 1.5.Therefore, if the median boxplot is considered as the rep sentative slope (b) for each cluster, the size of the window of analysis does not seem affect the hydrogeological behaviors-that the volcanic and mixed watersheds presen quick recession (rapid drainage) or that the sedimentary and mixed watersheds hav slow recession (slow drainage).In addition, Figure 5 shows the matrix correlation analy between the median slope b of each watershed in this study obtained from the 8 mov windows.A good fit between the median of each time window is observed (r > 0. which indicates that the groundwater behavior may be determined based on the med slope using limited streamflow recession data records (e.g., two-year records).The results demonstrate a relationship between the recession slope b and the geological and morphological characteristics of the watersheds since the variability of slope b is greater in mountain basins (clusters 1 and 6) than in controlled basins in the central valley (clusters 3 and 5 in Figure 3).In general, Figure 3 shows that the variation (i.e., the size of the box-plot) in the estimation of recession slope b slightly decreases as the size of the time window w increases.Regarding the variation of parameter b, it is observed that in clusters 1, 6, and 7 (formed by watersheds of the Andes Mountains and Coastal Range with a predominance of volcanic and granitic geology), the highest proportion of b values (many b values approximately above the 3rd quartile; see Figure 3) is greater than 1.5.Meanwhile, groups C2, C3, C4, C5, and C8 (formed by watersheds with a predominating sedimentary and mixed geology) present the highest proportion of b values below 1.5 (many values of b below the 3rd quartile; see Figure 3).
Figure 4 shows the median boxplot for each moving window.The median b values in groups C1, C6, and C7 are over 1.5, while in other groups (C2, C3, C4, C5, and C8) the median b value is below 1.5.Therefore, if the median boxplot is considered as the representative slope (b) for each cluster, the size of the window of analysis does not seem to affect the hydrogeological behaviors-that the volcanic and mixed watersheds present a quick recession (rapid drainage) or that the sedimentary and mixed watersheds have a slow recession (slow drainage).In addition, Figure 5 shows the matrix correlation analysis between the median slope b of each watershed in this study obtained from the 8 moving windows.A good fit between the median of each time window is observed (r > 0.98), which indicates that the groundwater behavior may be determined based on the median slope using limited streamflow recession data records (e.g., two-year records).

Discussion
In this study, the recession parameter b results obtained from different time windows and watershed clusters were analyzed.In general, if the median of the estimated b values is considered as a representative value of each cluster, the results show that independent of the size of the time window, the watersheds with fast (b > 1.5) or slow drainage hydrogeological behavior (b < 1.5) are consistently identified; however, the smaller the record time window size, the greater the uncertainty in the estimation of recession parameter b (i.e., wider box-plots).
The clusters that show two of the greatest parameter b variability are related to watersheds located in the Andes Mountains (C1 and C6), at elevations over 600 m.a.s.l. and steep slopes above 13° (see Figures 2 and 3).This behavior is linked to the hydrogeological (lava with a high degree of fracturing) and morphological characteristics (steep slopes) of such watersheds (Figure 5).These characteristics favor the interaction of surface water and

Discussion
In this study, the recession parameter b results obtained from different time windows and watershed clusters were analyzed.In general, if the median of the estimated b values is considered as a representative value of each cluster, the results show that independent of the size of the time window, the watersheds with fast (b > 1.5) or slow drainage hydrogeological behavior (b < 1.5) are consistently identified; however, the smaller the record time window size, the greater the uncertainty in the estimation of recession parameter b (i.e., wider box-plots).
The clusters that show two of the greatest parameter b variability are related to watersheds located in the Andes Mountains (C1 and C6), at elevations over 600 m.a.s.l. and steep slopes above 13 • (see Figures 2 and 3).This behavior is linked to the hydrogeological (lava with a high degree of fracturing) and morphological characteristics (steep slopes) of such watersheds (Figure 5).These characteristics favor the interaction of surface water and groundwater [38,39] and, therefore, the behavior of storage-release process dynamics [32].A similar behavior is observed in the watersheds controlled in the Coastal Range (C7), with mean elevations between 200 and 800 m.a.s.l. and slopes between 4 • and 11 • , as they also present relatively high variability of slope b.The watersheds located in the Coastal Range present a greater proportion of granitic geology [33] with a high level of soil degradation.This soil presents low permeability [40], which influences runoff generation and the infiltration process [41] and, therefore, the groundwater storage and release relationship represented by parameter b.
The remaining clusters (C2, C3, C4, C5, and C8) show less recession slope variability (narrower boxplots).These watersheds present a major influence of sedimentary and mixed formations, which result in greater stability of flows associated with the aquifer and therefore less b variability.For example, the watersheds of C3 are monitored in the Central Valley of Chile; therefore, they present a greater sedimentary influence.These watersheds present stable behavior, as the median b value remains relatively constant at each time window (see yellow line in Figure 4).This indicates that there is a greater groundwater contribution, maintaining stable flows in rivers in watersheds with sedimentary and mixed geology.The behavior observed in these watersheds is in line with the results presented by Parra et al. [17], who mention that watersheds with sedimentary geology influence residence time, while watersheds with mixed geology present a transitory behavior (fast and slow drainage).
Figure 3 shows that the size of the box decreases as the size of the window under analysis increases, suggesting that the results tend to be more robust with a longer time window under analysis.In addition, if the median of the boxplot is considered as a representative value for the aquifer characterization, a two-year (w1) time window is proven to be enough for almost all clusters (an exception is seen in C1, where more data seems to be necessary to better describe that cluster).
Although the watershed clusters present parameter b variability for the different time windows analyzed, the average hydrogeological behavior (median in Figure 4) is maintained, independent of the size of the time window.
To further analyze if a two-year time window is enough for watershed hydrogeological characterization, the median slope b obtained from 250 aleatoric combinations of two years (non-consecutive) of streamflow recession was plotted (Figure 6).The median was calculated using the slope b values of the watersheds that comprise the cluster obtained from each aleatoric combination of two years.Figure 6 shows that in over 95% of the results, the median hydrogeological behavior for clusters C2, C3, C4, C6, and C8 can be properly obtained from two years of records.On the other hand, it is also observed that clusters C1, C5, and C7, mainly basins with fractured and granitic geology, would need more data records to estimate the hydrogeological behavior of the watersheds.The results obtained from 72 watersheds grouped into eight clusters allow an understanding of the hydrogeological behavior of a group of watersheds with similar geomorphological characteristics.Using clusters allows for more robust conclusions since the findings are related not to a single watershed but rather to a group of watersheds with similar geomorphological characteristics.Small catchments and large basins (100-20,500 The results shown indicate that an initial characterization of groundwater system behavior or regime (fast or slow drainage) in watersheds with different geological characteristics can be achieved based on the median of the flow recession curves recorded in two years (not necessarily consecutive years), whether volcanic, sedimentary, plutonic, or mixed geology.On the other hand, watersheds with fractured or granitic geology might need longer records.
Determining hydrogeological behavior via parameter b and its connection to physiographical, geological, and hydrological characteristics allows us to ascertain which watersheds likely have a continuous groundwater contribution to rivers and thus are possibly resilient to climate variability and change [42].Therefore, recession flow analysis is an adequate tool for providing preliminary results and facilitating the selection of complementary tools suitable for future studies or research on aquifer characterization.In addition, the identification of a minimum time window for groundwater characterization optimizes data collection, decreasing the time and costs associated with groundwater system characterization.
The results obtained from 72 watersheds grouped into eight clusters allow an understanding of the hydrogeological behavior of a group of watersheds with similar geomorphological characteristics.Using clusters allows for more robust conclusions since the findings are related not to a single watershed but rather to a group of watersheds with similar geomorphological characteristics.Small catchments and large basins (100-20,500 km 2 ) were analyzed in this study, covering a wide range of geomorphological features.Therefore, the results could be extrapolated to other areas with similar characteristics.However, validation is recommended, as the clusters obtained are valid for the attributes considered in this study.
The findings of this study conducted in the diverse watersheds of central Chile hold significant importance for understanding the impact of the 15-year drought that has afflicted the region [43].Lee and Ajami [44] have previously outlined how extended drought periods affect baseflow in rivers across the USA.They elucidate that the decrease in groundwater discharge, stemming from diminished recharge, directly impacts aquifer systems.According to their research, rivers require a long period beyond the duration of the drought to fully recover.Consequently, using a two-year window enables a nuanced understanding of aquifer response to prolonged drought, allowing analysis of changes in recession constants, as demonstrated by Trotter et al. [45].This observation is particularly noteworthy, especially given recent publications expressing concerns regarding groundwater exploitation in central Chile [46][47][48].

Conclusions
This study analyzed recession parameter b based on flow records with different time windows.To this end, watershed clusters with diverse hydrological, geological, and morphological characteristics were used to identify a minimum time period for characterizing the (average) behavior of groundwater storage systems.
The results obtained in this study provide relevant information on the hydrogeological behavior of diverse watersheds, highlighting the importance of considering diverse predominant characteristics and a minimum record length for an adequate characterization.The results show that a size of the time window equal to 25 years for the analysis is sufficient to perform an estimation of parameter b (hydrogeological regime or behavior) in watersheds with different geological, morphological, and hydrological characteristics.Additionally, the results show that an initial characterization of the groundwater system behavior in watersheds of volcanic, sedimentary, plutonic, or mixed geology can be carried out with 2 years of records.
Finally, understanding the behavior of groundwater systems (aquifers) in watersheds with limited hydrogeological records is fundamental for identifying watersheds resilient to climate variability and change.Therefore, the results of this study contribute to practical engineering as they broaden the spectrum of potential data to use to study groundwater systems in watersheds without extensive hydrological and hydrogeological records.

Figure 1 .
Figure 1.Locations of the watersheds used in the study area and their hydrological and geomorphological characteristics.The predominant slope map (a), degree of permeability (b), and aridity index (c) of each watershed are shown.

Figure 1 .
Figure 1.Locations of the watersheds used in the study area and their hydrological and geomorphological characteristics.The predominant slope map (a), degree of permeability (b), and aridity index (c) of each watershed are shown.

Figure 2 .
Figure 2. Cluster characteristics.The mean elevation (a), mean slope (b), degree of permeability (c), and aridity index (d) of each cluster are also shown.
To determine the value of b (S-Q process behavior) as a function of flow record length, eight sizes of moving time windows of clustered recession events (point clouds) were analyzed.Time windows of 2 years (w1), 3 years (w2), 4 years (w3), 5 years (w4), 10 years (w5), 15 years (w6), 20 years (w7), and 25 years (w8) were selected to determine a minimum or adequate analysis period, covering long-term processes such as climate variability.

Figure 2 .
Figure 2. Cluster characteristics.The mean elevation (a), mean slope (b), degree of permeability (c), and aridity index (d) of each cluster are also shown.

Water 2024 ,
16, x FOR PEER REVIEW 7 of 13

Figure 5 .
Figure 5.The correlation matrix between each watershed's median slope b values from the 8 time windows.

Figure 5 .
Figure 5.The correlation matrix between each watershed's median slope b values from the 8 time windows.

Figure 6 .
Figure 6.Median slope b values of two-year combinations.The dashed line represents the limit value of b for fast drainage (b > 1.5) and slow drainage processes (b < 1.5).

Figure 6 .
Figure 6.Median slope b values of two-year combinations.The dashed line represents the limit value of b for fast drainage (b > 1.5) and slow drainage processes (b < 1.5).