Assessing Reservoir Performance under Climate Change. When Is It Going to Be Too Late If Current Water Management Is Not Changed?

: Climate change is modifying the way we design and operate water infrastructure, including reservoirs. A particular issue is that current infrastructure and reservoir management rules will likely operate under changing conditions different to those used in their design. Thus, there is a big need to identify the obsolescence of current operation rules under climate change, without compromising the proper treatment of uncertainty. Acknowledging that decision making beneﬁts from the scientiﬁc knowledge, mainly when presented in a simple and easy-to-understand manner, such identiﬁcation— and the corresponding uncertainty—must be clearly described and communicated. This paper presents a methodology to identify, in a simple and useful way, the time when current reservoir operation rules fail under changing climate by properly treating and presenting its aleatory and epistemic uncertainties and showing its deep uncertainty. For this purpose, we use a reliability– resilience–vulnerability framework with a General Circulation Models (GCM) ensemble under the four Representative Concentration Pathways (RCP) scenarios to compare the historical and future long-term reservoir system performances under its current operation rule in the Limar í basin, Chile, as a case study. The results include percentiles that deﬁne the uncertainty range, showing that during the 21st century there are signiﬁcant changes at the time-based reliability by the 2030s, resilience between the 2030s and 2040s, volume-based reliability by the 2080s, and the maximum failure by the 2070s. Overall, this approach allows the identiﬁcation of the timing of systematic failures in the performance of water systems given a certain performance threshold, which contributes to the planning, prioritization and implementation timing of adaptation alternatives.


Introduction
Reservoirs are fundamental infrastructure for water supply and irrigation, especially in semiarid and arid regions, such as Central Chile, characterized by a marked seasonality of the hydrological cycle. Furthermore, considering that agriculture is responsible for about 70% of global water withdrawals [1], and that climate change-related water reductions have affected arid and semi-arid regions of the world [2], reservoir management in these The Paloma reservoir system located in the basin supplies water to~70,000 ha of irrigated land and drinking water to Ovalle city (110,000 inhabitants). The system is composed of the Paloma, Cogotí, and Recoleta reservoirs, whose capacities are 750, 150, and 100 Mm 3 , respectively (Figure 1), and according to Chilean water regulation [92] is expected to have a time-based reliability of an 85%. The total capacity largely exceeds the average annual system inflow (i.e., 400 Mm 3 ), which allows coping with the high inter-annual variability (i.e., between several years) [41] as presented in Table 2, which shows values of the coefficient of variation ranging between 0.70 and 1. 27. Moreover, the intra-annual streamflow variability (i.e., between months of a year) is naturally regulated by snow accumulation and melting, which drives the spring and summer peak flows. Because future spring and summer flows are expected to occur earlier in the season due to climate change [40,41], the reservoir system is expected to participate more actively in regulating future flows. As shown in Figure 1, the Recoleta and Paloma reservoirs operate in parallel, while the Cogotí reservoir located in the upper basin is in series with Paloma. Nevertheless, because the Paloma reservoir is five times larger than the Cogotí and given that the last mainly supplies the agricultural area above Paloma reservoir, the entire system can be simulated as three reservoirs working in parallel.
The current operation rule of the system was developed by Ferrer et al. [93], who used precipitation and streamflow data from 1944 to 1976 to simulate the inter-annual variability and various allocation scenarios. Ferrer et al. [93], estimated volumes of 138 and 220 Mm 3 for the 3-and 4-year moving average of the annual inflows to the system with 85% exceedance probability, respectively. The moving average inflows with 85% exceedance probability are important because they give information about the maximum amount of water that can be allocated, while keeping a time-based reliability of 85%, which restricts the maximum amount of water rights that can be allocated in Chile [92]. After analyzing the historical hydrology, Ferrer et al. [93] concluded that a wet year is expected to follow three dry years. Hence, a period of three years was considered critical for the long-term operation of the reservoir system, leading to a single annual allocation decision [93]. Based on this analysis Ferrer et al. [93] determined the operation rule and the amount of water to be allocated from each reservoir that has been operating ever since (Equation (5), depicted at the end of this section).
which restricts the maximum amount of water rights that can be allocated in Chile [92]. After analyzing the historical hydrology, Ferrer et al. [93] concluded that a wet year is expected to follow three dry years. Hence, a period of three years was considered critical for the long-term operation of the reservoir system, leading to a single annual allocation decision [93]. Based on this analysis Ferrer et al. [93] determined the operation rule and the amount of water to be allocated from each reservoir that has been operating ever since (Equation (5), depicted at the end of this section).   More formally, the system operation is expressed as follows. The stored volume S t+1 j (m 3 ) in reservoir j at the beginning of year t + 1 is where O, I and Sp are the outflow, inflow, and spilled water, respectively (m 3 ). For this study, and as depicted in Section 3, inflows (I) are obtained from the WEAP hydrological model, which simulates synthetic streamflows using future climate projections. E is the net evaporation from the reservoir (m 3 ): where e is the evaporation rate (m) estimated from meteorological data, P is the precipitation (m), and A is the surface area (m 2 ) for reservoir j during year t, associated with the current storage elevation, which in turn is also related to the water storage volume. Given that the stored water S t j cannot exceed the maximum storage capacity MS j , the spilled water (Sp t j ) from Equation (1) is excess water that the reservoir is not able to store (Equation (4)). Moreover, usable stored water S t j is restricted by the dead storage DS (m 3 ), which is the storage under the lowest discharge outlet, and hence cannot be used: Although the relation between surface area and water stored in the reservoirs may change through time, for simplification we assume the original design for each reservoir (i.e., dead storage, maximum storage and the relationship between the surface area and the stored volume) to remain constant.
The water allocated in year t + 1 (O t+1 j ) is a function of the stored water in the system If S t T overpasses a threshold or restrain bound (RB), a fixed amount α j is allocated from reservoir j. Otherwise, the allocated water is a fraction r of the storage.
Ferrer et al. [93] determined values of α = 240, 40 and 40 Mm 3 for the Paloma, Recoleta, and Cogotí reservoirs, respectively, as well as values of RB = 500 Mm 3 and r = 0.5. Thus, if the system storage exceeds 500 Mm 3 , the maximum allowed annual water allocation is 320 Mm 3 . Otherwise, half of the stored water is allocated. Note that, although it may be uncommon for other places, the reservoir operation is based on an annual decision, which has been successfully used for more than four decades. As we aim to evaluate when climate change may significantly impair their performance, identifying the timing for adaptation strategies to be implemented, we need to assess the current reservoir operation rule instead of a theoretical or optimized new operation that could be defined at a different time scale (monthly or other). Furthermore, changing water allocation goals might not be simple in Chile, as according to the 1981 Water Code water use rights are legally treated as real state private property, which are granted to perpetuity. Hence, modifications in the Water Code may be needed prior to changes in the allocation.

Methodology
Several methods and tools are used in this study to analyze the relation between climate, hydrology, the reservoir system, and its performance. An already developed hydrological climate-driven model implemented in the Water Evaluation and Planning (WEAP) system [94,95] was updated and calibrated using historical monthly runoff data. This model is utilized to generate streamflow scenarios for the operation of the reservoirs' system using synthetic local climate projections as input. For the generation of these projections, precipitation, and temperature data for the period 1971-2100 from 49 runs of GCMs (Appendix A) and the four RCPs (i.e., 149 climate projections) were downscaled with a weather generator. Finally, the performance of the reservoirs' system under this set of streamflow scenarios was characterized through performance indexes. Figure 2 depicts the methodology, with a special focus on the implementation of Equations (1)-(5) to simulate the reservoirs' operation, particularly the water allocation. Details of the methodology are presented in the subsections below.
tions, precipitation, and temperature data for the period 1971-2100 from 49 runs of GCMs (Appendix A) and the four RCPs (i.e., 149 climate projections) were downscaled with a weather generator. Finally, the performance of the reservoirs' system under this set of streamflow scenarios was characterized through performance indexes. Figure 2 depicts the methodology, with a special focus on the implementation of Equations (1)-(5) to simulate the reservoirs' operation, particularly the water allocation. Details of the methodology are presented in the subsections below.

Hydrological Modeling
WEAP uses climate information as input to generate streamflow following a semidistributed approach. In the model, elevation bands are used as hydrological units where climate, soil, topography, and land use characteristics are specified. A WEAP model already set up for the Limarí Basin by Vicuña et al. [40,41] was re-calibrated and used in this study. The re-calibration used the most recent years , leaving the period 1969-1984 for validation. Because this approach allows obtaining calibration parameter from a more recent period, the time lag between the periods of calibration and simulation using

Hydrological Modeling
WEAP uses climate information as input to generate streamflow following a semidistributed approach. In the model, elevation bands are used as hydrological units where climate, soil, topography, and land use characteristics are specified. A WEAP model already set up for the Limarí Basin by Vicuña et al. [40,41] was re-calibrated and used in this study. The re-calibration used the most recent years (1985-2011), leaving the period 1969-1984 for validation. Because this approach allows obtaining calibration parameter from a more recent period, the time lag between the periods of calibration and simulation using to future climate projections is reduced, and more reliable hydrologic simulations are obtained [96]. Overall, the simulated and observed hydrographs are similar, and satisfactory Nash-Sutcliffe efficiency (NSE) coefficient values [97] were obtained for the different streamflow gauges (Table 2); furthermore, observed and simulated average annual flows are very similar for the calibration period. The model tends to underestimate the annual flow for the validation period, while the coefficient of variation (CV) is underestimated for the entire historic period mainly due to the underestimation of the extreme discharge values during very humid months. Given that extreme discharges tend to produce water spills in the reservoir operation, the underestimation of them does not affect the performance of the simulation of the operation rule. In the validation period the NSE value improves, compared to the calibration period for the San Agustín and Ojos de Agua gauges, is maintained for Cuestecita gauge and decreases for Las Ramadas, Desembocadura, and Fragüita gauges ( Table 2). Flow discharges simulated at the San Agustin gauge are the least satisfactory ( Figure 3). This gauge receives contributions from the highest elevations in the basin (>5000 m) where reliable meteorological measurements are scarce. Nonetheless, even at this gauge low flows (i.e., the most relevant flows for the long-term simulation of the basin) are well simulated. Moreover, Table 2 shows good simulations associated with the main tributaries contributing to the system (i.e., Las Ramadas, Fragüita, and Cuestecita). These contributions explain most of the inflows to Cogotí and Paloma reservoirs, which represent 90% of the system storage volume. For a more detailed description of the implemented WEAP model, readers are referred to the references by Vicuña et al. [40,41].
are very similar for the calibration period. The model tends to underestimate the annual flow for the validation period, while the coefficient of variation (CV) is underestimated for the entire historic period mainly due to the underestimation of the extreme discharge values during very humid months. Given that extreme discharges tend to produce water spills in the reservoir operation, the underestimation of them does not affect the performance of the simulation of the operation rule. In the validation period the NSE value improves, compared to the calibration period for the San Agustín and Ojos de Agua gauges, is maintained for Cuestecita gauge and decreases for Las Ramadas, Desembocadura, and Fragüita gauges ( Table 2). Flow discharges simulated at the San Agustin gauge are the least satisfactory ( Figure 3). This gauge receives contributions from the highest elevations in the basin (>5000 m) where reliable meteorological measurements are scarce. Nonetheless, even at this gauge low flows (i.e., the most relevant flows for the long-term simulation of the basin) are well simulated. Moreover, Table 2 shows good simulations associated with the main tributaries contributing to the system (i.e., Las Ramadas, Fragüita, and Cuestecita). These contributions explain most of the inflows to Cogotí and Paloma reservoirs, which represent 90% of the system storage volume. For a more detailed description of the implemented WEAP model, readers are referred to the references by Vicuña et al. [40,41].

Development of Climate Change Scenarios
Under the premise that many GCMs are needed to characterize the uncertainty when analyzing climate change impacts, this study considered 49 GCMs realizations (Appendix A) and the RCPs 2.6, 4.5, 6.0, and 8.5 [98,99]. GCMs' projected precipitation and temperature series were downscaled using a weather generator downscaling method, obtaining 200 synthetic time series of climates, under each RCP scenario, for the period 1970-2100. In the first step, GCM outputs are interpolated using inverse square distance to the ground weather station locations.

Development of Climate Change Scenarios
Under the premise that many GCMs are needed to characterize the uncertainty when analyzing climate change impacts, this study considered 49 GCMs realizations (Appendix A) and the RCPs 2.6, 4.5, 6.0, and 8.5 [98,99]. GCMs' projected precipitation and temperature series were downscaled using a weather generator downscaling method, obtaining 200 synthetic time series of climates, under each RCP scenario, for the period 1970-2100. In the first step, GCM outputs are interpolated using inverse square distance to the ground weather station locations.
The Chadwick et al. [90] annual weather generator downscaling method was adopted, which considers (1) the extraction of long-term trends from the changes of the mean and the standard deviation from a GCM group, and (2) the generation of annual climate (i.e., precipitation and temperature) time series based on these trends, and the historical observed climate between 1971 and 2005. The trends are extracted from the changes of a GCM group for each RCP. Changes in the precipitation mean from the GCM projection G are obtained by the ratio between the moving averages of the GCM precipitation (MAP t,G ) and the average from the GCM control period (AP t o ,G ), where t and t o are the last year of the moving window and the control period, respectively. This ratio is called normalized moving average (NMAP t,G ). The trends are built using all the NMAP t,G from each RCP.
For each year t an empirical cumulative distribution function (CDF) of all the NMAP t,G is fitted. Finally, the trend percentile with a non-exceedance probability p 1 , NMAP t,p1 , is obtained from the CDFs of each year. Note that NMAP t,p 1 corresponds to a statistical mapping of the changes from a GCM group, and several trend percentiles can be used to map the dispersion among the group of GCM results.
An analogous process is undertaken for the standard deviation of the precipitation. The trend percentile of the standard deviation (NMSDP t,p2 ) with a non-exceedance probability p 2 is randomly generated considering the average correlation with the trend percentile p 1 of the mean, which is chosen. For temperature, a similar process for extracting the trends is applied, but the normalized moving difference between moving average of temperature and the average of the control period of the GCM is used instead. Moreover, the non-exceedance probability p 3 and p 4 for the change of the mean and the standard deviation of the temperature, are also randomly generated considering their average correlation with the trend percentile p 1 of NMAP t,p 1 .
For the generation of annual climate series data, in each station of interest probability distribution functions (PDFs) f Y (y, θ) are fitted to the observed annual records of the variable Y (temperature or precipitation) through estimation of the parameter set θ using the mean µ, and standard deviation σ. These PDFs are combined with the GCMs trend percentiles, case in which the parameter set θ change in time according to the GCMs. Hence, this is a GCM ensemble that incorporates the natural variability. Under this approach, the value of the climatic variable at any time for a given p and RCP is the value obtained from the PDF, but with mean µ * and standard deviation σ * that change through time according to the trends: where u is a random uniform number [0,1], and Y t,i,p 1 is the i th annual precipitation value randomly generated, using the trend percentile p 1 of the mean, for a non-stationary climate. The value of µ * and σ * in Equation (6) at any particular year t for precipitation are calculated from the historical mean (µ) and standard deviation (σ) and the multiplicative normalized change rates: For temperature, a similar equation is used, but µ * and σ * are obtained using the additive changes rates, and probabilities p 3 and p 4 : Finally, to disaggregate the annual data generated with the weather generator into monthly data, we used the k-Nearest Neighbor (k-NN) method, similar to the one utilized by [100]. Following the heuristic approach adopted elsewhere [101,102], a value of k = √ L ≈ 6 was used in the implementation of the k-NN method, in which L is the length of the historical record (i.e., 35 years). For the disaggregation, we have the annual synthetic generated climate for several climate stations (precipitation and temperature) and we also have the historical observed monthly climate at the same stations. To disaggregate one year of the synthetic climate, we sort in ascending order, the historical years of climate, according to the Euclidian distance between the annual data of each historical year and the synthetic year. For this purpose, equal weighting factors were used for each station, given that we previously standardized the synthetic and observed climate data by subtracting the historical mean and dividing by the historical standard deviation. Then a historical year is chosen by using the k-NN method [101], which is then used to disaggregate annual synthetic climate into monthly climate. The process is repeated for each synthetic year to disaggregate the whole synthetic series. Given that the annual value of the historical year and the synthetic year to be disaggregated not necessarily match, there is a post adjustment by multiplication and addition in the case of precipitation and temperature, respectively.
As explained above, using trend percentiles allows mapping the dispersion of the group of GCM runs for each RCP and the local climate information. Ten equally spaced trend percentiles for the changes in the precipitation mean are adopted, as recommended by Chadwick et al. [90], while the trend percentiles of the precipitation standard deviation, temperature mean and standard deviation, are randomly selected considering the correlation among them [90]. For each trend percentile 20 synthetic realizations are generated, adding the total of 200 synthetic realizations for each RCP scenario.

Performance Indexes
First, satisfactory and unsatisfactory states must be defined to assess the system performance. A satisfactory state is that when the total annual demand (D) is met, whereas the system falls into an unsatisfactory state or failure when D is not satisfied. Hence, the Paloma system is on failure when the long-term expected annual water allocation is under D = ∑ M j=1 α j = 320 Mm 3 (see Equation (5)). The performance criterion used are the RRV indexes proposed by Hashimoto et al. [27], and widely used in the literature [28,29,31,46,55,[103][104][105][106][107][108].
Time-based Reliability (Rel T ) measures how often the system fails. This index is calculated as the percentage of time (0-1 range) that the system can meet D in the n years under evaluation: where Z t counts the number of years at failure: where W t equals 1 each time step in which the system passes from failure to success and 0 if it stays on failure. Hence, which ensures a Res range between 0 (no recovery from failure or always in failure) to 1 (immediate recovery from failure or never in failure). The Volume-based Reliability (Rel V ) [55,109] was also used in the analysis: hence, Rel V ranges between 0 and 1 and measures the mean percentage of water allocated by the system compared to D. Following the approach by [106,107], two indexes for vulnerability were used: (1) the maximum vulnerability or maximum water deficit (MaxV) as used by Moy et al. [32], and (2) the average vulnerability or average water deficit (AvgV). Both indexes range between 0 and 1, and are where In Equations (14) and (16), vulnerability values are standardized by D.

Climate Change Hydrological Impacts
Future annual precipitation and reservoir inflow projections are presented in Figures 4 and 5 for the la Paloma reservoir. Both figures show the 25th, 50th, and 75th percentiles of the time series averaged over a 40-year moving window. Precipitation starts with 117 mm (50th percentile) by year 2010 (Figure 4) and decline in time to 107 mm, 99 mm, 97 mm, and 88 mm, by the end of the century under RCPs 2.6, 4.5, 6.0, and 8.5, respectively; these changes correspond to reductions of −8.5%, −15.4%, −17.1%, and −24.7%, respectively. Furthermore, larger declines are observed for higher percentiles, regardless the RCP; for example, under RCP 2.6 reductions of 9, 10, and 19 mm are obtained for the 25th, 50th, and 75th percentiles, respectively. Hence the climate is projected to become drier, with larger precipitation reductions for the rainy years.
In Equations (14) and (16), vulnerability values are standardized by D.

Climate Change Hydrological Impacts
Future annual precipitation and reservoir inflow projections are presented in Figures 4 and 5 for the la Paloma reservoir. Both figures show the 25th, 50th, and 75th percentiles of the time series averaged over a 40-year moving window. Precipitation starts with 117 mm (50th percentile) by year 2010 (Figure 4) and decline in time to 107 mm, 99 mm, 97 mm, and 88 mm, by the end of the century under RCPs 2.6, 4.5, 6.0, and 8.5, respectively; these changes correspond to reductions of −8.5%, −15.4%, −17.1%, and −24.7%, respectively. Furthermore, larger declines are observed for higher percentiles, regardless the RCP; for example, under RCP 2.6 reductions of 9, 10, and 19 mm are obtained for the 25th, 50th, and 75th percentiles, respectively. Hence the climate is projected to become drier, with larger precipitation reductions for the rainy years.  The impacts of these precipitation changes over the incoming streamflow to the Paloma reservoir is magnified ( Figure 5). The 50th percentile of the inflow goes from 245 Mm 3 by 2010, to 218, 179, 179, and 147 Mm 3 by 2100 under RCPs 2.6, 4.5, 6.0, and 8.5, respectively; these changes correspond to reductions of −11.4%, −27.2%, −25.1%, and −40.7%, respectively. Just as for precipitation, the absolute decline is greater for higher percentiles. For example, under RCP 2.6 the inflows to the Paloma are projected to diminish in 9, 28 and 82 Mm 3 for the 25th, 50th, and 75th percentile between years 2010 and 2100.
Similarly, inflows to the Cogotí and Recoleta reduce through time, as shown in Figures S1 and S2  Overall, the basin is expected to have a drier future, as concluded by previous studies in the basin [40,41,90]. Interestingly, projected reductions in reservoirs inflow surpass those expected for precipitation, especially in the upper basin. Moreover, reductions in precipitation and streamflow are greater for higher percentiles, which implies fewer very humid years in the future. This decline is quite relevant because the reservoirs are filled precisely during these very humid years.
Overall, the basin is expected to have a drier future, as concluded by previous studies in the basin [40,41,90]. Interestingly, projected reductions in reservoirs inflow surpass those expected for precipitation, especially in the upper basin. Moreover, reductions in precipitation and streamflow are greater for higher percentiles, which implies fewer very humid years in the future. This decline is quite relevant because the reservoirs are filled precisely during these very humid years.

Long-Term Impacts over Reservoir Operation
Changes in the RRV performance are analyzed using a 40-year moving window during the 1971-2100 period. The objective is to show in a simple, yet comprehensive manner, how and when future reservoir performance is expected to considerably worsen, and the corresponding uncertainty around this estimation. Thus, the performance associated with any particular year considers the last 40 years of operation. The dynamics of the positive performance metrics (i.e., Rel T , Res and Rel V ) are shown in Figures 6-8, whereas that of the negative performance metrics (i.e., MaxV and AveV) are shown in Figures 9 and 10. To illustrate the use of thresholds to identify the time at which sever impacts in the system's operation occur, we adopted in these figures two referential values that are intuitive and easy to follow in decision-making. The first threshold corresponds to the historical reference RRV performance calculated from the first 40 years of operation between 1971 and 2010 (thick dashed lines). The second threshold indicates a 10% worsening of the reference historical RRV performance (thick dotted line). Note however that any threshold value may be used, which can reflect for example a decision by the regulator or operator, a local regulation, or a hypothetical scenario. Following the recommendations from McMillan et al. [87] the uncertainty through time associated with the variety of GCMs is presented in each figure using the 25th, 50th, and 75th percentiles of the corresponding performance indexes (thin dotted, solid, and dashed lines, respectively). To analyze the potential errors of using just 200 synthetic time series per RCP, a bootstrapping analysis was performed as explained in Text S1. The results of the bootstrapping analysis are presented in Figures S3-S12 in the Supplementary Material. Note that the possible errors that may be produced by using only 200 synthetic time series per RCP are quite small (Figures S3-S12).
value may be used, which can reflect for example a decision by the regulator or operator, a local regulation, or a hypothetical scenario. Following the recommendations from McMillan et al. [87] the uncertainty through time associated with the variety of GCMs is presented in each figure using the 25th, 50th, and 75th percentiles of the corresponding performance indexes (thin dotted, solid, and dashed lines, respectively). To analyze the potential errors of using just 200 synthetic time series per RCP, a bootstrapping analysis was performed as explained in Text S1. The results of the bootstrapping analysis are presented in Figures S3-S12 in the Supplementary Material. Note that the possible errors that may be produced by using only 200 synthetic time series per RCP are quite small ( Figures  S3-S12).          Given that reductions for the 75th percentile of both precipitation and inflows to the reservoirs are larger than for the 25th percentile (Figures 4 and 5, Figures S1 and S2), the interquartile range for these variables reduces throughout the century. Nevertheless, the interquartile range for both the time-based and volume-based reliabilities increases (Figures 6 and 8). This interesting result illustrate the importance of using reservoir performance indexes to understand the impacts of climate change, as the changes and uncertainties in the raw hydro-climatological variables are not necessarily the same, or transferable, as those for the operational variables defining the performance of water systems in general, and reservoirs in particular.

Negative Performance Metrics
According to Figure 9, the maximum vulnerability increases under the RCP 2.6 scenario, beyond the historical reference value of 0.68, but without surpassing the "Reference + 0.1" threshold (thick dotted black line, Figure 9a). For the other RCPs, the maximum vulnerability worsens in the future beyond this threshold. Nevertheless, this severity has not been an issue yet, as reservoir design in Chile relays mostly on ensuring the time-based reliability. The future maximum vulnerability reaches values larger than 0.80 for the 50th percentile under RCPs 4.5, 6.0, and 8.5, which indicates that severe droughts are projected during the following decades. While a worse future is clearly predicted in terms of the maximum vulnerability index (Figure 9), the average vulnerability behaves differently ( Figure 10). Under the four RCPs, the average vulnerability clearly improves by mid-century (2060), with a reduction for the 50th percentile from 0.45 (historical reference), to 0.36, 0.37, 0.38, and 0.36 by year 2060 for RCPs 2.6, 4.5, 6.0, and 8.5, respectively. By the end of the century, the average vulnerability continues improving for RCP 2.6, while for RCPs 4.5 and 6.0 it slightly worsens, although not beyond the reference value of 0.45. For RCP 8.5, the 50th percentile average vulnerability by the end of the century equals the reference value. It is noteworthy that, although the number of failures will likely increase in time (Figure 6), the severities of these failures are expected to reduce by the middle of the century ( Figure 10). Most likely the average vulnerability improves because the time-based reliability ( Figure 6) worsens more than the volume-base reliability (Figure 8), hence the failures become more recurrent, but less severe. Nevertheless, the maximum failure increases (Figure 9), so even if the average failure ( Figure 10) is not as bad, the worst drought is expected to be worse. Note, that all these results depend on the reservoir operation rule; thus, if the current reservoir operation rule is modified, the expected consequences will need to be re-evaluated.

The Timing of the Changes in Performance: When Current Management Will Become Obsolete?
To identify the time when the current reservoir operation rule is no longer able to maintain the historical reference performance (i.e., time of emergence, ToE), we compute the first year in which the 50th percentile of each performance index exceeds a 10% worsethan-the-reference value, without going back (Table 3). Table 3 also shows in parenthesis the 80% confidence range for the 50th percentile emergence (i.e., 10% and 90% of error), estimated with the bootstrapping (Figures S3-S12 in the Supplementary Material). Under RCP 2.6 the ToE for the time-based reliability and resilience indexes is identified within the century, while for the other indexes it is not. Moreover, the ToE for the average vulnerability does not take place in this century for any of the four RCPs. Under RCPs 4.5 and above, the ToE detected for each index are clearly different: the time-based reliability overpasses the threshold between years 2028 and 2033, the resilience diverges from the reference between 2034 and 2042, whereas the volume-based reliability does so between 2072 and 2083, and the maximum vulnerability index overpass the reference threshold between 2068 and 2079. Overall, under RCPs 4.5 and above, Table 3 shows that the 10% worse-than-the-reference value is exceeded during the 2030s in terms of the time-based reliability, the 2030s and 2040s in terms of the resilience, the 2080s in terms of volume-based reliability, and the 2070s in terms of maximum vulnerability. Table 3. First year in which, for each RCP and performance index, the 50th percentile reservoir performance exceeds a 10% worse-than-the-reference value without going back. In parenthesis are the same result, but for the 10% and 90% of error estimated with the bootstrapping (Supplementary Material).

Time Reliability Resilience Volume Reliability MaxV AvgV
If the 25th and 75th percentiles are used instead the 50th percentile to analyze the ToE, we can appreciate the uncertainty associated with the results. For example, the time-based reliability under RCP 2.6, overpass the threshold on years 2016 and 2067 for the 25th and 75th percentile (Figure 6a), respectively. Under RCP 8.5 the years are now 2013 and 2057 for the same percentiles (Figure 6d). For the volume-based reliability (Figure 8) under RCP 8.5, the ToE for the 25th percentile is year 2047, but for the 75th percentile it happens after 2100. The maximum vulnerability threshold for the 25th percentile ( Figure 9) is not overpassed by any RCP during this century. Overall, the large time range detected when using the 25th and 75th percentile illustrates how difficult it is to identify a single year in which the reservoir performance will stop performing as expected historically, rather only a range of many years can be identified. This issue is further discussed by Chadwick et al. [83].
A comparison of the values in Table 3 against the 50th percentile Local ToE estimates for precipitation and temperature in the same basin [83] shows that deviations from the historical behavior of the reservoir performance, measured using time-based reliability or resilience, occur slightly before than for precipitation. On contrary, ToE takes place after when considering other performance indexes. For example, the time-based reliability performance for the 50th percentile ( Figure 6) worsens before 2040 for RCPs 4.5 and 8.5, while changes in precipitation in the basin happen around 2040 [83]. Such difference is likely caused in part by the early emergence of changes in temperature. Interestingly, the ToE for the volume-based reliability under RCP 4.5 and above ( Figure 8) happens during the 2080s for the 50th percentile, while changes in precipitation emerge earlier [83]. Nevertheless, further analysis is needed, given the larger uncertainty in the timing of changes in reservoir performance, as there are more steps involved in the cascade of uncertainty associated with the calculations [42]. Moreover, the results from our analysis allows detecting ToE that differ among the different performance indexes. In our case for example, time-based reliability ( Figure 6) is affected before the volume-based one (Figure 8), and thus adaptation measurements should be taken before the occurrence of significant changes in volume-based water allocation.
Overall, our approach and results are presented as a replicable method to inform decision makers-in a clear, simple, and useful manner-about the impacts of climate change through time over the performance of a reservoir operation rule. In particular, by defining an easy-to-follow and intuitive percentage worse-than-the-reference value to evaluate the performance, one can identify the time of obsolescence of the current operation rule under a climate change scenario (i.e., a local ToE of the reservoir performance). Furthermore, our approach allows transmitting the uncertainty to water managers and practitioners using uncertainty bands or alike, something recommended in the literature in recent years [87].

Adaptation Strategies
Although the main objective of this article is to depict an approach to identify the time at which the performance will drop beyond a defined threshold due to climate change under the current operation rule (i.e., ToE of reservoir performance), we now explore in a simple manner the implications of possible adaptations strategies. It is not our goal to optimize the operation rule, but to illustrate how the approach allows understanding the effect of varying the operation rule (i.e., reduction of water allocation goals) to cope with the increasing water scarcity of the basin. Given that the most severe changes and earlier emergences are expected under RCP 8.5, we use this scenario for the analysis. In particular, we explore the effects of implementing different reductions in water allocation goals in year 2033, when the first emergence is detected for this RCP. For example, Figure 11 compares the performance indexes obtained when the water allocation goals are reduced to 205, 35, and 35 Mm 3 , for the Paloma, Cogotí, and Recoleta reservoirs, respectively (green lines), against the reference case in which the operation rule remains the same (red lines). For this case, the impacts of the change in the rule differ depending on the index. Resilience is barely affected, while both vulnerability metrics are reduced to the point that the 50th percentile MaxV and AvgV remain respectively within the ±0.1 reference and the reference thresholds by 2100.
Water 2021, 13, x FOR PEER REVIEW 19 of 28 first year in which the 50th percentile reservoir performance indexes values exceed a 10% worse-than-the-reference value without going back, under RCP 8.5, for different adaptation strategies (i.e., reductions in annual water allocation goals) as shown in columns 1-4. As expected, for each index the year in which the threshold is exceeded delays with larger reductions in water allocation goals. Nevertheless, in order for the system to perform as originally expected throughout the 21st century, reductions of 40-50% of the original system allocation goal are needed. Such reductions may be extremally difficult to implement in the basin, and thus the eventual adaptation process may require the strong involvement of stakeholders and the compromising of current expectations.  Similar results to those shown in Figure 11 can be calculated for other reductions in the allocation goals (not shown here). To summarize these results, Table 4 presents the first year in which the 50th percentile reservoir performance indexes values exceed a 10% worse-than-the-reference value without going back, under RCP 8.5, for different adaptation strategies (i.e., reductions in annual water allocation goals) as shown in columns 1-4. As expected, for each index the year in which the threshold is exceeded delays with larger reductions in water allocation goals. Nevertheless, in order for the system to perform as originally expected throughout the 21st century, reductions of 40-50% of the original system allocation goal are needed. Such reductions may be extremally difficult to implement in the basin, and thus the eventual adaptation process may require the strong involvement of stakeholders and the compromising of current expectations. Table 4. First year in which, for different allocation goal reductions, the 50th percentile reservoir performance exceeds a 10% worse-than-the-reference value with no turning back under RCP 8.5. Columns 1 and 2 show annual allocation goals for each reservoir, and columns 3 and 4 show the corresponding system allocation goal as volume and as a percentage of the original 320 Mm 3 .

Water Allocation Goal
Performance Indexes

Conclusions
This study assessed the operation of a reservoir system using the RRV performance indexes during the 21st century under climate change. Considering the ongoing water scientist' challenge of communicating climate change results in a usable way for water managers, the results focus on providing a clear response to the decision-making question: when current operation rules will significantly impair their performance due to climate change impacts?
For this purpose, the future performance of the Paloma reservoir system in the Limarí River basin, Chile, was evaluated using a 40-year moving window for the period 1971-2100. Using a weather generator, GCMs outcomes were downscaled to produce climate series that served as input to an already set up WEAP hydrological model, which was utilized to simulate streamflows. These flows were used to model the performance of the historical and future operation of the reservoir system under the current operation rule, which was characterized using percentiles of the RRV indexes. For illustrative purposes, this performance was defined to be deficient when the RRV indexes worsen more than a 10% as compared to the historical values; however, any other threshold can be selected by the operator or the decision maker. The approach adopted in this study allows estimating the impacts of climate change while accounting for three sources of uncertainty: (a) aleatory uncertainty, i.e., the natural variability represented by intra-GCM uncertainties, (b) epistemic uncertainty, i.e., as represented by the spread of multi-GCM ensembles, and (c) deep uncertainty, i.e., as represented by the choice of RCP scenario. Note that the methodology here presented treats aleatory and epistemic uncertainties in a risk analysis, while deep uncertainty is properly presented. Thus, we aimed to clearly disentangle and communicate the impacts of climate change from the very considerable noise of these three sources of uncertainty. Our main conclusions are the following: -Precipitation reductions are projected in the basin due to climate change. While inflows to the reservoirs are also projected to diminish, there is an amplified response of streamflows, which will reduce even more than precipitation, especially in the upper basin. - The time-based reliability of the Paloma system will deteriorate significantly in the future, even for an optimistic percentile (75th), under all the RCP scenarios. The volume-based reliability will also reduce, except for the RCP 2.6 scenario. Thus, mitigation measures become key to ensure an adequate reservoir system's performance during the next decades. -Counterintuitively, although the projected time-based and volume-based reliability worsen in time, the average vulnerability stays around the historical value by the end of the century. In fact, the average vulnerability improves by mid-century (2060) regardless the RCP scenario, and tends to go back to the reference for RCPs 4.5 and above. - The range of GCMs uncertainty increases for the time-based and volume-based reliability, both with time and RCP scenario (i.e., RCP uncertainty decreases from RCP 8.5 to RCP 2.6). On the other hand, this range decreases with time for precipitation and reservoir inflows, while staying quite constant with the RCP. - The emergence was detected much earlier for the time-based reliability than for the volume-based one. Hence a new operation rule will be needed earlier if the timebased reliability is considered as the predominant performance metric. Indeed, this is the case in Chile, and thus the current operation rule should be changed in the 2030s. Nevertheless, the best timing to change the operation rule still depends on the RCP scenario, as well as the performance threshold, which should be discussed and defined by decision makers. -Deviations from the historical behavior of the reservoir performance measured using time-based reliability or resilience, occur slightly before than precipitation changes. On contrary, these deviations take place after when considering other performance indexes (i.e., volume-based reliability and maximum vulnerability). Hence, the identification of the ToE of variables other than the climatic ones (i.e., precipitation and temperature) that are more directly related to water resources availability becomes relevant. Nonetheless, such identification is more uncertain as more steps in the cascade of uncertainty are involved. -For the Paloma system to perform as originally expected throughout the 21st century, reductions of 40-50% of the original system allocation goal are needed under RCP 8.5. Given the difficulty of implementing such reductions in the basin, any eventual adaptation process will require the strong involvement of stakeholders and the compromising of current expectations.
We believe the approach followed in this study allows decision maker to easily understand climate change risks and impacts, and to identify the critical timing to propose and implement adaptation measures affecting current operation rules, infrastructure, and water allocation goals, especially in cases were implementing those changes are difficult or timely to implement. Although this work does not explore adaptation alternatives, their performance through time, as well as their prioritization, can also be assessed within this framework. On the other hand, in a direct and relevant application, this work can be extended nationwide, allowing the prioritization of resources where adaptation is or will be needed sooner. Finally, future studies considering other downscaling methods and different hydrological modeling approaches should be carried out to quantify their effects on the reservoir's performance assessment. Moreover, other dynamics in the basin and the reservoirs' system may be incorporated, such as land-uses changes and changes in the reservoirs capacity and dead storages due to sedimentation [110,111].  6.0 (c), and 8.5 (d). The 25th (dotted line), 50th (solid line), 75th (dashed line) of annual inflows to the Recoleta reservoir are averaged over a 40-year moving window, with the horizontal axis being the last year of the window; Text S1: Bootstrapping Method; Figure S3: Time-based reliability under RCP 2.6 (a), 4.5 (b), 6.0 (c), and 8.5 (d) for a 40-year moving window, with the horizontal axis being the last year of the window. The thick dashed and dotted lines represent the reference historical performance and 10% worse-than-the-reference value, respectively. The red lines are the same results from Figure 5a,b, with the thin blue and yellow lines being the 10% and 90% of error, obtained with the bootstrapping method explained in Text S1; Figure S4: Time-based reliability under RCP 2.6 (a), 4.5 (b), 6.0 (c), and 8.5 (d) for a 40-year moving window, with the horizontal axis being the last year of the window. The thick dashed and dotted lines represent the reference historical performance and 10% worse-than-the-reference value, respectively. The red lines are the same results from Figure 5c,d, with the thin blue and yellow lines being the 10% and 90% of error, obtained with the bootstrapping method explained in Text S1; Figure S5: Resilience under RCP 2.6 (a), 4.5 (b), 6.0 (c), and 8.5 (d) for a 40-year moving window, with the horizontal axis being the last year of the window. The thick dashed and dotted lines represent the reference historical performance and 10% worse-than-the-reference value, respectively. The red lines are the same results from Figure 6a,b, with the thin blue and yellow lines being the 10% and 90% of error, obtained with the bootstrapping method explained in Text S1; Figure S6: Resilience under RCP 2.6 (a), 4.5 (b), 6.0 (c), and 8.5 (d) for a 40-year moving window, with the horizontal axis being the last year of the window. The thick dashed and dotted lines represent the reference historical performance and 10% worse-than-the-reference value, respectively. The red lines are the same results from Figure 6c,d, with the thin blue and yellow lines being the 10% and 90% of error, obtained with the bootstrapping method explained in Text S1; Figure S7: Volume-based reliability under RCP 2.6 (a), 4.5 (b), 6.0 (c), and 8.5 (d) for a 40-year moving window, with the horizontal axis being the last year of the window. The thick dashed and dotted lines represent the reference historical performance and 10% worse-than-the-reference value, respectively. The red lines are the same results from Figure 7a,b, with the thin blue and yellow lines being the 10% and 90% of error, obtained with the bootstrapping method explained in Text S1; Volume-based reliability under RCP 2.6 (a), 4.5 (b), 6.0 (c), and 8.5 (d) for a 40-year moving window, with the horizontal axis being the last year of the window. The thick dashed and dotted lines represent the reference historical performance and 10% worse-than-the-reference value, respectively. The red lines are the same results from Figure 7c,d, with the thin blue and yellow lines being the 10% and 90% of error, obtained with the bootstrapping method explained in Text S1; Figure S9: Maximum vulnerability under RCP 2.6 (a), 4.5 (b), 6.0 (c), and 8.5 (d) for a 40-year moving window, with the horizontal axis being the last year of the window. The thick dashed and dotted lines represent the reference historical performance and 10% worse-than-the-reference value, respectively. The red lines are the same results from Figure 8a,b, with the thin blue and yellow lines being the 10% and 90% of error, obtained with the bootstrapping method explained in Text S1; Figure S10: Maximum vulnerability under RCP 2.6 (a), 4.5 (b), 6.0 (c), and 8.5 (d) for a 40-year moving window, with the horizontal axis being the last year of the window. The thick dashed and dotted lines represent the reference historical performance and 10% worse-than-the-reference value, respectively. The red lines are the same results from Figure 8a,b, with the thin blue and yellow lines being the 10% and 90% of error, obtained with the bootstrapping method explained in Text S1; Figure S11: Average vulnerability under RCP 2.6 (a), 4.5 (b), 6.0 (c), and 8.5 (d) for a 40-year moving window, with the horizontal axis being the last year of the window. The thick dashed and dotted lines represent the reference historical performance and 10% worse-than-the-reference value, respectively. The red lines are the same results from Figure 9a,b, with the thin blue and yellow lines being the 10% and 90% of error, obtained with the bootstrapping method explained in Text S1; Figure S12: Average vulnerability under RCP 2.6 (a), 4.5 (b), 6.0 (c), and 8.5 (d) for a 40-year moving window, with the horizontal axis being the last year of the window. The thick dashed and dotted lines represent the reference historical performance and 10% worse-than-the-reference value, respectively. The red lines are the same results from Figure 9c,d, with the thin blue and yellow lines being the 10% and 90% of error, obtained with the bootstrapping method explained in Text S1.
Author Contributions: All authors contributed extensively to the work here presented. S.V. contributed with the hydrological modelling. F.M. and C.C. conceptualized the experiment. C.C., J.G., and P.B. contributed to data analysis. C.C., P.B., S.V. and J.G. contributed to literature review. C.C. and P.B. prepared the original draft. C.C., P.B., J.G., F.M., and S.V. reviewed and edited the manuscript. All authors have read and agreed to the published version of the manuscript.