Simple and Low-Cost Model for Monthly and Yearly Streamflow Forecasts during the Current Hydrological Year

Forecasting streamflow accurately is essential to achieve an efficient integrated water resources management strategy and provide consistent support to water decision-makers. We present a simple, low-cost and robust approach for forecasting monthly and yearly streamflows during the hydrological year in course, applicable to headwater catchments. It combines the use of regression analysis techniques, the two-parameter Gamma continuous cumulative probability distribution function and the Monte Carlo method. It is based on a probabilistic comparison of the progression of the current hydrological year with the historic observed series. The methodology has been successfully applied to two headwater reservoirs within the Guadalquivir River Basin in southern Spain. The root-mean-square error and correlation coefficient were used to measure the accuracy of the model and the results showed good levels of reliability. The outputs are the probabilistic monthly and yearly streamflows and 80% confidence interval. Further reductions in prediction errors may be achieved from increasing the number of observed years. These risk-based predictions are of great value, especially, before the intensive irrigation campaign starts (usually in April), when Water Authorities are to take responsible management decisions about the best allocation of the available water volume between the different water users and environmental needs.


Introduction
Nowadays, water authorities and decision-makers are facing considerable challenges in achieving a sustainable and integrated water resources management system, especially, in water-stressed areas.They must make responsible management decisions on the optimum allocation of the available water volume from a wide range of possible sources (regulated or non-regulated rivers, groundwater resources, water re-use schemes, desalination plants, etc.) between the demands of, usually, multi-sectoral water users (urban, agriculture, industry, tourism, energy, etc.).
Not only must these decisions meet water demand, but they must protect this natural and finite resource, the environmental needs, the environmental impacts, the increase of legal requirements on water quality, the social equity, the costs, and the promotion of economic growth.Additionally, Water 2018, 10, 1038 3 of 26 simple error models and thus their results present more accurate inflow forecasts into hydropower reservoirs several hours ahead.
Currently, the use of seasonal climate predictions is becoming more commonplace.For example, Khedun et al. [19] presented a copula-based precipitation-forecasting model that produces a notable improvement in simulating negative precipitation anomalies during "La Niña" and negative Pacific Decadal Oscillation.Also, Liu et al. [20] presented a probabilistic wavelet-support vector regression model for streamflow forecasting with rainfall and climate information input that consistently generates more reliable predictions for daily and monthly forecasts as compared with the best single-member wavelet-support vector models and the adaptive neuro-fuzzy inference system.
Nevertheless, in Spain, the level of accuracy achieved by the seasonal climate forecasts provided by the Spanish Meteorological Agency (AEMET or Agencia Estatal de Metereología) [21] for the year of 2017-2018 has not been as good as expected.For example, in February 2018, AEMET predicted that, for the south-eastern quarter of Spain, March, April, and May would be much drier than average, whereas March 2018 has been the wettest month registered to date.
Additionally, the Guadalquivir River Basin Authority (who is responsible for making the strategic decisions on controlled released outflows from reservoirs) still does not have an established common streamflow forecast model or procedure to predict reservoir inflows.These decisions are based on their vast knowledge and experience managing water resources in this area, especially during drought and periods of water scarcity when critical decisions and actions must be taken [22].Usually, these decision-makers are not familiar with very complex models or specialised software, so they typically require simple, but sufficiently robust and reliable, tools.
This research contributes towards the development of a user-guided, novel, simple, low cost, and robust methodology to forecast streamflows within the current hydrological year.This is applicable to headwater systems and can provide support to strategic water management decisions.The methodology has been successfully applied to two headwater reservoirs located at the upper area of the Genil River (Guadalquivir River Basin), namely, Quéntar and Canales.
The model forecast outputs are the mean monthly and yearly streamflows along with the 10th and 90th percentiles (although the model offers flexibility to easily modify the percentiles if needed).The model was first put into operation in October 2016 and the model output forecasts achieved satisfactory results, with a relative error varying from 8% (Canales) to 18% (Quéntar).These risk-based forecasts are not only useful for water reservoir operators and authorities, but also to all stakeholders involved in the planning, management, and decision-making processes.The model also has the ability to incorporate seasonal climate predictions and climate change effects.
The aim of this work was to develop a simple and low cost statistical model (data-driven) to forecast monthly and annual streamflows during the current hydrologic year.To achieve this, the specifications were set to:

•
Use free hydrological data sources available in the public domain (to be downloaded in an easy, free, and quick manner from a reliable, official online resource where possible); • minimise the number of hydrological variables; • integrate easily and regularly new observed hydrological data in the model (an automatic process where possible); • minimise computational and simulation running times (instantaneously where possible); and • offer a user-friendly interface.
This paper is organised as follows: Section 2 describes the methodology, Section 3 presents the results for the two cases of study, including the description, findings, model performance tests using reliability metrics, and the discussion, and Section 4 provides the concluding remarks.

Overview
The proposed methodology is to predict the probabilistic monthly and annual streamflows during the current hydrological year.It innovatively combines the use of well-known regression analysis techniques (between relevant hydrological variables), the two-parameter gamma continuous cumulative probability distribution function and the Monte Carlo simulation.It is based on a probabilistic comparison of the progression of the current hydrological year (in terms of cumulative monthly precipitation or cumulative monthly streamflow) with the historical data series.
Figure 1 shows the flowchart of this methodology (described in detail in the following sections and applied to two cases of study detailed in Section 3).
Water 2018, 10, x FOR PEER REVIEW 4 of 26 cumulative probability distribution function and the Monte Carlo simulation.It is based on a probabilistic comparison of the progression of the current hydrological year (in terms of cumulative monthly precipitation or cumulative monthly streamflow) with the historical data series.
Figure 1 shows the flowchart of this methodology (described in detail in the following sections and applied to two cases of study detailed in Section 3).

Allocation of probabilities and
application of the Monte Carlo method (10,000 simulations) to obtain the final probabilistic outputs.

Model Time Step, Data Sources, and Treatment
A monthly time step has been considered adequate for this work since the results are to support strategic and management decisions associated with annual or quarterly cycles.Therefore, the predictions from the model might not be applicable to other reservoir operations (for example, hydropower plants or flood forecasting) with lower time scale data requirements (hours, days).
It is also important to understand the phenomena that will be picked up within the selected monthly time step for each particular catchment area (for example, snow storage and melting processes, subterranean inflows, etc.).

Model Time
Step, Data Sources, and Treatment A monthly time step has been considered adequate for this work since the results are to support strategic and management decisions associated with annual or quarterly cycles.Therefore, the predictions Water 2018, 10, 1038 5 of 26 from the model might not be applicable to other reservoir operations (for example, hydropower plants or flood forecasting) with lower time scale data requirements (hours, days).
It is also important to understand the phenomena that will be picked up within the selected monthly time step for each particular catchment area (for example, snow storage and melting processes, subterranean inflows, etc.).
Free hydrological data sources available in the public domain have been used to inform this work (as detailed in Section 3.2).The data can be downloaded in an easy, free, and fast manner from a reliable and official online resource.
The data requirements were the historical precipitation and streamflow monthly data sets for each particular catchment of study.It is recommended to apply the methodology when there is good quality input data and at least 30 years of observed data (although 40 years would be desirable to increase the forecasts' accuracy).The minimum number of observed years should be greater in those catchments with a greater standard deviation.
From the observed data series, the cumulative precipitation and streamflow monthly time series were calculated for the first three observed months (Oct to Dec), four observed months (Oct to Jan), five observed months (Oct to Feb), six observed months (Oct to March), seven observed months (Oct to April), and eight observed months (Oct to May).The reason why the assessment starts with the first three cumulative observed months is to ensure that sufficiently good correlation coefficient values were achieved in the regression analysis.
With this input data, the model provides monthly and yearly streamflow forecasts for the rest of the current hydrological year monthly, starting the forecasts from January, as shown in Table 1.

Simple Regression Analysis and Best Fit Predictor
For the proposed methodology, we used simple regression analysis techniques.We sought to predict the total annual streamflow (hereinafter referred to as A annual ) from relevant hydrological descriptors.We selected the cumulative monthly precipitation (hereinafter referred to as P cum ) and the cumulative monthly streamflow (hereinafter referred to as A cum ) as the best predictors.
The analysis assumed that the hydrological cycle (from October to September) for a specific basin is in balance at the end of the hydrological year.Therefore, a greater correlation between the total annual streamflow and the cumulative monthly time sets was expected as the number of observed months increases.
The results from the regression analysis were subsequently assessed using the coefficient of determination (R 2 ).This allowed us to select the best fit regression model and best fit hydrological predictor (cumulative rainfall or cumulative streamflow) for each specific month of study.The estimated total annual streamflow from the regression models is an intermediary output (to be used as an input to obtain the conditioned gamma, as shown in Figure 1).
It is important to note that the best fit regression model and the best fit predictor do not necessarily need to be same for all the observed months of study.In fact, it might vary depending on the predominant hydrological process given at each specific month or season of the year (snow storage and melting processes, subterranean inflows, seasonal extreme, and sporadic rainfall events).In this Water 2018, 10, 1038 6 of 26 situation, a combined regression model should be used to achieve the best correlation values and forecast results.

Two-Parameter Gamma Cumulative Probability Distribution Function
The next step was fitting the historical annual streamflow data series to a two-parameter gamma (α,β) cumulative probability distribution function (CDF).
There is extensive literature describing the properties, parameters estimation, and applications of the two-parameters gamma (α,β) CDF [23].This distribution has been shown to fit well to rainfall and streamflow data sets, respectively, and, therefore, has been widely applied to hydrological data-driven models, such as those described in the studies of Buishand [24], Stephenson et al. [25], Wang and Nathan [26], Chen et al. [27], and Chowdhury et al. [28].
A random variable, X, is said to have a two-parameter gamma continuous probability density function if its distribution is given by: where α and β are the shape and scale parameters, respectively.The gamma CDF is given by: The mean, E(x), and variance, V(x), are given by Equations ( 4) and (5), respectively: From Equations ( 4) and ( 5), we can estimate the mean and variance of the observed series and the parameters α and β, as follows: This distribution provides the non-exceedance (or maximum) probability that the total annual streamflow will take a value less than or equal to a determined value as well as the maximum total annual streamflow value associated to a specific probability value.
It is important to check how well the gamma distribution fits with the observed data series of annual streamflows for each case study.Section 3.4 shows that for the two case studies, the gamma CDF was found to provide a very good fit to the total annual streamflow distribution.

Conditioned Two-Parameter Gamma Cumulative Probability Distribution Function (Using the Output from the Regression Models)
To obtain the conditioned gamma distribution function, we determined the influence of the estimated total annual streamflow (from the regression models, Section 2.3) on the gamma CDF (fitted to the observed data, Section 2.4).
To do that, we imposed the condition that the mean value of the new conditioned gamma CDF is equal to the result obtained from the regression model ('A annual ' or estimated total annual streamflow).The value of the scale parameter, β, is kept the same as that already obtained from the gamma CDF (fitted to the observed total annual streamflow data sets).
Based on this, the shape parameter, α, of the conditioned gamma was re-calculated as follows: α cond (Conditioned Gamma) = Mean (Conditioned Gamma)/β = = Estimated total annual streamflow (from the regression analysis)/β (8) We then estimated the conditioned gamma CDF using Equation ( 3) and the two-parameters (α cond ,β).
It is important to highlight that whilst the gamma CDF is a fixed curve (since this is derived from the observed and complete hydrological years), the conditioned gamma CDF curve will vary from month to month depending on the output obtained from the regression models.In fact, the conditioned gamma CDF curve will move to the right or left of the gamma CDF, depending on whether the prediction is for a wetter or drier hydrological year than the mean historical year, respectively.If the conditioned gamma CDF is close to the observed gamma CDF this means that the current hydrological year is expected to be similar to the mean observed hydrological year.
The conditioned gamma distribution function (α cond ,β) allowed us to assign a probability to each year of the historical series based on their total annual streamflow.Therefore, we obtained greater probabilities for those observed years whose annual streamflow is similar to the estimated annual streamflow ('A annual ') and lower probabilities for those observed years whose annual streamflow differs from the estimated annual streamflow.For each year of the historical series, we have its probability (p), given by: where x is the observed annual streamflow of the historical records (hm 3 ), α cond is the shape parameter of the conditioned gamma, and β is the scale parameter of the gamma distribution (the same as per the non-conditioned gamma distribution).

Monte Carlo Method
The Monte Carlo Method is a numerical statistical method that allows the replication of random behaviour of real, non-dynamic systems through the generation of random numbers to which an event is assigned based on their probability distribution [29].
Our model generates 10,000 random input numbers in the interval [0,1].Each of these random numbers is assigned to one year of the observed streamflows series according to the conditioned gamma CDF (which already integrates the output from the regression model).Therefore, we obtained 10,000 random observed years (from the historical series) according to the probability of success.
From the statistical analysis of those, we obtained three probabilistic forecasts: Optimistic, average, and pessimistic.For our case study, we assigned the 90th percentile to the optimistic streamflow forecast and the 10th percentile to the pessimistic forecast, though the user can easily modify these values if needed.

Model Running Time, Test, and Validation
The model takes just a few seconds to run the simulations and provide the forecast results.Therefore, this tool provides a significant improvement in saving simulation and computational times when compared with more complex and sophisticated models.
To verify the model performance, measure its predictive accuracy, and identify potential limitations, several metrics, including the root-mean-square error (RMSE) and determination coefficient (R 2 ), were used (as shown in Section 3).

Application and Limitations
The total annual streamflow was chosen as the prediction target (instead of the precipitation) to avoid the need of using an additional model to transform the precipitation into streamflow.
However, it is important to note that if there are rapid and significant land use changes within the catchment of study, these might alter the observed hydrological cycle within the basin.In this situation, the model might generate misleading streamflow forecasts.For example, a recent and rapid increase in impermeable area would generate a greater volume of surface water flows and quick streamflow peaks than the historical observed catchment response [30].In this case, the precipitation as the prediction target (instead of the streamflow) might be more appropriate and an additional model to transform the precipitation into the streamflow contribution might be required.
The proposed methodology does not consider the contributing inflows from upstream systems (for example, regulated outflow releases from other reservoirs upstream).Therefore, this methodology is applicable to headwater systems located at upper catchment areas.
There is no limitation in catchment size, however, those catchments with considerable climatic irregularity or very rare seasonal sporadic rainfall events are difficult to predict and should be carefully studied prior to progressing the modelling works (since this methodology is based on a statistical treatment of the known past to predict the future).

Study Area Description
The Guadalquivir River is the main river in southern Spain that provides water to a total population of over four million people and over eight hundred thousand hectares for irrigation purposes.This system is currently formed by an interconnected system of 64 functioning dams [31].Although there are alternative water resources from aquifers, springs, and water re-use schemes, nowadays, reservoirs are the essential infrastructure to efficiently deal with spatial and temporal climate irregularities distinctive of this catchment area.Figure 2 shows the location of the Guadalquivir River Basin in Spain and the area of study.
Water 2018, 10, x FOR PEER REVIEW 8 of 26 (for example, regulated outflow releases from other reservoirs upstream).Therefore, this methodology is applicable to headwater systems located at upper catchment areas.
There is no limitation in catchment size, however, those catchments with considerable climatic irregularity or very rare seasonal sporadic rainfall events are difficult to predict and should be carefully studied prior to progressing the modelling works (since this methodology is based on a statistical treatment of the known past to predict the future).

Study Area Description
The Guadalquivir River is the main river in southern Spain that provides water to a total population of over four million people and over eight hundred thousand hectares for irrigation purposes.This system is currently formed by an interconnected system of 64 functioning dams [31].Although there are alternative water resources from aquifers, springs, and water re-use schemes, nowadays, reservoirs are the essential infrastructure to efficiently deal with spatial and temporal climate irregularities distinctive of this catchment area.Figure 2 shows the location of the Guadalquivir River Basin in Spain and the area of study.[31] in Spain and the area of study (Latitude: 37.15855 Longitude: -3.470605).

Figure 2. Location of the Guadalquivir River Basin
The Guadalquivir River has a total contributing catchment area of 57,527 km 2 and is delimited by Sierra Morena to the north, the Betic mountain to the south, and the Atlantic Ocean.The altitude at the mountainous borders varies between 1000 m Above Ordenance Datum (AOD) and 3480 m AOD, which contrasts with the lower altitudes of the Guadalquivir River valley.The climate is Mediterranean, which is defined by the warm temperatures (16.8 °C annual average) and the irregularity of precipitations (550 mm annual average).The rains are frequently torrential and occur The Guadalquivir River has a total contributing catchment area of 57,527 km 2 and is delimited by Sierra Morena to the north, the Betic mountain to the south, and the Atlantic Ocean.The altitude at the mountainous borders varies between 1000 m Above Ordenance Datum (AOD) and 3480 m AOD, which contrasts with the lower altitudes of the Guadalquivir River valley.The climate is Mediterranean, which is defined by the warm temperatures (16.8 • C annual average) and the irregularity of precipitations (550 mm annual average).The rains are frequently torrential and occur after long periods of drought and high temperatures, with a distinct susceptibility to erosion [31].Figure 3 shows the mean annual precipitation and temperature, altitude, and land uses in the Guadalquivir River Basin.The Guadalquivir River Basin Authority (GRBA) is responsible for the elaboration of the Guadalquivir River Basin Management Plan (according to the European Water Framework Directive 2000/60/CE) as well as the administration and control of the hydraulic public domain [31].Last year, the GRBA published the draft version of the Guadalquivir River Basin Special Drought Management Plan (2017).This document establishes the general water management principles and course of action for different drought and water scarcity threshold scenarios for each sub-catchment area [32], but does not incorporate any streamflow forecast model or procedure.
The two headwater reservoirs of study are the Quéntar and Canales, located at the upper area of the Genil River within the Granada administrative area and to the south-east of the Guadalquivir River Basin (refer to Figures 2 and 4).
The Canales reservoir is located in the upstream area of the Genil River (close to Sierra Nevada), close to the town of Güejar Sierra, with a total contributing catchment area of 176.5 km 2 .The dam was built in 1989 with a total capacity of 70 hm 3 and average streamflow of 80.42 hm 3 /year.This catchment area is affected by snow storage and melting processes in the Sierra Nevada.The snow appears between the months of November and May, reaching the maximum accumulation of snow in March with an average water-equivalent volume of 20 hm 3 [33].
The Quéntar reservoir is located on the Aguas Blancas River (tributary of the Genil River), with a total contributing catchment area of 101.2 km 2 .This dam was built in 1975 with a total volume capacity of 13.5 hm 3 .The average streamflow is 28.84 hm 3 /year.This catchment area is affected by subterranean inflows due to the aquifer and lithology present in the region.The Guadalquivir River Basin Authority (GRBA) is responsible for the elaboration of the Guadalquivir River Basin Management Plan (according to the European Water Framework Directive 2000/60/CE) as well as the administration and control of the hydraulic public domain [31].Last year, the GRBA published the draft version of the Guadalquivir River Basin Special Drought Management Plan (2017).This document establishes the general water management principles and course of action for different drought and water scarcity threshold scenarios for each sub-catchment area [32], but does not incorporate any streamflow forecast model or procedure.
The two headwater reservoirs of study are the Quéntar and Canales, located at the upper area of the Genil River within the Granada administrative area and to the south-east of the Guadalquivir River Basin (refer to Figures 2 and 4).
The Canales reservoir is located in the upstream area of the Genil River (close to Sierra Nevada), close to the town of Güejar Sierra, with a total contributing catchment area of 176.5 km 2 .The dam was built in 1989 with a total capacity of 70 hm 3 and average streamflow of 80.42 hm 3 /year.This catchment area is affected by snow storage and melting processes in the Sierra Nevada.The snow appears between the months of November and May, reaching the maximum accumulation of snow in March with an average water-equivalent volume of 20 hm 3 [33].
The Quéntar reservoir is located on the Aguas Blancas River (tributary of the Genil River), with a total contributing catchment area of 101.2 km 2 .This dam was built in 1975 with a total volume capacity of 13.5 hm 3 .The average streamflow is 28.84 hm 3 /year.This catchment area is affected by subterranean inflows due to the aquifer and lithology present in the region.These two reservoirs form the key infrastructure that mainly provide water for urban and irrigation purposes.Urban water consumers are formed by the Granada city and fourteen towns of its metropolitan area, with up to 300,000 inhabitants.The Vega Alta del Río Genil traditional irrigations cover over 4000 hectares and are fed by an extensive irrigation channel system that diverts water from the Genil River (downstream of the Canales and Quéntar reservoirs).
This system is currently supplemented by a network of thirteen operating underground water wells located in the upper area of the Vega de Granada aquifer (south-east of the city of Granada, on both banks of the Monachil River and the A-395 motorway).These were built by the GRBA after the serious social, economic, and environmental consequences suffered during and after the 1992-1995 drought period (during which Canales and Quéntar reservoirs were depleted).The objective of these works was to supplement the existing surface water resources (provided by the Canales-Quéntar system) with the extraction of groundwater to serve the urban water supply of Granada and the metropolitan area.In 1995, the GRBA delivered this infrastructure to Emasagra (Local Water and Sewage Company) to guarantee its correct management, operation, and maintenance.Since then, Emasagra has been responsible for its management, while the GRBA is responsible for the supervision, reservoir management, and compliance with water allocation.
During years of average precipitation, water demands can be met with surface water resources.However, in years of below-normal precipitation, typically higher underground water volume extractions and/or reductions in per right allocation are required.During prolonged periods of drought and water scarcity, the urban water supply has theoretically been assigned priority of use over agricultural demand.
Strategic decisions made by the GRBA on the controlled released outflows from the Quéntar-Canales system are critical to ensure the most resource-efficient, cost-efficient, and sustainable allocation of the available water resources.These decisions are especially relevant before the intensive irrigation campaign starts (usually in April), when the water authorities must make responsible management decisions about the best allocation of the available water volume between different water consumers and environmental needs for the rest of the hydrological year.
However, despite their vital importance, decisions made by the GRBA are mainly based on the These two reservoirs form the key infrastructure that mainly provide water for urban and irrigation purposes.Urban water consumers are formed by the Granada city and fourteen towns of its metropolitan area, with up to 300,000 inhabitants.The Vega Alta del Río Genil traditional irrigations cover over 4000 hectares and are fed by an extensive irrigation channel system that diverts water from the Genil River (downstream of the Canales and Quéntar reservoirs).
This system is currently supplemented by a network of thirteen operating underground water wells located in the upper area of the Vega de Granada aquifer (south-east of the city of Granada, on both banks of the Monachil River and the A-395 motorway).These were built by the GRBA after the serious social, economic, and environmental consequences suffered during and after the 1992-1995 drought period (during which Canales and Quéntar reservoirs were depleted).The objective of these works was to supplement the existing surface water resources (provided by the Canales-Quéntar system) with the extraction of groundwater to serve the urban water supply of Granada and the metropolitan area.In 1995, the GRBA delivered this infrastructure to Emasagra (Local Water and Sewage Company) to guarantee its correct management, operation, and maintenance.Since then, Emasagra has been responsible for its management, while the GRBA is responsible for the supervision, reservoir management, and compliance with water allocation.
During years of average precipitation, water demands can be met with surface water resources.However, in years of below-normal precipitation, typically higher underground water volume extractions and/or reductions in per right allocation are required.During prolonged periods of drought and water scarcity, the urban water supply has theoretically been assigned priority of use over agricultural demand.Strategic decisions made by the GRBA on the controlled released outflows from the Quéntar-Canales system are critical to ensure the most resource-efficient, cost-efficient, and sustainable allocation of the available water resources.These decisions are especially relevant before the intensive irrigation campaign starts (usually in April), when the water authorities must make responsible management decisions about the best allocation of the available water volume between different water consumers and environmental needs for the rest of the hydrological year.
However, despite their vital importance, decisions made by the GRBA are mainly based on the current water storage in reservoirs instead of using the results from streamflow forecast models.Therefore, we believe the proposed tool can positively contribute towards the development of a practical, quick, and reliable tool to support water authorities and managers on the strategic decision process to achieve a sustainable water resources management approach.

Data Sources and Treatment
The 'Automatic Hydrological Information System (SAIH)' is a free and public online portal maintained by the Guadalquivir River Basin Authority in Spain [34].The SAIH offers information, such as streamflow, reservoir outflows, rainfall, reservoir level, temperature, etc. for up to 57 reservoirs, 52 non-regulated rivers, 20 canals, and 10 hydro power plants.Other information on rain, snow, and temperature gauging stations across the basin is also offered.The available temporal data sets vary depending on the specific sub-catchment area, but usually these are available from 1989.The information can be downloaded in an hourly, daily, or monthly time step.
The historical monthly precipitation and streamflow data series for Canales (from October 1988 to present) and Quéntar (from October 1977 to present) were obtained.From this information, the cumulative precipitation and streamflow monthly time series for the first three, four, five, six, seven, and eight observed months were determined (as described in Section 2.2).

Regression Analysis, Correlation, and Selection of the Best Fit Estimator and Regression Model
A complete statistical study was carried out to assess the correlation between the total annual streamflow and the cumulative monthly precipitation and streamflow sets.Figures 5 and 6 show the regression analysis for the first three, four, five, six, seven, and eight observed months for Canales and Quéntar, respectively.
From a visual inspection of Figures 5 and 6, it can easily be observed how the correlation is greater as the number of observed months increases.This is corroborated by the R 2 values shown in Table 2.     Table 2 shows the results obtained for the Canales and Quéntar reservoirs.
The comparative critical assessment of the coefficient of determination (R 2 ) allowed us to select the best fit regression model and best fit hydrological predictor (cumulative rainfall or cumulative streamflow) for each reservoir and specific month of study.The estimated total annual streamflow from the regression models is an intermediary output (to be used as an input to obtain the conditioned gamma).
Table 2 shows that that the lineal regression model is the best fit model for the Canales reservoir.It can be observed that the cumulative precipitation presents the best R 2 values (varying from 0.7253 to 0.9083) for the first three, four, and five observed months whilst the cumulative streamflow achieves the best R 2 values (varying from 0.9429 to 0.9839) for the six, seven, and eight observed months.This fact is considered to be due to the influence of snow storage and melting processes distinctive of this basin.The Canales catchment river hydrograph shows a pluvio-nival pattern with two flow peaks.The first one was in January and February due to the surface water runoff from rainfall events.The second one was around April and May due to snow-melt water flows from the Sierra Nevada.
Therefore, the Canales regression model follows the equation: where A annual is the annual streamflow (hm 3 ), x cum . is the cumulative monthly precipitation (for the first three, four, and five observed months in mm) and cumulative monthly streamflow (for the first six, seven, and eight months in hm 3 ), and a and b are the lineal regression model coefficients.
Table 2 shows that the regression model (linear with logarithmic transformation) is the best fit model for the Quéntar reservoir.It can be observed that the cumulative streamflow presents the best R 2 values (varying from 0.5442 to 0.9894) for all the months of study.The most important hydrological process given within the Quéntar catchment is due to the underground aquifer flows.The snow storage and melting processes are not relevant if compared with the Canales reservoir.Therefore, the base flow contribution into the total annual reservoir inflow is important.This might be the reason why the strongest correlation is found between the cumulative monthly streamflow and the annual streamflow.
Therefore, the Quéntar regression model follows the equation: where A cum . is the cumulative monthly streamflow (hm 3 ), and c and d are the regression model coefficients.
The estimated annual streamflow ('A annual ') can therefore be estimated from the obtained regression models, which are based on complete hydrological years (from October to September, both inclusive) of observed precipitation and streamflow data.That is, the regression models should contain all the historical records up to September of the last year that is needed to be forecasted.

Two-Parameter Gamma Cumulative Probability Distribution Function (Observed Data) and Conditioned Gamma Cumulative Probability Distribution Function (Using the Output from the Regression Models)
Following the methodology described in Sections 2.4 and 2.5, we obtained two-parameter gamma CDF for each reservoir of study shown in Figure 7.It was found that the gamma distribution fits very well with rainfall in general [35] and particularly with the observed series of annual streamflows for the two reservoirs of study.We obtained a correlation coefficient (R) of 0.99 and 0.98 for Quéntar and Canales, respectively, between the observed annual streamflow series and those given by the distribution function.
It can be seen from Figure 7 that the forecast given by the model in April 2017 for Canales and Quéntar was for a drier year than the mean historical hydrological year (since the conditioned gamma distribution was to left side of the gamma CDF), which was what happened last year.fits very well with rainfall in general [35] and particularly with the observed series of annual streamflows for the two reservoirs of study.We obtained a correlation coefficient (R) of 0.99 and 0.98 for Quéntar and Canales, respectively, between the observed annual streamflow series and those given by the distribution function.
It can be seen from Figure 7 that the forecast given by the model in April 2017 for Canales and Quéntar was for a drier year than the mean historical hydrological year (since the conditioned gamma distribution was to the left side of the gamma CDF), which was what happened last year.

Assigning Probabilities and Application of Monte Carlo Method
As explained in Section 2.6, we obtained 10,000 random observed years (from the historical series) using the Monte Carlo simulation method.
From the statistical analysis, we obtained three forecasts: Optimistic, average, and pessimistic.For our case study, we assigned the 90th percentile to the optimistic forecast and the 10th percentile to the pessimistic forecast, but the user can easily modify these values if needed.
Figure 8 shows, as an example, the April 2017 forecast output graph for the Quéntar reservoir model (six observed months and six forecast months).Lines in blue correspond to each observed year and the dotted red line to the historical average year.The orange line is the model average forecast output while the yellow dotted lines are the 90th and 10th percentiles forecasts (optimistic and pessimistic, respectively).series) using the Monte Carlo simulation method.
From the statistical analysis, we obtained three forecasts: Optimistic, average, and pessimistic.For our case study, we assigned the 90th percentile to the optimistic forecast and the 10th percentile to the pessimistic forecast, but the user can easily modify these values if needed.
Figure 8 shows, as an example, the April 2017 forecast output graph for the Quéntar reservoir model (six observed months and six forecast months).Lines in blue correspond to each observed year and the dotted red line to the historical average year.The orange line is the model average forecast output while the yellow dotted lines are the 90th and 10th percentiles forecasts (optimistic and pessimistic, respectively).

Model Validation
To verify the model performance and measure its predicting accuracy, the following metrics were applied [36]:

Model Validation
To verify the model performance and measure its predicting accuracy, the following metrics were applied [36]: The results are listed in Tables 3 and 4 for Canales and Quéntar, respectively.The performance indices presented in Table 3 show that the model performs relatively better at forecasting annual rather than monthly streamflows.This highlights the difficulty of forecasting the monthly streamflow distribution along the hydrological year.This may be because the snow melt processes occurring in the Canales catchment area are more difficult to predict monthly (due to its variability).
The monthly forecast results achieved for the Canales (R 2 values vary from 0.53 to 0.83 and average of 0.69) reservoir are better than those found for the Quéntar reservoir (R 2 values vary from 0.23 to 0.51 and average of 0.41).The optimal results in our study are given for the Canales reservoir for the seven observed months that achieved the highest R 2 = 0.83 and lowest MAPE = 32.45.
The results also show that, generally, the model performs better as the number of observed months increases (the values of R 2 , IOA, and NSE increase while the values of RMSE, MAE, and MAPE decrease).The higher values of error correspond to those observed hydrological years where hydrographs differ considerably from the mean historical observed year.Particularly, the worst MAPE value for the Canales and Quéntar reservoirs occurs in the driest years of 1994-1995 and 2004-2005.To gain better insight into the model performance in relation to the lower and upper forecasts (i.e., 10th and 90th percentiles, respectively), the inclusion coefficient was calculated.An inclusion coefficient of approximately 80% for the monthly streamflow forecasts and higher than 80% for the annual forecasts was achieved.These results show that the 10 and 90 percentiles are representative values to support risk-based management decisions for these specific catchments.
Figures 9 and 10 present the observed and estimated monthly and yearly streamflow hydrographs (with six months of observed data) for the Canales and Quéntar reservoirs, respectively.The model predicted the entire historical set relatively well, with a tendency to slightly overestimate the flows.The error distribution suggests that the model is less likely to be more effective in forecasting hydrograph peaks (very dry or very wet years).
Myronidis et al. [17] developed ARIMA models to forecast the mean monthly streamflows for the next three months and for 10 different catchments in Cyprus, achieving moderate accuracy.The results achieved R 2 values varying from 0.10 to 0.63 (average of 0.37) and MAPE values varying from 25.02 to 1112.38.Myronidis et al. acknowledged similar results achieved by other authors also using ARIMA models.Their optimal results obtained in this study were found for catchment number 6, which achieved the highest R 2 = 0.63 and lowest MAPE = 25.02.
If we compare the ARIMA model results with those achieved in this study (Tables 3 and 4), in average terms, our model performed similarly or even slightly better at providing results for a longer forecasted period (for up to nine months in advance), using a simpler methodology.
Given that the model has been developed to support management decisions associated with annual or quarterly cycles, we consider that the model is fit for its purpose and the results obtained are satisfactory.

Model Outputs
The proposed method can be implemented using mathematical programming software or simply using an Excel ® spreadsheet.
Figure 11 shows the user-friendly interface (taken from our model implemented in Excel).The user only needs to introduce the observed monthly rainfall and streamflow information in the dark grey columns and press the button called "Run the model".Almost instantly, the user can see the numerical results and associated graphs on the right hand side of the screen showing the probabilistic mean total annual streamflow and the monthly distribution as well as the optimistic and pessimistic forecasts.
Water 2018, 10, x FOR PEER REVIEW 21 of 26

Model Outputs
The proposed method can be implemented using mathematical programming software or simply using an Excel ® spreadsheet.
Figure 11 shows the user-friendly interface (taken from our model implemented in Excel).The user only needs to introduce the observed monthly rainfall and streamflow information in the dark grey columns and press the button called "Run the model".Almost instantly, the user can see the numerical results and associated graphs on the right hand side of the screen showing the probabilistic mean total annual streamflow and the monthly distribution as well as the optimistic and pessimistic forecasts.Figure 8 shows the historical data set of observed years and the historical average monthly streamflow as well as the prediction.This graph helps the user to quickly see how far the current hydrological year is from the mean historical hydrological year as well as similar historical observed years.

First Operational Year 2016-2017: Results
The model was first put into operation in October 2016.Figure 12 presents a comparative analysis of the annual and monthly streamflow forecasts with the observed values.
It can be observed that the 2016-2017 hydrological year was predicted to be drier than the average historical year and all predictions were between the average year shown in dark grey and the observed values shown in red.It can also be appreciated that the prediction given in January was below, but closer, to the average historical year, whilst as the number of observed months increased, the predictions were getting closer to the observed data.Figure 8 shows the historical data set of observed years and the historical average monthly streamflow as well as the prediction.This graph helps the user to quickly see how far the current hydrological year is from the mean historical hydrological year as well as similar historical observed years.

First Operational Year 2016-2017: Results
The model was first put into operation in October 2016.Figure 12 presents a comparative analysis of the annual and monthly streamflow forecasts with the observed values.
It can be observed that the 2016-2017 hydrological year was predicted to be drier than the average historical year and all predictions were between the average year shown in dark grey and the observed values shown in red.It can also be appreciated that the prediction given in January was below, but closer, to the average historical year, whilst as the number of observed months increased, the predictions were getting closer to the observed data.
It was observed for the Canales reservoir that the accuracy of the prediction output achieved in March 2017 (for the 2016/2017 total annual streamflow) was very good at approximately 92%.The forecast annual streamflow value was 41.52 hm 3 whilst the observed value was 38.34 hm 3 (compared with the observed historical mean annual streamflow of 66.71 hm 3 ).Logically, the prediction results improved as the number of observed months increased.It was observed for the Canales reservoir that the accuracy of the prediction output achieved in March 2017 (for the 2016/2017 total annual streamflow) was very good at approximately 92%.The forecast annual streamflow value was 41.52 hm 3 whilst the observed value was 38.34 hm 3 (compared with the observed historical mean annual streamflow of 66.71 hm 3 ).Logically, the prediction results improved as the number of observed months increased.
For the Quéntar reservoir, the accuracy of the prediction given in March 2017 (for the 2016/2017 total annual streamflow) was slightly lower than the Canales reservoir, thus, achieving an accuracy For the Quéntar reservoir, the accuracy of the prediction given in March 2017 (for the 2016/2017 total annual streamflow) was slightly lower than the Canales reservoir, thus, achieving an accuracy of 80%, which is good.The total annual streamflow forecast was 12.65 hm 3 whilst the observed value was 10.68 hm 3 (compared with the observed historical mean annual streamflow of 20.81 hm 3 ).
We investigated in more detail whether the Quéntar model could be improved.A more detailed evaluation of the behaviour and influence of the interannual underground flow in the total annual contribution to the reservoir was carried out.It was found that there is a relatively small correlation between the annual streamflow and with the previous years.It was also examined whether when applying the Monte Carlo simulation, the median value might provide better forecasts than using the mean value.Even though for the hydrological year of 2016-2017, the median provided better forecast results, when assessing the historical data, the mean value provided more accurate forecast results.For this particular sub-basin, we also concluded that the minimum number of observed years (to minimise the error) should be 40 years.
The model allows for the inclusion of seasonal predictions, for example, those provided by the AEMET in Spain.The seasonal probability prediction values estimated for the next quarter of the current year assigned to each tertile (wet, normal, and dry) are applied to the model outputs by multiplying them with seasonal predictions (upper, central, and lower estimations).Equally, the model allows the integration of climate change effects by scoring and weighing recent years of the historic data.

Conclusions and Future Research Directions
In Mediterranean countries, risk-based streamflows forecasts are critical, especially, before the intensive irrigation campaign starts (usually in April).During this time, water authorities must make responsible management decisions on the best allocation of the available water volume between different water users and environmental needs.
This research contributes towards the development of a user-guided, simple, low cost, and robust procedure for forecasting monthly and annual streamflows during the current hydrological year.
The methodology is innovative because of its combination of well-known techniques and its ability to achieve satisfactory and reliable results.Firstly, regression analysis techniques were applied to obtain the relationship between the total annual streamflow and the cumulative monthly rainfall and monthly streamflow.Secondly, the historical streamflow data sets were fitted to a two-parameter gamma continuous and cumulative probability distribution function.The estimated total annual streamflow output obtained from the regression analysis was made to be equal to the mean value of a new distribution, named the conditioned two-parameter gamma cumulative probability distribution function.This new distribution allowed the allocation of the probability for each year of the historical data set based on the estimated annual streamflow (from the regression analysis) and the gamma distribution (previously fitted to the historical data set).Finally, the Monte Carlo simulation was applied to obtain 10,000 simulations.The model outputs were the probabilistic mean annual and monthly streamflows along with the 10th and 90th percentiles (which can be easily modified by the user if needed).This allows decision-makers to adopt a water resource management risk-based approach, which can be adjusted to their own needs.
The method was successfully applied to two headwater reservoirs located at the upper area of the Genil River (Guadalquivir River Basin), namely, the Quéntar and Canales.Several metrics (including MAE, MAPE, IOA, RMSE, R 2 , and NSE) were used during the validation process, which showed good levels of accuracy and increased with the number of months observed.The model was first put into operation during the last hydrological year (2016-2017), obtaining accuracy levels of 92% and 80% respectively, for Canales and Quéntar, for the annual streamflow forecast given in March 2017.
The regression analysis shows that the best descriptor for forecasting the total annual streamflow for the Quéntar reservoir was the cumulative monthly streamflow (due to the influence of the subterranean flows).For Canales, the cumulative precipitation was the best predictor for the first three, four, and five months observed whilst the cumulative streamflow was the best predictor for the six, seven, and eight months observed.It is important to highlight that the best fit regression model and best fit predictor may vary depending on the predominant hydrological process at each specific basin and season of the year.In this situation, a combined regression model should be used to achieve the best correlation and predictive results.
The key advantages of this procedure are: (i) The use of relatively simple algorithms and well-known techniques, which can be easily implemented in an excel spreadsheet; (ii) the use of a minimum number of hydrological variables (streamflow and precipitation); (iii) the model saves substantial computational time (the model takes just a few seconds to run the simulations); (iv) the model does not depend on other software to achieve the results; (v) the model offers an intuitive user interface, which can be easily used by anyone; and (vi) the results achieved are similar or slightly better than those achieved with other more complex models.
The method also has some limitations: (i) It is applicable to headwater systems; (ii) if there are new rapid and significant land use changes within the catchment of study (not reflected in the historical data sets), the model might not capture these changes and may generate misleading streamflow forecasts; and (iii) whilst there is no limitation in catchment size, catchments with considerable climatic irregularity or very rare seasonal sporadic rainfall are difficult to predict and should be carefully studied.
To obtain the best results, this methodology should be applied when there is good quality input data and, at least, 30 years of observed data (although 40 years would be desirable to increase the forecast accuracy).The minimum number of observed years should be greater in those catchments with a greater standard deviation.The performance of the model can be further improved (and reduce prediction errors) by increasing the number of observed years.
Current and future research is in line with the full integration of seasonal climate predictions as well as the effects of climate change.Equally, we are working towards the integration of this model with other types of hydrological models (conceptual or distributed models) to check whether the accuracy could be further improved.
We conclude that the proposed methodology to forecast streamflows presented in this paper, although simple and of low cost, provides satisfactory, consistent, and robust results with a relatively small error margin.The outputs of this model can help with the early detection of critical events, such as droughts and periods of water scarcity, and support strategic water management decisions.This can improve resource and cost efficiency when deciding on optimum water allocation.

DISTRIBUTION 1 . 4 .
Historical observed monthly and yearly precipitation and streamflow data series.The average value of the Conditioned Gamma distribution is forced to be equal to the estimated total annual streamflow (intermediate output from the best fit regression model).
Water 2018, 10, x FOR PEER REVIEW 10 of 26

Figure 5 .
Figure 5. Canales reservoir: Regression analysis for the first three, four, five, six, seven, and eight months using the cumulative monthly reservoir inflow and precipitation.

Figure 5 .
Figure 5. Canales reservoir: Regression analysis for the first three, four, five, six, seven, and eight months using the cumulative monthly reservoir inflow and precipitation.

Figure 6 .
Figure 6.Quéntar reservoir: Regression analysis for the first three, four, five, six, seven, and eight months using the cumulative monthly reservoir inflow and precipitation.

Figure 6 .
Figure 6.Quéntar reservoir: Regression analysis for the first three, four, five, six, seven, and eight months using the cumulative monthly reservoir inflow and precipitation.

Figure 8 .
Figure 8. Example of typical model forecast outputs results compared with the historical observed streamflow data.

Figure 8 .
Figure 8. Example of typical model forecast outputs results compared with the historical observed streamflow data.
(a) Coefficient of determination (R 2 ); (b) root-mean-square error (RMSE); (c) Mean Absolute Error (MAE); (d) Index of Agreement (IOA); (e) Mean Absolute Percent Error (MAPE); (f) Coefficient of Nash-Sutcliffe efficiency (NSE); and (g) Inclusion Coefficient (IC) defined as the percentage of times that the observed streamflow falls within the 10th and 90th percentiles (our pessimistic and optimistic forecast values).

Figure 9 .
Figure 9. Observed monthly streamflows (hm 3 ) and predictions using the model with the first six months of observed data of the hydrological year and 10th and 90th percentiles (lower and upper limits) (hm 3 ) of the (a) Canales reservoir and (b) Quéntar reservoir.

Figure 10 .
Figure 10.Observed annual streamflows (hm 3 ) and predictions using the model with the first six months of observed data of the hydrological year and 10th and 90th percentiles (lower and upper limits) (hm 3 ) of the (a) Canales reservoir and (b) Quéntar reservoir

Figure 10 .
Figure 10.Observed annual streamflows (hm 3 ) and predictions using the model with the first six months of observed data of the hydrological year and 10th and 90th percentiles (lower and upper limits) (hm 3 ) of the (a) Canales reservoir and (b) Quéntar reservoir

Figure 11 .
Figure 11.Example of typical model input requirements and outputs results.

Figure 11 .
Figure 11.Example of typical model input requirements and outputs results.

Figure 12 .
Figure 12.Predictions of annual and monthly cumulative streamflow (hm 3 ) and observed annual and monthly cumulative streamflow (hm 3 ) of the (a) Canales reservoir and (b) Quéntar reservoir.

Figure 12 .
Figure 12.Predictions of annual and monthly cumulative streamflow (hm 3 ) and observed annual and monthly cumulative streamflow (hm 3 ) of the (a) Canales reservoir and (b) Quéntar reservoir.

Table 1 .
Relationship between the observed and forecasted periods.

Table 2 .
Coefficient of determination (R 2 ) from the regression analysis.