Water Temperature Ensemble Forecasts: Implementation Using the CEQUEAU Model on Two Contrasted River Systems

In some hydrological systems, mitigation strategies are applied based on short-range water temperature forecasts to reduce stress caused to aquatic organisms. While various uncertainty sources are known to affect thermal modeling, their impact on water temperature forecasts remain poorly understood. The objective of this paper is to characterize uncertainty induced to water temperature forecasts by meteorological inputs in two hydrological contexts. Daily ensemble water temperature forecasts were produced using the CEQUEAU model for the Nechako (regulated) and Southwest Miramichi (natural) Rivers for 1–5-day horizons. The results demonstrate that a larger uncertainty is propagated to the thermal forecast in the unregulated river (0.92–3.14 ◦C) than on the regulated river (0.73–2.29 ◦C). Better performances were observed on the Nechako with a mean continuous ranked probability score (MCRPS) <0.85 ◦C for all horizons compared to the Southwest Miramichi (MCRPS ≈ 1 ◦C). While informing the end-user on future thermal conditions, the ensemble forecasts provide an assessment of the associated uncertainty and offer an additional tool to river managers for decision-making.


Introduction
The inherent links between fish biological processes and water temperature have been well documented over the last fifty years [1][2][3]. It has been shown that sustained periods of high water temperature can result in impaired fish swimming capacity [4], weight loss, disease proliferation [5] and death [6].
In unregulated rivers, the thermal regime is governed by environmental characteristics such as meteorological forcings, topography, hydrology and geology [7,8]. The impoundment of a watercourse can alter this thermal regime in different ways depending on the type of dam, the timing and magnitude of the water releases, reservoir stratification [9][10][11] and the depth in the upstream reservoir from which water is released [12,13]. In some hydrological systems where dams have altered natural flows and thermal patterns (e.g., Klamath River, US; Delaware River, US), management strategies have been implemented to protect aquatic communities while maintaining socio-economic benefits delivered by freshwater resources. Research on water temperature has provided the water management community with a plethora of modeling tools (see Benyahya et al. [14] for a partial review). Many of these models are key elements for dam operators because they help to meet environmental flow requirements and water quality criteria, while at the same time assisting in the optimization of operations [13,15].
The total enthalpy is estimated by summing the initial enthalpy, the advective fluxes and the various energy fluxes at the air-water interface, according to Equation (2): where all terms are computed in MJ. H ini is the initial enthalpy, H s is the net solar radiation, H IR is the net longwave radiation, H e is the evaporative heat flux, H c is the sensible heat flux and H adv represents the advective fluxes. These advective fluxes include the energy transferred by surface runoff, interflow, groundwater, and overflow from lakes and marshes [31]. The temperature of the surface runoff is assumed to be equal to air temperature, base flow temperature is assumed to be constant and equal to the mean annual air temperature, and interflow is the average of the two previous terms. When water temperature is modeled downstream of a dam, both discharge and water temperature of the water release are provided to the model. These values are used to calculate the advective fluxes on the grid cell downstream of the dam. When the local energy budget is completed, including the advective fluxes and the surface fluxes, the energy is routed downstream according to the discharge and water temperature of each cell. The terms of the energy budget are estimated using Equations (3)- (7): where β is described as: H e = C e L e AH (6) H c = C c A(0.2U(T a − T w )) (7) where A is the heat-exchange surface (m 2 ), corresponding to the air-water interface, R s is net solar radiation (MJ m 2 ), Ta is air temperature ( • K), σ is the Stefan-Boltzmann constant (4.9 × 10 −9 MJ m −2 • K −4 ), β is the sky emissivity (0-1), L e is the height of evaporated water as calculated by the hydrological model (m), H is the latent heat of vaporization (2480 MJ m −3 ), CC is the cloud cover fraction (0-1), U is wind speed (km/h), e a is air vapor pressure (mm Hg) and Cs, C i , C e and C c are empirical coefficients to be adjusted during model calibration. In both the hydrological and the thermal model, H is estimated using a modified Thornthwaite equation [30]: XIT and XAA are empirical coefficient calculated from mean monthly air temperatures. Both the water budget and the water temperature calculations are performed on each square area of a predefined grid. A complete description of the original model is available in [30]. A thorough description of recent model modifications is available in [32].

Ensemble Forecasting System
In order to provide an estimation of the uncertainty associated with water temperature forecasts, an ensemble method is proposed. The core of this system is the CEQUEAU model described in Section 2.1, fed with meteorological ensemble forecasts as inputs.
In a typical operational setting, the hydrological and thermal forecasting process begins by estimating the best possible values for the initial state of the watershed. This initial state is described by state variables, for instance current soil humidity, which are typically difficult to monitor. Thus, this estimation is most often performed by running the hydrological model in simulation mode for at least a year prior to the initial time of the forecast (spin up period). In the context of this study, instead of repeating this process of "state estimation then forecast", the initial states were estimated for each time step in a single run and preserved for future use. Then, for each time step (t), the hydrological and thermal initial states of the watershed are retrieved at time t and provided to the model. The ensemble meteorological forecasts for time t + 1 to t + 5 are used as model inputs to produce a 1-5-day forecasts. Those meteorological ensemble forecasts are described in greater details in Section 3.4. Each set of meteorological forecasts, called members (k), that composes the ensemble is fed to the model individually. To produce a 5-day forecast of K members, the CEQUEAU model is thus run K times. The output is a distribution of K hydrological and thermal forecasts for each forecasted time step. Those outputs for each forecasting horizons are then stored individually to build complete time series of each of the five lead time forecasts. The temperature ensemble forecasts for these five days are henceforth analyzed separately and referenced as forecasting horizons one to five (hz1 to hz5). The forecasting framework is summarized in Figure 1.

Model Calibration
Both the thermal and the hydrological models contain parameters that must be calibrated based on a comparison of streamflow simulations with observations. They were calibrated using a split sample method. Because of the cascade structure of the model, the hydrological parameters were adjusted prior to those of the thermal model. The parameters of both models were optimized using a covariance matrix adaptation evolution strategy (CMA-ES; [33]). The hydrological model includes  The model was run from 2009 to 2014 during the summer period (i.e., 15 June to 15 September) for both the regulated and natural system. The length of this period is limited by the availability of the archived ensemble meteorological forecasts and the water temperature data.

Model Calibration
Both the thermal and the hydrological models contain parameters that must be calibrated based on a comparison of streamflow simulations with observations. They were calibrated using a split sample method. Because of the cascade structure of the model, the hydrological parameters were adjusted prior to those of the thermal model. The parameters of both models were optimized using a covariance matrix adaptation evolution strategy (CMA-ES; [33]). The hydrological model includes 28 parameters from which 16 have physical meaning and 12 are adjusted based on goodness of fit only. The reader is referred to St-Hilaire et al. [31] for a complete list and description of the parameters. The hydrological model was optimized based on Nash-Sutcliffe efficiency criterion (NS; Equation (9)) and bias (Equation (10)), while the calibration of the thermal model relied on minimizing the root mean squared error (RMSE; Equation (11)) between observed and simulated water temperatures. In all cases, the parameters were obtained from 1500 iterations of the CMA-ES optimization algorithm.
In Equations (9)-(11), n is sample size, y t is the observed value (flow or temperature) at time t, andŷ t is the simulated value (flow or temperature) at time t.

Forecasts Verification and Explicit Consideration of Uncertainty
An extensive toolbox of evaluation criteria was developed to rightfully assess the quality an ensemble forecast [34][35][36][37]. One of the most frequently encountered scores in the hydrological ensemble forecasting literature is the Continuous Ranked Probability Score (CRPS). Basically, the CRPS compares the cumulative distribution function (CDF) of the forecast members with the CDF of the observation. If the observation is represented by a single value, its CDF is a Heaviside function. As demonstrated by Gneiting and Raftery [37], the mean CRPS (Equation (12); MCRPS) is the probabilistic analog of the mean absolute error (MAE). The MCRPS is defined as: where F(x) is the cumulative density function (CDF) of the forecasted variable at time step t (x t ) for all ensemble members, H(x t ≥ y t ) is the Heaviside (step) function of the t th observed value (y t ) and n is the number of time steps. A normal distribution is typically accepted for temperature [38] and was thus used in the present analysis. A Monte-Carlo approximation proposed by Gneiting and Raftery [37] was used to solve the integral in Equation (12), which is therefore estimated using Equation (13): for which X and X' are two independent random samples (n = 1000) drawn from the empirical cumulative distribution function of the forecast x t that includes all K members. Standard model evaluation strategies are based on the quality of the fit between the measurements and the simulations (or forecasts). However, Hague and Patterson [28] suggest that this type of evaluation might not "aptly evaluate" a model's performance when it comes to water temperature modeling for fisheries management. They suggest looking at the capacity of a model to predict a threshold exceedance instead of solely focusing on the traditional best fit. Given the fact that in many systems, thermal forecasts are issued in order to keep water temperature below a target threshold [13,15,39] the evaluation of threshold prediction capacity of the forecasting system is essential to assess its usefulness. Hence, in addition to the MCRPS, the Brier Score (BS; Equation (14)) [40] was calculated as: where p(x i ) is the probability of exceeding the threshold according to the forecast on day i and p(y) is the outcome according to the observation (y). In the case of a deterministic forecast, the value of p(x i ) can only be 1, if the temperature xi exceeds the threshold or 0 if it is predicted not to happen. The same logic applies to p(y). On the other hand, if the forecast is a distribution of values (i.e., an ensemble forecast), p(x i ) can take any value between 0 and 1. The overall uncertainty of the ensembles was tracked for all forecasting horizons by calculating the spread of the daily distribution (maximum-minimum). Reliability plots [41] were also produced to assess the reliability of the water temperature forecasts. Confidence intervals, named forecast nominal probability, for confidence levels of 10% increments were first calculated for the ensemble forecast only (hz1-hz5). Then, the observed probability of each confidence interval was computed by calculating the frequency at which the observation was included in a given interval. The results were then displayed as a scatter plot with the forecast nominal probabilities on the x-axis and the observed probability on the y-axis. For a perfectly reliable forecasting system, nominal probabilities should be equal to the observed probabilities. For instance, the computed 90% interval should include the observed temperature 9 days out of 10 on average.

Study Sites and Data
The CEQUEAU model described above was implemented on the Nechako watershed (125 • 59' W and 53 •

Study Sites and Data
The CEQUEAU model described above was implemented on the Nechako watershed (125°59' W and 53°46' N) in the province of British Columbia, Canada and on the Southwest Miramichi River (65°82' W and 46°74' N) in the province of New Brunswick, Canada ( Figure 2).

Nechako
The Nechako River flows from the Skins Lake Spillway eastward for about 245 km before it reaches its junction with the Stuart River. It begins its course in the coastal range, and then it drains the Nechako Plateau and flows into the Fraser River at Prince George, British Columbia. The elevation of the watershed ranges from 630 m to 1810 m. The watershed covers about 47,000 km 2 . The Stuart and Nautley Rivers are its major tributaries ( Figure 2). A major dam in the Nechako Canyon and nine saddle dams were built in the early 1950s to create the Nechako reservoir, which is about 181 kilometers long [42]. The Skins Lake spillway ( Figure 2) typically discharges between 170 and 283 m 3 /s from 11 July to 20 August and between 14.2 and 49 m 3 /s throughout the rest of the year.
An average total annual precipitation of 417 mm was monitored at the Ootsa Lake meteorological station between 1981 and 2010, 37% of which fell as snow. The average air temperature is 3.2 °C with an average maximum of 19.8 °C occurring in August and an average minimum of −11.8 °C in January.
A protocol called the Summer Temperature Management Program (STMP) was implemented in the Nechako River to monitor water temperature downstream from the dam and to ensure that water temperature remains below 20 °C at a control section near the city of Finmoore (i.e., upstream of the confluence of the Nechako and Stuart Rivers). This protocol is in effect between 20 July and 20 August (Nechako Fisheries Conservation Program (NFCP), 2005). The measure was introduced to benefit sockeye salmon (Oncorhynchus nerka) during its spawning period. During that period, water

Nechako
The Nechako River flows from the Skins Lake Spillway eastward for about 245 km before it reaches its junction with the Stuart River. It begins its course in the coastal range, and then it drains the Nechako Plateau and flows into the Fraser River at Prince George, British Columbia. The elevation of the watershed ranges from 630 m to 1810 m. The watershed covers about 47,000 km 2 . The Stuart and Nautley Rivers are its major tributaries ( Figure 2). A major dam in the Nechako Canyon and nine saddle dams were built in the early 1950s to create the Nechako reservoir, which is about 181 kilometers long [42]. The Skins Lake spillway ( Figure 2) typically discharges between 170 and 283 m 3 /s from 11 July to 20 August and between 14.2 and 49 m 3 /s throughout the rest of the year.
An average total annual precipitation of 417 mm was monitored at the Ootsa Lake meteorological station between 1981 and 2010, 37% of which fell as snow. The average air temperature is 3.2 • C with an average maximum of 19.8 • C occurring in August and an average minimum of −11.8 • C in January.
A protocol called the Summer Temperature Management Program (STMP) was implemented in the Nechako River to monitor water temperature downstream from the dam and to ensure that water temperature remains below 20 • C at a control section near the city of Finmoore (i.e., upstream of the confluence of the Nechako and Stuart Rivers). This protocol is in effect between 20 July and 20 August (Nechako Fisheries Conservation Program (NFCP), 2005). The measure was introduced to benefit sockeye salmon (Oncorhynchus nerka) during its spawning period. During that period, water releases at Skins Lake are increased in order to reach a minimum discharge of 170 m 3 /s at Cheslatta Falls. Discharge is then maintained between 170 m 3 /s and 283 m 3 /s. When the 20 • C threshold is expected to be exceeded, discharge at the Skins Lake Spillway is increased to 453 m 3 /s. This operation rapidly increases discharge of the Nechako River downstream and limit its warming at its confluence with the Stuart River. The decision to whether of not increase released discharge at the Skins Lake Spillway is based on a water temperature forecast. To assess the capacity of the model to predict the exceedance of the 20 • C threshold, a Brier Score (Equation (14)) was calculated for values of 20 • C identified as critical temperature (TH). A Brier Score for a threshold value of 18 • C was also calculated and labeled as early warning (EW).
The model was set up for the management of summer water releases at the Skins Lake Spillway. Therefore, only the portion of the watershed located between the Skins Lake Spillway and the confluence of the Nechako and Stuart Rivers ( Figure 2) was modeled. At its upstream boundary, the model was fed with the discharge released at the Skins Lake Spillway. Water temperature was also necessary at the model upstream boundary. Data were available for 2013 and 2014 but modeling was required to provide input reservoir temperatures for previous years (2009)(2010)(2011)(2012)). An autoregressive model with exogenous variables (ARX) that uses air temperature residuals as predictors [43] was calibrated using data from moored thermographs. The calibrated model was subsequently used to generate water temperature at the surface of the reservoir. Model performances are presented in the results section.
The travel time between the Spillway and the site that is subject to the thermal constrain is estimated to be 5 days [44]. Discharge and water temperature at the outlet of the Nautley River were kept constant for all forecasting horizons (hz1 to hz5) in forecasting mode. The last observation of both discharge and water temperature recorded at the Nautley station ( Figure 2) were thus fed to the model. This was done in order to replicate the actual forecasting framework used for operations.

Southwest Miramichi
The Southwest Miramichi flows in the main stem of Miramichi River before it empties in the Atlantic Oceans. It occupies about 60% (7800 km 2 ) of the total drainage area (13,000 km 2 ) of the Miramichi watershed. The elevation of the watershed ranges from sea level to 640 m. Measurement sites are located at Blackville (discharge) and Wades Lodges (water temperature) which drains about 5600 km 2 . The two measurement sites are separated by 10 km of river. Mean discharge at the Blackville hydrological station ( Figure 2) is 120 m 3 /s with peaks reaching more than 2000 m 3 /s and low flows around 20 m 3 /s. Its hydraulic regime is considered as naturel by the Water Survey of Canada (http://wateroffice.ec.gc.ca). It receives an average total precipitation of 1175 mm (Doaktown -Environment and Climate Change Canada) of which 25% falls as snow.
Water temperatures in the Miramichi system are known to exceed the optimal range of 16-20 • C for Atlantic salmon (Salmo salar) [45]. Between 1997 and 2016, mean daily summer water temperature recorded in the Southwest Miramichi was 19.33 • C with a maximum of 25.87 • C (9 July, 2010) and a minimum of 11.36 • C (21 September 2014). Thresholds of 20 • C for minimum daily water temperature and 23 • C for maximum daily water temperature were identified for management strategies in the system [46]. Brier Scores were therefore calculated for thresholds values of 20 • C (early warning) and 23 • C (critical temperature).
Since the Miramichi is unregulated, no input discharge or water temperatures had to be provided to the model as a boundary condition. The period between 15 June and 5 July 2011 on the Miramichi were removed from the analysis because the radiation forecasts were not available. For ease of reading, the Southwest Miramichi will be referred to as the Miramichi.

Meteorological and Hydrological Data
On the Nechako, meteorological data used to adjust the hydrological model parameters (pTot, tMin and tMax) were provided by British Columbia Wildfire (BC Wildfire) management branch, Environment and Climate Change Canada (ECCC) and Rio Tinto (RT), an aluminum producer who manages the watershed for hydropower production ( Figure 2). BC Wildfire data were preferred over most of the data from RT's meteorological stations because of their better representation of the meteorological conditions on the watershed. Cloud cover (CC) was retrieved from ECCC stations Bella Coola, Quesnel, Prince George and Terrace. Net solar radiation (Rs), relative humidity (RH) and wind speed (U) were extracted from the NASA Prediction of Worldwide Energy Resource (POWER) on a grid with 1 • horizontal resolution.
A string of 13 temperature loggers (Onset HOBO Pendant temperature loggers; ±0.53 • C accuracy) placed at a 0.91 m (3 feet) vertical distance from each other was attached to a buoy, located at about 150 m upstream of the Skins Lake Spillway from May to October 2013 and 2014 and for July to October during the summer of 2015. These data confirmed that the water column was well mixed during the period of the year where the STMP applies. Data from the first two sensors from the surface were used to calibrate the ARX surface water temperature model in the reservoir, which is subsequently used as the upstream thermal boundary condition. On the Miramichi, pTot, tMin and tMax were provided by Environment and Climate Change Canada (ECCC), and net solar radiation, relative humidity and wind speed were also extracted from the POWER database. CC was derived from the proportion of extra-terrestrial solar radiation reaching the earth surface. All of the aforementioned data were downscaled to the grid used by CEQUEAU using a three nearest neighbors method.
Discharge data at Blackville (station #01BO001) were retrieved from the Water Survey of Canada database. Water temperature data from Department of Fisheries and Oceans Canada at the Wades Lodges station (46.6716 N and 65.7738 W) retrieved from the rivTemp database (http://rivtemp.ca) were used. Gaps in the data were filled using corrected data from Millerton station (46. On both watersheds, RH and mean daily air temperature (T a ; • C) were used to estimate air vapor pressure (e a ; kPa) with [47] formula as given by Equations (15) and (16). e s = 6.11 · 10 ( 7.5·Ta 237.3+Ta ) (15) e a = RH · e s /100 (16) where e s is the saturated vapor pressure. The ensemble meteorological forecasts were extracted in grib2 format through The Observing System Research and Predictability Experiment (THORPEX) Interactive Grand Global Ensemble (TIGGE) portal. They include variables that are typically used for the purpose of hydrological forecasting, namely, total precipitation, and minimum and maximum air temperature. In this study the net solar radiation, dew point temperature, wind speed, and cloud cover are also required. These additional variables are used by the thermal model that follows the hydrological model in the modeling/forecasting cascade. Forecasts for the relative humidity are not available on the TIGGE portal. Therefore, the dew point (T d ; • C) was used to estimate air vapor pressure (e a ; kPa) using Equation (15) with T d replacing T a . The two orthogonal vectors for wind speed, 10 u (east-west axis) and 10 v (north-south axis), were transformed in a single wind component.
Forecasts emitted by the Canadian Meteorological Center (CMC) were preferred, as they are the most susceptible of being used operationally in Canada. However, CMC's net solar radiation forecasts are not available on the TIGGE portal. Consequently, net solar radiation forecasts from the European Centre for Medium Range Weather Forecasts (ECMWF) were used. Forecasts for all other variables are from the CMC. The Canadian ensemble forecasts are composed of 20 members (K = 20) generated from different initial conditions of the atmosphere, physical parameters and perturbed observations [48]. ECMWF forecasts are composed of 50 members also generated from different initial conditions and model's physics perturbations. In order to use the ECMWF and CMC forecasts in the same forecasting framework, only 20 members out of the 50 emitted by the ECMWF were retained for each forecast (hz1 to hz5). At each time step, a sub-sample of 20 members was drawn randomly and kept for the five forecasting horizons. Another random sub-sample was then drawn for the subsequent time step. All forecasts were extracted at a 0.6 • horizontal resolution and downscaled on the CEQUEAU grid using a bilinear interpolation. A CEQUEAU grid resolution of 5 km was used for Nechako watershed and 12 km for the Miramichi watershed. The difference between the grid resolutions is due to the use of pre-existing structures. Previous work by Dugdale et al. [49] showed minor differences in model performance between grids resolutions of 2.5 km and 20 km.
A MCRPS (Equation (12)) was calculated between observations at the meteorological stations and the corresponding grid cell of the meteorological forecasts for tMin, tMax, pTot and R s . Note that MCRPS for R s was only calculated for 2014 (25 July to 20 August) on the Nechako watershed. With regards to the remaining R s forecast and the other input variables (CC, U and e a ), MCRPS was calculated by comparing data interpolated on the CEQUEAU grid including data extracted from the POWER database with the forecasts interpolated on the same grid. This was done in order to assess the quality of ensemble meteorological forecasts used as inputs for the hydrological and thermal models. A particular attention was paid to solar radiation because of its importance in the estimation of the surface thermal budget of a river.

Model Calibration and Evaluation
Hydrological and water temperature model calibration performance metrics are summarized in Table 1. Both model show satisfactory performances, with relatively low RMSE values for simulated water temperatures during the critical summer period. The parameters of the thermal model were optimized to best reproduce water temperature at Vanderhoof (Nechako) and Wades Lodges (Miramichi) during the summer period (15 June to 15 September). This was successfully achieved, with a RMSE under 1.6 • C during both calibration and validation periods Table 1). The ARX model (temperature of the water release; see Section 2.5.1.) yielded a RMSE of 1.18 • C and 1.28 • C in calibration and validation, respectively.

Meteorological Forecasts Evaluation
The MCRPS calculated for the meteorological forecasts are summarized in Table 2. Total precipitation showed relatively low MCRPS for all lead times on both watersheds, ranging between 0.7 mm (hz1-Nechako) to 3.0 mm (hz5-Miramichi). With regards to air temperature, a higher MCRPS was found for tMin (1.97-3.80 • C) compared to tMax (1.35-1.81 • C). Between 25 July and 20 August, MCRPS for R s ranges between 2.5 MJ/m 2 (hz1) and 3.1 MJ/m 2 (hz5) at Skins Lake on the Nechako watershed. In relative terms, it represents 13% and 21% of the observed values, respectively. Troccoli and Morcrette [50] calculated relative MAE's between 18% and 45% for direct solar radiation forecasted by the ECMWF's Global Atmospheric Model in Australia. These values are of the same order of magnitude than those obtained in the present study.  Figure 3 shows MCRPSs calculated between the forecasted radiation interpolated (hz5) on the CEQUEAU grid and the corresponding observations interpolated on the same grid. For ease of reading, only radiation is presented because of its major importance for thermal modeling. Figure 3A highlights an east-west gradient of the adequacy between the grid of observed and forecasted solar radiation on the Nechako watershed. Lower MCRPS values were found in the western section of the watershed while higher values were observed in the middle section before it diminishes again towards the eastern part. In the eastern section, where sits the main stem of the Nechako, MCRPSs remained fairly high with values above 3 MJ/m 2 . Although not shown here, such an east-west gradient was observed for all forecasted variables.   Figure 3 shows MCRPSs calculated between the forecasted radiation interpolated (hz5) on the CEQUEAU grid and the corresponding observations interpolated on the same grid. For ease of reading, only radiation is presented because of its major importance for thermal modeling. Figure  3A highlights an east-west gradient of the adequacy between the grid of observed and forecasted solar radiation on the Nechako watershed. Lower MCRPS values were found in the western section of the watershed while higher values were observed in the middle section before it diminishes again towards the eastern part. In the eastern section, where sits the main stem of the Nechako, MCRPSs remained fairly high with values above 3 MJ/m 2 . Although not shown here, such an east-west gradient was observed for all forecasted variables. On the Miramichi watershed, the MCRPS values are slightly higher in the northeastern part of the watershed but no clear spatial pattern was observed with respect to the accuracy of the solar radiation forecast ( Figure 3B) or other forecasted meteorological variables. As observed for the Nechako watershed, the MCRPS for solar radiation is quite high across the watershed and remained above 3 MJ/m 2 . On the Miramichi watershed, the MCRPS values are slightly higher in the northeastern part of the watershed but no clear spatial pattern was observed with respect to the accuracy of the solar radiation forecast ( Figure 3B) or other forecasted meteorological variables. As observed for the Nechako watershed, the MCRPS for solar radiation is quite high across the watershed and remained above 3 MJ/m 2 .
On the Nechako watershed, the total precipitation showed large relative bias (>100%). In absolute terms, this bias remained fairly low (<2 mm) while the same metrics were much lower on the Miramichi watershed with relative biases under 25% (<1 mm). With regards to air temperature, both tMax and tMin had a relative bias under 20% on both watersheds, which translates into a larger bias for tMax in absolute terms (>3 • C) compared to tMin (<1 • C). On the Nechako watershed, maximum air temperature was strongly biased in the upper part the watershed while bias diminishes towards its eastern border. Air temperature is used in the estimation of longwave radiation (Equation (4)), convection (Equation (7)) and evaporation (Equation (8)). Cloud Cover (CC) on the Nechako watershed showed a positive bias of 32% which was higher in the eastern part of the watershed than in the western part. On the Miramichi maximum biases of 20% were calculated and were constant across the watershed. Forecasted e a had a maximum relative bias of 30% on the Nechako watershed while it reached 12% on the Miramichi. The highest bias values were found in the lower eastern part of the Nechako watershed and were spatially constant on the Miramichi. CC and e a are both used in the calculation of sky emissivity (Equation (4)), involved in the estimation of energy fluxes associated with longwave radiation (Equation (5)). Wind speed, which is used along with air temperature in the calculation of convective heat fluxes (Equation (7)), was biased by 35% (2.5 km/h) on average with a stronger bias in the upper part of the Nechako watershed. On the Miramichi watershed, biases were constant around 15% (1.2 km/h). Solar radiation, used directly as heat input in thermal modeling, was positively biased by 17% (3.14 MJ/m 2 ) on the Nechako watershed and negatively biased by 6% (−1.07 MJ/m 2 ) on the Miramichi watershed.

Hydrological and Thermal Forecasts Evaluation
The first step of the forecasting cascade is the hydrological forecast. Since discharge was imposed at the Skins Lake Spillway and at the outlet of the Nautley River in forecasting mode, very low uncertainty (<1 m 3 /s) was induced to the hydrological forecast by the ensemble precipitation forecast on the Nechako watershed ( Figure 4). Therefore, MCRPSs calculated from resulting discharge ensembles ( Figure 5A), which range from 17.1 m 3 /s (hz1) to 21.4 m 3 /s (hz5) are very similar to MAE calculated from simulations using observed meteorological inputs (17.7 m 3 /s). On the Miramichi, MCRPSs ranging between 31.7 m 3 /s (hz1) and 34.4 m 3 /s (hz3; Figure 5B) were in the same order of magnitude than the MAE calculated on for simulated discharge (34.9 m 3 /s). On the Nechako watershed, the total precipitation showed large relative bias (>100%). In absolute terms, this bias remained fairly low (<2 mm) while the same metrics were much lower on the Miramichi watershed with relative biases under 25% (<1 mm). With regards to air temperature, both tMax and tMin had a relative bias under 20% on both watersheds, which translates into a larger bias for tMax in absolute terms (>3 °C) compared to tMin (<1 °C). On the Nechako watershed, maximum air temperature was strongly biased in the upper part the watershed while bias diminishes towards its eastern border. Air temperature is used in the estimation of longwave radiation (Equation (4)), convection (Equation (7)) and evaporation (Equation (8)). Cloud Cover (CC) on the Nechako watershed showed a positive bias of 32% which was higher in the eastern part of the watershed than in the western part. On the Miramichi maximum biases of 20% were calculated and were constant across the watershed. Forecasted ea had a maximum relative bias of 30% on the Nechako watershed while it reached 12% on the Miramichi. The highest bias values were found in the lower eastern part of the Nechako watershed and were spatially constant on the Miramichi. CC and ea are both used in the calculation of sky emissivity (Equation (4)), involved in the estimation of energy fluxes associated with longwave radiation (Equation (5)). Wind speed, which is used along with air temperature in the calculation of convective heat fluxes (Equation (7)), was biased by 35% (2.5 km/h) on average with a stronger bias in the upper part of the Nechako watershed. On the Miramichi watershed, biases were constant around 15% (1.2 km/h). Solar radiation, used directly as heat input in thermal modeling, was positively biased by 17% (3.14 MJ/m 2 ) on the Nechako watershed and negatively biased by 6% (−1.07 MJ/m 2 ) on the Miramichi watershed.

Hydrological and Thermal Forecasts Evaluation
The first step of the forecasting cascade is the hydrological forecast. Since discharge was imposed at the Skins Lake Spillway and at the outlet of the Nautley River in forecasting mode, very low uncertainty (<1 m 3 /s) was induced to the hydrological forecast by the ensemble precipitation forecast on the Nechako watershed ( Figure 4). Therefore, MCRPSs calculated from resulting discharge ensembles ( Figure 5A), which range from 17.1 m 3 /s (hz1) to 21.4 m 3 /s (hz5) are very similar to MAE calculated from simulations using observed meteorological inputs (17.7 m 3 /s). On the Miramichi, MCRPSs ranging between 31.7 m 3 /s (hz1) and 34.4 m 3 /s (hz3; Figure 5B) were in the same order of magnitude than the MAE calculated on for simulated discharge (34.9 m 3 /s).    Figure 5B). The spread of the ensemble was also larger on the Miramichi (0.92 °C) than on the Nechako (0.73 °C). On the Nechako, the MCRPS increased with lead time to reach 0.82 °C on hz5. On the Miramichi, MCRPS reaches 1.08 °C on hz2 and decreases to 1.00 °C on hz5. Figure 6B shows that some five-day lead time forecasts are strongly biased at the beginning of 2011 and 2012 on the Nechako. In terms of threshold exceedance predictions indicated by the BS, similar values for the early warning were obtained for both rivers with BS = 0.15 for hz1 (Nechako and Miramichi) and respectively 0.13 and 0.15 for the Nechako and the Miramichi at hz5 ( Figure 5C). When the BS is calculated for higher temperature thresholds (critical temperature), both rivers performed well with lower scores obtained on the Nechako (BS = 0.01 − 0.02) compared to Miramichi (BS = 0.05 − 0.06). Both thresholds were represented on Figure 6. Over the forecasting period, the early warning temperature was exceeded on 46% of the time on the Miramichi and 24% on the Nechako and the critical temperature was exceeded 10% of the time on the Miramichi and 3% on the Nechako.  Figure 6 presents the ensemble water temperature forecasts produced on both watershed for one-day-ahead and five-day-ahead lead times. One-day-ahead forecast showed good performances on both rivers with a MCRPS somewhat higher on the Miramichi (0.92 • C) than on the Nechako (0.77 • C) but both under 1 • C ( Figure 5B). The spread of the ensemble was also larger on the Miramichi (0.92 • C) than on the Nechako (0.73 • C). On the Nechako, the MCRPS increased with lead time to reach 0.82 • C on hz5. On the Miramichi, MCRPS reaches 1.08 • C on hz2 and decreases to 1.00 • C on hz5. Figure 6B shows that some five-day lead time forecasts are strongly biased at the beginning of 2011 and 2012 on the Nechako.
Water 2017, 9, x 13 of 21 Figure 6 presents the ensemble water temperature forecasts produced on both watershed for one-day-ahead and five-day-ahead lead times. One-day-ahead forecast showed good performances on both rivers with a MCRPS somewhat higher on the Miramichi (0.92 °C) than on the Nechako (0.77 °C) but both under 1 °C ( Figure 5B). The spread of the ensemble was also larger on the Miramichi (0.92 °C) than on the Nechako (0.73 °C). On the Nechako, the MCRPS increased with lead time to reach 0.82 °C on hz5. On the Miramichi, MCRPS reaches 1.08 °C on hz2 and decreases to 1.00 °C on hz5. Figure 6B shows that some five-day lead time forecasts are strongly biased at the beginning of 2011 and 2012 on the Nechako.
In terms of threshold exceedance predictions indicated by the BS, similar values for the early warning were obtained for both rivers with BS = 0.15 for hz1 (Nechako and Miramichi) and respectively 0.13 and 0.15 for the Nechako and the Miramichi at hz5 ( Figure 5C). When the BS is calculated for higher temperature thresholds (critical temperature), both rivers performed well with lower scores obtained on the Nechako (BS = 0.01 − 0.02) compared to Miramichi (BS = 0.05 − 0.06). Both thresholds were represented on Figure 6. Over the forecasting period, the early warning temperature was exceeded on 46% of the time on the Miramichi and 24% on the Nechako and the critical temperature was exceeded 10% of the time on the Miramichi and 3% on the Nechako.

Uncertainty and Reliability of the Forecasts
One of the main objectives of this work is to assess and represent the uncertainty that is propagated to a water temperature forecast by the meteorological inputs. Figure 4 shows that the spread of the discharge ensemble forecast is much larger on the Miramichi than on the Nechako with values between 3.74 m 3 /s and 125.9 m 3 /s. Important variability was observed during the forecasting period, especially for longer lead times with a standard deviation of 194 m 3 /s for hz5 (Figure 4). Figure 7 shows the spread of the water temperature ensemble forecasts. The central line In terms of threshold exceedance predictions indicated by the BS, similar values for the early warning were obtained for both rivers with BS = 0.15 for hz1 (Nechako and Miramichi) and respectively 0.13 and 0.15 for the Nechako and the Miramichi at hz5 ( Figure 5C). When the BS is calculated for higher temperature thresholds (critical temperature), both rivers performed well with lower scores obtained on the Nechako (BS = 0.01 − 0.02) compared to Miramichi (BS = 0.05 − 0.06). Both thresholds were represented on Figure 6. Over the forecasting period, the early warning temperature was exceeded on 46% of the time on the Miramichi and 24% on the Nechako and the critical temperature was exceeded 10% of the time on the Miramichi and 3% on the Nechako.

Uncertainty and Reliability of the Forecasts
One of the main objectives of this work is to assess and represent the uncertainty that is propagated to a water temperature forecast by the meteorological inputs. Figure 4 shows that the spread of the discharge ensemble forecast is much larger on the Miramichi than on the Nechako with values between 3.74 m 3 /s and 125.9 m 3 /s. Important variability was observed during the forecasting period, especially for longer lead times with a standard deviation of 194 m 3 /s for hz5 ( Figure 4). Figure 7 shows the spread of the water temperature ensemble forecasts. The central line displays the mean spread of the ensemble for a given lead time and the vertical bars represent one standard deviation of the same spread. We observed a larger ensemble spread on the Miramichi compared to the Nechako for both forecasting horizons. On both rivers, the spread of the ensemble increased with lead time to reach 2.29 • C on the Nechako and 3.14 • C on the Miramichi. However, the spread of the ensemble as well as the variability of the spread increased more rapidly on the Miramichi than on the Nechako (Figure 7). Therefore, more uncertainty was propagated in the Miramichi water temperature forecasting system. These observations are also visible on the time series plot on Figure 4. It should be noted that the water temperature forecasts spread on the Miramichi is exacerbated by the propagation of the uncertainty of the discharge forecast to the water temperature forecast. Miramichi than on the Nechako (Figure 7). Therefore, more uncertainty was propagated in the Miramichi water temperature forecasting system. These observations are also visible on the time series plot on Figure 4. It should be noted that the water temperature forecasts spread on the Miramichi is exacerbated by the propagation of the uncertainty of the discharge forecast to the water temperature forecast.  Figure 8 shows the reliability plots of discharge forecasts for all lead times. The reliability plot for the Nechako ( Figure 8A) clearly demonstrates the lack of spread of the discharge forecast. For all forecasting horizons, the observed probabilities stay close to zero for all forecast nominal probabilities. This indicates that observed discharge almost never falls into the ensemble and that the forecasting framework does not include enough uncertainty to be reliable. On the Miramichi (Figure 8B), the reliability increases constantly with lead time. Although the reliability of the discharge forecast is better on the Miramichi than it is on the Nechako, the lines being under the bisector indicates an under-dispersion of the ensemble. For a forecasted probability of 0.9, the observed probability goes from 0.02 (hz1) to 0.5 (hz5). Although a clear surge in reliability was visible between the first and the fifth horizon, the observed probability remained lower than the forecasted probability for all levels of confidence.  Figure 8 shows the reliability plots of discharge forecasts for all lead times. The reliability plot for the Nechako ( Figure 8A) clearly demonstrates the lack of spread of the discharge forecast. For all forecasting horizons, the observed probabilities stay close to zero for all forecast nominal probabilities. This indicates that observed discharge almost never falls into the ensemble and that the forecasting framework does not include enough uncertainty to be reliable. On the Miramichi (Figure 8B), the reliability increases constantly with lead time. Although the reliability of the discharge forecast is better on the Miramichi than it is on the Nechako, the lines being under the bisector indicates an under-dispersion of the ensemble. For a forecasted probability of 0.9, the observed probability goes from 0.02 (hz1) to 0.5 (hz5). Although a clear surge in reliability was visible between the first and the fifth horizon, the observed probability remained lower than the forecasted probability for all levels of confidence. Figure 8 shows the reliability plots of discharge forecasts for all lead times. The reliability plot for the Nechako ( Figure 8A) clearly demonstrates the lack of spread of the discharge forecast. For all forecasting horizons, the observed probabilities stay close to zero for all forecast nominal probabilities. This indicates that observed discharge almost never falls into the ensemble and that the forecasting framework does not include enough uncertainty to be reliable. On the Miramichi (Figure 8B), the reliability increases constantly with lead time. Although the reliability of the discharge forecast is better on the Miramichi than it is on the Nechako, the lines being under the bisector indicates an under-dispersion of the ensemble. For a forecasted probability of 0.9, the observed probability goes from 0.02 (hz1) to 0.5 (hz5). Although a clear surge in reliability was visible between the first and the fifth horizon, the observed probability remained lower than the forecasted probability for all levels of confidence.  Figure 9 displays the reliability plots for water temperature forecasts for all forecasting horizons. It can be seen that on both rivers reliability increases with lead time. However, more disparity is visible between hz1 and hz5 on the Miramichi. For a nominal probability of 0.9 the observed probability goes from 0.24 to 0.59 while it goes from 0.25 to 0.54 on the Nechako. In both cases, this suggests an over-confidence of the forecasts, meaning that the spread of the ensemble is narrower than it should be in order to convey the total uncertainty related to the forecasting situation. The ensembles of all five lead time forecasts are thus found to be under dispersed.  Figure 9 displays the reliability plots for water temperature forecasts for all forecasting horizons. It can be seen that on both rivers reliability increases with lead time. However, more disparity is visible between hz1 and hz5 on the Miramichi. For a nominal probability of 0.9 the observed probability goes from 0.24 to 0.59 while it goes from 0.25 to 0.54 on the Nechako. In both cases, this suggests an over-confidence of the forecasts, meaning that the spread of the ensemble is narrower than it should be in order to convey the total uncertainty related to the forecasting situation. The ensembles of all five lead time forecasts are thus found to be under dispersed.

Discussion
In this paper, we produced five-day-ahead forecasts on two Canadian watersheds with special emphasis directed towards uncertainty of meteorological inputs. As expected, the spread of the water temperature ensemble, representative of the meteorological and hydrological uncertainty propagated within the forecast, grows larger with the forecasting horizon. This translated into an improvement in reliability with lead time. This improvement of reliability results from an increase in

Discussion
In this paper, we produced five-day-ahead forecasts on two Canadian watersheds with special emphasis directed towards uncertainty of meteorological inputs. As expected, the spread of the water temperature ensemble, representative of the meteorological and hydrological uncertainty propagated within the forecast, grows larger with the forecasting horizon. This translated into an improvement in reliability with lead time. This improvement of reliability results from an increase in uncertainty rather than from a better accuracy of the forecast. As demonstrated in Section 3.2, meteorological forecasts for all variables were diagnosed as biased. Obviously, the best way to improve the reliability of forecasts is to correct those biases. However, this operation was left out of the present study for two reasons: first, there is a certain disagreement among the hydrologic ensemble forecasting community as to whether it is better to pre-process meteorological inputs, or post-process the hydrological forecasts to remove bias. For instance, Kang et al. [51] showed that post-processing was much more efficient than pre-processing to improve the quality of their final streamflow forecasts. Verkade et al. [52] found that pre-processing meteorological forecasts to remove bias and correct dispersion resulted in very little gain, if any at all, for the final streamflow forecasts. To the best of our knowledge, a comparative study between pre-and post-processing was never undertaken in the context of water temperature forecasting, since ensemble forecast and quantification of uncertainty is a very new topic to this field. Second, many different pre-and post-processing methods exist, spanning from very simple [53] to more elaborate (e.g., Bayesian Model Averaging [54]). An adequate comparison of methods is outside of the scope of the present study, which is a first attempt at assessing the potential of ensemble forecasts in the context of water temperature. An in-depth comparative study on the merits of various methods for pre-and post-processing to improve the reliability of water temperature forecasts is a potential subsequent research avenue.
On the Nechako watershed, despite the large MCRPS for the ensemble precipitation forecasts (2.8 mm at hz5), the effects of this uncertainty on the forecasted discharge are very small. This is due to the fact that discharge is imposed at the model upstream boundaries, namely the Skins Lake Spillway and the outlet of the Nautley River. The large spread observed in forecasted water temperatures for the unregulated Miramichi, even for the first forecasting horizon, suggests an influence of the precipitation uncertainty that propagates into the discharge and ultimately to the water temperature forecasts. This indicates the apparent consequences of the quality of the precipitation forecasts on the subsequent hydrological and thermal forecasting in a natural river. Naturally, the uncertainty propagated within the discharge forecast is transferred to the water temperature forecast. Further analysis would be required to quantify in details this propagation of precipitation uncertainty across the model cascade. Since the volume of water is included in the estimation of water temperature (Equation (1)), the uncertainty of the precipitation, propagated to the discharge, induces uncertainty to the water temperature forecasts. In a regulated system, like the Nechako River, this influence is highly tempered by flow regulation in opposition to a natural system, like the Miramichi River, where surface runoff freely reaches the hydrological network.
Solar radiation (Rs) is known to be the dominant variable of the energy budget in most rivers [7] while being a direct output of atmospheric models. Error in R s values should therefore be a dominant contribution to the total uncertainty of energy fluxes. However, additional work, including a detailed analysis of each forecasted flux, would be required confirm this hypothesis. Surprisingly, the ensemble forecast's accuracy (MCRPS) and its capacity to predict a threshold exceedance (BS) remained fairly constant with lead time. Both metrics (CRPS and BS) were expected to increase indicating a lower accuracy and threshold exceedance prediction capacity. This can be explained by the fact that the initial conditions of the watershed at the beginning of the forecast are estimations dependent on the spin up period and are necessarily uncertain themselves. The ensemble forecasts were produced from initial states generated from simulations with observed meteorological conditions as inputs without the assimilation of the observed hydrological and thermal conditions. Therefore, the thermal inertia of the system limits the deviation of the short-term forecasts from the initial conditions. Thiboult et al. [24] showed the important influence of the initial conditions of the watershed on hydrological forecast accuracy and reliability through data assimilation. They also suggested that these initial conditions mostly influence the quality of the forecasts for shorter lead times. In the present study, results indicate that the lower reliability for shorter lead times can be attributable to the absence of data assimilation. This is especially visible for the discharge forecast on the Nechako River where a relatively narrow ensemble spread translates into low reliability forecasts. Although the proposed model shows satisfactory performances in both calibration and validation modes, initial states still carry a modeling error. In this case, the thermal inertia of the system will carry this modeling error through the forecast horizons but will dissipate with lead time. Although the proposed forecasting framework performed well in both a regulated and an unregulated river system, the problem related with initial states uncertainty should also be addressed in future work.
The MAE obtained from archived deterministic operational forecasts using a one-dimensional unsteady state water temperature model on the Nechako (personal communication [55]; Table 3) for a one-day-ahead forecasts was lower (MAE = 0.49 • C) than those obtained in the present study but only spanned from 20 July to 20 August and did not include 2011 and 2012. For the same forecasting period and the same years, the proposed framework yields a MCRPS of 0.85 • C. However, for a five-day-ahead forecast, performance is very similar to the archived, with the archived forecasts having a MAE of 0.68 • C while the proposed framework yielded a MCRPS of 0.69 • C. Forecasts were also produced on a tributary of the Southwest Miramichi, the Little Southwest Miramichi (LSWM), by Caissie et al. [56]. An autoregressive models with air temperature as an exogenous variables (ARX) was used for 1-3-day lead times from 1992 to 2011 during the summer (June-September). Results shows RMSE of 0.87 • C for one-day-ahead forecast and 1.48 • C for a three-day-ahead forecast (Table 3). For the same lead times, the proposed method returned MCRPS of respectively 0.92 • C and 1.07 • C. Although both metrics cannot be directly compared, these results suggest a slightly better performance of the ARX model for a one-day-ahead forecast but a better performance of our proposed method for a three-day-ahead forecast. Although the main objective of this paper is to highlight the uncertainty associated with meteorological inputs in a hydrological and thermal forecast, these comparisons suggest a good potential of CEQUEAU to produce five-day lead time forecasts with reasonable accuracy. A thorough model comparison would be of interest for future work. All forecasts carry uncertainty from various sources. In the absence of knowledge on this uncertainty, the level of confidence into a forecast can hardly be estimated. One of the aims of this study was to provide information on the uncertainty from eight input variables from an atmospheric model that propagates to water temperature forecast in both a regulated and a natural river. We showed the satisfactory accuracy of the ensemble forecast and its capacity to accurately predict threshold exceedance for management purpose for a one to five-day lead time. From a management point of view, these good performances on longer lead times are valuable, given the fact that decisions and management strategies can be better adapted with longer lead time. Knowledge of the uncertainty associated with a water temperature forecast is also valuable and should be integrated in decision making processes. This can be done by basing decisions (at least in part) on probabilistic forecasts like those proposed in the present study. Such framework helps to explicitly communicate the risk associated with a decision. In this case, the probability of exceedance of the water temperature threshold can be calculated from the ensembles. For instance, when five out of 20 members predict the exceedance of 20 • C water temperature for a given water release, then the risk of having a threshold exceedance is 25%. A decision therefore can be made based on knowledge of the risk associated with this decision. In fact, it was recently shown that the level of risk aversion of a decision-maker is a key factor in assessing the comparative benefits of forecasting systems for decision-making [57]. In the case of a fully deterministic forecast, that notion of risk is much more difficult to assess and to communicate to the decision-makers, although they are usually aware of its existence [58]. From this point of view, ensemble forecasts are clearly advantageous over deterministic forecasts and can even be seen as more "complete".
While understanding of the uncertainty associated with meteorological inputs was improved with this study, further work is required to fully and accurately capture the uncertainty in water temperature ensemble forecasts, especially for short lead times (1-3 days). The lack of reliability associated with these forecasts also suggests that uncertainty sources that dominate for short lead times have yet to be fully understood and incorporated in the forecasting framework. Among other sources, the choice of particular hydrological and thermal models, initial conditions, and the level of model parameterization are known to induce uncertainty to the hydrological modeling/forecasting [59]. The combination of ensemble forecasting approaches (data assimilation, multi-model and ensemble meteorological forecasts) was shown to be effective to properly represent uncertainty over short-(one day) to long-term (10 days) hydrological forecasts [24]. River temperature forecasting would likely benefit from a similar, more complete approach.

Conclusions
The study presented in this paper proposes a modeling framework used to produce ensemble water temperature forecasts in a regulated and a natural river system. Uncertainty associated with meteorological forecast inputs was quantified throughout the forecasting process. The resulting water temperature forecasts showed good performances on both watersheds for a five-day lead time with a better accuracy on the regulated river system (Nechako) and a better reliability on the natural river system (Miramichi). The present study improved our knowledge of the risk associated with water temperature forecast. As far as the authors know, this paper represents the first published attempt to produce ensemble water temperature forecasts by feeding ensemble meteorological forecasts to a hydrological and thermal model cascade. Further work is required to address accuracy and reliability of short-term ensemble forecasts (1-3 days). Structural uncertainty and initial state error should thus be investigated in combination with ensemble meteorological forecasts in future work in water temperature forecasting.