An Evaluation of Statistical Downscaling Techniques for Simulating Daily Rainfall Occurrences in the Upper Ping River Basin

This study presents an exhaustive evaluation of the performance of three statistical downscaling techniques for generating daily rainfall occurrences at 22 rainfall stations in the upper Ping river basin (UPRB), Thailand. The three downscaling techniques considered are the modified Markov model (MMM), a stochastic model, and two variants of regression models, statistical models, one with single relationship for all days of the year (RegressionYrly) and the other with individual relationships for each of the 366 days (Regression366). A stepwise regression is applied to identify the significant atmospheric (ATM) variables to be used as predictors in the downscaling models. Aggregated wetness state indicators (WIs), representing the recent past wetness state for the previous 30, 90 or 365 days, are also considered as additional potential predictors since they have been effectively used to represent the low-frequency variability in the downscaled sequences. Grouping of ATM and all possible combinations of WI is used to form eight predictor sets comprising ATM, ATM-WI30, ATM-WI90, ATM-WI365, ATM-WI30&90, ATM-WI30&365, ATM-WI90&365 and ATM-WI30&90&365. These eight predictor sets were used to run the three downscaling techniques to create 24 combination cases. These cases were first applied at each station individually (single site simulation) and thereafter collectively at all sites (multisite simulations) following multisite downscaling models leading to 48 combination cases in total that were run and evaluated. The downscaling models were calibrated using atmospheric variables from the National Centers for Environmental Prediction (NCEP) reanalysis database and validated using representative General Circulation Models (GCM) data. Identification of meaningful predictors to be used in downscaling, calibration and setting up of downscaling models, running all 48 possible predictor combinations and a thorough evaluation of results required considerable efforts and knowledge of the research area. The validation results show that the use of WIs remarkably improves the accuracy of downscaling models in terms of simulation of standard deviations of annual, monthly and seasonal wet days. By comparing the overall performance of the three downscaling techniques keeping common sets of predictors, MMM provides the best results of the simulated wet and dry spells as well as the standard deviation of monthly, seasonal and annual wet days. These findings are consistent across both single site and multisite simulations. Overall, the MMM multisite model with ATM and wetness indicators provides the best results. Upon evaluating the combinations of ATM and sets of wetness indicators, ATM-WI30&90 and ATM-WI30&365 were found to perform well during calibration in reproducing the overall rainfall occurrence statistics while ATM-WI30&365 was found to significantly improve the accuracy of monthly wet spells over the region. However, these models perform poorly during validation at annual time scale. The use of multi-dimension bias correction approaches is recommended for future research.


Introduction
The coarse resolution of General Circulation Models (GCMs) offers some limitations while assessing the impacts of climate change on hydrological processes as they often occur at a fine catchment scale. Downscaling techniques are used to reduce this spatial mismatch by transferring the GCM outputs from coarse scale to fine scale [1,2]. Broadly, downscaling techniques can be grouped into four categories: regression model, weather pattern (weather typing) approach, stochastic weather generator and limited-area model (LAM) [1,3]. LAM is also known as dynamical downscaling or regional climate model (RCM). Regression model, weather pattern approach and stochastic weather generator can also be viewed as a statistical/stochastic downscaling approach.
The dynamical downscaling approach simulates the physical realism of the future climate change scenarios by using GCM data as a boundary condition of the region [3,4]. Although the approach can provide high-resolution climate variables, it requires intensive computing resources in order to setup and run the RCM over the study region. There have been quite a few applications of dynamical downscaling approach over Southeast Asia [5][6][7].
The statistical downscaling approach is quite flexible, easier to use and computationally inexpensive [2]. In statistical downscaling approach, a statistical relationship between current large scale climate variables and the local climate variables is established and is later applied to generate future climate scenarios under the assumption of no change in this relationship in the future. The statistical downscaling output depends on the choice of predictors, downscaling domain, downscaling techniques, the selected transfer function and the calibration period used to setup the model [3,[8][9][10][11][12]. All these factors need careful selection to produce reasonable and reliable outputs for a climate change study.
A parametric model forms one of the most common and simplest types of stochastic downscaling model. In simulating the rainfall occurrence, this type of model is normally based on a Markov chain [13]. A major drawback of the low-order Markov dependence model is the under representation of variability at longer or higher time scales (variability over years), also known as low frequency variability [14][15][16]. Mehrotra and Sharma [16] proposed a modified Markov model (MMM) to incorporate low-frequency variability in rainfall occurrence simulations. In the approach, a low-order Markov process is modified to accommodate the low-frequency variability through a "wetness indicator (WI)" that is based on the recent past behavior of rainfall state over the study region. The same model was applied by them in the downscaling context as well [17].
It is common to simulate synthetic weather sequences at a single or several locations, independently. However, spatial dependence of rainfall can play an important role on the resulting catchment response and also on the outcome of the studies dealing with the climate change impact assessment over large areas. Realizing this, Wilks [18] proposed an approach to capture the coherence of rainfall at multiple locations over a region by making use of serially independent and spatially correlated random numbers. This logic has since been successfully applied to simulate precipitation at multiple locations [13,16,[19][20][21][22][23]. The complexity of spatial intermittence of rainfall, in which precipitation amount depend on neighboring stations being wet or dry, may pose some problems in the formulation of the correlation matrix over a large network of stations and may end up with ill-defined correlation matrices [13,16].
Many downscaling studies have been conducted in the past focusing on the assessment of the impacts of climate change on hydrological processes in the upper Ping river basin (UPRB) [24][25][26][27]. Sharma and Babel [24] estimated future runoff applying the bias correction and spatial disaggregation techniques to improve the characteristics of raw ECHAM4/OPYC precipitation. The future simulations showed a decrease of 13-19% in annual streamflow compared to the baseline time period and seasonal streamflow pattern will be shifted. Wuthiwongyothin et al. [25] studied the effects of climate change on hydrology based on dynamically downscaling using bias-corrected CSM3-GCM to project future flow under A1B, A2 and B1 scenarios. The results show that the bias corrected rainfall outputs were well matched to the observed data. Future rainfall depths are suitable to be used to project future flow, which will increase 13% compared to the baseline time period. Wuthiwongyothin et al. [26] applied quantile mapping technique to correct daily rainfall from MM5-RCM over the UPRB using different distributions of Bernoulli-Weibull, Bernoulli-Gamma and non-parametric transformation. It was found that quantile mapping was an efficiency technique to improve and reduce climate model output bias. Therefore, the bias correction is normally technique to be done before quantifying any impacts of climate change study. Quantile mapping technique is commonly used to correct biases of climate model outputs. Although it is an effective technique to remove historical biases relative to observed data, it can artificially corrupt the projection of future trends [28]. Furthermore, Homsi et al. [29] found that linear scaling (LS) demonstrated the highest capability for precipitation downscaling comparing to three bias correction methods comprising power transformation (PT), general quantile mapping (GEQM) and gamma quantile mapping (GAQM). Saengsawang et al. [28] predicted future rainfall under the Representative Concentration Pathway scenarios (RCP2.6, RCP4.5 and RCP8.5) in the UPRB using regression-based downscaling to relate between rainfall depth and climate variables instead of applying raw GCM rainfall data as applied in the aforementioned studies. They found that Nested Bias Correction (NBC) is a better technique to correct the historical GCM compared to a conventional method that normally applied to raw GCM rainfall data. The results show the effects of climate change varied across the basin. Rainfall significantly decreased in the middle but increased towards the northern part of the catchment. Since each station had its own predictors, it would be interesting to investigate whether atmospheric predictors or climate change scenarios have more effect on future rainfall characteristics. This study suggested to select the same predictors to form the regression models for simulating rainfall occurrence and amount for all stations to clarify this question. These results show that bias correction is essential to correct the biases in raw climate model output before using them in climate change impact studies.
The need for intercomparison of downscaling techniques has been raised by many researchers from time to time [17,30,31]. As the selection of a downscaling model is area as well as location specific, a comparison of downscaling approaches helps us to know the performance of a model in comparison to other available alternatives and to determine why or when a model performs well at a local scale. The outcome of such studies also helps to improve our understanding of the spatial and temporal scale of the atmosphere-surface environment relationship over the study region.
Keeping these issues in mind, we present here an exhaustive comparison of two commonly used downscaling approaches, MMM and the linear regression model, in simulating daily rainfall occurrences at 22 rainfall stations located in the UPRB. A common set of predictors across all stations was selected for rainfall occurrence downscaling over the study region. These predictors were corrected using NBC before applying to the downscaling approaches. The findings of the study were used in a climate change study to project future rainfall behavior over the study region. The results also reveal whether the MMM models can be adopted over simple regression approaches to enhance model performance.

Regression Models
The multiple linear regression model proposed by Wilby et al. [2] was applied in this study to generate rainfall occurrence. In the model, daily probabilities of a wet day following a dry day or a Hydrology 2020, 7, 63 4 of 26 wet day (P 1i t ) for a given day t are calculated as a function of exogenous predictors (X t ) together with the transition probabilities of a wet day (P 1i t−1 ) for a previous day t − 1, as shown in Equation (1): in the first model formulation, transition probability of each day for all datasets is calculated by using a moving window of 31 days centered on the day of interest, resulting in 366 regression models for each calendar day (called Regression366). In the second model formulation, a single regression model is fitted to the whole data (called RegressionYrly).

Modified Markov Model (MMM)
The modified Markov model (MMM) was proposed by Mehrotra and Sharma [16] for generating rainfall occurrences exhibiting low frequency variability by including wetness state indicators (WIs) along with the transition probabilities of the simple Markov model. In the MMM, the probability of a wet day can be expressed in term of R t (k)|Z t (k), where Z t (k) represents a vector of conditioning variables at a location k and at time t and can include previous time step(s) rainfall state(s) (wet or dry) and the aggregated wetness state of immediate past time period(s) to assign daily and long-term persistence. For the simplest case of a first-order standard Markov model, Z t (k) depends on R t−1 (k). While using MMM as a downscaling model, atmospheric variables (representing the climate change conditions) are also added in the vector Z t (k) [17,21]. More specifically, the vector Z t is defined as R t−1 ,X t and the transition probability P(R t |R t−1 ,X t ) is calculated by using the following parameterization (reproduced from Mehrotra and Sharma [17] for the sake of completeness only): where P 1i is the transition probability of a first order Markov. The second term of Equation (2) is the additional variables (X t ) in the vector Z t and can be approximated by using a mixture of multivariate normal distribution to represent the conditional probability as expressed in Equation (3).
where P 1i is the transition probability of a wet day following a dry day or a wet day; µ 1,i and µ 0,i are the mean vectors of X t when R t−1 = i and R t = 1 and 0, respectively; V 1,i and V 0,i are the corresponding variance-covariance matrix of X t when R t−1 = i and R t = 1 and 0, respectively; and det() represents the determinant operation.

Study Area
The study area is the upper Ping river basin (UPRB) in northern Thailand. It is located between latitude 17 • 14 30" and 19 • 47 52" north and between longitude 98 • 4 30" and 99 • 22 30" east. The UPRB covers the area of 25,370 km 2 in Chiang Mai and Lamphun provinces, which is covered mostly by mountain ranges and forests. The average annual rainfall of the basin (1960-2007) is around 1100 mm, which is mostly influenced by the Southwestern monsoon in the wet season starting from May to October ( Figure 1). Since nearly 90% of rainfall occurs during the wet season, water shortage during the dry season is normal in rain-fed areas. Runoff from the UPRB flows into Bhumipol Dam, which has an active storage capacity of 9.7 × 10 9 m 3 . The water supply is mainly used for rice production in the lower Chao Phraya Basin and delivering water to Bangkok, the capital city of Thailand. The impacts of climate change on rainfall regimes will have a direct influence on reservoir inflows.
Hydrology 2020, 7, x FOR PEER REVIEW 5 of 26 to October (Figure 1). Since nearly 90% of rainfall occurs during the wet season, water shortage during the dry season is normal in rain-fed areas. Runoff from the UPRB flows into Bhumipol Dam, which has an active storage capacity of 9.7 × 10 9 m 3 . The water supply is mainly used for rice production in the lower Chao Phraya Basin and delivering water to Bangkok, the capital city of Thailand. The impacts of climate change on rainfall regimes will have a direct influence on reservoir inflows.

Rainfall Data
Daily rainfall data at 22 rainfall stations in the UPRB for 46 years from 1960 to 2005 were selected for the application of both statistical downscaling techniques. Figure 1 shows the locations of these rainfall stations. These stations have more than 96% complete records. Some missing data were approximated from the data at nearby stations using the Inverse Distance Square Weighting (IDSW) method. In generating daily rainfall occurrence, a wet day was classified as wet if daily rainfall was greater than 0.3 mm [17].

Reanalysis Data
The National Centers for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) provide large-scale historical atmospheric variables (predictors) known as NCEP/NCAR reanalysis data [32]. The reanalysis data have a spatial resolution of 2.5 • latitude by 2.5 • longitude and a temporal coverage of four times daily, daily and monthly values starting from 1948. Daily time series of twenty-six climate variables at four grids covering the UPRB were downloaded from the website (https://www.esrl.noaa.gov/psd/data/reanalysis/reanalysis.shtml) for use as potential predictors in the downscaling approaches.

General Circulation Model (GCM) Data
The daily output of a representative GCM, MPI-ESM-LR model (by Max Planck Institute for Meteorology (MPI-M), Germany) available under the Coupled Model Intercomparison Project 5 (CMIP5) was used in the study. The model has a horizontal grid resolution of 1.875 • by 1.875 • . The GCM output at four grid points covering and surrounding the UPRB is downloaded from the website http://esgf-data.dkrz.de/esgf-web-fe/ for the baseline (1960-2005) time period.

Predictor Selection
The selected 26 potential atmospheric (ATM) variables from NCEP/NCAR reanalysis data at 4 selected grids are interpolated at 22 rainfall stations using IDSW approach. Significant atmospheric predictors at each rain gauge station are identified using a stepwise multiple linear regression which has been successfully used for selecting ATM variables for rainfall downscaling [2,21]. Following Mehrotra et al. [21], atmospheric predictors which showed high correlations (>0.85) amongst themselves were dropped, resulting in the optimal parameter set, which includes the four key atmospheric variables. These variables consisting of hgt700 (geopotential height at the pressure level 700 hPa), uwnd700 (eastward wind at the pressure level 700 hPa), vwnd850 (northward wind at the pressure level 850 hPa) and ome850 (omega at the pressure level 850 hPa) were selected as a set of predictors to be further used to establish different models in simulating daily rainfall occurrence at each station.

Wetness State Indicators (WIs)
A variable representing "how wet it has been over a period of time in the past" is defined as "wetness-state indicator (WI)", i.e., the ratio of the number of wet days to total number of days in a fixed time period in the recent past. WI was successfully used by Mehrotra and Sharma [16] to represent the low frequency variability (seasonal, annual and inter-annual) in daily rainfall occurrences. Following Mehrotra and Sharma [16], we decided to include combinations of previous 30, 90 and 365 days WIs as extra predictors in our downscaling model. The usefulness of WIs was investigated by forming set of predictors with and without WIs and atmospheric variables, running the downscaling models and comparing the results.

Single Site and Multisite Cases
Downscaling of rainfall occurrence and amount is usually carried out at a single station separately. However, rainfall of stations located in the same catchment can show strong dependence and can influence the simulated catchment response. To study this aspect, we applied both single and multisite downscaling approaches in the study. The multisite downscaling approach was an extension of the single site wherein serially independent and spatially correlated random numbers at several locations were used to maintain the spatial dependence in the daily rainfall simulations at 22 stations. For single site simulation, the probability of a wet day was compared to a uniformly distributed random number [0,1]. A wet day was generated if the random number was greater or equal to the wet day probability. In the multisite simulation procedure, the uniformly distributed random numbers were first generated for each site together. These were then modified such that they were spatially dependent yet serially independent. These modified random numbers were next compared against the wet day probability at individual station to simulate a wet or dry day reflecting the observed spatial dependence in the simulations. The results of both single and multisite downscaling models were compared. More details on the procedure are available in the works of Wilks [18], Mehrotra and Sharma [16] and Mehrotra et al. [19].

Downscaling Models
As described in the previous section, 8 sets of predictors were formed using combinations of ATM and WIs (ATM, ATM-WI30, ATM-WI90, ATM-WI365, ATM-WI30&90, ATM-WI30&365, ATM-WI90&365 and ATM-WI30&90&365) and used to run the three downscaling techniques to obtain 24 sets of rainfall simulations at 22 stations using single site models. Similarly, 24 sets of additional simulations were obtained by employing the same 8 predictor sets with 3 multisite models.

Bias Correction
Since the statistical characteristics of the raw historical GCM data differ from the reanalysis data, bias correction of GCM outputs is essential. Multiplicative factors, change factors, delta change approach and quantile mapping are a few examples of simple approaches often used to bias correct the raw GCM output and also as downscaling add on [29,[33][34][35][36]. These approaches have some limitations as well, for example, adopting observed distribution and capturing average statistics of historical time series without focusing on the extremes [37][38][39]. More comprehensive multi-dimensional bias correction approaches considering time, space and across variables biases are also available [40][41][42][43][44][45][46].
In this study, nested bias correction (NBC) proposed by Johnson and Sharma [40] was applied to reduce the differences across GCM and historical statistics. The NBC corrects for biases at multiple time scales and reproduces the observed low frequency variability in the bias corrected time series and provides a better reproduction of observed behavior of variables than other simpler approaches [47].
The NBC was carried out by removing mean, standard deviation and lag 1 autocorrelation from the GCM outputs and substituting with those of reanalysis data at daily, monthly and annual timescales. Essentially, starting from the daily time scale, the raw GCM data were first standardized by removing series means, standard deviations (Equation (4)) and thereafter correction for lag 1 autocorrelations (Equation (5)) was applied to this standardized time series. Means and standard deviations of reanalysis data were added back to create the bias corrected time series at that time scale (Equation (6)).
The bias corrected daily values (x i ) were aggregated to form monthly time series (y m ) and the same bias correction procedure was applied. The bias corrected monthly time series was aggregated to annual time scale (z n ) and the correction procedure was repeated. The corrections at monthly and annual time scales were incorporated at the daily time scale by modifying the daily bias corrected time series using the factors as specified in Equation (7). In this equation, t stands for day, m for month and n for year. The combined weighting factor is the ratio of the monthly corrected GCM to the raw GCM for month m, multiplied by the ratio of the annual corrected GCM to the aggregated raw GCM for year n.
x t = y m y m z n z n x t , More details on the NBC are available in the work of Johnson and Sharma [40].

Model Calibration and Validation
All three downscaling techniques were calibrated using observed rainfall and reanalysis data for the time period 1960-2005 and validated using bias corrected historical GCM data for the time period 1960-2005 (46 years). The results of each model and data case are described and discussed for both calibration and validation periods in terms of reproduction of important statistics of interest to water resource managers.

Comparison of Statistics
As stochastic models produce random outputs, there is always some uncertainty on how representative the generated data are if only one simulation (realization) is used in the analysis. Thus, we simulated 100 realizations of 46 years of daily rainfall occurrences at 22 rainfall stations for all 48 model and data combination cases to capture the essential statistical characteristics of the climate incorporating model structure and parameter uncertainty. The performance of these cases was analyzed in terms of reproduction of statistics at daily, monthly, seasonal and annual time scales. These include means, standard deviations and auto correlations of number of wet days and wet and dry spells of different durations. Root mean square error (RMSE) is a popular error metric for general numerical prediction purposes [48]. However, unlike relative RMSE (Rel. RMSE), the RMSE is not normalized by the mean value, which makes it rather unsuitable for assessing errors for a variable whose observations can vary by several orders of magnitude. Thus, Rel. RMSE was selected to indicate overall model performance. This indicator is calculated by dividing root mean square error with average value of measured data. Model accuracy is considered excellent when Rel. RMSE < 10% (indicated in blue highlight), good if 10% < Rel. RMSE < 20% (indicated in green highlight), fair if 20% < Rel. RMSE < 30% (indicated in yellow highlight), poor if 30% < Rel. RMSE < 40% (indicated in orange highlight) and very poor if Rel. RMSE > 40% (indicated in grey highlight). Spatial dependence across stations was analyzed in terms of cross correlation of monthly and annual wet days. These results are explained and discussed next.

Preliminary Screening of Model and Data Combination Cases
Owing to many model and dataset combination cases, a preliminary analysis was conducted to select the best performing datasets for the detailed analysis. This analysis was primarily focused on the differences in the reproduction of means, standard deviations and wet and dry spells by these cases as defined by Rel. RMSE. Table 1 presents the Rel. RMSE of all the 48 cases for the calibration time period while scatter plots of means and standard deviations (SDs) of observed and models simulated annual wet days are presented in Figure 2. Mean is reproduced well by all models (Column 1, Figure 2), although Rel. RMSE of mean is minimum for RegressionYrly model with ATM as predictors (Table 1). SD at monthly and annual time scale is the best reproduced by multisite MMM with ATM and WI30&365 as predictors (Columns 2 and 4, Figure 2). Similarly, number of wet and dry spells is better reproduced by MMM with ATM and WI30, whereas cross correlations are better reproduced by multisite MMM with ATM and WI30&365 as predictors. All models, in general, perform well during calibration period as they were fitted and tested using the same dataset. It is the validation period that really tests the performance of a model. Table 2 presents the Rel. RMSE for the annual, monthly and seasonal time periods simulated by three downscaling techniques for the validation (using historical GCM data) time period. Scatter plots of means and SD are presented in Figure 3. Daily mean is simulated well by both regression models, whereas monthly mean is simulated well by R366 models (Column 1, Figure 3). Statistics related to the monthly and annual standard deviations are better simulated by multisite MMM with ATM and WI30&90&365 predictor variables (Last column and bottom two rows of Figure 3). Number of dry and wet spells is better simulated by MMM with ATM only and ATM-WI90&365 variables. Cross correlations are better simulated by regression R366 model with ATM at annual time scale while by RYrly model with ATM-WI30&365 at monthly time scale. models (Column 1, Figure 3). Statistics related to the monthly and annual standard deviations are better simulated by multisite MMM with ATM and WI30&90&365 predictor variables (Last column and bottom two rows of Figure    Following these results and keeping in mind that downscaled results of this exercise will be used in a climate change study, all subsequent analysis is based on the presentation and discussion of detailed calibration and validation results of the multisite downscaling models with ATM, ATM-WI30&90 and ATM-WI30&365 datasets as predictors.

Number of Wet Days
Means and SDs of observed and models simulated monthly, seasonal and annual number of wet days using three selected predictor sets are presented in Table 3 for both calibration and validation periods. The table also shows Rel. RMSE in brackets. Similarly, scatter plots of these statistics are presented in Figures 4 and 5 for both calibration and validation periods.
All downscaling techniques closely reproduce the observed wet day totals at monthly and annual time scales Column 3, Figures 2 and 3) for calibration and validation periods with regression models performing better than MMM. On seasonal basis, overall MMM performs better in comparison to regression models irrespective of datasets and time periods. In general, all models struggle in reproducing seasonal standard deviation during validation, although MMM better simulates seasonal SDs during validation (Figures 4 and 5). The inclusion of WI predictors in the predictor set helps in reproducing the observed aggregated time scales statistics in the downscaled sequences. Although not so obvious, inclusion of 365 days WI predictor helps further improve the representation of observed annual SDs in the downscaled simulations (Table 3).

Number of Wet Days
Means and SDs of observed and models simulated monthly, seasonal and annual number of wet days using three selected predictor sets are presented in Table 3 for both calibration and validation periods. The table also shows Rel. RMSE in brackets. Similarly, scatter plots of these statistics are presented in Figures 4 and 5 for both calibration and validation periods.

Number of Wet and Dry Spells
Sustained wet and dry periods are of prime concern in catchment management studies. Table 4 presents the station averaged results of observed and models simulated dry and wet spells of varying durations produced by three downscaling techniques for both calibration and validation time periods using a selected dataset (ATM-WI30&365). As mentioned above, a day is classified as wet if daily rainfall is greater than or equal to 0.3 mm, thus a wet spell is defined as the continuous sequence of days with daily rainfall greater than or equal to 0.3 mm. Similarly, a dry spell is defined as the sequence of days with daily rainfall less than 0.3 mm. At site statistics in the form of scatter plots are presented in Figure 6 for both calibration and validation time periods and for three datasets. MMM provides the most accurate simulation of these statistics for all datasets and for both calibration and validation time periods by having accurate fit and minimum Rel. RMSE ( Figure 6 and Table 4). Models, in general, overestimate the number of dry spells of 2-9 and 10-18 days while underestimate the number of dry spells of >18 days ( Figure 6). It appears that the inclusion of 30 days WI helps maintaining wet and dry spells related characteristics in the downscaled sequences in a better way. The extreme wet and dry spells (>7 days and >18 days, respectively) are intrinsically rare events (on average, occurring once in every two seasons) and improved performance of MMM is an indication of the capability of model for use in water resources related applications.

Spatial Dependence of Rainfall Occurrence
Accurate reproduction of the spatial dependence of rainfall events is needed for correct simulation of catchment response over a large area. The spatial dependence in the downscaled sequences is introduced by making use of spatially dependent and serially independent random innovations. It would also be of interest to see the additional role of ATM and WI predictors on the rainfall spatial dependence. The role of ATM is more interesting as this predictor set is common across all the stations. As discussed above, the ATM, ATM-WI30&90 and ATM-WI30&365 predictor sets were found to be the top three performers in reproducing the overall rainfall occurrence statistics. Therefore, they were chosen as the final set of predictors to compare the performance of the three downscaling techniques. Since we were dealing with the reproduction of spatial dependence by the downscaling models, results of both single and multisite cases were considered.

Spatial Dependence of Rainfall Occurrence
Accurate reproduction of the spatial dependence of rainfall events is needed for correct simulation of catchment response over a large area. The spatial dependence in the downscaled sequences is introduced by making use of spatially dependent and serially independent random innovations. It would also be of interest to see the additional role of ATM and WI predictors on the rainfall spatial dependence. The role of ATM is more interesting as this predictor set is common across all the stations. As discussed above, the ATM, ATM-WI30&90 and ATM-WI30&365 predictor sets were found to be the top three performers in reproducing the overall rainfall occurrence statistics. Therefore, they were chosen as the final set of predictors to compare the performance of the three downscaling techniques. Since we were dealing with the reproduction of spatial dependence by the downscaling models, results of both single and multisite cases were considered. Figures 7 and 8 present scatter plots of observed and simulated cross correlation of daily, monthly and annual wet days for each station pair provided by three downscaling techniques using ATM, ATM-WI30&90 and ATM-WI30&365 predictors, for single site and multisite cases, respectively, during calibration time period. All single site models with all sets of predictors appear to be only marginally improving the spatial dependence (top three rows, Figure 7). This is expected as a single site model simulates rainfall at individual location in isolation and completely ignores the spatial dependence. However, the reproduction of spatial dependence improves with time aggregation and WI predictors as these variables tend to impart a weak spatial coherence in the rainfall simulation. At annual time scale, single site MMM with ATM-WI30&365 predictor set performs well. It appears that ATM and low frequency variability predictors which do not vary greatly in space are able to reproduce a part of observed spatial dependence in the downscaled simulations. All multisite models, irrespective of predictor sets, seem to perform well at daily time scale (top three rows, Figure 8). At monthly time scale, Regression366 provides slight overestimation of this statistic. At annual time scale, MMM with ATM-WI30&365 predictor set performs better in comparison to regression models (bottom row, Figure 8). ATM-WI30&90 and ATM-WI30&365 predictors, for single site and multisite cases, respectively, during calibration time period. All single site models with all sets of predictors appear to be only marginally improving the spatial dependence (top three rows, Figure 7). This is expected as a single site model simulates rainfall at individual location in isolation and completely ignores the spatial dependence. However, the reproduction of spatial dependence improves with time aggregation and WI predictors as these variables tend to impart a weak spatial coherence in the rainfall simulation. At annual time scale, single site MMM with ATM-WI30&365 predictor set performs well. It appears that ATM and low frequency variability predictors which do not vary greatly in space are able to reproduce a part of observed spatial dependence in the downscaled simulations. All multisite models, irrespective of predictor sets, seem to perform well at daily time scale (top three rows, Figure 8) Similar to the calibration case, Figures 9 and 10 present scatter plots of observed and models simulated cross correlations of daily, monthly and annual wet days using three predictor sets, for single site and multisite cases, respectively, for the validation time period. All single site models, irrespective of predictor sets, perform poorly at daily and annual time scales. This is because the single site model ignores the spatial dependence across stations in the downscaled simulations. At monthly time scale Regression366 model with either predictor set performs better in comparison to remaining two models (Rows 4-6, Figure 9). For multisite case, all models perform well at daily and monthly time scale by repeating the calibration results. At annual time scale, all models with either predictor set fail to reproduce the observed spatial dependence behavior. It appears that the univariate nested bias correction ignores the spatial dependence and cross variable dependence aspects and use of a multivariate bias correction approach, as described by Mehrotra and Sharma [43][44][45], might improve the behavior. As annual cross correlations play a major role during the planning, design and operation of manmade components of hydrosystems, their accurate simulation is important. Similar to the calibration case, Figures 9 and 10 present scatter plots of observed and models simulated cross correlations of daily, monthly and annual wet days using three predictor sets, for single site and multisite cases, respectively, for the validation time period. All single site models, irrespective of predictor sets, perform poorly at daily and annual time scales. This is because the single site model ignores the spatial dependence across stations in the downscaled simulations. At monthly time scale Regression366 model with either predictor set performs better in comparison to remaining two models (Rows 4-6, Figure 9). For multisite case, all models perform well at daily and monthly time scale by repeating the calibration results. At annual time scale, all models with either predictor set fail to reproduce the observed spatial dependence behavior. It appears that the univariate nested bias correction ignores the spatial dependence and cross variable dependence aspects and use of a multivariate bias correction approach, as described by Mehrotra and Sharma [43][44][45], might improve the behavior. As annual cross correlations play a major role during the planning, design and operation of manmade components of hydrosystems, their accurate simulation is important.

Conclusions
Water resources assessments of climate change impacts at catchment scales, where most adaptation measures are planned and implemented, are of great interest to planners and water resources managers. This study compared the application of three approaches for downscaling precipitation occurrence patterns at 22 rain gauge stations given wetness state indicators and atmospheric variables. Some important conclusions of the study are as follows.
The results of validation indicate that the four selected atmospheric variables comprising hgt700, uwnd700, vwnd850 and ome850 can represent the rainfall patterns effectively for this study region. While the structure of Regression366 and MMM require many parameters to define the processes, results of cross-validation show that, despite this, these models are well balanced and not overparameterized.
A thorough comparison of observed and models simulated at site and across the sites statistics at multiple time scales suggest multisite MMM as the best downscaling model for the region. The MMM is formulated to allow a mix of discrete and continuous predictor variables within the Markovian model structure assumed. The wetness-state indicators as continuous variables are formed solely from the previous wet days' values in the downscaled sequence. The use of wetness-state indicators as additional conditioning predictors further improves the model performance at monthly, seasonal and annual time scales. Use of spatially correlated random innovations coupled with ATM and WI effectively improves the reproduction of spatial dependence, more specifically during calibration, but performs poorly during validation at annual time scale. However, to properly reproduce the spatial and across variable dependences in atmospheric variables and downscaled results, use of multi-dimension bias correction approaches is necessary [43][44][45], as spatial temporal and multi-variable attributes are often misrepresented by climate models. We intend to use a multivariate bias correction approach in our next stage of research. The spatial and high and low frequency temporal dependencies of the climate variables can have significant influence on the outcome where time series of climate variables at multiple locations are used, for example in catchment modeling. Similarly, annual cross correlations assume importance during the planning, design and operation of man-made components of a water resources system.
It may be noted that, while the approaches adopted have been described elsewhere, their application to the study area is attempted for the first time. The model setup over the new area is a unique challenging task and requires expert judgement. Similarly, identification of meaningful

Conclusions
Water resources assessments of climate change impacts at catchment scales, where most adaptation measures are planned and implemented, are of great interest to planners and water resources managers. This study compared the application of three approaches for downscaling precipitation occurrence patterns at 22 rain gauge stations given wetness state indicators and atmospheric variables. Some important conclusions of the study are as follows.
The results of validation indicate that the four selected atmospheric variables comprising hgt700, uwnd700, vwnd850 and ome850 can represent the rainfall patterns effectively for this study region. While the structure of Regression366 and MMM require many parameters to define the processes, results of cross-validation show that, despite this, these models are well balanced and not overparameterized.
A thorough comparison of observed and models simulated at site and across the sites statistics at multiple time scales suggest multisite MMM as the best downscaling model for the region. The MMM is formulated to allow a mix of discrete and continuous predictor variables within the Markovian model structure assumed. The wetness-state indicators as continuous variables are formed solely from the previous wet days' values in the downscaled sequence. The use of wetness-state indicators as additional conditioning predictors further improves the model performance at monthly, seasonal and annual time scales. Use of spatially correlated random innovations coupled with ATM and WI effectively improves the reproduction of spatial dependence, more specifically during calibration, but performs poorly during validation at annual time scale. However, to properly reproduce the spatial and across variable dependences in atmospheric variables and downscaled results, use of multi-dimension bias correction approaches is necessary [43][44][45], as spatial temporal and multi-variable attributes are often misrepresented by climate models. We intend to use a multivariate bias correction approach in our next stage of research. The spatial and high and low frequency temporal dependencies of the climate variables can have significant influence on the outcome where time series of climate variables at multiple locations are used, for example in catchment modeling. Similarly, annual cross correlations assume importance during the planning, design and operation of man-made components of a water resources system.
It may be noted that, while the approaches adopted have been described elsewhere, their application to the study area is attempted for the first time. The model setup over the new area is a unique challenging task and requires expert judgement. Similarly, identification of meaningful predictors is unique to each climatic region and requires knowledge of the climate processes and governing physics.
As with any other research field, the downscaling techniques are also subjected to a few limitations and assumptions. Statistical downscaling, e.g., regression and MMM used here, focus on the overall model accuracy and pay less attention to physical interpretation of the climate [49]. The downscaling techniques are developed based on our present understanding of the physical processes and the temporal and spatial limits of these relationships are unknown. The downscaling models are used to study the effect of changing climate on our water resources under the assumption that the downscaling model developed and calibrated under current climate is stable and the relationship between atmospheric variables and local climate (rainfall) will continue to hold true in the future as well. Although the changes were found to be relatively small and atmospheric patterns may be more robust to these time shifts [14,17], further testing is recommended. Finally, considering overall model performances and datasets used, we recommend a multisite MMM model with four atmospheric variables and previous 30-and 365-day wetness indicators as a prospective downscaling model setup to project future rainfall behavior over the study region.