Modiﬁed Approach to Reduce GCM Bias in Downscaled Precipitation: A Study in Ganga River Basin

: Reanalysis data is widely used to develop predictor-predictand models, which are further used to downscale coarse gridded general circulation models (GCM) data at a local scale. However, large variability in the downscaled product using di ﬀ erent GCMs is still a big challenge. The ﬁrst objective of this study was to assess the performance of reanalysis data to downscale precipitation using di ﬀ erent GCMs. High bias in downscaled precipitation was observed using di ﬀ erent GCMs, so a di ﬀ erent downscaling approach is proposed in which historical data of GCM was used to develop a predictor-predictand model. The earlier approach is termed “ Re-Obs ” and the proposed approach as “ GCM-Obs ”. Both models were assessed using mathematical derivation and generated synthetic series. The intermodal bias in di ﬀ erent GCMs downscaled precipitation using Re-Obs and GCM-Obs model was also checked. Coupled Model Inter-comparison Project-5 (CMIP5) data of ten di ﬀ erent GCMs was used to downscale precipitation in di ﬀ erent urbanized, rural, and forest regions in the Ganga river basin. Di ﬀ erent measures were used to represent the relative performances of one downscaling approach over other approach in terms of closeness of downscaled precipitation with observed precipitation and reduction of bias using di ﬀ erent GCMs. The e ﬀ ect of GCM spatial resolution in downscaling was also checked. The model performance, convergence, and skill score were computed to assess the ability of GCM-Obs and Re-Obs models. The proposed GCM-Obs model was found better than Re-Obs model to statistically downscale GCM. It was observed that GCM-Obs model was able to reduce GCM-Obs erved and GCM-GCM bias in the downscaled precipitation in the Ganga river basin.


Introduction
General circulation models (GCM) cannot be directly used for climate studies at regional scale due to coarse spatial resolution. Wood et al. [1] suggested that products of GCMs should not be directly The driving ideologies of this study were that (i) there is always dissimilarity in the predictor sets of reanalysis and GCM data due to difference in their development, so predictor-predictand model based on predictors of reanalysis data might not be good enough to downscale GCMs, and (ii) a better predictor-predictand relationship of reanalysis-observed data does not assure good downscaling will occur with different GCMs. This may be better understood as follows Say, Observed = f (Obs) (1) Here, 'Obs', 'Obs1 is the characteristics of observed and reanalysis data time series. As, the reanalysis data is developed using observations, Obs1 is similar to Obs. 'G' includes the characteristics of GCM1 and GCM2. GCM1 and GCM2 are GCM products of the same GCM model but with different times.
As a standard statistical approach, Observed, Reanalysis and GCM1 data are available for the same time period, while the GCM2 product have to downscale using the predictor-predictand relationship.
If one wishes to find the unknown value of GCM2 in accordance with the data observed, there may be two approaches: So, it can be observed that GCM-Obs is a better choice as it is using the same group of variables to develop the predictor-predictand relationship which needs to be predicted.
Each GCM simulation is based on assumptions of climate systems, initial conditions, parameterizations, and numerical methods used to solve the non-linear differential equations of the fluid motion of the atmosphere and the ocean [43,44]. The reanalysis data is based on the observation data, so these are closer to the observed data than GCMs. It is obvious that the relationship between reanalysis and observed data may be better than relationship between GCM and observed data. However, this doesn't guarantee better GCM downscaling. GCM downscaling using the Re-Obs model would be similar to predicting a time series using the relationship of two similar time series which are different from the predicted time series. Racherla et al. [7] also concluded that a better model does not necessarily translate to better climate projections. GCM downscaling using the GCM-Obs model may be the better choice, as the predictor-predictand relationship between GCM and observed data already considers the GCM uncertainty, which may result in better GCM downscaled products that may be close to the observations. Change Factor Methodologies (CFM) [6] also use a similar approach to downscale GCM. CFM only use GCM data to downscale future GCM by using additive multiplicative measures with observations. CFM is widely used across the world in climate change impact assessment studies and programs [45,46].
An effort had been made in this study to reduce the GCM bias in downscaled products using the predictor-predictand relationship and using historical GCM data itself as the predictor. This technique is referred to as GCM-Obs. Using historical GCM data as the predictor might provide better future projections of respective GCM, as it has already considered the inherent uncertainty of GCM in model development. Change factor methodology also has a similar approach of downscaling which uses the GCM-Obs relationship to downscale future projections of climate.
After the brief introduction and background of the presented study in Section 1, GCM-Obs logic is described using mathematical expressions and synthetic series in Section 2. The Re-Obs and GCM-Obs models are compared using a case study in the Ganga river basin. The study area and data used are described in Section 3, the methodology adopted is given in Section 4, while Section 5 describes the outcome of the study. Discussions are provided in Section 6 followed by limitations in Section 7. The study is concluded in Section 7. A methodology to develop predictor-predictand relationship is described in Appendix A. Definition 1. Definition of Bias: The term bias used in this study represents the differences between downscaled precipitation using different GCMs and differences between downscaled and observed precipitation.

Mathematical Explanation
The performance of prediction model can be assessed by checking the deviations of predicted values from observed data. For example, a variable 'O' having mean µ O and standard deviation σ O depend on the variable R (µ R and σ R ). It also depends on the variable G (µ G and σ G ). The correlation between O and R is ρ OR and between O and G is ρ OG . Some part of data is considered for training tr and remaining is used for testing ts . The simplest relationship between two variables can be defined by a linear regression model.
Here α and β are intercept and slope.α andβ are best fit values defining the relationship between independent and dependent variables. This relationship can be used to predict response of testing data ts So, O (pR ts ) j =α R tr +β R tr (G ts ) j (11) O (pG ts ) j =α G tr +β G tr (G ts ) j (12) Here, O pR and O pG are prediction of G ts data for O-R and O-G models using Equations (11) and (12) respectively.
The quality of predicted response can be checked in terms of deviation from actual value as below or or and or or Comparing the Equations (11) and (14), assuming deviation in intercept terms for O-R and O-G models almost equal, the deviation in prediction from actual value is and It can be noted that difference between best fit value of slopeβ and the normal slope β for the same variable will certainly be lower than the value in Equation (15). To better understand this, we assume a specific case whenβ R = β R andβ G = β G , i.e., there is already best fit relationship between dependent and independent variables. So, from Equations (15) and (16) So, the O-G model is expected to perform better than the R-G model to predict the response of unknown values of variable G.
In support of the above statement, an example of statistical downscaling using Re-Ob and GCM-Obs models is also shown in Supplementary Information. As simple statistical downscaling using multiple-linear-regression (MLR) is applied to downscale precipitation at grid point-28.25 • latitude and 73.25 • longitude. Monthly Global Precipitation Climatology Center (GPCC) data was used at observed precipitation (details of GPCC data is provided in Section 3). Only few predictor variables are selected to keep the example simple. A summary of the data and method is presented in Table 1. The statistical downscaling using both Re-Obs and GCM-Obs models is compared in Table 2. It may be observed from the results that the GCM-Obs model performed better than the Re-Obs model to downscale GCM data. It can also be observed that the Re-Obs model showed a better predictor-predictand relationship than GCM-Obs because reanalysis data is closer to observations but the GCM-Obs model performed better to downscale GCM.

Explanation Using Synthetic Series
The logic of using historical GCM data itself instead of reanalysis data is also explained by considering two different sets of synthetic time series: (i) Set1-time series having no seasonal component i.e., annual or seasonal values and (ii) Set2-a time series which has some cyclic effects due to inter-annual patterns or natural atmospheric oscillations, i.e., ENSO etc.
Each set of series consists of observed, reanalysis, and historical GCM data for the same length. It is to be noted that both reanalysis and GCM data are correlated to some degree (even very small) with observed data. So, Here, Obs, Re1, Re2, GCM1 and GCM2 are time series of observed, reanalysis data for first variable, reanalysis data for second variable, GCM data of first variable and GCM data of second variable respectively. ρ 1 , ρ 2 are cross correlations of reanalysis and GCM variables with observed data series.
The values of variables at time i for Set1 can be expressed as Here, µ, σ and ρ are mean, standard deviation and cross correlation of reanalysis/GCM with observed data. N(0, 1 2 ) j is the j th standard normal random number.
Set2 is generated considering the seasonality. Here we have considered seasonality composed of three sinusoidal wave forms and the data series are expressed as Re2 i = A 4 sin(2πn 7 i) + A 5 sin(2πn 8 i) + A 6 sin(2πn 9 i) +µ Re2 + σ Re2 N(0, GCM1 i = A 6 sin(2πn 10 i) + A 7 sin(2πn 11 i) + A 8 sin(2πn 12 GCM2 i = A 9 sin(2πn 13 i) + A 10 sin(2πn 14 i) + A 11 sin(2πn 15 Here, A is the amplitude and n is frequency of the wave form. Typical pattern of synthetic time series for Set1 and Set2 are shown in Figure 1a,b respectively. Different values of parameters considered for Set1 and Set2 are given in Tables 3 and 4 respectively.  The parameters of synthetic series of reanalysis data are considered to be closer to the observed as it is assumed that reanalysis data shows better similarity than corresponding GCM with observed data because the former is based on observations while the latter is generated while assuming an ocean-atmospheric relationship. 1000 series of sample size 115 for each variable for Set1 and Set2 are generated and tested with statistical downscaling with GCM-Obs and a formal approach using reanalysis data (Re-Obs). To reduce complexity, a simple linear regression model is used to develop the relationship between predictors, i.e., GCM/reanalysis and predictand Observed data. Around 60% of the data is used for training the model and the rest is used for testing. The predictor-predictand relationship is used to downscale/predict the testing data. The predicted response of both models is tested with observed data in terms of mean, standard deviation, skewness, lag1 autocorrelation, correlation coefficient, and root-mean-square-error (RMSE).
The predicted response of test data with both models for Set1 is compared with observed data and shown in Figure 2. All the measures to test the prediction with observed data are calculated for testing period. It is found that there is a huge bias in the mean and standard deviation of the predicted response with the Re-Obs model, while the GCM-Obs model shows a high degree of similarity. Root  The parameters of synthetic series of reanalysis data are considered to be closer to the observed as it is assumed that reanalysis data shows better similarity than corresponding GCM with observed data because the former is based on observations while the latter is generated while assuming an ocean-atmospheric relationship.
1000 series of sample size 115 for each variable for Set1 and Set2 are generated and tested with statistical downscaling with GCM-Obs and a formal approach using reanalysis data (Re-Obs). To reduce complexity, a simple linear regression model is used to develop the relationship between predictors, i.e., GCM/reanalysis and predictand Observed data. Around 60% of the data is used for training the model and the rest is used for testing. The predictor-predictand relationship is used to downscale/predict the testing data. The predicted response of both models is tested with observed data in terms of mean, standard deviation, skewness, lag1 autocorrelation, correlation coefficient, and root-mean-square-error (RMSE).
The predicted response of test data with both models for Set1 is compared with observed data and shown in Figure 2. All the measures to test the prediction with observed data are calculated for testing period. It is found that there is a huge bias in the mean and standard deviation of the predicted response with the Re-Obs model, while the GCM-Obs model shows a high degree of similarity. Root mean square error shows a true deviation of predicted values with the observed data. The GCM-Obs model shows a RMSE close to 1 while the Re-Obs model shows a very high RMSE of the order of 25-30. Little to no effect was found on skewness and lag1 autocorrelation of prediction with selection of the prediction model. Both of them are found to predict similar skewness and lag1 autocorrelation with the observed data. Figure 3 shows performance of the GCM-Obs and Re-Obs models for Set2. Here also, the GCM-Obs models shows good agreement in terms of mean and standard deviation. RMSE of around 30 is found for the Re-Obs model while for the GCM-Obs model, it is found to be below 1. Both models show same degree of similarity in skewness and lag1 autocorrelation with observed data. It can be seen that higher correlation is found in predicted response with observed data for both models for a higher correlation of GCM data in generated synthetic series. the prediction model. Both of them are found to predict similar skewness and lag1 autocorrelation with the observed data. Figure 3 shows performance of the GCM-Obs and Re-Obs models for Set2. Here also, the GCM-Obs models shows good agreement in terms of mean and standard deviation. RMSE of around 30 is found for the Re-Obs model while for the GCM-Obs model, it is found to be below 1. Both models show same degree of similarity in skewness and lag1 autocorrelation with observed data. It can be seen that higher correlation is found in predicted response with observed data for both models for a higher correlation of GCM data in generated synthetic series.
This shows that the GCM-Obs model should perform better than the Re-Obs model to reduce bias and produce better predictions.

The GCM-Obs model: Case Study
The GCM-Obs model was found to be better than the Re-Obs model for downscaling GCM data while considering synthetic series in the previous section. Here, the GCM-Obs model was used to downscale precipitation at different locations in the Ganga river basin and was compared with the Re-Obs model. This shows that the GCM-Obs model should perform better than the Re-Obs model to reduce bias and produce better predictions.

The GCM-Obs model: Case Study
The GCM-Obs model was found to be better than the Re-Obs model for downscaling GCM data while considering synthetic series in the previous section. Here, the GCM-Obs model was used to downscale precipitation at different locations in the Ganga river basin and was compared with the Re-Obs model.

Study Area and Data
The confluence of the rivers Alaknanda and Bhagirathi at Devprayag in the Uttarakhand district is the beginning of the river Ganga. Gangotri glacier is the primary source of the Ganga river and is also the originating place of River Bhagirathi. The terminus of Gangotri glacier is at Gaumukh in Uttarakhand. Gaumukh is considered as the true source of the Ganga river. The Ganga river travels around 2525 km from its origin at Gaumukh to terminus at the Bay of Bengal [47]. The catchment area of Ganga river basin is around 1,086,000 km 2 which lies between latitudes 22 •  The Ganga river basin map is shown in Figure 4. The elevation in the basin ranges from near to mean sea level (MSL) to about 8000 m MSL. Five regions/zones of the Ganga river basin were considered representing different climate types defined by an updated Koppen-Geiger global climate classification [49]. These are: The Global precipitation climatology center (GPCC) is a leading agency providing global monthly precipitation high resolution gridded observed data at a grid interval of 0.5° × 0.5° [50]. Other gridded data set, i.e., from the Climate Research Unit (CRU) [51,52] have also been used which agrees with the GPCC data set. GPCC precipitation data for the years 1948 to 2005 were used as gridded observed data for zones 1 to 5 in this study.
Observed precipitation data provided by India Water Portal (available at indiawaterportal.org) was also used to check the effect of different observed data providing agency on downscaled data. Monthly precipitation data for two districts in The Ganga river basin, i.e., Uttarkashi district (lies in latitude 30°30′ N to 31° N and longitude 78°25′ E to 78°75′ E) in state Uttarakhand, in which the Ganga river originates and Darjeeling district (lies in latitude 26°30′ N to 27° N and longitude 88°25′ E to 88°75′ E) in west Bengal state, which receives considerably higher rainfall, is obtained from India Water Portal (IWP) for the years 1948 to 2002. The location of the study area in the Ganga river basin is also shown in Figure 4. Mean monthly, global NCEP/NCAR reanalysis data [12] of six climatological variables, i.e., temperature (ta), geopotential height (zg), specific humidity (hus), zonal and meridional wind components (Ua and Va respectively) and mean sea level pressure (psl), at grid interval of 2.5° × 2.5° for the years 1948 to 2015 were used to develop a predictor-predictand model. The Global precipitation climatology center (GPCC) is a leading agency providing global monthly precipitation high resolution gridded observed data at a grid interval of 0.5 • × 0.5 • [50]. Other gridded data set, i.e., from the Climate Research Unit (CRU) [51,52] have also been used which agrees with the GPCC data set. GPCC precipitation data for the years 1948 to 2005 were used as gridded observed data for zones 1 to 5 in this study.
Observed precipitation data provided by India Water Portal (available at indiawaterportal.org) was also used to check the effect of different observed data providing agency on downscaled data. Monthly precipitation data for two districts in The Ganga river basin, i.e., Uttarkashi district (lies in latitude 30 • Figure 4. Mean monthly, global NCEP/NCAR reanalysis data [12] of six climatological variables, i.e., temperature (ta), geopotential height (zg), specific humidity (hus), zonal and meridional wind components (Ua and Va respectively) and mean sea level pressure (psl), at grid interval of 2.5 • × 2.5 • for the years 1948 to 2015 were used to develop a predictor-predictand model.
Coupled Model Inter-comparison Project-5 (CMIP5) historical mean monthly data of six climatological variables (as considered in reanalysis data) available at different grid intervals depending on specific GCM for year 1948 to 2010 were obtained from ESGF website (available at https://pcmdi9.llnl.gov/projects/cmip5). Brief description of GCMs used in the study is given in Table 5. Detailed descriptions for CMIP5 products of GCMs, their modeling centers, and resolution can be found in previous studies [53,54].

Methodology
The first objective of the study was to check whether reanalysis data was best suited to downscale GCM data. A predictor-predictand model was developed using reanalysis data as the predictor and observed data as the predictand in the Re-Obs model. The GCM-Obs model was developed using historical data of GCM itself as the predictor and observed data as the predictand. Both reanalysis and GCM variable data were re-gridded to the grid interval of 0.5 • × 0.5 • to match observed data grids. Predictor variables considered at different pressure levels in milibar (mbar) in the development of predictor-predictand models are tabulated in Table 6.
Principal component analysis (PCA) of predictor data is done to reduce the dimension and to remove inter-collinearity [55]. Multiple linear regression (MLR) and Artificial Neural Network (NN) methods were used to develop predictor-predictor relationships. The Artificial Neural Network method was found to be better than MLR, so the NN method is used for downscaling GCM data.
Detailed description of predictor-predictand model development is given in Appendix A. Downscaling of historical GCM data is done with developed Re-Obs and GCM-Obs predictor-predictand models. The number of training, validation/testing, and downscaling years considered in model development and downscaling are given in Table 7. Evaluation of both models was done in terms of performance and convergence capabilities. Performance of the model was checked by assessing the similarity of downscaled precipitation using different GCMs with observed precipitation in terms of normalized standard deviation, correlation, skill score, and normalized root mean square deviation (NRMSD). Convergence skill of model is inversely proportional to GCM uncertainty; that is, a model with lower GCM uncertainty shows better convergence and vice-versa. Measures like normalized root mean square deviation and correlation between downscaled precipitations using different GCMs were used to analyze the similarity or uncertainty of both models. Measures to define performance and convergence capabilities of both models are discussed in detail in the results and discussions section.
Step-by-step procedures followed in this study are shown in Figure 5.

Results
Re-Obs and GCM-Obs models were used to downscale historical GCM data at each grid point of zones 1 to 5, in the Uttarakhand and Darjeeling districts. Downscaled precipitation using both models were compared with observed precipitation for the model performance assessment. Zonal averaged values of observed and downscaled precipitation were considered for comparison in each zone. Time series of observed and downscaled precipitation by the Re-Obs model with CMCC-CMS GCM and the GCM-Obs model with GFDL-CM3 GCM for zone 1 and the Darjeeling district is shown in Figures  6 and 7, respectively. A considerable difference in downscaled precipitation when using Re-Obs and models can be observed. The coefficient of determination (R 2 ) between observed and downscaled precipitation by Re-Obs and GCM-Obs models varies from 0.5 to 0.69 and 0.8 to 0.81, respectively. Higher variability between observed and downscaled precipitation can be seen by with the Re-Obs Evaluation of both models was done in terms of performance and convergence capabilities. Performance of the model was checked by assessing the similarity of downscaled precipitation using different GCMs with observed precipitation in terms of normalized standard deviation, correlation, skill score, and normalized root mean square deviation (NRMSD). Convergence skill of model is inversely proportional to GCM uncertainty; that is, a model with lower GCM uncertainty shows better convergence and vice-versa. Measures like normalized root mean square deviation and correlation between downscaled precipitations using different GCMs were used to analyze the similarity or uncertainty of both models. Measures to define performance and convergence capabilities of both models are discussed in detail in the results and discussions section.
Step-by-step procedures followed in this study are shown in Figure 5.

Results
Re-Obs and GCM-Obs models were used to downscale historical GCM data at each grid point of zones 1 to 5, in the Uttarakhand and Darjeeling districts. Downscaled precipitation using both models were compared with observed precipitation for the model performance assessment. Zonal averaged values of observed and downscaled precipitation were considered for comparison in each zone. Time series of observed and downscaled precipitation by the Re-Obs model with CMCC-CMS GCM and the GCM-Obs model with GFDL-CM3 GCM for zone 1 and the Darjeeling district is shown in Figures 6  and 7, respectively. A considerable difference in downscaled precipitation when using Re-Obs and models can be observed. The coefficient of determination (R 2 ) between observed and downscaled precipitation by Re-Obs and GCM-Obs models varies from 0.5 to 0.69 and 0.8 to 0.81, respectively. Higher variability between observed and downscaled precipitation can be seen by with the Re-Obs model than with the GCM-Obs model for Darjeeling. Similar results were also found for other regions.  Along with a comparison of downscaling performance by both models for zone-1 to 5, Uttarkashi and Darjeeling districts are also represented by Taylor diagrams [56] in Figures 8-13, respectively. A Taylor diagram is a useful plot to concisely show the degree of similarity between observed and modeled data. In this study, a Taylor diagram is used to show the relative performances of both methods to downscale precipitation with different GCMs. Radial lines from origin show the correlation between observed and downscaled precipitation. X and Y axis indicate the normalized standard deviation, which is computed by the following formula: Along with a comparison of downscaling performance by both models for zone-1 to 5, Uttarkashi and Darjeeling districts are also represented by Taylor diagrams [56] in Figures 8-13, respectively. A Taylor diagram is a useful plot to concisely show the degree of similarity between observed and modeled data. In this study, a Taylor diagram is used to show the relative performances of both methods to downscale precipitation with different GCMs. Radial lines from origin show the correlation between observed and downscaled precipitation. X and Y axis indicate the normalized standard deviation, which is computed by the following formula:

( )
Standard deviation of downscaled precipitaion Standard deviation of observed precipitation = Normalized standard deviation NSD (35) NSDs are represented by two sets of concentric arches of circles having centers at origin and observed data point. NSD and correlation coefficient for observed data is always unity and marked at unit correlation and unit NSD in Taylor diagrams. Points in green and red colour represent downscaled precipitation with Re-Obs and GCM-Obs models, respectively. GCM Multi-model ensemble averaged precipitation is also calculated by adding more weight to the highly correlated GCM downscaled precipitation with observed precipitation using the following formula.
Here, MMt is the GCM Multi-model ensemble averaged precipitation at time t, i r is correlation between downscale precipitation (DP) and observed precipitation for particular GCM, i = 1, 2, ..., n number of GCMs (here 'n' is 10). Notations used in this study to represent precipitation downscaled by Re-Obs and GCM-Obs models with different GCMs are given in Table 8.   Observed Obs Obs Figure 8 shows the Taylor diagram for zone 1. Downscaled precipitation determined by the Re-Obs model with MIROC-ESM-CHEM and FGOALS-s2 GCMs is found to be the least correlated with observed precipitation, as it has a correlation coefficient (r) of 0.36 and 0.38 respectively. The GCM-Obs model is showing a significant improvement in similarity of downscaled precipitation with observed precipitation compared to the Re-Obs model. The coefficient of correlation improved to 0.9 and 0.83 using the GCM-Obs model with MIROC-ESM-CHEM and FGOALS-s2 GCMs, respectively. NSD values by the GCM-Obs model are considerably better than the Re-Obs model, which shows less bias in observed and GCM downscaled precipitation. Similarly, Taylor diagrams for zone 2 to 5 also show improved similarity in downscaled precipitation by the GCM-Obs model than the Re-Obs model. Figures 13 and 14 represent the Taylor diagram for district Uttarakhand and Darjeeling, respectively, for which IWP observed data was used. Figures 13 and 14 also show more significant improvements in downscaled precipitation with the GCM-Obs model than with the Re-Obs model. Taylor diagrams for most of the study area show that downscaled precipitation in the GCM-Obs model with a different GCM is in the form of cluster.  NSDs are represented by two sets of concentric arches of circles having centers at origin and observed data point. NSD and correlation coefficient for observed data is always unity and marked at unit correlation and unit NSD in Taylor diagrams. Points in green and red colour represent downscaled precipitation with Re-Obs and GCM-Obs models, respectively. GCM Multi-model ensemble averaged precipitation is also calculated by adding more weight to the highly correlated GCM downscaled precipitation with observed precipitation using the following formula.       Here, MM t is the GCM Multi-model ensemble averaged precipitation at time t, r i is correlation between downscale precipitation (DP) and observed precipitation for particular GCM, i = 1, 2, . . . , n number of GCMs (here 'n' is 10).

Notation for the GCM-Obs model
Notations used in this study to represent precipitation downscaled by Re-Obs and GCM-Obs models with different GCMs are given in Table 8. Figure 8 shows the Taylor diagram for zone 1. Downscaled precipitation determined by the Re-Obs model with MIROC-ESM-CHEM and FGOALS-s2 GCMs is found to be the least correlated with observed precipitation, as it has a correlation coefficient (r) of 0.36 and 0.38 respectively. The GCM-Obs model is showing a significant improvement in similarity of downscaled precipitation with observed precipitation compared to the Re-Obs model. The coefficient of correlation improved to 0.9 and 0.83 using the GCM-Obs model with MIROC-ESM-CHEM and FGOALS-s2 GCMs, respectively. NSD values by the GCM-Obs model are considerably better than the Re-Obs model, which shows less bias in observed and GCM downscaled precipitation. Similarly, Taylor diagrams for zone 2 to 5 also show improved similarity in downscaled precipitation by the GCM-Obs model than the Re-Obs model. Figures 13 and 14 represent the Taylor diagram for district Uttarakhand and Darjeeling, respectively, for which IWP observed data was used. Figures 13 and 14 also show more significant improvements in downscaled precipitation with the GCM-Obs model than with the Re-Obs model. Taylor diagrams for most of the study area show that downscaled precipitation in the GCM-Obs model with a different GCM is in the form of cluster.
Skill score [56] is a popular method used to check the skill of model to simulate the results close to target data. Skill score increases with increase in correlation between simulated and observed data. It also increases when variance of modeled data approaches near to observed data. Skill score is defined as follows:   Skill score [56] is a popular method used to check the skill of model to simulate the results close to target data. Skill score increases with increase in correlation between simulated and observed data. It also increases when variance of modeled data approaches near to observed data. Skill score is defined as follows: Here, R is correlation between modeled and observed series, σ is the ratio of standard deviation of modeled and observed series, K is a penalty parameter imposed for low correlation (here K = 2) and 0 R is maximum possible correlation which is assumed to be unity here.
The skill score of downscaled precipitation by Re-Obs and GCM-Obs models with different GCMs is shown in Figure 15. Peaks and troughs are visible in the plot, where downscaled precipitation by GCM-Obs and Re-Obs models are at peaks and troughs, respectively. Skill scores by the GCM-Obs model vary from 65% to 95% and 15% to 80% with the Re-Obs model. Zone 4 shows the least skill score under the Re-Obs model, which improved significantly by using the GCM-Obs model. Here, R is correlation between modeled and observed series, σ is the ratio of standard deviation of modeled and observed series, K is a penalty parameter imposed for low correlation (here K = 2) and R 0 is maximum possible correlation which is assumed to be unity here.
The skill score of downscaled precipitation by Re-Obs and GCM-Obs models with different GCMs is shown in Figure 15. Peaks and troughs are visible in the plot, where downscaled precipitation by GCM-Obs and Re-Obs models are at peaks and troughs, respectively. Skill scores by the GCM-Obs model vary from 65% to 95% and 15% to 80% with the Re-Obs model. Zone 4 shows the least skill score under the Re-Obs model, which improved significantly by using the GCM-Obs model.
Root Mean Square Deviation (RMSD) is also a good measure to assess the predictive power of model. It directly relates difference between modeled and target data considering each data point. Lower values of RMSD shows better matching between modeled and target data. RMSD is generally presented in a normalized form to remove the scale difference between different data sets. The following formula is used to calculate a normalized RMSD.
Here, θ t andθ t are observed and simulated values at time t, t = 1, 2, . . . , N number of months, θ max and θ min are maximum and minimum values of respective observed series.
NRMSD of downscaled precipitation with different GCMs by Re-Obs and GCM-Obs models is shown in Figure 16. Downscaled precipitation by the Re-Obs model at peaks and by the GCM-Obs model at troughs in the plot clearly shows the improvement in the closeness of downscaled data with observed data. FGOALS-g2, GFDL-CM3 and INMCM4 GCMs show comparatively higher NRMSD.
Performance of the GCM-Obs model was found to be better than the Re-Obs model in downscaling precipitation in close range of observed precipitation following similar patterns. Different measures adopted to check the performance of both models indicate the high performance of the GCM-Obs model over the Re-Obs model in measuring downscale precipitation with different GCMs.
Two measures are adopted to show convergence skill of both models (i) correlation matrix and (ii) NRMSD matrix. These measures are discussed in detail in following paragraphs. Root Mean Square Deviation (RMSD) is also a good measure to assess the predictive power of model. It directly relates difference between modeled and target data considering each data point. Lower values of RMSD shows better matching between modeled and target data. RMSD is generally presented in a normalized form to remove the scale difference between different data sets. The following formula is used to calculate a normalized RMSD.  Performance of the GCM-Obs model was found to be better than the Re-Obs model in downscaling precipitation in close range of observed precipitation following similar patterns. Different measures adopted to check the performance of both models indicate the high performance of the GCM-Obs model over the Re-Obs model in measuring downscale precipitation with different GCMs.
Two measures are adopted to show convergence skill of both models (i) correlation matrix and (ii) NRMSD matrix. These measures are discussed in detail in following paragraphs.
Correlation coefficient (r) is a good measure to check linear similarity between two datasets. Higher value of r shows higher similarity. The Pearson correlation coefficient is adopted in this study, which is computed as follows: Here, , X Y r is Pearson correlation coefficient between datasets X and Y , x and y are mean of datasets X and Y respectively, t x and t y are values of datasets at time t = 1, 2…N. Correlation coefficient (r) is a good measure to check linear similarity between two datasets. Higher value of r shows higher similarity. The Pearson correlation coefficient is adopted in this study, which is computed as follows: Here, r X,Y is Pearson correlation coefficient between datasets X and Y, x and y are mean of datasets X and Y respectively, x t and y t are values of datasets at time t = 1, 2, . . . , N.
Correlation coefficient between downscaled precipitation with different GCMs by the Re-Obs and GCM-Obs models are presented in the form of a correlation matrix. Correlation between downscaled and observed precipitation is also shown. The diagonal of the correlation matrix shows a histogram of respective data series. The correlation matrix of downscaled precipitation for zone 4 by the Re-Obs and GCM-Obs models is shown in Figures 17 and 18 respectively. Similar matrix is also shown for Uttarkashi district in Figures 19 and 20.
A  Table 5. Numbers in scatter plot show the correlation coefficient between two GCMs. Diagonal of matrix represents histogram of observed precipitation and downscaled precipitation for a particular GCM.  NRMSD was again used to judge the convergence skill of both models in downscaled precipitation with different GCMs. To show the closeness of downscaled precipitation with different GCMs under the Re-Obs and GCM-Obs models, a NRMSD matrix is prepared as follows:  Here, (NRMSD) X,Y is normalized root mean square deviation between downscaled precipitation for GCM X and GCM Y , X, Y = 1, 2, . . . , 10, θ X,t and θ Y,t are downscaled precipitation of GCM X and GCM Y at time t, t = 1, 2, . . . , N number of months, θ obs_max and θ obs_min is maximum and minimum observed precipitation respectively.
The pattern of the NRMSD matrix of downscaled precipitation by both models for Zone 4 and Uttarkashi are shown in Figures 21 and 22, respectively. Two GCMs, i.e., FGOALS-g2 and GFDL-CM3, downscaled by the Re-Obs model show the highest variability with other GCMs and observed data for Zone 4 and Uttarkashi. The NRMSD value of the order of 70%-90% with the Re-Obs model is significantly reduced to 5%-15% by the GCM-Obs model for zone 4. Similarly, for Uttarkashi, the NRMSD of 15%-30% using Re-Obs model is reduced to 5-15% using GCM-Obs model. A significant reduction in the NRMSD value can also be seen with observed data using the GCM-Obs model. It can also be seen that the spatial resolution of GCM is not an influencing parameter in downscaling. A mixed pattern of downscaling performance is achieved on downscaling using coarser gridded CMCC-CESM to obtain relatively finer gridded CMCC-CM GCMs. Similar results were also found for other regions.  Figures 21 and 22, respectively. Two GCMs, i.e., FGOALS-g2 and GFDL-CM3, downscaled by the Re-Obs model show the highest variability with other GCMs and observed data for Zone 4 and Uttarkashi. The NRMSD value of the order of 70%-90% with the Re-Obs model is significantly reduced to 5%-15% by the GCM-Obs model for zone 4. Similarly, for Uttarkashi, the NRMSD of 15%-30% using Re-Obs model is reduced to 5-15% using GCM-Obs model. A significant reduction in the NRMSD value can also be seen with observed data using the GCM-Obs model. It can also be seen that the spatial resolution of GCM is not an influencing parameter in downscaling. A mixed pattern of downscaling performance is achieved on downscaling using coarser gridded It can be noted from the results above that the GCM-Obs model performs better than the Re-Obs model in statistical downscaling. However, bias correction methods are generally adopted with the Re-Obs model to reduce GCM bias. Lie et al. [32] proposed Equidistant CDF matching (EDCDFm) bias correction method which is widely used to correct bias in monthly precipitation and temperature. The same method is used to correct the bias in downscaled precipitation using the Re-Obs and GCM-Obs models. The performance of both models considering EDCDFm bias correction method for zone-4 is presented by NRMSD matrix and correlation plots.
The correlation matrix for the bias corrected Re-Obs and GCM-Obs are shown in Figures 23 and 24, respectively. It can be observed from the Figures 17 and 23 that the bias correction method improved the downscaled precipitation. However, bias correction also improved the downscaled precipitation using the GCM-Obs model. So, the bias correction method improved the inter GCM and GCM-observed correlation and the GCM-Obs model performed better than the Re-Obs model. CMCC-CESM to obtain relatively finer gridded CMCC-CM GCMs. Similar results were also found for other regions.  It can be noted from the results above that the GCM-Obs model performs better than the Re-Obs model in statistical downscaling. However, bias correction methods are generally adopted with the Re-Obs model to reduce GCM bias. Lie et al. [32] proposed Equidistant CDF matching (EDCDFm) bias correction method which is widely used to correct bias in monthly precipitation and   The NRMSD matrix of bias corrected downscaled precipitation using Re-Obs and GCM-Obs models is shown in Figure 25. Here also, the bias correction method improved the downscaling performance of both models, and the GCM-Obs model was found to be better than the Re-Obs model. Measures adopted to judge the convergence skill of both models indicate the GCM-Obs model's capability to reduce GCM bias, and show a better convergence skill than the Re-Obs model with or without using bias correction methods. Overall, it can be conveyed that the GCM-Obs model performs better than the Re-Obs model. The NRMSD matrix of bias corrected downscaled precipitation using Re-Obs and GCM-Obs models is shown in Figure 25. Here also, the bias correction method improved the downscaling performance of both models, and the GCM-Obs model was found to be better than the Re-Obs model.  Measures adopted to judge the convergence skill of both models indicate the GCM-Obs model's capability to reduce GCM bias, and show a better convergence skill than the Re-Obs model with or without using bias correction methods. Overall, it can be conveyed that the GCM-Obs model performs better than the Re-Obs model.

Discussion
This article discusses the choice of a better statistical downscaling model among Re-Obs and GCM-Obs to downscale GCM data. The former Re-Obs model is adopted by the majority of researchers. However, the bias in the GCM downscaled product from observations and the differences in GCM downscaled product using different GCMs is major concern of the climate researchers [8,15,21]. The Re-Obs and GCM-Obs models are compared using three methods: (i) Mathematical derivation, (ii) Synthetic Series and (iii) a case study considering real observed data. The Performance of the GCM-Obs model was found to be better than the Re-Obs model for statistically downscaling GCM data. It may be argued that if the reanalysis data is closer to the observed data, it should be able to better downscale GCM, but the Re-Obs model could produce better results if the reanalysis products are to be downscaled. However, if the goal is to downscale GCM, the obvious choice for the predictor -predictand model should be the GCM-Obs model, as it already considers the characteristics of the GCM that will be downscaled.
The performance of Re-Obs and the GCM-Obs model for statistically downscaling precipitation in different regions in the Ganga river basin was checked using different skill scores [35,41]. The case study indicates that the GCM-Obs model could be better choice for statistically downscaling GCM. The GCM-Obs model can be used to downscale different CMIP5 experiments [57], as reported in the IPCC AR5 for assessing different climate states while considering different assumptions.

Limitations of the Study
The study aimed to propose a different downscaling approach to improve the similarity of the downscaled variable and observed data and to reduce variability in downscaled data using different GCMs. Downscaling can be further improved by using a different set of predictor variables, downscaling methods, and other techniques/methods. In this study ten different GCMs are considered and the results can be verified with other GCMs as well. However, the results are likely to be in agreement with this study. Uncertainty is inherent with GCM due to different boundary conditions, equations, methods and other factors in the development of different GCMs. GCM uncertainty cannot be fully removed, but in this study an effort has been made to reduce the variability in downscaled variables using a different downscaling approach. Observed data is obtained from renowned agencies which take utmost care in sampling and production in datasets, but still errors in the data cannot be fully ruled out.

Conclusions
Statistical downscaling of precipitation in different regions in the Ganga river basin was carried out with ten different GCMs along with the Re-Obs and GCM-Obs models. The predictor-predictand relationship used to downscale GCM at a local scale was developed using reanalysis and historical GCM data as predictors in Re-Obs and GCM-Obs models, respectively. Different measures were adopted to judge the relative performances of Re-Obs and GCM-Obs models to downscale precipitation with different GCMs.
Downscaled precipitation with different GCMs by the Re-Obs and GCM-Obs models showed significant differences in each region of the study area. Although the predictor-predictor model showed good connection between reanalysis and observed data, but same model could not better simulate the downscaling of GCM. Higher variance and lesser correlation between modeled and observed precipitation was shown by the Re-Obs model. Development of the predictor-predictand relationship with historical data of GCM itself as the predictor and observed data as the predictand showed a high similarity and less variability in downscaled and observed precipitation. The skill score of downscaled precipitation also showed a more significant improvement with the GCM-Obs model than with the Re-Obs model. Datasets of downscaled precipitation with different GCMs using the GCM-Obs model fall near to each other in the form of clusters, as represented by the Taylor diagram.
Intercomparison of downscaled precipitation with different GCMs by the Re-Obs model also showed higher GCM-GCM bias, as indicated by the NRMSD correlation matrix. The GCM-Obs model significantly reduced GCM bias in downscaled precipitation for all regions and all GCMs. The GCM-Obs model showed its robustness in downscaling with different GCMs for all regions and both sets of observed data, i.e., GPCC and IWP. The GCM-Obs model was found to be more reliable in terms of performance and convergence skill than the Re-Obs model. Multi-model ensemble average precipitation data showed better resemblance with observed data than most of the individual GCM. It is also found that spatial resolution of the GCM does not have a considerable effect on performance and convergence skill for both models. The bias correction method also improves the downscaling performance of both Re-Obs and GCM-Obs.
It can be said that using historical GCM data to develop the predictor-predictand relationship is a better choice to simulate the precipitation and to reduce GCM uncertainty. The GCM-Obs model is robust against bias due to different data observing agencies. Predictor-predictand model development using historical data of GCM itself can be applied to downscale other atmospheric variables, i.e., temperature, humidity, evapotranspiration etc. Improvement in performance and reduction in GCM bias is also expected to downscale other variables under the GCM-Obs model, because precipitation is in least agreement with GCM in downscaling [58]. The bias correction measures may still be used to further improve the quality of the downscaled variable. This study will be helpful for climate change researchers to develop better downscaling models and to be more certain when they decide the ranges of downscaled atmospheric variables.