Development of a Parametric Regional Multivariate Statistical Weather Generator for Risk Assessment Studies in Areas with Limited Data Availability

Risk analysis of water resources systems can use statistical weather generators coupled with hydrologic models to examine scenarios of extreme events caused by climate change. These require multivariate, multi-site models that mimic the spatial, temporal, and cross correlations of observed data. This study developed a statistical weather generator to facilitate bottom-up approaches to assess the impact of climate change on water resources systems for cases of limited data. While existing weather generator models have impressive features, this study suggested a simple weather generator which is straightforward to implement and can employ any distribution function for variables such as precipitation or temperature. It is based on (1) a first-order, two-state Markov chain to simulate precipitation occurrences; (2) the use of Wilks’ technique to produce correlated weather variables at multiple sites with the conservation of spatial, temporal, and cross correlations; (3) the capability to vary the statistical parameters of the weather variables. The model was applied to studies of the Diyala River basin in Iraq, which is a case with limited observed records. Results show that it exhibits high values (e.g., over 0.95) for the Nash–Sutcliffe and Kling–Gupta metric tests, preserves the statistical properties of the observed variables, and conserves the spatial, temporal, and cross correlations among the weather variables in the meteorological stations.


Introduction
Climate change impacts are of increasing concern to hydrologists who assess risks in the management of water resources systems. Their models of climate scenarios for extreme events can be derived from global climate models (GCMs), stochastic-statistical weather generators (SWGs), or a combination. Although they have their own advantages, some argue that the GCM scenarios are inadequate and limit decision-making options because they represent only specific scenarios for climatic variability and have large uncertainties [1][2][3][4][5][6]. On the other hand, others think that SWGs can produce a wide range of scenarios to study system responses and provide more insights about the system performance under climate change [7][8][9][10]. The drawbacks of the SWGs are that they have a stochastic-basis and cannot provide future change insights. Therefore, the SWGs and GCMs have been linked to generate forecasting scenarios and to assign a probability of each SWG scenario by fitting a distribution to the GCM outcomes [11][12][13]. In this way, SWGs can then be used to generate probabilistic synthetic scenarios with the aid of the GCM information and which are statistically similar to observed data and used to investigate which climate states cause system failure [4,[14][15][16][17][18][19][20][21][22][23]. Where historic records are limited, synthetic weather sequences based on SWGs are especially suitable [24].
Given the previous work, the main objective of this paper is to develop a SWG that can be used in a bottom-up approach to generate daily synthetic scenarios to evaluate the impacts of long-term climate change on system performance and suggest robust adaptations to cope with anticipated negative impacts that will be examined in a follow up study. Emphasis is placed on areas with low data availability, and the model is demonstrated for Diyala River basin in Iraq for the four historic weather variables (e.g., precipitation, maximum and minimum temperatures, and wind speed magnitude) with daily time steps from 1948 to 2006.

Literature Review
Generally, SWGs can be grouped into parametric, non-parametric, and semi-parametric methods. In the parametric method, the weather variables are assumed to fit one continuous probability distribution or two combined distributions. The parameters are usually estimated from historic observations [24][25][26][27][28]. In the non-parametric method, the weather variables are resampled from historic observations using techniques such as empirical distributions, neural networks, and maximum entropy bootstrap [29][30][31][32]. The semi-parametric method is a mixture between parametric and non-parametric methods.
Albeit other approaches have their advantages, the parametric SWG in the bottom-up approach is preferable because the parameters can be altered to simulate different weather scenarios and facilitate climate change studies [16]. Verdin et al., [15], Furrer and Katz [18], Buishand and Brandsma [33], Seneviratne et al., [34] noted that the non-parametric method has limitations in generating extreme events because values can only be in the range of the observations. Using only the observed sequences ignores climate change's impacts on altering the intensities of the variables and is insufficient in assessing the future response of water resources systems because it leads to single results corresponding only to these observed sequences [22,23,25,[34][35][36].
Most existing SWGs are for single sites and cannot capture the spatial and cross correlations between the variables, which are essential for generating realistic climate change scenarios. Schaake et al., [37] stated that "relationships between physically dependent variables like precipitation and temperature should be respected". Single site SWGs can fail to capture the extreme events of the generated runoff, which are essential to develop realistic adaptation strategies to cope with flood and drought events, especially where a high runoff in one sub-basin can be offset by the low runoff in adjacent sub-basins [26,35,38].
Moreover, the misrepresentation of spatial and cross correlations (e.g., correlations between the precipitation and temperature) leads to biased generated streamflows as this correlation determines the water availability for evapotranspiration and snowmelt [32,39,40]. Therefore, SWGs should capture the characteristics of each site and the spatial dependence among them.
Recently, multi-site and multi-variable SWGs have been developed using different approaches. Steinschneider and Brown, [4] developed a semi-parametric model using a k-nearest-neighbor resampling scheme to simulate multiple spatially distributed variables using wavelet decomposition and autoregressive model to account for low-frequency oscillations. They used a Markov chain of first-order with three states to identify the precipitation states (e.g., dry, wet, and extremely wet). This model had difficulty in preserving the weather statistics besides the cross correlation. Additionally, it is not clear how to diagnose the differences between the precipitation states (e.g., wet and extremely wet).
Srivastav and Simonovic, [32] developed a non-parametric model using the maximum entropy bootstrap technique to capture the time-dependent structure and statistical characteristics. They used an orthogonal transformation to capture the spatial correlations. Even though the model preserves the historical characteristics, Verdin et al., [15] and Chen et al., [40] showed that the maximum entropy bootstrap technique is limited to the historical data range leading to inadequacy in climate change studies. It is difficult to employ this model to create different climate scenarios through variations of parameters.
Li and Babovic, [41] proposed a two-stage parametric model using an empirical copula to generate spatial distribution templates. Then, they developed a rank ordering technique that depended on historic data ranks with an empirical copula technique to preserve the correlations between the variables. The model preserves correlations between the variables and sites but is limited to the historic record length. For example, the model cannot generate more than 30 years of simulation if the historic observations are 30 years. Therefore, the model is not useful in areas with limited data length as an insufficient projection length may lead to wrong conclusions in risk assessment studies [42,43].
Verdin et al., [15] presented a model using a Bayesian hierarchical technique. The precipitation amounts are modeled using gamma distributions and maximum and minimum temperatures are modeled using a normal distribution. The statistical coefficients within them are modeled as spatial Gaussian processes to account for the correlations. Besides the complexity of model structure, the model has difficulty in preserving the statistical properties of the variables (especially the standard deviation of the minimum temperature is extremely underestimated by the model). Additionally, the model underestimates the spatial correlation between the variables. Furthermore, their results do not demonstrate the model's ability to preserve the cross correlation between the variables as well as the temporal correlation.

Model Description
The goal here is to develop a parametric regional weather generator (PR-WG) to generate daily stochastic weather variables that preserve their statistical parameters, such as the mean and standard deviations, as well as the spatial, temporal, and cross correlations among them. It should be easy to implement and adapt by altering the statistical parameters to generate synthetic future climate scenarios. The generated scenario must exceed the historic record length and observation range.
The novel contribution is to use a parametric approach to create a flexible model that can adapt to any continuous probability distribution. This will enable the use of the most accurate distribution for each weather variable, and the user can employ other distributions according to the data availability and scope of the study.

Precipitation States
The first step in developing the PR-WG is to establish the precipitation states. They are defined here as: wet days if the daily amounts equal or exceed 0.1 mm and dry days otherwise. This is similar to the approach by Verdin et al., [15] and Li and V. Babovic [41]. The approach is to use the first-order two-state Markov chain (FTMC), which is the most popular method to produce dry and wet precipitation occurrences. It works well in different climate types and performs as well as higher Markov chain orders [21,22].
Let S (k,t,m) denote the precipitation state (S = 0 is a dry day and S = 1 is a wet day) at spatial location k ∈ N, time index t ∈ N in days, and month index m = {1,2, . . . 12}. The dry or wet day occurrence is obtained from the following conditional probabilities: where, κ 0 is the probability of a dry day following a dry day, and κ 1 is the probability of a wet day following a wet day. These probabilities were estimated from the daily historical precipitation observations for each month.

Precipitation Amount
Precipitation amounts were calculated by using the joint probability distribution between the occurrence and amount. For example, once a wet day is predicted from the FTMC, the precipitation amount is calculated. A skewed normal distribution (SN) was selected because it was recommended by other researchers and estimates the daily precipitation amount better than other distributions such as exponential, gamma, Weibull, mixed-exponential, and generalized Pareto in capturing the mean, standard deviation, and extreme values [20,21,36,44,45].
Let P denote the precipitation amount in mm/day and where µX and σX are the mean and standard deviation of TX, res each month m according to (to account for precipitation state e as: where, µX0, µX1, µN0, µN1, σX0, σX1, σN0, and σN1 are the monthly me maximum and minimum air temperature ( o C/day) for S= 0 and 1 matrices of standard normal deviates, such that ʋ and δ ~ N Equations (6), (7), (8), and (9) were estimated from the historic ob [49] showed that the most accurate function to simulate the is Weibull with three and two parameters, respectively, followed b wind speed is affected by precipitation states and amounts [50] decomposed into the same distribution type. As the Weibull distri two Weibulls (although gamma can be [51]), wind speed magni distribution (GM) in this study. Let WS denote the daily wind spe as follows:

Wind Speed Magnitude
where α and β are the shape and scale parameters, respectively. WS for each month m, according to , was estimated implicitly f where α0, α1, β0, and β1 are the shape and scale parameters for S=0 m, h is an independent parameter, and λ is the cumulative uniformly-λ~ U [0, 1], ϵ ℝ . The shape and scale parameters observations using MLEs.
Ψ denote the indicator of precipitation state condition ψ. P returns to a value obtained implicitly from Equation (4) [46] if the condition ψ holds ( in o C, respectively. In which, TX is (and TN) is: ( )~ ( ( ) , ( ) ) (5) where µX and σX are the mean and standard deviation of TX, respectively. Solving Equation (5) for each month m according to (to account for precipitation state effects), Tx and TN can be computed as: ( , , ) = ( , ) + ( , ) × ʋ ( , , ) [ ( , , ) ] (9) where, µX0, µX1, µN0, µN1, σX0, σX1, σN0, and σN1 are the monthly mean and standard deviation for the maximum and minimum air temperature ( o C/day) for S= 0 and 1, respectively, and ʋ and δ are the matrices of standard normal deviates, such that ʋ and δ ~ N (0,1) ϵ ℝ. The parameter values of Equations (6), (7), (8), and (9) were estimated from the historic observations using MLEs. [49] showed that the most accurate function to simulate the daily wind speed magnitude (WS) is Weibull with three and two parameters, respectively, followed by gamma. Given the condition that wind speed is affected by precipitation states and amounts [50], the selected distribution must be decomposed into the same distribution type. As the Weibull distribution cannot be decomposed into two Weibulls (although gamma can be [51]), wind speed magnitude was modeled by the gamma distribution (GM) in this study. Let WS denote the daily wind speed magnitude (m/s) for k locations, as follows:

Wind Speed Magnitude
where α and β are the shape and scale parameters, respectively. Similarly for the temperature, the WS for each month m, according to , was estimated implicitly from the following equations: where α0, α1, β0, and β1 are the shape and scale parameters for S=0 and 1, respectively, for each month m, h is an independent parameter, and λ is the cumulative probability, which is distributed uniformly-λ~ U [0, 1], ϵ ℝ . The shape and scale parameters were estimated from the historic observations using MLEs.
[S=1] ) and returns to zero otherwise in o C, respectively. In which, TX is (and TN) is: where µX and σX are the mean and standard deviation of TX, respectively. Solv each month m according to (to account for precipitation state effects), Tx and as: ( , , ) = ( , ) + ( , ) × ʋ ( , , ) [ ( , , )  where, µX0, µX1, µN0, µN1, σX0, σX1, σN0, and σN1 are the monthly mean and standa maximum and minimum air temperature ( o C/day) for S= 0 and 1, respectively, matrices of standard normal deviates, such that ʋ and δ ~ N (0,1) ϵ ℝ. The Equations (6), (7), (8), and (9) were estimated from the historic observations usin [49] showed that the most accurate function to simulate the daily wind sp is Weibull with three and two parameters, respectively, followed by gamma. Giv wind speed is affected by precipitation states and amounts [50], the selected decomposed into the same distribution type. As the Weibull distribution cannot two Weibulls (although gamma can be [51]), wind speed magnitude was mod distribution (GM) in this study. Let WS denote the daily wind speed magnitude as follows:

Wind Speed Magnitude
where α and β are the shape and scale parameters, respectively. Similarly for WS for each month m, according to , was estimated implicitly from the follow [ ( where α0, α1, β0, and β1 are the shape and scale parameters for S=0 and 1, respecti m, h is an independent parameter, and λ is the cumulative probability, w uniformly-λ~ U [0, 1], ϵ ℝ . The shape and scale parameters were estimate observations using MLEs.
[S=0] , as follows: The maximum and minimum daily air temperatures distribution (N) [47], [48]. Let TX and TN denote the maximum in o C, respectively. In which, TX is (and TN) is: where µX and σX are the mean and standard deviation of TX, each month m according to (to account for precipitation sta as: (  where, µX0, µX1, µN0, µN1, σX0, σX1, σN0, and σN1 are the monthly maximum and minimum air temperature ( o C/day) for S= 0 an matrices of standard normal deviates, such that ʋ and δ ~ Equations (6), (7), (8), and (9) were estimated from the historic [49] showed that the most accurate function to simulate t is Weibull with three and two parameters, respectively, followe wind speed is affected by precipitation states and amounts [ decomposed into the same distribution type. As the Weibull di two Weibulls (although gamma can be [51]), wind speed ma distribution (GM) in this study. Let WS denote the daily wind s as follows:

Wind Speed Magnitude
where α and β are the shape and scale parameters, respective WS for each month m, according to , was estimated implicit where α0, α1, β0, and β1 are the shape and scale parameters for S m, h is an independent parameter, and λ is the cumulati uniformly-λ~ U [0, 1], ϵ ℝ . The shape and scale paramete observations using MLEs.

Maximum and Minimum Air Temperature
The maximum and minimum daily air temperatures distribution (N) [47], [48]. Let TX and TN denote the maximum in o C, respectively. In which, TX is (and TN) is: where µX and σX are the mean and standard deviation of TX, each month m according to (to account for precipitation sta as: (  where, µX0, µX1, µN0, µN1, σX0, σX1, σN0, and σN1 are the monthly maximum and minimum air temperature ( o C/day) for S= 0 an matrices of standard normal deviates, such that ʋ and δ ~ Equations (6), (7), (8), and (9) were estimated from the historic [49] showed that the most accurate function to simulate t is Weibull with three and two parameters, respectively, followe wind speed is affected by precipitation states and amounts [ decomposed into the same distribution type. As the Weibull di two Weibulls (although gamma can be [51]), wind speed ma distribution (GM) in this study. Let WS denote the daily wind s as follows:

Wind Speed Magnitude
where α and β are the shape and scale parameters, respective WS for each month m, according to , was estimated implicit where α0, α1, β0, and β1 are the shape and scale parameters for S m, h is an independent parameter, and λ is the cumulati uniformly-λ~ U [0, 1], ϵ ℝ . The shape and scale paramete observations using MLEs. [S(k,t,m)=0] where θ is the matrix of the standard normal deviates θ~N(0,1) R, and µ p , σ p , and γ p , are the mean, standard deviation, and skew coefficient of the precipitation for month m. The values of the parameters µ p , σ p , and γ p were estimated from the daily historical observations using the method of maximum likelihood estimation (MLE).

Maximum and Minimum Air Temperature
The maximum and minimum daily air temperatures are usually modeled by the normal distribution (N) [47,48]. Let T X and T N denote the maximum and minimum daily air temperature in • C, respectively. In which, T X is (and T N ) is: where µ X and σ X are the mean and standard deviation of T X , respectively. Solving Equation (5) for each month m according to standard deviation, and extreme values [20], [21], [36], [44], [45]. Let P denote the precipitation amount in mm/day and denote the indicator of precipitatio state condition ψ. P returns to a value obtained implicitly from Equation (4) [46] if the condition ψ holds ( [ ] ) and returns to zero otherwise ( [ ] ), as follows: (4 where θ is the matrix of the standard normal deviates θ ~ N(0,1) ϵ ℝ, and µ p , σp, and γp, are the mean standard deviation, and skew coefficient of the precipitation for month m. The values of th parameters µp, σp, and γp were estimated from the daily historical observations using the method o maximum likelihood estimation (MLE).

Maximum and Minimum Air Temperature
The maximum and minimum daily air temperatures are usually modeled by the norma distribution (N) [47], [48]. Let TX and TN denote the maximum and minimum daily air temperatur in o C, respectively. In which, TX is (and TN) is: (9) where, µX0, µX1, µN0, µN1, σX0, σX1, σN0, and σN1 are the monthly mean and standard deviation for th maximum and minimum air temperature ( o C/day) for S= 0 and 1, respectively, and ʋ and δ are th matrices of standard normal deviates, such that ʋ and δ ~ N (0,1) ϵ ℝ. The parameter values o Equations (6), (7), (8), and (9) were estimated from the historic observations using MLEs. [49] showed that the most accurate function to simulate the daily wind speed magnitude (WS is Weibull with three and two parameters, respectively, followed by gamma. Given the condition tha wind speed is affected by precipitation states and amounts [50], the selected distribution must b decomposed into the same distribution type. As the Weibull distribution cannot be decomposed int two Weibulls (although gamma can be [51]), wind speed magnitude was modeled by the gamm distribution (GM) in this study. Let WS denote the daily wind speed magnitude (m/s) for k locations as follows:

Wind Speed Magnitude
where α and β are the shape and scale parameters, respectively. Similarly for the temperature, th WS for each month m, according to , was estimated implicitly from the following equations: where α0, α1, β0, and β1 are the shape and scale parameters for S=0 and 1, respectively, for each mont m, h is an independent parameter, and λ is the cumulative probability, which is distribute uniformly-λ~ U [0, 1], ϵ ℝ . The shape and scale parameters were estimated from the histori observations using MLEs.
Ψ (to account for precipitation state effects), T x and T N can be computed as: as exponential, gamma, Weibull, mixed-exponential, and generalized standard deviation, and extreme values [20], [21], [36], [44], [45]. Let P denote the precipitation amount in mm/day and denote state condition ψ. P returns to a value obtained implicitly from Equa where θ is the matrix of the standard normal deviates θ ~ N(0,1) ϵ ℝ, an standard deviation, and skew coefficient of the precipitation for m parameters µp, σp, and γp were estimated from the daily historical obs maximum likelihood estimation (MLE).

Maximum and Minimum Air Temperature
The maximum and minimum daily air temperatures are usu distribution (N) [47], [48]. Let TX and TN denote the maximum and m in o C, respectively. In which, TX is (and TN) is: where µX and σX are the mean and standard deviation of TX, respecti each month m according to (to account for precipitation state effect as: where, µX0, µX1, µN0, µN1, σX0, σX1, σN0, and σN1 are the monthly mean a maximum and minimum air temperature ( o C/day) for S= 0 and 1, res matrices of standard normal deviates, such that ʋ and δ ~ N (0,1) Equations (6), (7), (8), and (9) were estimated from the historic observa [49] showed that the most accurate function to simulate the daily is Weibull with three and two parameters, respectively, followed by ga wind speed is affected by precipitation states and amounts [50], the decomposed into the same distribution type. As the Weibull distributio two Weibulls (although gamma can be [51]), wind speed magnitude distribution (GM) in this study. Let WS denote the daily wind speed m as follows:

Wind Speed Magnitude
where α and β are the shape and scale parameters, respectively. Sim WS for each month m, according to , was estimated implicitly from where α0, α1, β0, and β1 are the shape and scale parameters for S=0 and 1 m, h is an independent parameter, and λ is the cumulative prob uniformly-λ~ U [0, 1], ϵ ℝ . The shape and scale parameters were observations using MLEs.
(k,t,m) f or as exponential, gamma, Weibull, mixed-exponential, and gen standard deviation, and extreme values [20], [21], [36], [44], [ Let P denote the precipitation amount in mm/day and state condition ψ. P returns to a value obtained implicitly fr where θ is the matrix of the standard normal deviates θ ~ N(0 standard deviation, and skew coefficient of the precipita parameters µp, σp, and γp were estimated from the daily histo maximum likelihood estimation (MLE).

Maximum and Minimum Air Temperature
The maximum and minimum daily air temperatures distribution (N) [47], [48]. Let TX and TN denote the maximu in o C, respectively. In which, TX is (and TN) is: where µX and σX are the mean and standard deviation of TX each month m according to (to account for precipitation s as: where, µX0, µX1, µN0, µN1, σX0, σX1, σN0, and σN1 are the monthl maximum and minimum air temperature ( o C/day) for S= 0 a matrices of standard normal deviates, such that ʋ and δ Equations (6), (7), (8), and (9) were estimated from the histori [49] showed that the most accurate function to simulate is Weibull with three and two parameters, respectively, follow wind speed is affected by precipitation states and amounts decomposed into the same distribution type. As the Weibull d two Weibulls (although gamma can be [51]), wind speed m distribution (GM) in this study. Let WS denote the daily wind as follows:

Wind Speed Magnitude
where α and β are the shape and scale parameters, respectiv WS for each month m, according to , was estimated implic where α0, α1, β0, and β1 are the shape and scale parameters for m, h is an independent parameter, and λ is the cumula uniformly-λ~ U [0, 1], ϵ ℝ . The shape and scale parame observations using MLEs.
as exponential, gamma, Weibull, mixed-exponential, and generalized standard deviation, and extreme values [20], [21], [36], [44], [45]. Let P denote the precipitation amount in mm/day and denot state condition ψ. P returns to a value obtained implicitly from Equa holds ( [ ] ) and returns to zero otherwise ( [ ] ), as follows: where θ is the matrix of the standard normal deviates θ ~ N(0,1) ϵ ℝ, a standard deviation, and skew coefficient of the precipitation for parameters µp, σp, and γp were estimated from the daily historical obs maximum likelihood estimation (MLE).

Maximum and Minimum Air Temperature
The maximum and minimum daily air temperatures are usu distribution (N) [47], [48]. Let TX and TN denote the maximum and m in o C, respectively. In which, TX is (and TN) is: where µX and σX are the mean and standard deviation of TX, respect each month m according to (to account for precipitation state effec as: where, µX0, µX1, µN0, µN1, σX0, σX1, σN0, and σN1 are the monthly mean a maximum and minimum air temperature ( o C/day) for S= 0 and 1, res matrices of standard normal deviates, such that ʋ and δ ~ N (0,1 Equations (6), (7), (8), and (9) were estimated from the historic observa [49] showed that the most accurate function to simulate the daily is Weibull with three and two parameters, respectively, followed by ga wind speed is affected by precipitation states and amounts [50], the decomposed into the same distribution type. As the Weibull distributi two Weibulls (although gamma can be [51]), wind speed magnitude distribution (GM) in this study. Let WS denote the daily wind speed m as follows:

Wind Speed Magnitude
where α and β are the shape and scale parameters, respectively. Sim WS for each month m, according to , was estimated implicitly from where α0, α1, β0, and β1 are the shape and scale parameters for S=0 and m, h is an independent parameter, and λ is the cumulative pro uniformly-λ~ U [0, 1], ϵ ℝ . The shape and scale parameters wer observations using MLEs.
(k,t,m) f or Climate 2020, 8, x FOR PEER REVIEW as exponential, gamma, Weibull, mixed-exponential, and ge standard deviation, and extreme values [20], [21], [36], [44], [ Let P denote the precipitation amount in mm/day and state condition ψ. P returns to a value obtained implicitly fr where θ is the matrix of the standard normal deviates θ ~ N(0 standard deviation, and skew coefficient of the precipita parameters µp, σp, and γp were estimated from the daily hist maximum likelihood estimation (MLE).

Maximum and Minimum Air Temperature
The maximum and minimum daily air temperatures distribution (N) [47], [48]. Let TX and TN denote the maximu in o C, respectively. In which, TX is (and TN) is: where µX and σX are the mean and standard deviation of T each month m according to (to account for precipitation s as: where, µX0, µX1, µN0, µN1, σX0, σX1, σN0, and σN1 are the month maximum and minimum air temperature ( o C/day) for S= 0 matrices of standard normal deviates, such that ʋ and Equations (6), (7), (8), and (9) were estimated from the histor [49] showed that the most accurate function to simulate is Weibull with three and two parameters, respectively, follow wind speed is affected by precipitation states and amounts decomposed into the same distribution type. As the Weibull two Weibulls (although gamma can be [51]), wind speed m distribution (GM) in this study. Let WS denote the daily wind as follows: where α0, α1, β0, and β1 are the shape and scale parameters fo m, h is an independent parameter, and λ is the cumula uniformly-λ~ U [0, 1], ϵ ℝ . The shape and scale parame observations using MLEs.

Wind Speed Magnitude
as exponential, gamma, Weibull, mixed-exponential, and ge standard deviation, and extreme values [20], [21], [36], [44], Let P denote the precipitation amount in mm/day and state condition ψ. P returns to a value obtained implicitly f holds ( where θ is the matrix of the standard normal deviates θ ~ N( standard deviation, and skew coefficient of the precipita parameters µp, σp, and γp were estimated from the daily hist maximum likelihood estimation (MLE).

Maximum and Minimum Air Temperature
The maximum and minimum daily air temperature distribution (N) [47], [48]. Let TX and TN denote the maximu in o C, respectively. In which, TX is (and TN) is: where µX and σX are the mean and standard deviation of T each month m according to (to account for precipitation as: where, µX0, µX1, µN0, µN1, σX0, σX1, σN0, and σN1 are the month maximum and minimum air temperature ( o C/day) for S= 0 matrices of standard normal deviates, such that ʋ and Equations (6), (7), (8), and (9) were estimated from the histor [49] showed that the most accurate function to simulat is Weibull with three and two parameters, respectively, follo wind speed is affected by precipitation states and amount decomposed into the same distribution type. As the Weibull two Weibulls (although gamma can be [51]), wind speed m distribution (GM) in this study. Let WS denote the daily win as follows: where α0, α1, β0, and β1 are the shape and scale parameters fo m, h is an independent parameter, and λ is the cumul uniformly-λ~ U [0, 1], ϵ ℝ . The shape and scale param

Wind Speed Magnitude
as exponential, gamma, Weibull, mixed-exponential, and ge standard deviation, and extreme values [20], [21], [36], [44], Let P denote the precipitation amount in mm/day and state condition ψ. P returns to a value obtained implicitly f holds ( where θ is the matrix of the standard normal deviates θ ~ N( standard deviation, and skew coefficient of the precipita parameters µp, σp, and γp were estimated from the daily hist maximum likelihood estimation (MLE).

Maximum and Minimum Air Temperature
The maximum and minimum daily air temperature distribution (N) [47], [48]. Let TX and TN denote the maximu in o C, respectively. In which, TX is (and TN) is: where µX and σX are the mean and standard deviation of T each month m according to (to account for precipitation as: where, µX0, µX1, µN0, µN1, σX0, σX1, σN0, and σN1 are the month maximum and minimum air temperature ( o C/day) for S= 0 matrices of standard normal deviates, such that ʋ and Equations (6), (7), (8), and (9) were estimated from the histor [49] showed that the most accurate function to simulat is Weibull with three and two parameters, respectively, follo wind speed is affected by precipitation states and amount decomposed into the same distribution type. As the Weibull two Weibulls (although gamma can be [51]), wind speed m distribution (GM) in this study. Let WS denote the daily win as follows: where α0, α1, β0, and β1 are the shape and scale parameters fo m, h is an independent parameter, and λ is the cumul

Wind Speed Magnitude
Ref. [49] showed that the most accurate function to simulate the daily wind speed magnitude (WS) is Weibull with three and two parameters, respectively, followed by gamma. Given the condition that wind speed is affected by precipitation states and amounts [50], the selected distribution must be decomposed into the same distribution type. As the Weibull distribution cannot be decomposed into two Weibulls (although gamma can be [51]), wind speed magnitude was modeled by the gamma distribution (GM) in this study. Let WS denote the daily wind speed magnitude (m/s) for k locations, as follows: where, µX0, µX1, µN0, µN1, σX0, σX1, σN0, and σN1 are the monthly mean and standard deviatio maximum and minimum air temperature ( o C/day) for S= 0 and 1, respectively, and ʋ and matrices of standard normal deviates, such that ʋ and δ ~ N (0,1) ϵ ℝ. The parameter Equations (6), (7), (8), and (9) were estimated from the historic observations using MLEs. [49] showed that the most accurate function to simulate the daily wind speed magnit is Weibull with three and two parameters, respectively, followed by gamma. Given the cond wind speed is affected by precipitation states and amounts [50], the selected distribution decomposed into the same distribution type. As the Weibull distribution cannot be decomp two Weibulls (although gamma can be [51]), wind speed magnitude was modeled by th distribution (GM) in this study. Let WS denote the daily wind speed magnitude (m/s) for k as follows:

Wind Speed Magnitude
where α and β are the shape and scale parameters, respectively. Similarly for the tempera WS for each month m, according to , was estimated implicitly from the following equatio where α0, α1, β0, and β1 are the shape and scale parameters for S=0 and 1, respectively, for ea m, h is an independent parameter, and λ is the cumulative probability, which is di uniformly-λ~ U [0, 1], ϵ ℝ . The shape and scale parameters were estimated from the observations using MLEs.
Ψ , was estimated implicitly from the following equations: where, µX0, µX1, µN0, µN1, σX0, σX1, σN0, and σN1 are the maximum and minimum air temperature ( o C/day) matrices of standard normal deviates, such tha Equations (6), (7), (8), and (9) were estimated from th [49] showed that the most accurate function to is Weibull with three and two parameters, respective wind speed is affected by precipitation states and decomposed into the same distribution type. As the two Weibulls (although gamma can be [51]), wind distribution (GM) in this study. Let WS denote the d as follows: where α0, α1, β0, and β1 are the shape and scale param m, h is an independent parameter, and λ is the uniformly-λ~ U [0, 1], ϵ ℝ . The shape and scale observations using MLEs.

Wind Speed Magnitude
[S(k,t,m)=0] [47], [48]. Let TX and TN denote the in o C, respectively. In which, TX is (and TN) is: where µX and σX are the mean and standard deviat each month m according to (to account for precip as: where, µX0, µX1, µN0, µN1, σX0, σX1, σN0, and σN1 are the maximum and minimum air temperature ( o C/day) matrices of standard normal deviates, such tha Equations (6), (7), (8), and (9) were estimated from th [49] showed that the most accurate function to is Weibull with three and two parameters, respective wind speed is affected by precipitation states and decomposed into the same distribution type. As the two Weibulls (although gamma can be [51]), wind distribution (GM) in this study. Let WS denote the d as follows: where α0, α1, β0, and β1 are the shape and scale param m, h is an independent parameter, and λ is the uniformly-λ~ U [0, 1], ϵ ℝ . The shape and scale observations using MLEs.

Wind Speed Magnitude
where α 0 , α 1 , β 0 , and β 1 are the shape and scale parameters for S = 0 and 1, respectively, for each month m, h is an independent parameter, and λ is the cumulative probability, which is distributed uniformly-λ~U [0, 1], R. The shape and scale parameters were estimated from the historic observations using MLEs.

Model Implementation
The parametric SWG should conserve the spatial, temporal, and cross correlations of the historic observations of the four weather variables. The concept is to study the behavior of the variates θ, υ, δ, and λ, hereafter referred to as anomalies. The correlations between those anomalies should be identified so the generated weather values are statistically similar to the observed values and conserve spatial, temporal, and cross correlations. The implementation of the PR-WG consists of two stages, namely preprocessing and postprocessing, as shown in Figure 1.

Preprocessing: Parameter Estimation and Matrix Preparation
In order to specify the wet and dry occurrences, a random uniform variate y~U(0, 1) must be drawn and compared with the transition probabilities obtained from Equations (1) and (2). For multi-site precipitation, the anomalies (referred to as Y R) that identify the states in k locations must be correlated so that the generated states S are correlated to the historic observations. Wilks' method was selected to generate correlated anomalies Y~N(0,1) at multiple sites. It is simple and more efficient than hidden Markov and k-nearest neighbor methods [52], accurate in generating the correlations of monthly interstations [53], and the most cited method compared to other approaches [54].
The process should be repeated for each station pair and lead to the number of realizations of k (k-1)/2 and be repeated for each month m to create the anomalies matrix ω s ∈ R. The ω s matrix is then used to develop Y that produces correlated precipitation states in k locations for month m, using the multivariate normal distribution as follows, The parametric SWG should conserve the spatial, temporal, and cross correlations of the historic observations of the four weather variables. The concept is to study the behavior of the variates θ, ʋ, δ, and λ, hereafter referred to as anomalies. The correlations between those anomalies should be identified so the generated weather values are statistically similar to the observed values and conserve spatial, temporal, and cross correlations. The implementation of the PR-WG consists of two stages, namely preprocessing and postprocessing, as shown in Figure 1.

Preprocessing: Parameter Estimation and Matrix Preparation
In order to specify the wet and dry occurrences, a random uniform variate y ~U(0, 1) must be drawn and compared with the transition probabilities obtained from Equations (1) and (2). For multisite precipitation, the anomalies (referred to as Y ϵ ℝ) that identify the states in k locations must be correlated so that the generated states S are correlated to the historic observations. Wilks' method was selected to generate correlated anomalies Y~ N(0,1) at multiple sites. It is simple and more efficient than hidden Markov and k-nearest neighbor methods [52], accurate in generating the and (12) for the WS, after estimating their parameters (e.g., µp, σp, γp for P, µX0, µX1, σX0, σX1 for TX , µN0, µN1, σN0, σN1 for TN, and α0 ,α1 β0, β1 for the WS). The Z matrix represents the anomalies of the weather variables and their elements have spatial, cross-, and auto-correlation magnitudes. To generate the Z matrix with the same observation properties, these correlations must be preserved. The first step done here was to estimate autoregressive model of order 1, AR(1), coefficients for the anomalies (φz) so that the generated variables have the observed AR(1) value (φv) applying Wilks' technique. For illustration, synthetically assume that the values of µX0, µX1, σX0, σX1 are 11.72, 9.12, 3.71, 2.21 (C o /day), respectively, and φv is 0.82 at station k of month m. The adopted procedure for obtaining the φz is as follows: 1) Generate the standard normal random deviate set y; y ~ N (0,1). (1) and (2) to identify the dry and wet days.

5) Obtain the anomalies z by standardizing x of Step 4. 6) Apply Equations
This procedure must be done for all TX and TN of each k and m. For the WS, the procedure is the same except for Step 5, converting x so it is uniformly distributed to get the WS anomalies. For example, let us assume that α0, α1, β0, and β1 are 4.04, 3.22, 0.62, 0.71, respectively, and the φv is 0.54. The corresponding φz will be 0.56, as shown in Figure 2c. This procedure allows us to preserve the auto-correlation of Tx, TN, and the WS.
The final step of the preprocessing stage is to construct the positive-definite correlation matrix of the variable anomalies ωV, as done for precipitation states using ISDC. Building the ωV allows us to preserve all the spatial, temporal, and cross correlations between the variables.

Postprocessing Stage: Variable Generation
After building all matrices and estimating the parameters in the preprocessing stage, the four weather variables can be generated for any time length of interest, as follows: 1) Use Equation (13) with ωs to generate Y anomalies that denote S. The length of Y denotes the The variable µ y denotes the 1-D mean vector for the anomalies Y, Σ denotes the covariance matrix, and d is an independent parameter. In this case, µ = [0, 0, . . . , 0] k×1 and the variance is 1, so the covariance matrix Σ s becomes the correlation matrix ω s .
The matrix ω s must be a positive-definite matrix (e.g., the matrix is symmetric and all its eigenvalues are positive) to be implemented in Equation (13). Since the elements of ω s were calculated empirically, ω s is usually a non-positive matrix. Comparing to the work of others, the most precise method to obtain a positive-definite matrix is the iterative spectral with Dykstra's correction (ISDC) [55], as follows:
Replace the negative eigenvalues of L i by a small positive value to construct L + i .
Then, replace all ω i+1 diagonal elements with 1. 6) Test whether ω i+1 is a positive-defined matrix or not. If not, repeat the steps from two to six by making i = i + 1 and ω i = ω i−1 .
After generating the matrix S at k and m, the next step is to simulate the weather variables (e.g., P, T X , T N and WS). The idea here is to examine the anomalies of these variables and generate the weather variables with the same observation properties. To account for all the spatial and cross correlation between the variables, their anomalies (θ, υ, δ, and λ) must be correlated. The temporal correlation, identified by the Lag-1 day auto-correlations, for the T X , T N and WS must also be considered. Since the precipitation amount is an intermittent variable, the auto-correlation is not considered. The following procedure was suggested to achieve this purpose. First, arrange the weather variable matrix V as follows, where, V represents the observed weather variable value and n denotes the weather variable rank (P, T X , T N and WS), n = {1, 2, 3, 4}. The total number of the rows will be T = month days × year numbers, the columns will be K × N, and the aisle will be M. This matrix arrangement enables us to consider all the spatial and cross correlations between the weather variables. Next, extract the anomalies matrix Z ∈ R from V using Equations (3) and (4) for P; Equations (6)- (9) for T x and T N ; Equations (11) and (12) for the WS, after estimating their parameters (e.g., µ p , σ p , γ p for P, µ X0 , µ X1 , σ X0 , σ X1 for T x , µ N0 , µ N1 , σ N0 , σ N1 for T N , and α 0 , α 1 β 0 , β 1 for the WS). The Z matrix represents the anomalies of the weather variables and their elements have spatial, cross-, and auto-correlation magnitudes. To generate the Z matrix with the same observation properties, these correlations must be preserved. The first step done here was to estimate autoregressive model of order 1, AR(1), coefficients for the anomalies (ϕ z ) so that the generated variables have the observed AR(1) value (ϕ v ) applying Wilks' technique. For illustration, synthetically assume that the values of µ X0 , µ X1 , σ X0 , σ X1 are 11.72, 9.12, 3.71, 2.21 (C o /day), respectively, and ϕ v is 0.82 at station k of month m. The adopted procedure for obtaining the ϕ z is as follows:

8) Use the regression equation obtained in
Step 7 with the observed value ϕ v (e.g., 0.82) to determine ϕ z . In this case, 0.88 (as shown in Figure 2b).
This procedure must be done for all T x and T N of each k and m. For the WS, the procedure is the same except for Step 5, converting x so it is uniformly distributed to get the WS anomalies. For example, let us assume that α 0 , α 1 , β 0 , and β 1 are 4.04, 3.22, 0.62, 0.71, respectively, and the ϕ v is 0.54. The corresponding ϕ z will be 0.56, as shown in Figure 2c. This procedure allows us to preserve the auto-correlation of T x , T N , and the WS.
The final step of the preprocessing stage is to construct the positive-definite correlation matrix of the variable anomalies ω V , as done for precipitation states using ISDC. Building the ω V allows us to preserve all the spatial, temporal, and cross correlations between the variables.

Postprocessing Stage: Variable Generation
After building all matrices and estimating the parameters in the preprocessing stage, the four weather variables can be generated for any time length of interest, as follows:

1)
Use Equation (13) with ω s to generate Y anomalies that denote S. The length of Y denotes the day number of the generated time series. In this case, the user can generate any length (independently of the historic observation length). (1) and (2) with the estimated FTMC parameters (κ 0 and κ 1 ) to identify the dry and wet day occurrences. 3) Apply Equation (13) with ω v to generate Z anomalies that denote the variable values. Of course, the length of Z must be the same of Y. 4) Obtain P for the wet days using Equations (3) and (4) with the estimated parameters µ p , σ p , and ι p . This will make sure the generated P have similar observed statistics. 5) Apply the AR (1) with coefficients φ z for T x , T N and the WS anomalies to consider the auto-correlation magnitude for the variables. 6) Re-standardize the anomalies for T X and T N , as follows:

2) Use Equations
where Z std represents the standardized anomalies Z of Step 5, and µ(Z) and σ(Z) are the mean and standard deviation of Z, respectively. 7) Apply Z std in Equations (6)-(9) with the estimated parameters µ X0 , µ X1 , µ N0 , µ N1 , σ X0 , σ X1 , σ N0 , and σ N1 to calculate T x and T N . 8) Convert the anomalies Z of the WS to be uniformly distributed between 0 and 1 Z U , as follows: 9) Apply Z U in Equations (11) and (12) with the estimated parameters α 0 , α 1 , β 0 , and β 1 to calculate the WS. Steps 3 to 9 enable us to preserve the observation statistics of T x , T N and the WS and the spatial, temporal, and cross correlations with consideration of the precipitation states effects through decomposing their distribution functions. 10) Repeat Steps 1 to 9 for all months m.

Case Study and Data
The developed PR-WG was tested in the Diyala River basin, which is a transboundary basin between Iran and Iraq with a total stream length of 217 km and basin area of 16,760 km 2 above Derbendikhan Dam, as shown in Figure 3. In previous work, Waheed et al., [5] implemented the daily weather data (e.g., precipitation, maximum and minimum temperature, and wind speed) in this basin at a 0.5 • spatial resolution from 1948 to 2006 and explained the implementation procedure. In this follow up study, the historic forcing data were used to validate the proposed PR-WG and test its performance. The reader should refer to the original paper for more details about the data implementation and their validation in the basin.

Model Performance Evaluation
The PR-WG was tested for its daily performance with historic observations for the period between 1948 and 2006, e.g., 58 years, in a grid composed of 24 grid-cells. The Nash-Sutcliffe coefficient efficiency (NSCE; [56]) and the Kling-Gupta efficiency (KGE; [5]) were used to evaluate the PR-WG's ability to produce spatially correlated precipitation states S similar to the observed values, as follows: where Sim and Obs are the simulations (e.g., the PR-WG outcomes) and the observations of the time index t, respectively; μobs, σobs, μsim, and σsim are the mean and standard deviation of the observations and simulations (e.g., the PR-WG outcomes), respectively, and ρ is the correlation coefficient between the observations and simulations. Figure 4 shows the comparison of 10 separate daily simulations each of the same observation length (e.g., 58 years) of PR-WG monthly dry and wet occurrences in gray color dots. The average of these 10 simulations is calculated and plotted in blue dots. The A 1-1 line is also plotted to ease the comparison. It is evident that the model works well to produce the number of dry and wet days, with

Model Performance Evaluation
The PR-WG was tested for its daily performance with historic observations for the period between 1948 and 2006, e.g., 58 years, in a grid composed of 24 grid-cells. The Nash-Sutcliffe coefficient efficiency (NSCE; [56]) and the Kling-Gupta efficiency (KGE; [5]) were used to evaluate the PR-WG's ability to produce spatially correlated precipitation states S similar to the observed values, as follows: where Sim and Obs are the simulations (e.g., the PR-WG outcomes) and the observations of the time index t, respectively; µ obs , σ obs , µ sim, and σ sim are the mean and standard deviation of the observations and simulations (e.g., the PR-WG outcomes), respectively, and ρ is the correlation coefficient between the observations and simulations. Figure 4 shows the comparison of 10 separate daily simulations each of the same observation length (e.g., 58 years) of PR-WG monthly dry and wet occurrences in gray color dots. The average of these 10 simulations is calculated and plotted in blue dots. The A 1-1 line is also plotted to ease the comparison. It is evident that the model works well to produce the number of dry and wet days, with KGE and NSCE values of 0.97. This result demonstrates the ability of the FTMC to produce the precipitation states well [21,22]. Figure 5 shows a comparison of pairwise correlations of the daily precipitation states calculated for each calendar month. It can be seen that the correlations are captured well by the PR-WG. The overall KGE and NSCE values are 0.98 and 0.99, respectively.  Figure 6 demonstrates the PR-WG performance to produce the statistical parameters (e.g., mean, standard deviation and skewness) of the four weather variables. The comparisons were done on a daily basis at each month for the 24 grid-cells. A daily time step series of 1000 years was generated to reduce the sampling bias and uncertainty in the simulations. However, the daily means of all variables and the standard deviations for TX, TN and WS were perfectly produced by the model (KGE ≈1), while σp, and γp are reasonably preserved (KGE = 0.96 and 0.86; NSCE = 0.98 and 0.93). The slight discrepancies are due to the stochastic nature of the process [57]. Figure 7 shows the daily median values with 0.05 and 0.95 quantiles of the daily values in the bounded areas, and the inverse cumulative distribution function (CDF -1 ) of the observed and simulated weather variables for grid-cell number 9, which is located in the basin heart (see Figure 3). It is seen that PR-WG well preserves the daily medians for all months. Moreover, the quantile daily estimates show good agreement with the observation quantiles, proving the model's ability to capture the maximum and minimum daily weather values. It is also noticeable from the inverse CDF the observed and simulated weather values are very close, evincing the validity of the selected distribution types. Furthermore, the simulated daily values of quantile 1 exceed the observation values which demonstrates the model's ability to produce values beyond the observation ranges.   Figure 6 demonstrates the PR-WG performance to produce the statistical parameters (e.g., mean, standard deviation and skewness) of the four weather variables. The comparisons were done on a daily basis at each month for the 24 grid-cells. A daily time step series of 1000 years was generated to reduce the sampling bias and uncertainty in the simulations. However, the daily means of all variables and the standard deviations for TX, TN and WS were perfectly produced by the model (KGE ≈1), while σp, and γp are reasonably preserved (KGE = 0.96 and 0.86; NSCE = 0.98 and 0.93). The slight discrepancies are due to the stochastic nature of the process [57]. Figure 7 shows the daily median values with 0.05 and 0.95 quantiles of the daily values in the bounded areas, and the inverse cumulative distribution function (CDF -1 ) of the observed and simulated weather variables for grid-cell number 9, which is located in the basin heart (see Figure 3). It is seen that PR-WG well preserves the daily medians for all months. Moreover, the quantile daily estimates show good agreement with the observation quantiles, proving the model's ability to capture the maximum and minimum daily weather values. It is also noticeable from the inverse CDF the observed and simulated weather values are very close, evincing the validity of the selected distribution types. Furthermore, the simulated daily values of quantile 1 exceed the observation values which demonstrates the model's ability to produce values beyond the observation ranges.   Figure 6 demonstrates the PR-WG performance to produce the statistical parameters (e.g., mean, standard deviation and skewness) of the four weather variables. The comparisons were done on a daily basis at each month for the 24 grid-cells. A daily time step series of 1000 years was generated to reduce the sampling bias and uncertainty in the simulations. However, the daily means of all variables and the standard deviations for T x , T N and WS were perfectly produced by the model (KGE ≈1), while σ p , and γ p are reasonably preserved (KGE = 0.96 and 0.86; NSCE = 0.98 and 0.93). The slight discrepancies are due to the stochastic nature of the process [57].  Figure 8 shows the spatial and cross correlation coefficient matrices of the observations and simulations for one month (e.g., m=1), while Figure 9 shows the spatial and cross correlation comparison for all variables for each m calculated at daily time steps. The number of columns of the V matrix (see Section 4.1) are 4 × 24 = 96. Therefore, the V dimensions are 96 × 96, in which the values are from 1 to 24 for P, 25 to 48 for TX, 49 to 72 for TN, and 73 to 96 for the WS. It can be observed from Figure 8 that the observed correlation among the variables varies greatly across them. P and the WS are slightly less spatially correlated as compared with TX and TN. These facts are in line with Srivastav and Simonovic [32] and Verdin et al. [15]. It is also noticeable from Figure 8 and Figure 9 that the model preserves the spatial and cross correlation well among the variables. The overall KGE and NSCE values are 0.96 and 0.97, respectively.   Figure 7 shows the daily median values with 0.05 and 0.95 quantiles of the daily values in the bounded areas, and the inverse cumulative distribution function (CDF −1 ) of the observed and simulated weather variables for grid-cell number 9, which is located in the basin heart (see Figure 3). It is seen that PR-WG well preserves the daily medians for all months. Moreover, the quantile daily estimates show good agreement with the observation quantiles, proving the model's ability to capture the maximum and minimum daily weather values. It is also noticeable from the inverse CDF the observed and simulated weather values are very close, evincing the validity of the selected distribution types. Furthermore, the simulated daily values of quantile 1 exceed the observation values which demonstrates the model's ability to produce values beyond the observation ranges. are from 1 to 24 for P, 25 to 48 for TX, 49 to 72 for TN, and 73 to 96 for the WS. It can be observed from Figure 8 that the observed correlation among the variables varies greatly across them. P and the WS are slightly less spatially correlated as compared with TX and TN. These facts are in line with Srivastav and Simonovic [32] and Verdin et al. [15]. It is also noticeable from Figure 8 and Figure 9 that the model preserves the spatial and cross correlation well among the variables. The overall KGE and NSCE values are 0.96 and 0.97, respectively.   Figure 8 shows the spatial and cross correlation coefficient matrices of the observations and simulations for one month (e.g., m=1), while Figure 9 shows the spatial and cross correlation comparison for all variables for each m calculated at daily time steps. The number of columns of the V matrix (see Section 4.1) are 4 × 24 = 96. Therefore, the V dimensions are 96 × 96, in which the values are from 1 to 24 for P, 25 to 48 for T x , 49 to 72 for T N , and 73 to 96 for the WS. It can be observed from Figure 8 that the observed correlation among the variables varies greatly across them. P and the WS are slightly less spatially correlated as compared with T x and T N . These facts are in line with Srivastav and Simonovic [32] and Verdin et al. [15]. It is also noticeable from Figures 8 and 9 that the model preserves the spatial and cross correlation well among the variables. The overall KGE and NSCE values are 0.96 and 0.97, respectively.        Figure 10 demonstrates the PR-WG capability to preserve the Lag-1 day auto-correlations of T x , T N and the WS. It is noticeable that the values differ from month to month, they are less for the WS comparing to T x and T N . However, the PR-WG captures these monthly variations very well regardless of their magnitudes with the overall KGE and NSCE values of 0.97 and 0.98, respectively.
The results presented here glimpse the model capability to preserve the statistical properties of the observations to synthesize the future scenarios. The proposed model demonstrated the Wilks technique ability to generate anomalies similarly to the observations. It is also seen that the hybrid structure of the AR and Wilks technique leads to generate data that preserve the temporal correlation beside the spatial and cross correlations.  Figure 10 demonstrates the PR-WG capability to preserve the Lag-1 day auto-correlations of TX, TN and the WS. It is noticeable that the values differ from month to month, they are less for the WS comparing to TX and TN. However, the PR-WG captures these monthly variations very well regardless of their magnitudes with the overall KGE and NSCE values of 0.97 and 0.98, respectively. The results presented here glimpse the model capability to preserve the statistical properties of the observations to synthesize the future scenarios. The proposed model demonstrated the Wilks technique ability to generate anomalies similarly to the observations. It is also seen that the hybrid The key advantage of PR-WG is that it is built to be a general model through studying the observation anomalies and mimicking them. Therefore, the model is anticipated to work well in different climate zones and topographies regardless of the data spatial and temporal scale. The model framework is flexible enough for locations observe short-term and long-term variations. Moreover, the user can reduce the cycle data length to meet their scope. e.g., they can use a data window of two weeks (or a week) instead of the monthly window that was used in this study. The computational expensive of implemented the pre-processing stage has to carefully examined.

Model Validation
In some cases, the proposed SWG produces negative daily values for precipitation. [58] indicated that the SN is not suitable when the skewness is greater than 4.5. However, in the study area, values of the skewness have not exceeded 4.5 (see Figure 6c), therefore the SN is applicable. The negatives of the daily values were checked and found to be less than 3% of the whole 1000-year time series in the 24 grid-cells. The suggestion of [32] to round the negative values to zero was considered, but it would affect the number of wet and dry calculations and the statistical parameters of precipitation. Instead, the negatives were rounded to 0.1 mm/day, which is assumed to be the minimum precipitation amount (see Section 3.1). This correction approach for negative values illustrates the slight differences in the simulated σ p and γ p (see Figure 5b,c). The user could apply another distribution function in cases where the SN is not applicable such as mixed-exponential [59,60], log-normal, gamma, etc. The key advantage of PR-WG is its flexibility in adopting any distribution of interest, such as these.
The second validation was done by checking if T N is greater than T x and was found to be less than 1% of the whole 1000-year time series in the 24 grid-cells. [41] suggested to force T x to be greater than T N through setting T N equal to T x minus 1. This procedure will affect the auto-correlation of the T N . Instead, the Chen et al., [57] approach was applied as follows, if T x < T N , T X(k,t,m) = T N(k,t,m) + (µ µx(k,m) − µ µN(k,m) ) + σ 2 X(k,m) − σ 2 N(k,m) × z std(k,t,m) f or the spatial, cross, and temporal correlations. Figure 11 shows the daily performances of the WG and EC to for the spatial and cross correlations, while Figure 12 shows the temporal correlations for the sites and month. It is seen that the EC model works well in preserving the spatial, cross, and temporal correlation; the PR-WG is slightly superior to it. However, the KGE and NSCE for the spatial and cross correlations are 0.92 and 0.93, and for the temporal 0.95, and 0.96. It also notable that the WG model poorly preserves the spatial and cross correlations but has good ability to preserve the temporal correlation. This is because the model accounts for the temporal correlation only, where the simulated data were generated independently for all variables and sites which leads to poor spatial and cross correlation accuracy. The KGE and NSCE for the spatial and cross correlations are −0.29 and −8.3, and for the temporal 0.88, and 0.89. Although the EC approach works well in general, the only drawback is that its simulation time period must be identical to the historic observation, which prevents its usage in areas with limited data availability. This is because the post processing stage of the EC model employs a re-ranking technique that extracts the ranked variables directly from the historic observations. Therefore, the model length can only be the same as the historic observations, leading to less flexibility for future scenarios, especially in data scarce regions. The PR-WG has the advantage of producing the simulation length of interest, making it useful in areas with limited data availability besides maintaining the statistical characteristics. and −8.3, and for the temporal 0.88, and 0.89. Although the EC approach works well in general, the only drawback is that its simulation time period must be identical to the historic observation, which prevents its usage in areas with limited data availability. This is because the post processing stage of the EC model employs a re-ranking technique that extracts the ranked variables directly from the historic observations. Therefore, the model length can only be the same as the historic observations, leading to less flexibility for future scenarios, especially in data scarce regions. The PR-WG has the advantage of producing the simulation length of interest, making it useful in areas with limited data availability besides maintaining the statistical characteristics. Figure 11. Performance evaluation for empirical copula (EC) and weather generator (WG) models for preserving the spatial and cross correlations of the weather variables for each month.

Simulation of the Future Forecasting Scenarios
The goal of the PR-WG is to be used later for climate variation assessments. The advantage of the model, besides the ability to preserve the statistical characteristics, is its flexibility to alter them to produce a wide range of different scenarios. However, defining the future scenario ranges to test a water resources system's performance in terms of the climate stress is a difficult task and dependent on many factors, including expert opinions [4].
These future scenarios will be applied in the Diyala River basin to discover the vulnerability of the Derbendikhan Dam and its reservoir. Moreover, different adaptation strategies will be suggested in order to test their capabilities to improve the system performance. Since the model is implemented on a stochastic basis, the future trend insights will be obtained from analyzing the GCM models. Then, this can be fed into the PR-WR to mimic the future trend as well as the statistical properties. For instance, multiplicative factors for the precipitation mean will be applied starting from a 0% change in the historical precipitation and annual linearly increasing (or decreasing) up to the specified value in the final period (e.g., +30% of the historical value). Forms other than the linear change can also be applied to synthesize the future forecast data. prevents its usage in areas with limited data availability. This is because the post processing stage of the EC model employs a re-ranking technique that extracts the ranked variables directly from the historic observations. Therefore, the model length can only be the same as the historic observations, leading to less flexibility for future scenarios, especially in data scarce regions. The PR-WG has the advantage of producing the simulation length of interest, making it useful in areas with limited data availability besides maintaining the statistical characteristics. Figure 11. Performance evaluation for empirical copula (EC) and weather generator (WG) models for preserving the spatial and cross correlations of the weather variables for each month.

Simulation of the Future Forecasting Scenarios
The goal of the PR-WG is to be used later for climate variation assessments. The advantage of the model, besides the ability to preserve the statistical characteristics, is its flexibility to alter them to produce a wide range of different scenarios. However, defining the future scenario ranges to test a water resources system's performance in terms of the climate stress is a difficult task and dependent on many factors, including expert opinions [4].
These future scenarios will be applied in the Diyala River basin to discover the vulnerability of the Derbendikhan Dam and its reservoir. Moreover, different adaptation strategies will be suggested in order to test their capabilities to improve the system performance. Since the model is implemented on a stochastic basis, the future trend insights will be obtained from analyzing the GCM models. Then, this can be fed into the PR-WR to mimic the future trend as well as the statistical properties. For instance, multiplicative factors for the precipitation mean will be applied starting from a 0% change in the historical precipitation and annual linearly increasing (or decreasing) up to the specified value in the final period (e.g., +30% of the historical value). Forms other than the linear change can also be applied to synthesize the future forecast data.

Simulation of the Future Forecasting Scenarios
The goal of the PR-WG is to be used later for climate variation assessments. The advantage of the model, besides the ability to preserve the statistical characteristics, is its flexibility to alter them to produce a wide range of different scenarios. However, defining the future scenario ranges to test a water resources system's performance in terms of the climate stress is a difficult task and dependent on many factors, including expert opinions [4].
These future scenarios will be applied in the Diyala River basin to discover the vulnerability of the Derbendikhan Dam and its reservoir. Moreover, different adaptation strategies will be suggested in order to test their capabilities to improve the system performance. Since the model is implemented on a stochastic basis, the future trend insights will be obtained from analyzing the GCM models. Then, this can be fed into the PR-WR to mimic the future trend as well as the statistical properties. For instance, multiplicative factors for the precipitation mean will be applied starting from a 0% change in the historical precipitation and annual linearly increasing (or decreasing) up to the specified value in the final period (e.g., +30% of the historical value). Forms other than the linear change can also be applied to synthesize the future forecast data.

Conclusions
It was shown that a PR-WG accurately preserves the statistical properties (mean, standard deviation, and skewness coefficient) of the weather variables (overall KGE and NSCE test values were 0.98). The PR-WG also preserves the spatial, temporal, and cross correlations among the weather variables. While other SWGs may have more features, the one developed in this study enables a bottom-up vulnerability assessment study to be implemented in areas with limited data availability.
The PR-WG effectively estimates the dry and wet day occurrences using a FTMC with overall KGE and NSCE values of 0.97, a result that is in line with those in [21,22]. The results also demonstrate the effectiveness of Wilks' technique to produce spatially correlated precipitation states (KGE of 0.98; NSCE of 0.99) and spatially and cross correlated weather variables (KGE of 0.96; NSCE of 0.97), as well as temporally correlated variables (KGE of 0.97; NSCE of 0.98). The model is also capable of preserving the maximum and minimum daily weather values as well as producing values beyond the observed ranges. Furthermore, the PR-WG outperforms the EC and WG models in preserving the spatial, cross, and temporal correlations in the meteorological stations.
While the PR-WG was validated in the Diyala River basin, it should be effective and applicable in other places and with other weather variables, such as solar radiation. The advantages of PR-WG are its flexibility to select any distribution function for each weather variable, ability to simulate any number of years within or beyond the historic observation length, capability to generate values outside the observation range, and its ability to produce synthetic scenarios through the alteration of the weather variable parameters for the study of climate change's impacts. The PR-WG is easy to construct and understand with little computational intensity to build the spatial and cross correlation matrices of the anomalies. Increasing computational power will facilitate the work.