Comparative Study of Different Stochastic Weather Generators for Long-Term Climate Data Simulation

Climate is one of the single most important factors affecting watershed ecosystems and water resources. The effect of climate variability and change has been studied extensively in some places; in many places, however, assessments are hampered by limited availability of long-term continuous climate data. Weather generators provide a means of synthesizing long-term climate data that can then be used in natural resource assessments. Given their potential, there is the need to evaluate the performance of the generators; in this study, three commonly used weather generators—CLImate GENerator (CLIGEN), Long Ashton Research Station Weather Generator (LARS-WG), and Weather Generators (WeaGETS) were compared with regard to their ability to capture the essential statistical characteristics of observed data (distribution, occurrence of wet and dry spells, number of snow days, growing season temperatures, and growing degree days). The study was based on observed 1966–2015 weather station data from the Western Lake Erie Basin (WLEB), from which 50 different realizations were generated, each spanning 50 years. Both CLIGEN and LARS-WG performed fairly well with respect to representing the statistical characteristics of observed precipitation and minimum and maximum temperatures, although CLIGEN tended to overestimate values at the extremes. This generator also overestimated dry sequences by 18%–30% and snow-day counts by 12%–19% when considered over the entire WLEB. It (CLIGEN) was, however, well able to simulate parameters specific to crop growth such as growing degree days and had an added advantage over the other generators in that it simulates a larger number of weather variables. LARS-WG overestimated wet sequence counts across the basin by 15%–38%. In addition, the optimal growth period simulated by LARS-WG also exceeded that obtained from observed data by 16%–29% basin-wide. Preliminary results with WeaGETS indicated that additional evaluation is needed to better define its parameters. Results provided insights into the suitability of both CLIGEN and LARS-WG for use with water resource applications.


Introduction
Climate is one of the single most important factors affecting ecosystems and water resources [1].Any change in climatic variables can result in adverse conditions, for example, the warming of lakes and rivers can lead to phenological shifts, organism abundance and productivity, a prolonged depletion of oxygen in deeper layers, and decreased surface layer nutrient concentrations.Variables such as precipitation, temperatures, and atmospheric carbon dioxide concentration influence various hydrological parameters such as streamflow, surface runoff, and evapotranspiration.A strong Climate 2017, 5, 26 2 of 40 relationship between climate variables and components of the hydrologic cycle has been reported in the literature [2].The effect of climate variability and change has been extensively studied in various sectors including human health [3]; biodiversity [4]; food production [5]; economic growth [6]; and water resources [7][8][9].
In many places, however, assessments are hampered by paucity of data as continuous long-term climate data series are generally required to predict changes in spatial and temporal patterns of various hydrological parameters.This deficiency in climate data at some locations has prompted the use of weather generators to synthesize the climate series for a station for any number of years where data are only partially available or have missing values even within longer term datasets.Weather generators (WGs) are statistical models that generate numerous possible weather variables including precipitation, temperatures, solar radiation, and wind velocity at a daily time step and, ideally, with the same statistical characteristics as those of observed data [10].Weather generators can also be used to generate weather data at ungauged sites by interpolating parameters from gauged sites that are nearby or have topographical resemblance to the ungauged site [11].
Although the generators were first developed for hydrological application [12,13], they have been widely used to investigate the influence of weather conditions on water resources and agricultural yields [14].For example, [15] used stochastic weather generators to estimate the natural variability of discharge of the Chute-du-Diable watershed (Quebec, QC, Canada), while [16] coupled a stochastic weather generator with crop simulation models to assess yields and economic returns in regions of the Pampas of Argentina.These generators are different from numeric general circulation models (GCMs) as they are based on small spatial scales, are computationally faster, and, ideally, should produce outputs that have the same distributional properties as observed time series [17].They can, however, be used to downscale GCM predictions from a monthly time step or grid scale to a daily time step or local scale [18][19][20].
There are numerous WGs currently being used, including Weather Generator, WGEN [21,22]; USCLIMATE [23]; Climate Generator, CLIMGEN [24]; CLImate GENerator, CLIGEN [25]; Stochastic Weather Generators, WeaGETS [26]; and the Long Ashton Research Station Weather Generator, LARS-WG [27].Because of the ability to downscale weather variables, weather generators are widely used in climate change studies, in which case parameters can be altered to better represent changes in precipitation and temperature [14,28].In this study, three commonly used generators-CLIGEN, LARS-WG, and WeaGETS-are compared with regard to their ability to capture essential statistical characteristics of observed data.These generators are widely used likely because of their ability to provide long-term climate data even in areas without long observed data records, and the combination of distributions (skewed, semi-empirical, and mixed exponential) and Markov chain approaches (first, second, and third order) that they offer.Moreover, these generators are relatively easy to run and their outputs are easy to post-process and interpret.These selected generators have been used in different climate change impact studies, including studies with CLIGEN [29][30][31]; LARS-WG [32][33][34]; and, WeaGETS [35][36][37].In addition to other measures as detailed in the methodology, differences between the means of simulated values and observed data were evaluated using Cohen's Effect Size [38], which provides a more robust measure than conventional hypothesis testing where datasets are large [39][40][41][42].
The study area for this work is the Western Lake Erie Basin (WLEB) in the U.S. Great Lakes region.A number of climate-related studies have been conducted in this basin including hydrologic and water quality modeling studies [43,44]; evaluation of conservation practices [43, [45][46][47][48]; and, evaluation of phosphorus responses in Lake Michigan under different GCMs [49].Generated weather data offer potential for use (and are often used) in such modeling and related applications.Thus, there is the need to evaluate weather generator performance in order to determine their suitability for use, particularly in hydrologic, water resources, and agricultural applications.In general, generated weather data should go beyond matching values with observed data to mirroring the statistical properties of the observed data [14,26].With climate change concerns and the use of generated data in future-cast studies,

Weather variables simulated
Maximum temperature, minimum temperature, dew point temperature, probability of today's precipitation, amount of precipitation, time to peak, radiation, wind direction and velocity Maximum temperature, minimum temperature, precipitation and solar radiation Maximum temperature, minimum temperature, and precipitation

CLIGEN (CLImate GENerator)
CLIGEN was developed as a component of the Water Erosion Prediction Project (WEPP) model to generate climate data to simulate erosion processes and sediment delivery at the hillslope profile and small watershed scale [25,50,51].A unique aspect of CLIGEN is its capacity to simulate intensity patterns within a storm including storm duration, peak rain intensity, and time to peak [25].Basic climate variables required for any water resources study, including precipitation, temperatures, solar radiation, and wind velocity are simulated by CLIGEN on a daily basis using monthly statistics (means, standard deviations, skew coefficients, etc.) of the observed weather parameters.RAND, a pseudo random number generator in CLIGEN, produces a sequence of random numbers that are statistically independent of each other using a mathematical formula or pre-calculated list.
CLIGEN needs a standard parameter (*.par) file to generate synthetic data of unlimited length.The input required to make the PAR file is raw weather data which should contain some minimum information about precipitation, and maximum and minimum temperatures.Other parameters may need to be interpolated from nearby stations if data are not available at the location of interest.
Usually, the raw data are obtained in the form of a *.dly file from NOAA.The information in *.dly is translated to a *.DAT file which later can be transformed to a *.par file using python scripting (DAT2PAR.exeprogram).The basic information contained in the PAR file includes mean, standard deviation, and skewness for precipitation on a wet day; probability of wet day occurrence provided the previous day is wet or dry; average and standard deviation for maximum and minimum daily air temperatures; mean daily solar radiation and its standard deviation; dew point temperature; and time to peak rainfall intensity.CLIGEN is able to handle missing data to some extent although the PAR file will not be generated if the input dataset has more than two months of continuous missing data.
In CLIGEN, temperatures are assumed to be normally distributed.To simplify the mathematics behind the simulation process, CLIGEN produces the maximum, minimum, and dew point temperatures with an assumption that they are independent of each other, although there are checks in place to ensure that the values obtained are reasonable [52].For example, there are checks in place within CLIGEN to ensure that simulated daily minimum temperature does not exceed simulated daily maximum temperature, which can happen if the underlying assumption of temperatures being normally distributed for each month is violated [52,53].Precipitation is calculated using a joint probability distribution.Fourier series interpolation/disaggregation is considered to give more reliable results than linear interpolation, or interpolation performed to preserve monthly averages for daily simulation from monthly data.

LARS-WG (Long-Ashton Research Station Weather Generator)
LARS-WG is a license-protected weather generator that uses 21 different parameters to denote interval bounds and count the number of events within each interval based on a semi-empirical distribution based on which precipitation occurrence is predicted and amounts are quantified [54].Weather data are generated in three distinct steps.The first step is model calibration, where the observed dataset is analyzed statistically.In the second stage, the synthetic data and observed data are checked for any statistical difference.Lastly, using the observed data, a parameter file is created using monthly statistics as determined from observed data.For LARS-WG, as little as a single year of data is sufficient to generate the synthetic climate data.However, the use of at least 20-30 years of data are recommended [55] to better represent the climate for the site in question.For extreme events, a longer observed data record may help capture less frequent climate events.The LARS-WG parameter file contains information on semi-empirical distributions for the length of dry and wet series, precipitation, minimum and maximum temperatures and radiation calculated for dry and wet days, and correlation and auto correlation coefficients.Missing data in the observed data set may alter the statistics but the long-term climate series will still be generated.The generator outputs result of statistical tests comparing the observed and generated data.The random seed generator that makes LARS-WG stochastic has a default value set to 541 but can be set to any number.
The precipitation threshold to define if a day is dry or wet is 0.0 mm, this differing from CLIGEN and WeaGETS where the threshold is 0.1 mm.Daily means and standard deviations are conditioned on the basis of whether a day is dry or wet, in order to generate daily maximum and minimum air temperatures.Residuals are computed following a normal distribution, whereas daily maximum and minimum air temperature are generated based on a finite Fourier series of order 3.The pre-set auto-correlation is 0.6 for maximum and minimum temperature which means that there is at minimum 60%-time autocorrelation for maximum and minimum air temperatures analyzed from the observed residuals by removing the mean value from the observed dataset.Similar to CLIGEN, a check is made during the simulation process, to preclude a situation where simulated minimum temperatures are higher than maximum temperatures.

WeaGETS (Weather Generator)
WeaGETS is a flexible, stochastic, MATLAB-based weather generator which simulates daily precipitation depth, and maximum and minimum daily air temperatures in a sequence of unlimited length.The uniqueness of WeaGETS is that it can use first, second, or third order Markov chain models to generate precipitation occurrence, and four distributions (exponential, gamma, skewed normal and mixed exponential) to simulate precipitation quantity.This gives a total of 12 submodels that can be used to simulate weather data within WeaGETS.In addition, there is an option for smoothing of precipitation parameters and low frequency correction.Fourier harmonics [21] can be used to smooth the precipitation generating parameters.Conditional and unconditional schemes are available to simulate the maximum and minimum daily air temperatures.The underestimated variability at monthly and inter-annual steps can be corrected using a spectral correction approach [56].The transitional probabilities required to account for precipitation occurrence are computed biweekly.There is a provision to smooth the variations lying within a 2-week period on a daily basis so as to account for true yearly distributions of the transitional probabilities associated with precipitation occurrence.There are four different orders of Fourier harmonics that can be used within the smoothing process.
The input data required for weather generation in WeaGETS consist of daily precipitation, and maximum and minimum temperatures.The generator does not account for 29 February and any significant precipitation event occurring on 29 February is redistributed equally between 28 February and 1 March.The temperatures for a leap year (29 February) are simply removed [57].The same constraints were applied in processing input data for this study.WeaGETS is able to handle missing values in the observed data set.There is not much information on random seed generation within WeaGETS, although it is presumed that it uses the control random number generation function within MATLAB to generate random numbers that may be repeatable or different.

Weather Generators (WGs) Configuration
For each of the stations as selected, all the three weather generators (CLIGEN, LARS-WG, and WeaGETS) were configured in their default state to simulate precipitation and minimum and maximum temperatures using the observed data as described.WeaGETS was configured considering a combination of three different orders of Markov models with four different distributions, no smoothing, and using a conditional scheme to simulate temperatures.This yielded six different submodels, herein termed WeaGETS-01 through WeaGETS-06.Submodels WeaGETS-01 through WeaGETS-03 represent simulation outputs with a skewed normal distribution and first, second, and third order Markov models, respectively, while WeaGETS-04 through WeaGETS-06 represent simulation based on mixed exponential distribution and first, second, and third order Markov models, respectively.Observed data for the period 1966-2015 were used to generate input for the weather generators, for example, the parameter (*.par) file for CLIGEN; Weather Generator (*.wgx) file in LARS-WG; and, as input for the MATLAB code for WeaGETS.For each generator and at each station, output time series daily climate variables comprising 50 realizations, each 50 years in length, were generated for use in this study.Different random number seeds were used to generate each realization, thus each set of 50-year outputs was expected to be different from the others, representing a range of variability in the generated values.The random seed number can be fed externally to the CLIGEN and LARS-WG weather generators, while WeaGETS creates its own random seed number for each realization.

Study Site Description
The Western Lake Erie Basin (WLEB, Figure 1) has an aerial span of 29,137 km 2 (7.2 million acres) and its tributaries run through three states, including Michigan, Indiana, and Ohio [58].Lake Erie is the shallowest of the Great Lakes with the depth averaging 7.3 m (24 feet).The WLEB receives an average annual precipitation ranging from 838 to 940 mm [59].Average monthly temperatures in the basin range from −12.5 • C (9.2 • F) in February to 27 • C (80 • F) in July based on the period of record from 1887 to 2016 [60].The lowest monthly precipitation depth ranges between 49 and 50 mm/month, and is typically recorded in February for most of the WLEB [60].May, June, and July generally receive the greatest monthly precipitation ranging between 97 and 110 mm/month (3.8-4.3 in/month) across the basin [60].The WLEB has a wide spread of national climatic data center (NCDC) climate stations with weather records dating back from the 1890s to the present.Several of these stations have consistent records for daily precipitation depth, and maximum and minimum daily air temperatures.For this study, a set of eight stations (Figure 1) were selected for analysis, specifically those with more than 95% coverage in terms of data availability and continuity, and considering the need for spatial coverage across the entire basin including Indiana, Ohio, and Michigan.Of the selected stations, Fort Wayne has the most consistent dataset (100%) while Sandusky, Ohio had the least consistent dataset (98.4%).This missing data generally were for short periods spanning 1 to 10 days, although in some instances (1-2 years) there were 50-60 days of continuous missing values.This range of missing values was, however, within the limits of all three weather generators thus these data were deemed suitable for analysis.
Climate 2017, 5, 26 6 of 43 than 95% coverage in terms of data availability and continuity, and considering the need for spatial coverage across the entire basin including Indiana, Ohio, and Michigan.Of the selected stations, Fort Wayne has the most consistent dataset (100%) while Sandusky, Ohio had the least consistent dataset (98.4%).This missing data generally were for short periods spanning 1 to 10 days, although in some instances (1-2 years) there were 50-60 days of continuous missing values.This range of missing values was, however, within the limits of all three weather generators thus these data were deemed suitable for analysis.

Preliminary Analysis
The climate data for each of the stations were selected for the period 1966-2015 based on a preliminary analysis on long-term precipitation and temperature data at the Fort Wayne International Airport Station (Fort Wayne, Figure 1).This station has a continuous and relatively consistent climate data record spanning from 1 August 1939 to the present day.Based on this preliminary analysis (Figure 2), upward trends were observed in annual precipitation over the entire dataset and for 1966-2015, although trends were not significant (p-value > 0.05).While a decrease in precipitation was observed in the period between 1958 and 1965, the upward trend observed between 1966 and 2015 was fairly consistent and reflective of current trends in climate in the region [60].Thus, this period was selected for further analysis.For this study, there was no imputation done for missing data so as to minimize uncertainties that may arise in generated data due to imputed data [61].
Once the time period had been determined, further analysis was conducted on the selected eight stations to check for similarities or differences among the stations and, thus, further refine the stations to be included in the weather generator comparisons.The stations that were distinctly different from each other in terms of climatic patterns and basic characteristics of precipitation and minimum and maximum temperatures were selected for use in evaluating the weather generators.Their spatial

Preliminary Analysis
The climate data for each of the stations were selected for the period 1966-2015 based on a preliminary analysis on long-term precipitation and temperature data at the Fort Wayne International Airport Station (Fort Wayne, Figure 1).This station has a continuous and relatively consistent climate data record spanning from 1 August 1939 to the present day.Based on this preliminary analysis (Figure 2), upward trends were observed in annual precipitation over the entire dataset and for 1966-2015, although trends were not significant (p-value > 0.05).While a decrease in precipitation was observed in the period between 1958 and 1965, the upward trend observed between 1966 and 2015 was fairly consistent and reflective of current trends in climate in the region [60].Thus, this period was selected for further analysis.For this study, there was no imputation done for missing data so as to minimize uncertainties that may arise in generated data due to imputed data [61].
Once the time period had been determined, further analysis was conducted on the selected eight stations to check for similarities or differences among the stations and, thus, further refine the stations to be included in the weather generator comparisons.The stations that were distinctly different from each other in terms of climatic patterns and basic characteristics of precipitation and minimum and maximum temperatures were selected for use in evaluating the weather generators.Their spatial location was considered as a secondary factor so as to have suitable coverage of the WLEB, thus providing suitable representation of climate in the basin.
Climate 2017, 5, 26 7 of 43 location was considered as a secondary factor so as to have suitable coverage of the WLEB, thus providing suitable representation of climate in the basin.Based on the analysis, maximum precipitation patterns and characteristics of precipitation and minimum and maximum temperatures varied among some of the stations.For example, the mean daily precipitation values ranged from 2.4 mm at Bowling Green and Sandusky to 2.7 mm at Norwalk, Bucyrus, and Lima WWTP; the daily average maximum temperature varied from 14.4 °C at Sandusky to 15.9 °C at Lima WWTP; the daily average minimum temperature ranged from 3.3 °C at Adrian to 5.9 °C at Sandusky.The percentage of days without precipitation ranged from 63.2 at Bucyrus to 72.1 at Bowling Green.Data at Norwalk had the highest values of skewness (7.1) and kurtosis (116.3).Maximum precipitation patterns were similar for Adrian, Lima WWTP, Bowling Green, and Fort Wayne, while data at Norwalk, Bucyrus, and Sandusky showed some distinct peaks.Details of this analysis are provided in the Appendix A (Table A1 and Figure A1).
Based on the aforementioned criteria, three stations (Adrian, Michigan; Norwalk, Ohio; Fort Wayne, Indiana) were selected for the analysis.Fort Wayne represented the upstream conditions of the basin and its dataset had 100% availability over the study period, and Adrian was the only station in Michigan.Norwalk's characteristics were generally distinct from those at the other two stations, particularly with regard to precipitation.The station is located on the eastern side of the basin, consistent with the criteria for spatial coverage.All of the three stations selected had a consistent observed climate dataset.

Data Analysis
Initial analysis involved plotting the distributions and computing descriptive statistics for the daily precipitation and minimum and maximum air temperature values predicted by each weather generator and comparing the results with the same statistics obtained from the observed data [62].In addition, the number of days with zero magnitude precipitation was also computed for each generator and for observed data.This information was converted to percentages so as to make it convenient for comparison.Plots of the distributions were also examined in order to obtain a visual comparison on how closely distributions of the generated values matched those of the observed data.Further analysis involved the comparison of variations observed in precipitation, and maximum and minimum temperatures within each realization to those of the observed data.In addition, the range of values from simulated results from each realization for each station were evaluated so as to determine the extent of variability captured by each weather generator when compared to observed data.Differences between the means of the observed data and those from each realization were evaluated using Cohen's Effect Size (Cohen's d) which is preferable to conventional p-values where datasets are large [39][40][41][42].The Cohen's-d value was calculated as the difference between the two means divided by the standard deviation of the observed data.Cohen's-d values can range from 0 to Based on the analysis, maximum precipitation patterns and characteristics of precipitation and minimum and maximum temperatures varied among some of the stations.For example, the mean daily precipitation values ranged from 2.4 mm at Bowling Green and Sandusky to 2.7 mm at Norwalk, Bucyrus, and Lima WWTP; the daily average maximum temperature varied from 14.4 • C at Sandusky to 15.9 • C at Lima WWTP; the daily average minimum temperature ranged from 3.3 • C at Adrian to 5.9 • C at Sandusky.The percentage of days without precipitation ranged from 63.2 at Bucyrus to 72.1 at Bowling Green.Data at Norwalk had the highest values of skewness (7.1) and kurtosis (116.3).Maximum precipitation patterns were similar for Adrian, Lima WWTP, Bowling Green, and Fort Wayne, while data at Norwalk, Bucyrus, and Sandusky showed some distinct peaks.Details of this analysis are provided in the Appendix A (Table A1 and Figure A1).
Based on the aforementioned criteria, three stations (Adrian, Michigan; Norwalk, Ohio; Fort Wayne, Indiana) were selected for the analysis.Fort Wayne represented the upstream conditions of the basin and its dataset had 100% availability over the study period, and Adrian was the only station in Michigan.Norwalk's characteristics were generally distinct from those at the other two stations, particularly with regard to precipitation.The station is located on the eastern side of the basin, consistent with the criteria for spatial coverage.All of the three stations selected had a consistent observed climate dataset.

Data Analysis
Initial analysis involved plotting the distributions and computing descriptive statistics for the daily precipitation and minimum and maximum air temperature values predicted by each weather generator and comparing the results with the same statistics obtained from the observed data [62].In addition, the number of days with zero magnitude precipitation was also computed for each generator and for observed data.This information was converted to percentages so as to make it convenient for comparison.Plots of the distributions were also examined in order to obtain a visual comparison on how closely distributions of the generated values matched those of the observed data.Further analysis involved the comparison of variations observed in precipitation, and maximum and minimum temperatures within each realization to those of the observed data.In addition, the range of values from simulated results from each realization for each station were evaluated so as to determine the extent of variability captured by each weather generator when compared to observed data.Differences between the means of the observed data and those from each realization were evaluated using Cohen's Effect Size (Cohen's d) which is preferable to conventional p-values where datasets are large [39][40][41][42].The Cohen's-d value was calculated as the difference between the two means divided by the standard deviation of the observed data.Cohen's-d values can range from 0 to 1. Values of 0.2, 0.5 and 0.8 represents small, medium, and large effects respectively [38].Larger effects indicate larger differences in means between two populations and less overlap between their distributions [63].
More detailed analysis involved an examination of extreme events and properties related to crop growth, specifically: Dry spells: Based on existing literature [64,65], a period was considered a dry spell if there were at least 15 consecutive days in which none of the days had greater than 0.1 mm of rainfall.The 0.1 mm threshold was used in this study for consistency as two of the three weather generators used this threshold to define a dry or wet day as previously described.Dry spells are indicators of drought and at the same time also affect aquatic life, and hydropower generation [66,67].They are also important in estimating irrigation water demand which also depends on the length of dry spells [68].
Wet spells: Based on [69], a wet spell was considered to occur where there were more than three days with precipitation greater than or equal to 0.1 mm.In a given period, the wet spell was considered to have ended when two continuous dry days were encountered.A record of wet spells is important for water management as it helps in water resource allocation and distribution.Moreover, information on wet spells helps in planning flood control measures and managing sediment transport and deposition in river basins.
Wet day and dry day count: In addition to wet and dry spells, a count of the number of dry and wet days in each month was obtained, with a wet day and a dry day being defined as previously described.The number of dry and wet days within a month provides information that is important for planning seed bed preparation, deciding planting date, and estimating crop water requirements.
Snow days: Based on [70], precipitation occurring on a day with temperature lower than 2 • C is more likely to be snow than rain, thus this temperature was taken as the threshold for snow occurrence for days in which there was precipitation.Snowmelt forms an important component of surface hydrology and can be related to the number of days in which snow occurred [71].Thus, any variation in the number of snow days can give a clear picture about implications of climate change on the hydrology of any watershed where snowmelt forms a component of water budget.
Growing season temperature requirements/period of optimal growth: Temperature plays a critical role in the plant germination and growth.An optimum temperature is required by the plants for the process of photosynthesis; for corn, which is the most important crop in most of the watersheds in Midwest USA, a day with a temperature ranging between 20 and 25 • C is considered to be ideal for growing corn and supporting its growth [72].Thus, the period of optimal growth is defined as the period of the year for which daily mean temperatures range between 20 and 25 • C.However, a temperature of 10 • C is considered to be a threshold below which corn will not grow or will grow very slowly.Such a temperature is termed as a base temperature.Corn is an economically important crop in the study area, with approximately 17,000 km 2 under cultivation [73,74].Thus, identifying the temperature range helps in deciding optimum requirements required for corn to flourish as a function of temperature.
Growing degree days (GDD): GDD or Heating Units (HU) are a measure of heat accumulation that controls the development characteristics of temperature-dependent plants.This measure is used to classify the maturity of plant hybrids.It is based on the concept that the plant growth and development are a function of cumulative values of daily average temperature above a certain base temperature over a period of time provided temperature is the only limiting condition favoring crop growth and considering only positive or zero values in accumulation [72].GDD for each day are determined by subtracting the mean daily temperature from its base temperature (Equation ( 1)).
The accumulated GDD values at the typical time of planting and harvesting corn in the Midwest U.S. was computed and averaged for four days in a year (1 and 15 May (planting), 1 and 15 October (harvesting)).
For this study, all analyses were conducted on daily data, as these are the data that are typically used in hydrologic, environmental, and agricultural modeling applications.Analysis on daily data also allowed extraction of pertinent characteristics such as dry and wet spells and accumulated GDD.

Summary Statistics and Data Distribution
All three generators captured mean values of daily precipitation depths relatively well although some of the WeaGETS submodels tended to underestimate the values.However, this underestimation was by a small margin (2.55-2.56mm compared with 2.60 mm from observed).With the exception of LARS-WG, the standard deviation of the simulated values was lower than that of the observed data, with values from CLIGEN having the smallest spread (6.64 mm).Output from all generators considering each realization had the same standard error (0.05 mm) as that of the observed data (0.05 mm).All generators captured the number of zero precipitation days relatively well with the exception of LARS-WG which simulated, on average, about 6 (1.6%) more days with zero precipitation than were evident from observed data.Precipitation depth values generated using a mixed exponential distribution (WeaGETS-04-06) were less skewed than the observed data while those simulated using WeaGETS submodels 01-03, were much more skewed than the observed data.LARS-WG was the most accurate in capturing the skewness of observed data (4.85 compared with 4.77 from observed data).LARS-WG was also the most accurate in capturing the kurtosis coefficient, while the WeaGETS submodels had a similar response as observed with skewness.In both cases, CLIGEN performance was similar to that of WeaGETS-04-06, which was unexpected since simulations were based on different distributions.LARS-WG captured the maximum daily precipitation fairly accurately (111.0 mm compared with 111.8 mm from observed) while both CLIGEN and WeaGETS tended to overestimate this value.
All generators reproduced the mean maximum daily air temperatures well, with those obtained from the WeaGETS simulations perfectly matching those from the observed data.The standard deviation obtained from the CLIGEN simulations was equal to that from the observed data (11.8 • C) while that from LARS-WG was lower (11.2 • C) and from WeaGETS was higher (12.All generators captured mean minimum daily air temperatures relatively well, however they were unable to capture the modal value of minimum temperatures with deviations ranging from −1.8 • C (WeaGETS-06) to 17.2 • C (LARS-WG).As with maximum temperatures, LARS-WG performed best in capturing the range of values of minimum temperatures, while both CLIGEN and WeaGETS tended to overestimate this range.
No appreciable differences were seen in the distribution of precipitation between observed and simulated values with the exception of the maximum value which only LARS-WG was able to capture.For maximum and minimum temperatures, the main differences were seen at the extremes, with all generators generally being able to capture the data distribution between the 25th and 97.5th percentiles.LARS-WG generally performed well for temperature simulation, showing only slight deviations for maximum temperatures and at the 97.5th percentile for minimum temperatures.Both CLIGEN and WeaGETS showed deviations at the maximum and minimum values of daily air temperature with WeaGETS having the largest deviations.Simulated values from CLIGEN and LARS-WG were far better than those from WeaGETS in terms of deviations from observed data for the different weather variables, including daily precipitation, maximum temperature, and minimum temperature.
Detailed results are presented in the Appendix A (Table A2).

Extreme Variables
From Table 2, simulated dry sequences from CLIGEN and WeaGETS-03 were closest to those based on observed data (24 and 26, respectively compared to 18 from observed data).WeaGETS-06 generated 207 wet sequences while LARS-WG and WeaGETS-05 both generated 209 wet sequences, all of which were closest to the 213 wet sequences obtained from the observed data.The Growing Degree Days (GDD) accumulation was simulated best by CLIGEN with an exception observed near the harvesting date (1 October) for which simulations from WeaGETS-05 were considered closer to the observed data.All other WeaGETS submodels overestimated GDD while LARS-WG underestimated the GDD values.LARS-WG, WeaGETS-04, and WeaGETS-05 accurately reproduced the average 33 snow days in a year while WeaGETS-06 was fairly close with an average of 32 snow days in a year.The total number of snow days observed in 50 years (1623) were simulated best by LARS-WG (1669) and WeaGETS-04-06 (1619-1644).For the number of growth range days per year, CLIGEN, WeaGETS-03, and WeaGETS-06 outperformed the other generators generating 61 days per year when compared with 63 days per year obtained from observed data.However, these models underestimated the total number of growth range days in 50 years although the CLIGEN-simulated value (3068 days) was closest to the value computed from observed data (3161 days).
The number of dry and wet days in a month generated by different weather generators showed maximum variations in the growing season with some variations seen in winter months.Figure 3 shows these variations during the growing season months (May-October).In most cases, the WeaGETS submodels either underestimated or overestimated the number of dry and wet days in a month.In very few cases, LARS-WG underestimated the number of wet days while overestimating the number of dry days.Generally, CLIGEN performed best at simulating the number of dry and wet days in a month.

Extreme Variables
From Table 2, simulated dry sequences from CLIGEN and WeaGETS-03 were closest to those based on observed data (24 and 26, respectively compared to 18 from observed data).WeaGETS-06 generated 207 wet sequences while LARS-WG and WeaGETS-05 both generated 209 wet sequences, all of which were closest to the 213 wet sequences obtained from the observed data.The Growing Degree Days (GDD) accumulation was simulated best by CLIGEN with an exception observed near the harvesting date (1 October) for which simulations from WeaGETS-05 were considered closer to the observed data.All other WeaGETS submodels overestimated GDD while LARS-WG underestimated the GDD values.LARS-WG, WeaGETS-04, and WeaGETS-05 accurately reproduced the average 33 snow days in a year while WeaGETS-06 was fairly close with an average of 32 snow days in a year.The total number of snow days observed in 50 years (1623) were simulated best by LARS-WG (1669) and WeaGETS-04-06 (1619-1644).For the number of growth range days per year, CLIGEN, WeaGETS-03, and WeaGETS-06 outperformed the other generators generating 61 days per year when compared with 63 days per year obtained from observed data.However, these models underestimated the total number of growth range days in 50 years although the CLIGEN-simulated value (3068 days) was closest to the value computed from observed data (3161 days).
The number of dry and wet days in a month generated by different weather generators showed maximum variations in the growing season with some variations seen in winter months.Figure 3 shows these variations during the growing season months (May-October).In most cases, the WeaGETS submodels either underestimated or overestimated the number of dry and wet days in a month.In very few cases, LARS-WG underestimated the number of wet days while overestimating the number of dry days.Generally, CLIGEN performed best at simulating the number of dry and wet days in a month.Overall, the WeaGETS model did not perform well based on descriptive statistics, output value distributions, and evaluation of extreme variables.These discrepancies were thought to be due to intricacies in WeaGETS parameterization, which warrants further assessment before the model can be evaluated fully.Results from this generator were thus not considered further in comparative assessments.

Analysis of Distributions
The range of values for mean precipitation simulated from each weather generator were generally within the range of means computed from observed data (Table 3).Values obtained for Cohen's d were generally small, indicating that the mean differences between simulated and observed data were small.A slight exception was observed at Fort Wayne for which Cohen's d values were slightly above the 0.2 threshold for small differences.However, these values were smaller than the 0.5 threshold for medium differences, thus the differences were still considered small.In addition, the non-overlap region between the simulated distribution and observed data distribution ranged from 0% to 21.3% [75] for all the three weather variables and weather stations.LARS-WG generally captured the standard deviations relatively well based on the range of values obtained from simulated data.The range of standard deviation values obtained for CLIGEN-simulated precipitation was, however, generally lower when compared with the observed data (Table 3).The observed values for standard error, number of days with zero precipitation, skewness, and kurtosis for each station were captured relatively well by both generators although the range of skewness values obtained was somewhat large at the Norwalk station (both generators).A similar observation was made for kurtosis at the same station.Generally, no appreciable differences were observed in precipitation values until about the 99.5th percentile value.The largest differences were seen with maximum precipitation, with CLIGEN tending to overestimate this value at all stations.
Mean values of maximum temperature were captured well by both generators based on Table 3. Cohen's d values were especially low for all generators and at all stations indicating that the mean value was maintained for all realizations and there was small or negligible non-overlap between the simulated and observed distributions for maximum temperature.LARS-WG tended to overestimate maximum temperatures at lower quantiles while CLIGEN captured these values relatively well.However, CLIGEN tended to overestimate the higher values with the largest difference seen at the maximum (Q100) while LARS-WG tended to underestimate these higher values (Table 3).CLIGEN was better at capturing skewness and kurtosis, and both generators captured the standard error fairly well.
A similar behavior was seen with mean values for minimum temperatures although Cohen's d values were much larger than those for maximum temperatures especially at Fort Wayne (Table 3).Standard deviations obtained from LARS-WG simulated data were lower than those associated with observed data while values for CLIGEN-simulated data matched observed values relatively well.Both generators missed the standard error by a small margin with the exception of CLIGEN at Fort Wayne, and similarly for skewness with the exception of CLIGEN at Norwalk (Table 3).Neither generator was able to capture the kurtosis with respect to minimum temperature.Both generally captured the lowest values relatively well.As with maximum temperatures, CLIGEN tended to overestimate the higher values and especially at the maximum while LARS-WG tended to underestimate these values (Table 3).The density distributions based on daily precipitation (Table 3) revealed that all the weather generators were able to simulate the precipitation with zero values well.Density charts for maximum, and minimum temperature are provided in Figures A2 and A3 in the Appendix A, respectively.
Based on the distribution of daily maximum precipitation (Figure 4), CLIGEN was able to simulate the fluctuations/noise seen in the observed data well.This was thought to be because of the skewed normal distribution used to account for precipitation within CLIGEN, which allowed the generator to pick up the fluctuations.While LARS-WG also captured observed patterns of maximum precipitation, it had the tendency to smooth the values, possibly attributable to its use of a semi-empirical distribution to determine precipitation amounts.Observed data patterns for maximum and minimum temperatures were generally captured well by both LARS-WG and CLIGEN (Figures 5 and 6).Table 3. Descriptive statistics for daily precipitation (mm), maximum temperature ( • C), and minimum temperature ( • C) obtained from simulated values from the weather generators (CLIGEN and LARS-WG) for the three weather stations in the Western Lake Erie Basin compared to the statistics obtained from the observed precipitation data.

Distribution of Distributions
Figures 7-9 show the results of an in-depth assessment of the distributions of the skewness, means and standard deviations across all 50 realizations.Based on Figure 7, both generators generally captured the skewness of the daily precipitation relatively well except at Norwalk for which the patterns were sporadic for both generators and the observed value was only captured in a few of the realizations.Both generators also generally performed well for the skewness of the maximum and minimum daily air temperatures, although values from LARS-WG were overestimated albeit by a small margin.Values from CLIGEN were closely matched with the observed value for all 50 different realizations (Figure 7).
From Figure 8, LARS-WG captured the mean value of observed precipitation relatively well for the three different weather stations.CLIGEN also simulated the mean values relatively well except at Norwalk where it underestimated the mean value in all 50 different realizations, albeit by a small margin (0.15 mm).Overall, LARS-WG performed better than CLIGEN in simulating the mean values for precipitation for all three weather stations.However, both weather generators performed well with all 50 different realizations in capturing the mean values of precipitation for the given weather station.Both generators captured the maximum and minimum daily air temperatures relatively well, consistent with previous observations.LARS-WG generally performed better with respect to standard deviations for precipitation relative to the observed values over the 50 different realizations (Figure 9) although variations in standard deviations were more than those in CLIGEN-simulated data.Standard deviations from CLIGEN data were more or less consistent across realization although values were slightly underestimated compared to the observed value for most of the realizations.Larger discrepancies were seen at Norwalk than at Adrian and Fort Wayne.For all 50 realizations, standard deviations for maximum and minimum temperatures at the three weather stations were underestimated based on LARS-WG while those based on CLIGEN were almost perfectly matched with observed values.

Extreme Events/Variables
To account for the maximum variation in weather variables from the weather generators for all three weather stations, the counts for days with precipitation, maximum and minimum temperatures with values more than the 95th and 99.5th percentile values from observed data were computed (Table A4).The only discrepancy noticed was in the case of maximum temperature simulation at the 95th and 99.5th percentiles for Adrian and Norwalk and for minimum temperature at 99.5th percentile for Fort Wayne, where the number of days were slightly underestimated.Growing days and snow days simulated by LARS-WG and CLIGEN respectively were overestimated for all three weather stations.The simulation for growing degree day accumulation by LARS-WG for Norwalk was in line with the Fort Wayne while for Adrian, CLIGEN overestimated the GDD accumulation value.The wet sequence simulated for Adrian and Norwalk by LARS-WG and CLIGEN respectively was similar to the dry sequence seen for Fort Wayne (Table A4).CLIGEN overestimated the number of dry days for August for Adrian and Norwalk, which was the opposite of results at Fort Wayne in which the corresponding number of dry days were underestimated (Figure A7).Both CLIGEN and LARS-WG overestimated the number of wet days for April for Adrian unlike those at Norwalk and Fort Wayne, where they were able to simulate the number of wet days reasonably well (Figure A7).

Discussion
This study was aimed at evaluating the performance of three stochastic weather generators (CLIGEN, LARS-WG, and WeaGETS) for simulating weather data, with the Western Lake Erie Basin (WLEB) as a pilot study site.The study accounted for the simulated weather variables characteristics in terms of distributions, extreme variables, and variations in simulating statistical parameters based on 50 different realizations.Because of the combination of distributions (exponential, gamma, skewed normal, and mixed exponential), Markov models, and other options that WeaGETS provides, there were a large number of potential sub-models that could be included, which made it challenging to evaluate this generator effectively within the scope of this study.Thus, work on this generator was discontinued early in the process.
Both CLIGEN and LARS-WG performed fairly well with respect to representing the statistical characteristics of observed daily precipitation and minimum and maximum daily air temperatures.LARS-WG performed better in capturing the multimodal peaks seen in observed temperatures and was able to reproduce the observed data for both weather variables (temperature and precipitation) better than the other two generators as observed from the density plots [62].LARS-WG also had very good representation of the dry and wet day sequences, consistent with [76], and performed well at simulating snow days and precipitation events, making it suitable for hydrologic studies.CLIGEN overestimated the dry sequences and snow days but had the added advantage of simulating a wide range of climate variables (nine, compared with three to four variables for LARS-WG and WeaGETS).This generator was also well able to simulate the period of optimal growth and growing degree days, making it especially suitable for crop growth simulation, consistent with [77].
The use of a semi-empirical distribution to account for and quantify the amount of precipitation, as in the case of LARS-WG, appears to be better for simulating wet and dry spells and daily precipitation especially for extreme rainfall simulation (Table 1).This is consistent with climate studies by [78].Although LARS-WG can work with small datasets to simulate long-term climate, longer observed datasets can help in generating better parameters.LARS-WG was developed specifically for generating climate data of sufficient length to help understand climate change impact on agriculture and hydrology in small areas [79].The question of length of climate data required for better simulation is an important topic of future research.
Temperatures in CLIGEN are simulated independently from precipitation.This is done so as to better represent weather parameters including rainfall intensity and time to peak which is somewhat more difficult to achieve with weather generators where the parameters are dependent [52].In this study, CLIGEN tended to overestimate the dry and wet sequences for all of the weather stations.The possible reason for the discrepancy is the use of two random variables by CLIGEN to preserve the auto-and cross-correlations for and between maximum and minimum temperatures that may result in variations from observed data in order to preserve statistics and distributions during simulation [80].A first order linear auto regressive model is used in LARS-WG which seems to perform better at preserving the observed auto-and cross-correlations [81].The temperatures reproduced by LARS-WG are not conditioned on each other like in CLIGEN but are conditioned on precipitation status.Compared with GCMs, weather generators focus on small scales, are computationally faster, and can produce results with many different realizations based on random seed generators.However, data from GCMs and RCMs, which cannot be used directly as input to hydrologic model for climate change impact studies, need to be downscaled to the required spatial and temporal scale, and weather generators (LARS-WG and CLIGEN) provide a means to statistically downscale the data.
As was the case in this study, such analyses often result in large datasets, which present specific challenges with respect to statistical analysis.In particular, goodness of fit tests for such large datasets will likely produce significant results (p-value < 0.05) which could lead to false decisions.Moreover, in climate studies, statistical significance does not always provide an adequate basis for decision-making; for example, a rise in temperature by two degrees Celsius may not be statistically significant but it can adversely affect the vegetation growth and lead to ecological imbalances, possibly Climate 2017, 5, 26 23 of 40 due to habitat alterations and/or melting glaciers [82][83][84].In this study, emphasis was placed on visualization and an alternative to significance testing (Cohen's-d value), which provided a more reliable means of analyzing the resulting datasets [39,41,62].

Conclusions
Overall, LARS-WG and CLIGEN both performed well at simulating the long-term climate data in the Western Lake Erie Basin (WLEB) and either one is suitable for further work in the basin.More work is needed to evaluate WeaGETS for use in the basin.With their capacity to generate a wide range of weather variables and their potential for use in statistical downscaling, weather generators have a promising future for climate studies.They are simple to understand and work with and can be used to fill missing data with the least bias if the gaps are short.In addition, the WGs can be used to develop synthetic climate data series for ungauged areas of interest using the long-term continuous and consistent records of neighboring or similar stations.An in-depth understanding of distributions, intent or application purpose, and range of data availability can help choose the most appropriate weather generator.Since weather generators produce random variables, a number of realizations are needed in order to cover the range of variability in climate.Additional work is required to determine definitively the number of realizations needed to have the most reliable data reproduced by weather generators at the least computational expense.Further work is also needed for more reliable parametrization to better account for extreme events while mirroring the statistical properties of the observed data.Results of this work are specific to the WLEB and might not be directly applicable elsewhere.Methodologies and approaches can, however, be used in other areas where similar assessments are being considered.* Values corresponding to the 95 and 99.5 percentiles were calculated based on observed data and used as thresholds.Simulated data were evaluated against these thresholds.These values were as follows: Precipitation: 95th percentile-Adrian (14.

Figure 1 .
Figure 1.Study site location showing the Western Lake Erie Basin and the eight different weather stations considered in this study.

Figure 1 .
Figure 1.Study site location showing the Western Lake Erie Basin and the eight different weather stations considered in this study.
3 • C).The modal value of about 28.0 • C was captured well by both LARS-WG and CLIGEN, while WeaGETS was not at all able to capture this value (−6.6 • C to 3.7 • C based on WeaGETS-simulations).Only LARS-WG was able to capture the range of maximum temperature values with reasonable accuracy (−22.8 • C-39.2 • C, range = 62 • C compared with −23.9 • C-41.1 • C, range = 65 • C from observed data).Both CLIGEN and WeaGETS tended to overestimate the range of values with a range as high as 83.8 • C being obtained from WeaGETS.

Figure 3 .
Figure 3. Number of dry days and wet days based on values simulated by LARS-WG, CLIGEN, and WeaGETS for Fort Wayne compared with those from the observed data.(WeaGETS01 represents simulation outputs from WeaGETS with daily precipitation amounts based on a skewed normal distribution and precipitation occurrence generated based on first order Markov models, respectively; WeaGETS04 represents simulation outputs from WeaGETS with daily precipitation amounts based on a mixed exponential distribution and precipitation occurrence generated based on first order Markov models, respectively.(Details are provided in Table A3 in the Appendix A).

Figure 3 .
Figure 3. Number of dry days and wet days based on values simulated by LARS-WG, CLIGEN, and WeaGETS for Fort Wayne compared with those from the observed data.WeaGETS01 represents simulation outputs from WeaGETS with daily precipitation amounts based on a skewed normal distribution and precipitation occurrence generated based on first order Markov models, respectively; WeaGETS04 represents simulation outputs from WeaGETS with daily precipitation amounts based on a mixed exponential distribution and precipitation occurrence generated based on first order Markov models, respectively.(Details are provided in Table A3 in the Appendix A).

Figure 4 .
Figure 4. Distribution of maximum precipitation amount (in mm) during each Day of Year (DOY) generated from each weather generator (grey region from 50 different realizations) compared with the observed data (bold black line).

Figure 4 .Figure 5 .
Figure 4. Distribution of maximum precipitation amount (in mm) during each Day of Year (DOY) generated from each weather generator (grey region from 50 different realizations) compared with the observed data (bold black line).

Figure 5 .
Figure 5. Distribution of mean maximum temperature amount (in degrees Celsius) during each Day of Year (DOY) generated from each weather generator (grey region from 50 different realizations) compared with the observed data (bold black line).

Figure 6 .
Figure 6.Distribution of mean minimum temperature amount (in degrees Celsius) during each Day of Year (DOY) generated from each weather generator (grey region from 50 different realizations) compared with the observed data (bold black line).

Figure 6 .
Figure 6.Distribution of mean minimum temperature amount (in degrees Celsius) during each Day of Year (DOY) generated from each weather generator (grey region from 50 different realizations) compared with the observed data (bold black line).

Figure 7 .
Figure 7. Box and whisker plot showing skewness of Precipitation, Maximum Temperature, and Minimum Temperature simulated from 50 different realizations of weather generators (CLIGEN, LARS-WG) compared to Observed data (dashed line) for the three weather stations (Adrian, Norwalk, and Fort Wayne).(Dot plots for skewness of each weather variable over 50 different realizations are provided in Figure A4 in the Appendix A.)

Figure 7 .Figure 8 .Figure 8 .Figure 9 .Figure 9 .
Figure 7. Box and whisker plot showing skewness of Precipitation, Maximum Temperature, and Minimum Temperature simulated from 50 different realizations of weather generators (CLIGEN, LARS-WG) compared to Observed data (dashed line) for the three weather stations (Adrian, Norwalk, and Fort Wayne).(Dot plots for skewness of each weather variable over 50 different realizations are provided in Figure A4 in the Appendix A.)

Figure A1 .
Figure A1.Comparison of daily maximum precipitation amounts and patterns among the selected eight stations in the Western Lake Erie Basin.

Figure A1 .
Figure A1.Comparison of daily maximum precipitation amounts and patterns among the selected eight stations in the Western Lake Erie Basin.

Figure A2 .
Figure A2.Distribution of daily MaximumTemperatures as simulated by LARS-WG and CLIGEN in comparison to the observed data for select weather stat Western Lake Erie Basin (Adrian, Norwalk, and Fort Wayne).† The horizontal axis represents maximum temperature with the magnitude divided into severa the vertical axis represents density expressed as the inverse of the difference of maximum temperature within a specific bin.Probability density can be expr frequency by multiplying the interval range on the horizontal axis by the density on the vertical axis.

Figure A2 .
Figure A2.Distribution of daily MaximumTemperatures as simulated by LARS-WG and CLIGEN in comparison to the observed data for select weather stations in the Western Lake Erie Basin (Adrian, Norwalk, and Fort Wayne).† The horizontal axis represents maximum temperature with the magnitude divided into several intervals; the vertical axis represents density expressed as the inverse of the difference of maximum temperature within a specific bin.Probability density can be expressed as a frequency by multiplying the interval range on the horizontal axis by the density on the vertical axis.

Figure A3 .
Figure A3.Distribution of daily Minimum Temperatures as simulated by LARS-WG and CLIGEN in comparison to the observed data for select weather stations in Western Lake Erie Basin (Adrian, Norwalk, and Fort Wayne).† The horizontal axis represents minimum temperature with the magnitude divided into several interv the vertical axis represents density expressed as the inverse of the difference of minimum temperature in a specific bin.Probability density can be expressed as a freque by multiplying the interval range on the horizontal axis by the density on the vertical axis.

Figure A3 .
Figure A3.Distribution of daily Minimum Temperatures as simulated by LARS-WG and CLIGEN in comparison to the observed data for select weather stations in the Western Lake Erie Basin (Adrian, Norwalk, and Fort Wayne).† The horizontal axis represents minimum temperature with the magnitude divided into several intervals; the vertical axis represents density expressed as the inverse of the difference of minimum temperature in a specific bin.Probability density can be expressed as a frequency by multiplying the interval range on the horizontal axis by the density on the vertical axis.

Figure A4 .
Figure A4.Distribution of distributions (Realizations: 1:50) for Skewness of weather variables (Precipitation, Maximum Temperature, and Minimum Temperature) simulated from weather generators (CLIGEN, LARS-WG) compared to observed data (dashed line) for the three weather stations (Adrian, Norwalk, and Fort Wayne).

Figure A4 .
Figure A4.Distribution of distributions (Realizations: 1:50) for Skewness of weather variables (Precipitation, Maximum Temperature, and Minimum Temperature) simulated from weather generators (CLIGEN, LARS-WG) compared to observed data (dashed line) for the three weather stations (Adrian, Norwalk, and Fort Wayne).

Figure A5 .
Figure A5.Distribution of distributions (Realizations: 1:50) for Mean of weather variables (Precipitation, Maximum Temperature, and Minimum Temperature) simulated from weather generators (CLIGEN, LARS-WG) compared to observed data (Dashed line) for the three weather stations (Adrian, Norwalk, and Fort Wayne).

Figure A5 .
Figure A5.Distribution of distributions (Realizations: 1:50) for Mean of weather variables (Precipitation, Maximum Temperature, and Minimum Temperature) simulated from weather generators (CLIGEN, LARS-WG) compared to observed data (Dashed line) for the three weather stations (Adrian, Norwalk, and Fort Wayne).

Figure A6 .
Figure A6.Distribution of distributions (Realizations: 1:50) for Standard Deviation of weather variables (Precipitation, Maximum Temperature, and Minimum Temperature) simulated from weather generators (CLIGEN, LARS-WG) compared to observed data (Dashed line) for the three weather stations (Adrian, Norwalk, and Fort Wayne).

Figure A6 .
Figure A6.Distribution of distributions (Realizations: 1:50) for Standard Deviation of weather variables (Precipitation, Maximum Temperature, and Minimum Temperature) simulated from weather generators (CLIGEN, LARS-WG) compared to observed data (Dashed line) for the three weather stations (Adrian, Norwalk, and Fort Wayne).

Figure A8 .
Figure A8.Number of dry and wet days during the months of the year simulated by weather generators (CLIGEN, LARS-WG) for three weather stations (Adrian, Norw and Fort Wayne) compared with observed data from each weather station.

Figure A7 .
Figure A7.Number of dry and wet days during the months of the year simulated by weather generators (CLIGEN, LARS-WG) for three weather stations (Adrian, Norwalk, and Fort Wayne) compared with observed data from each weather station.

Table 2 .
Evaluation of the weather generators in simulating different extreme variables/events associated with precipitation and temperature for the Fort Wayne station compared to the values obtained from the observed data.
* WeaGETS-01, 02 and 03 represent simulation outputs from WeaGETS with daily precipitation amounts based on a skewed normal distribution and precipitation occurrence generated based on first, second, and third order Markov models, respectively; WeaGETS-04, 05 and 06 represent simulation outputs from WeaGETS with daily precipitation amounts based on a mixed exponential distribution and precipitation occurrence generated based on first, second, and third order Markov models, respectively.The number of observations is 18,262 (50 years) for observed data, 913,100 for CLIGEN (50 years for each of 50 different realizations with leap years taken into account), and 912,500 for LARS-WG and WeaGETS (50 years for each of 50 different realizations without 29 February on leap years which the two generators do not take into account).Bolded values indicate the simulated values from the weather generators that were closest to the observed data.

Table A3 .
Number of dry and wet days per month simulated by the different weather generators (LARS-WG, CLIGEN, WeaGETS) for the Fort Wayne compared with the observed data.WeaGETS-01, 02 and 03 represent simulation outputs from WeaGETS with daily precipitation amounts based on a skewed normal distribution and precipitation occurrence generated based on first, second, and third order Markov models, respectively; WeaGETS-04, 05 and 06 represent simulation outputs from WeaGETS with daily precipitation amounts based on a mixed exponential distribution and precipitation occurrence generated based on first, second, and third order Markov models, respectively.

Table A4 .
Evaluation of the weather generators in simulating different extreme variables/events associated with daily precipitation (mm), maximum daily air temperature ( • C), and minimum daily air temperature ( • C) for the three weather stations in Western Lake Erie Basin compared to the values obtained from the observed data.