An Open-Access Web-Based Tool to Access Global , Hourly Wind and Solar PV Generation Time-Series Derived from the MERRA Reanalysis Dataset

Wind and Solar Energy resources are an increasingly large fraction of generation in global electricity systems. However, the variability of these resources necessitates new datasets and tools for understanding their economics and integration in electricity systems. To enable such analyses and more, we have developed a free web-based tool (Global Renewable Energy Atlas & Time-series, or GRETA) that produces hourly wind and solar photovoltaic (PV) generation time series for any location on the globe. To do so, this tool applies the Boland–Ridley–Laurent and Perez models to NASA’s (National Aeronautics and Space Administration) Modern-Era Retrospective Analysis for Research and Applications (MERRA) solar irradiance reanalysis dataset, and the Archer and Jacobson model to the MERRA wind reanalysis dataset to produce resource and power data, for a given technology’s power curve. This paper reviews solar and wind resource datasets and models, describes the employed algorithms, and introduces the web-based tool.


Introduction
Global wind and solar photovoltaic (PV) capacity have been growing rapidly, reaching 432 GW for wind [1] and 180 GWp for solar [2] in 2015.With this rapid deployment of variable renewable energy (VRE) resources, a greater understanding of their spatial-temporal nature is desirable for resource characterizations, grid integration analyses, and project development planning.However, freely available VRE datasets either have limited temporal resolution or spatial coverage.While commercial datasets including Meteonorm [3], SolarGIS [4], and Vaisala [5] provide improved resolution, they are costly for preliminary analyses or research applications.This paper closes this gap by publishing a free tool for downloading hourly, global wind and solar PV generation time series, called GRETA (Global Renewable Energy Atlas & Time-series).GRETA adds to the growing body of work that develops publicly available research datasets [6][7][8][9][10] and modelling tools [11][12][13][14].With regard to solar data more specifically, the US Department of Energy's recent "Orange Button" initiative will create open data standards to facilitate data exchange, reduce soft costs, and expedite financing [15].
The GRETA platform includes several advantageous attributes: free access, hourly temporal resolution, global spatial coverage, multi-decadal historical data period (from 1979), formulation of both wind and solar resources, and production of intermediate parameters in addition to power generation.The user-friendly and open-access design of the GRETA portal will facilitate a breadth of end-users, including "prosumers" optimizing decentralized energy exchange [16], commercial self-consumers evaluating alternative scenarios [17], as well as residential households performing economic analyses [18], balancing a PV and electric vehicle charging [19] or optimizing battery storage system design [20].Further, GRETA enables the energy system modelling community focused on generating VRE integration alternatives [21].Such modelling exercises, which have been reviewed by [22], embody a range of objectives and scales.Country or regional analysis of renewable resource potential are an essential step for any long-term energy planning exercise, including diffusion, market, or econometric analyses [23].The implications of renewable resource variability characteristics [24] have been reviewed by [25].Deterministic or stochastic unit commitment dispatch models have been used for a variety of VRE integration analyses, for example, to determine the optimal procurement of reserves [26].Longer-term modelling exercises, which explore the implications of climate change on future energy systems, also often rely on long historical meteorological time series data [27].
This paper summarizes the models that convert meteorological data into generation data, reviews previous validations of the MERRA reanalysis data, and introduces the freely available web tool GRETA.An overview of the GRETA inputs, outputs, and algorithm, is shown schematically in Figure 1, and described in more detail in the remainder of the paper.
generation.The user-friendly and open-access design of the GRETA portal will facilitate a breadth of end-users, including "prosumers" optimizing decentralized energy exchange [16], commercial selfconsumers evaluating alternative scenarios [17], as well as residential households performing economic analyses [18], balancing a PV and electric vehicle charging [19] or optimizing battery storage system design [20].Further, GRETA enables the energy system modelling community focused on generating VRE integration alternatives [21].Such modelling exercises, which have been reviewed by [22], embody a range of objectives and scales.Country or regional analysis of renewable resource potential are an essential step for any long-term energy planning exercise, including diffusion, market, or econometric analyses [23].The implications of renewable resource variability characteristics [24] have been reviewed by [25].Deterministic or stochastic unit commitment dispatch models have been used for a variety of VRE integration analyses, for example, to determine the optimal procurement of reserves [26].Longer-term modelling exercises, which explore the implications of climate change on future energy systems, also often rely on long historical meteorological time series data [27].
This paper summarizes the models that convert meteorological data into generation data, reviews previous validations of the MERRA reanalysis data, and introduces the freely available web tool GRETA.An overview of the GRETA inputs, outputs, and algorithm, is shown schematically in Figure 1, and described in more detail in the remainder of the paper.A screenshot of the website tool interface is shown in Figure 2. A screenshot of the website tool interface is shown in Figure 2.  Finally, an example of the GRETA data output is shown in Figure 3. Finally, an example of the GRETA data output is shown in Figure 3.

Wind and Solar Resource Datasets
Wind and solar PV time series can be derived from either meteorological station data or reanalysis data.Resource datasets simulated from meteorological station data typically scale and/or time shift the measured data to nearby hypothetical plants, as in [28][29][30][31][32], and others.However, challenges with this approach include the following: data gaps [33], limited global coverage [34], data quality control issues [35], limited collection durations [35], data inconsistencies [35], and susceptibility to the local topography [36,37].Moreover, surface observations are often relatively unavailable [36].
Reanalysis datasets aggregate weather observations from satellite and surface stations, aircraft, and balloons through numerical weather prediction (NWP) modelling [36].Such datasets offer key advantages when creating VRE resource estimates, namely their global coverage and long data collection periods [38], and consistent extrapolation methodology [36,39].While reanalysis data cannot represent site-specific resource characteristics at a sub-grid scale, they represent a region of interest well [35].
GRETA uses the Modern-Era Retrospective Analysis for Research and Applications (MERRA) reanalysis dataset developed by NASA (National Aeronautics and Space Administration) [40,41].This dataset provides the variables required to compute wind and solar generation potential on a global 1/2 by 2/3 latitude-longitude grid with hourly resolution, from early 1979 to the within 2 months of the present.Thus, the proposed GRETA platform is primarily of interest for preliminary VRE assessments and exploration of long-term planning options at a large spatial (utility) scale; ground measurements would be required for site-specific resource data, for example, at the distribution scale.Several studies have validated the MERRA wind data against other datasets, including the National Renewable Energy Laboratory [39], National Climatic Data Centre and the University of Massachusetts wind stations [42], historic met mast and wind generation data [35,43,44]; while others have compared the derived power output from wind farms [45].The MERRA solar data have been compared to solar production data from five North American sites [46], the Helio-Clim-1 dataset and in situ irradiance measurements [47], the ERA-Interim re-dataset and ground measurements [48], and the Meteosat-based CM-SAF SARAH satellite dataset [49].Many of these studies have concluded that MERRA provides an accurate representation of wind speed and solar

Wind and Solar Resource Datasets
Wind and solar PV time series can be derived from either meteorological station data or reanalysis data.Resource datasets simulated from meteorological station data typically scale and/or time shift the measured data to nearby hypothetical plants, as in [28][29][30][31][32], and others.However, challenges with this approach include the following: data gaps [33], limited global coverage [34], data quality control issues [35], limited collection durations [35], data inconsistencies [35], and susceptibility to the local topography [36,37].Moreover, surface observations are often relatively unavailable [36].
Reanalysis datasets aggregate weather observations from satellite and surface stations, aircraft, and balloons through numerical weather prediction (NWP) modelling [36].Such datasets offer key advantages when creating VRE resource estimates, namely their global coverage and long data collection periods [38], and consistent extrapolation methodology [36,39].While reanalysis data cannot represent site-specific resource characteristics at a sub-grid scale, they represent a region of interest well [35].
GRETA uses the Modern-Era Retrospective Analysis for Research and Applications (MERRA) reanalysis dataset developed by NASA (National Aeronautics and Space Administration) [40,41].This dataset provides the variables required to compute wind and solar generation potential on a global 1/2 by 2/3 latitude-longitude grid with hourly resolution, from early 1979 to the within 2 months of the present.Thus, the proposed GRETA platform is primarily of interest for preliminary VRE assessments and exploration of long-term planning options at a large spatial (utility) scale; ground measurements would be required for site-specific resource data, for example, at the distribution scale.Several studies have validated the MERRA wind data against other datasets, including the National Renewable Energy Laboratory [39], National Climatic Data Centre and the University of Massachusetts wind stations [42], historic met mast and wind generation data [35,43,44]; while others have compared the derived power output from wind farms [45].The MERRA solar data have been compared to solar production data from five North American sites [46], the Helio-Clim-1 dataset and in situ irradiance measurements [47], the ERA-Interim re-dataset and ground measurements [48], and the Meteosat-based CM-SAF SARAH satellite dataset [49].Many of these studies have concluded that MERRA provides an accurate representation of wind speed and solar irradiance at the hourly time scale, for electricity generation modelling applications and characterization analyses.However, other studies have developed site or country specific correction factors to compensate for spatial bias [45], and such analyses could be expanded to more areas in future research endeavors.
Several studies have employed the MERRA dataset for VRE analyses, including assessing the practical global wind power availability [50,51], inter-annual wind power production variability [42], and wind variability [35].In addition, Gunturu (with others) used MERRA data to investigate wind characteristics in the USA [39], Europe [52], southern Africa [53] and Australia [54].Juruš et al. [47] simulate the variability of 33 years of hourly PV production in the Czech Republic.Finally, studies have examined wind and solar covariation to model large-scale production in South Africa [55], Australia [56] and in Europe [57].The MERRA data has also been used to develop a model to synthetically simulate hourly wind values in the South West region of Western Australia [58].

Methodology for Calculating Solar PV Generation
The MERRA dataset provides the global horizontal irradiance (GHI) on a horizontal plane on the Earth's surface, including both the direct and diffuse irradiance components (tavg1_2d_rad_Nx data product).The GHI is separated into its direct and diffuse components (Section 3.1), so that the irradiance on an inclined surface can be calculated using trigonometry for the direct component, and the approach in Section 3.2 for the diffuse component.Finally, the power production data are calculated for a given technology's power curve in Section 3.3.Unless indicated otherwise, all variables have hourly temporal resolution.

Calculating the Direct and Diffuse Fractions from GHI
Calculation of the direct and diffuse components from the global irradiation relies on the clearness index, (k t ), defined in [59] as: where I global is the surface incident shortwave flux received on a horizontal plane at the Earth's surface, and H 0 is the top of atmosphere incident shortwave flux prior to any attenuation, both of which are provided by MERRA.The fraction d of total radiation I global received on a horizontal plane as diffuse radiation (I diffuse ) (integrated over the hour) is defined as [59]: Several models calculate the direct and diffuse components of global irradiation, whether polynomial correlations, logistic functions, or process dynamics models are used [60].The polynomial models that relate the clearness index to the diffuse fraction of radiation can further be divided into first order [61,62], second order [63], third order [64][65][66], and fourth order models [67,68].Logistic functions incorporate physical principles by relating the clearness index to variables such as cloud cover, air mass, water vapor, turbidity, and albedo [69].Logistic models include the direct insolation simulation code (DISC) [59,69,70], Skartveit-Olseth model [71,72], DirInt model [73], Muneer-Munawwar model [74], and Boland-Ridley-Laurent (BRL) model [75].Finally, process dynamic models include multiple predictors, such as the clearness index [76], stratospheric sulfate aerosol loading [77], and atmospheric turbidity [78].
The current application necessitates a universal model, applicable in both Northern and Southern Hemispheres.Logistic functions are intrinsically more generic, while piecewise functions would tend to be site-specific [75].In particular, the BRL model was found to perform marginally better in the Northern Hemisphere, and substantially better in the Southern Hemisphere, than previous models, and therefore is provisionally recommend as a universal model [75].

The BRL Model
The BRL model begins with the clearness index (k t ), to predict the diffuse fraction (d) as [75]: where the optimal constant values have been established by minimizing the squared error difference between the model and the back-transformed data [75].To describe the spread of the diffuse term, more predictors were added including: persistence indicator, the daily clearness index, apparent solar time, and solar angle.The persistence indicator accounts for atmospheric inertia, calculated as the average of the previous and successive hour's clearness index, for hours between sunrise and sunset: where ψ is the persistence indicator, k t−1 is from the previous hour, and k t+1 is the successive hour's clearness index.At sunrise, ψ is equal to the successive hour's index, while at sunset ψ is equal to the previous hour's index.The daily clearness index (K t ) is the sum of hourly clearness indexes: where I global j is the global radiation at hour j, and H 0 j is the extraterrestrial radiation at hour j.Finally, the combined equation to calculate the diffuse fraction of global irradiation on a horizontal surface, including the five predictors, is as follows: The apparent solar time (AST) and solar angle (α, in degrees) have their usual meaning.Ridley et al. [75] used least squares minimization on data from seven locations worldwide to determine the coefficient (β) values in each location; the authors concluded that these coefficient values were similar enough to create a generic model.After amalgamating the data from all of the locations and minimizing the residual sum of least squares, Ridley et al. [75] propose the following generic multiple predictor logistical model, known as the Boland-Ridley-Laurent (BRL) model:

Calculating Irradiance on a Tilted Plane
The BRL model estimates the diffuse and direct fraction of global irradiation on a horizontal plane; however, practical solar PV applications often incline modules to maximize electricity generation.Of the three components of solar radiation, direct, reflected, and diffuse, calculating the former two on an inclined surface is a trigonometric calculation [79]: where I b, T is the incident radidation on a tilted surface, I b, h is the incident direct radiation on a horizontal surface, θ is the angle of incident radiation, and θ Z is the angle of horizontal radiation to normal radiation.
where I g, T is the ground reflected radiation on a tilted module, I is the global radiation on a horizontal surface, ρ g is the albedo, and β is the angle of the module relative to the horizontal.Calculating diffuse irradiance on an inclined plane requires a second type of model, falling into one of three categories [80].First generation models include isotropic models, that assume that diffuse radiation comes equally from all parts of the sky [81], and circumsolar models, that assume that diffuse radiation emanates from the direction of the solar disk.Second generation models differentiate the diffuse radiation from clear versus overcast sky, by introducing a horizontal brightening component and circumsolar diffuse component [82], developing a modifier for overcast [82] and clear sky conditions [83], incorporating weighted diffuse components for both circumsolar and uniform isotropic skies [84], and developing a correction factor for horizontal brightening [85].Third generation models employ two or three diffuse components (isotropic, circumsolar, and horizontal brightening), and include the Gueymard model [86], the Muneer model [80,87,88], and the Perez model [89].
The Perez model has been cited frequently, due to its accuracy in locations such as Spain [90], Italy [91], Switzerland [92], Iran [93], and 27 other worldwide sites [94].The Perez model has been outperformed when compared to site-specific models in Athens, Greece [95], and Valladolid, Spain [96].While site-specific coefficients can improve the Perez model accuracy [97], the generic coefficients developed by Perez are suitable where site-specific coefficients are not available.

The Perez Model
The Perez model computes three sky components: an isotropic dome, a circumsolar component, and a horizontal brightening component [98] (original), and [99] (updated).The Perez model consists of two parts: a geometric description of the sky hemisphere, superimposing a circumsolar disk to account for forward scattering by aerosols, and a horizontal band on an isotropic background to account for multiple Rayleigh scattering and re-scattering near the horizon.Perez defines two parameters to describe these diffuse irradiance components: the circumsolar region radiance, which is equal to F 1 times the background, and the horizontal region radiance, which is equal to F 2 times the background [99].These two brightness coefficients are then established empirically as a function of insolation conditions.Perez proposes the following simplified version of the model: where D c is the diffuse irradiance on a tilted surface, D h is the diffuse irradiance on a horizontal surface, s is the module tilt angle, F 1 is the circumsolar brightness coefficient, F 2 is the horizontal brightness coefficient, a is the circumsolar solid angle weighted by its average incidence on the slope, and c is the circumsolar solid angle, weighted by its average incidence on the horizontal.The parameters F 1 , F 2 , a, and c are defined in detail in the Perez paper [99].The solid angle values a and c were modified slightly in an updated revision, to correct for numerical errors that occurred during dawn and dusk hours, where the solar elevation angles approached zero [100].In addition, lower bounds were placed on the zenith/elevation angles at dawn and dusk.These insolation conditions (F 1 and F 2 ) are described by relatively simple analytic functions of the position of the sun (Z, the solar zenith angle), the brightness of the sky dome (∆, the horizontal diffuse irradiance normalized to extraterrestrial global as defined below), and the clearness (sum of diffuse and direct normal irradiance divided by the diffuse irradiance, ).
where m is the relative air mass, and I 0 is the normal incidence extraterrestrial radiation.Perez notes that this generic set of coefficients, which is intended to satisfy a broad climatic spectrum, could be developed for a local environment to minimize error.

Solar PV Generation Calculation
The irradiance on a tilted surface can be used to calculate solar PV electricity generation, given a module power curve.GRETA includes a range of power curves for currently available solar PV module technologies, as well as a user interface, to enter another power curve of choice.The solar PV module efficiency is often dependent on module temperature (T m ); the following formula proposed by Sandia National Laboratories [101] is used for this adjustment: where T m is the estimated module temperature, E POA is the solar irradiance incident on the module, WS is the wind speed, T a is the ambient temperature in, and a and b are parameters that depend on the module construction and mount configuration.The Sandia Module Temperature Model specifies a and b parameter values for glass/cell/glass, glass/cell/polymer, polymer/thin-film/steel, and linear concentrator module constructions, as well as open rack, roof mount, insulated back, and tracker mounting configurations [101].

Calculation of Wind Speed at Hub Height
The MERRA atmospheric single-level diagnostics dataset contains the wind speed in the northward and eastward directions at 2 and 10 m above displacement height, and 50 m above the surface (tavg1_2d_slv_Nx data product).The Archer and Jacobson Least Squares Fitting Approach extrapolates the wind speed to hub height [102], using either: where V(z) is the wind speed at elevation z, V R is the wind speed at reference elevation z R , and α is a friction coefficient (typically 1/7), or: where z 0 is the roughness length (typically 0.01 m).However, since both formulas assume V R as a multiplying factor, they incorrectly assign a value of zero to all wind speed values at elevation z when V R = 0.As such, Archer and Jacobson derive the following additional formula [102]: where V i is the wind speed observed at vertical height i, and N is the selected number of free input data points.Equations ( 13)-( 15) assume that wind speed increases with increasing height, and therefore assume the wrong concavity if the wind speed decreases with height; in these cases, Archer and Jacobson (2003) derive the solution of extrapolating to hub height using linear regression: The employed method is then chosen by minimizing the wind speed residual squared error, where residual R is defined as [102]: where V(z i ) is the wind speed calculated by each of the four equations.By setting the partial derivative of R with respect to α from ( 13) and z 0 from ( 14) to zero, and solving for α LS and ln z LS 0 , the following formulae apply [102]: The four fitting parameters and their respective residuals are calculated for each hour.The best fitting parameter, which minimizes the residual, then determines the wind speed at hub height according to their respective equation [102].

Wind Generation Calculation
The wind speed at hub height can then be used to calculate a generation value using a given turbine technology's power curve.GRETA includes a range of currently available wind turbine technologies, as well as a user interface for entering a power curve of choice.The wind power production is dependent on the air density at hub height (ρ), which is calculated at surface height directly using MERRA parameters and the equation of state formula.The change in air pressure from the surface to hub height is then calculated iteratively using the vertical fluid pressure formula, and applying the typical temperature variation of 6 K per kilometer of height.

Conclusions
To facilitate a wide variety of wind and solar PV assessments by researchers and practitioners alike, we have developed GRETA, a web tool that can be used to calculate historical hourly wind and solar generation time series data.GRETA, is currently freely available at the following URL: energy.utoronto.ca/GRETA.The downloadable comma-separated values (CSV) format data files vary in size depending on the job request, and require a CSV-reading software to access.The key equations used to create these wind and solar PV generation time series datasets are herein summarized.GRETA implements these formulations on publicly available data, while providing users with the ability to create their own datasets with their own input parameters.Compared to existing wind and solar resource datasets, the current approach has the advantages of being free, with hourly temporal resolution, global spatial coverage, explicit modelling algorithms, with convenient visualization and data download.

Figure 1 .
Figure 1.Workflow of the model process.

Figure 1 .
Figure 1.Workflow of the model process.

Figure 3 .
Figure 3. Example average solar power output for Ontario, Canada.

Figure 3 .
Figure 3. Example average solar power output for Ontario, Canada.